Related
This question already has answers here:
How to make IEnumerable<string>.Contains case-insensitive?
(3 answers)
Closed 2 years ago.
I have wrote this code to add text from one richbox to another without the stop words but it is copying all the text from richtext1
string[] Stopwords = { "a", "about", "actually", "after", "also", "am", "an", "and",
"any", "are", "as", "at", "be", "because", "but", "by", "could", "do", "each", "either",
"en", "for", "from", "has", "have", "how","i", "if", "in", "is", "it", "its", "just",
"of", "or", "so", "some", "such", "that", "the", "their", "these", "thing", "this", "to",
"too", "very", "was", "we", "well", "what", "when", "where", "who", "will", "with",
"you", "your"
};
This the code in the button that should do it
private void button2_Click(object sender, EventArgs e)
{
string st = richTextBox1.Text;
string[] split = st.Split(' ');
richTextBox2.Text = "";
int c = 0;
foreach (string s in split)
{
if (!Stopwords.Contains(s))
{
richTextBox2.Text += s + " ";
}
else c++;
}
}
when I write if (Stopwords.Contains(s)) it prints all stop words in richtext1
here it shows the input and the output are the same
I would not recommend writing directly to the txt. Have you considered the following:
string st = richTextBox1.Text;
string[] split = st.Split(' ');
StringBuilder sb = new StringBuilder();
int c = 0;
foreach (string s in split)
{
if (!Stopwords.Contains(s))
sb.Append($"{s} ");
else
c++;
}
richTextBox2.Text = sb.ToString().Trim();
A simple version of code to do that, using LinQ
private void button2_Click(object sender, EventArgs e)
{
string st = richTextBox1.Text;
string[] split = st.Split(' ');
richTextBox2.Text = string.Join(' ', split.Where(x => !Stopwords.Contains(x.ToLower())).ToArray());
}
I need help with my problem:
Right now I have a WPF application with a form that includes a button and a textbox.
Also, I have an open file directory - that opens .cs and .txt files.
I need to loop through the string of those files and display the most common words in them, starting from the largest to the smallest.
For example, a string would be:
"The sun is bright. The sun is yellow".
Would return:
The = 2;
sun = 2;
is = 2;
bright = 1;
yellow = 1;
My code right as of now:
private void btn1_Click(object sender, RoutedEventArgs e)
{
OpenFileDialog ofd = new OpenFileDialog() { Filter = "Text Documents |*.cs;*.txt", ValidateNames = true, Multiselect = false };
if (ofd.ShowDialog() == true)
rtb.Text = File.ReadAllText(ofd.FileName);
string[] userText = rtb.Text.ToLower().Split(new char[] { ' ', ',', '=', '+', '}', '{', '\r', '\n', '(', ')', ';' }, StringSplitOptions.RemoveEmptyEntries);
var frequencies = new Dictionary<string, int>();
foreach (string word in userText) //search words in our array userText that we declared at the beginning.
{
}
}
I am not sure how to continue from here... Help is appreciated.
Using the example and expected output that you have provided for us, I was able to accomplish this by using .GroupBy along with creating an anonymous class.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
// example string
var myString = "The sun is bright. The sun is yellow";
// split the string into an array based on space characters and the period
var stringArray = myString.Split(new char[] {' ', '.'}, StringSplitOptions.RemoveEmptyEntries);
// group the items in the array based on the value, then create an anonymous class with the properties `Word` and `Count`
var groupedWords = stringArray.GroupBy(x => x).Select(x => new { Word = x.Key, Count = x.Count() }).ToList();
// print the properties based on order of largest to smallest count to screen
foreach(var item in groupedWords.OrderByDescending(x => x.Count))
{
Console.WriteLine(item.Word + " = " + item.Count);
}
// Output
---------
// The = 2
// sun = 2
// is = 2
// bright = 1
// yellow = 1
}
}
Let me know if this helps!
I would go with #GTown-Coder 's approach as the easiest. But if you really just want to know how to implement the same code using a dictionary as in your sample...
private void btn1_Click(object sender, RoutedEventArgs e)
{
OpenFileDialog ofd = new OpenFileDialog() { Filter = "Text Documents |*.cs;*.txt", ValidateNames = true, Multiselect = false };
if (ofd.ShowDialog() == true)
rtb.Text = File.ReadAllText(ofd.FileName);
string[] userText = rtb.Text.ToLower().Split(new char[] { ' ', ',', '=', '+', '}', '{', '\r', '\n', '(', ')', ';' }, StringSplitOptions.RemoveEmptyEntries);
var frequencies = new Dictionary<string, int>();
foreach (string word in userText) //search words in our array userText that we declared at the beginning.
{
// Sample here
if (frequencies.Contains(word))
{
frequencies[word]++;
}
else
{
frequencies.Add(word,1);
}
}
foreach (var kvp in frequencies)
Console.WriteLine("Word: {0} \t Frequency: {1}",kvp.Key, kvp.Value);
}
This almost sounds like the original definition of a dictionary, so that might be a good place to start:
IDictionary<string, int> actualDictionary = new Dictionary<string, int>();
You can place words in the dictionary, and increment their count each time you find them.
IDictionary<string, int> actualDictionary = new Dictionary<string, int>();
foreach (string word in userText) //search words in our array userText that we declared at the beginning.
{
if (!actualDictionary.ContainsKey(word)) {
actualDictionary[word] = 0;
}
actualDictionary[word]++;
}
foreach(var thing in actualDictionary) {
Console.WriteLine(thing.Key + " " + thing.Value);
}
See a running example on .NET Fiddle.
I want to check if a String s, contains "a" or "b" or "c", in C#.
I am looking for a nicer solution than using
if (s.contains("a")||s.contains("b")||s.contains("c"))
Well, there's always this:
public static bool ContainsAny(this string haystack, params string[] needles)
{
foreach (string needle in needles)
{
if (haystack.Contains(needle))
return true;
}
return false;
}
Usage:
bool anyLuck = s.ContainsAny("a", "b", "c");
Nothing's going to match the performance of your chain of || comparisons, however.
Here's a LINQ solution which is virtually the same but more scalable:
new[] { "a", "b", "c" }.Any(c => s.Contains(c))
var values = new [] {"abc", "def", "ghj"};
var str = "abcedasdkljre";
values.Any(str.Contains);
If you are looking for single characters, you can use String.IndexOfAny().
If you want arbitrary strings, then I'm not aware of a .NET method to achieve that "directly", although a regular expression would work.
You can try with regular expression
string s;
Regex r = new Regex ("a|b|c");
bool containsAny = r.IsMatch (s);
If you need ContainsAny with a specific StringComparison (for example to ignore case) then you can use this String Extentions method.
public static class StringExtensions
{
public static bool ContainsAny(this string input, IEnumerable<string> containsKeywords, StringComparison comparisonType)
{
return containsKeywords.Any(keyword => input.IndexOf(keyword, comparisonType) >= 0);
}
}
Usage with StringComparison.CurrentCultureIgnoreCase:
var input = "My STRING contains Many Substrings";
var substrings = new[] {"string", "many substrings", "not containing this string" };
input.ContainsAny(substrings, StringComparison.CurrentCultureIgnoreCase);
// The statement above returns true.
”xyz”.ContainsAny(substrings, StringComparison.CurrentCultureIgnoreCase);
// This statement returns false.
This is a "nicer solution" and quite simple
if(new string[] { "A", "B", ... }.Any(s=>myString.Contains(s)))
List<string> includedWords = new List<string>() { "a", "b", "c" };
bool string_contains_words = includedWords.Exists(o => s.Contains(o));
public static bool ContainsAny(this string haystack, IEnumerable<string> needles)
{
return needles.Any(haystack.Contains);
}
As a string is a collection of characters, you can use LINQ extension methods on them:
if (s.Any(c => c == 'a' || c == 'b' || c == 'c')) ...
This will scan the string once and stop at the first occurance, instead of scanning the string once for each character until a match is found.
This can also be used for any expression you like, for example checking for a range of characters:
if (s.Any(c => c >= 'a' && c <= 'c')) ...
// Nice method's name, #Dan Tao
public static bool ContainsAny(this string value, params string[] params)
{
return params.Any(p => value.Compare(p) > 0);
// or
return params.Any(p => value.Contains(p));
}
Any for any, All for every
static void Main(string[] args)
{
string illegalCharacters = "!##$%^&*()\\/{}|<>,.~`?"; //We'll call these the bad guys
string goodUserName = "John Wesson"; //This is a good guy. We know it. We can see it!
//But what if we want the program to make sure?
string badUserName = "*_Wesson*_John!?"; //We can see this has one of the bad guys. Underscores not restricted.
Console.WriteLine("goodUserName " + goodUserName +
(!HasWantedCharacters(goodUserName, illegalCharacters) ?
" contains no illegal characters and is valid" : //This line is the expected result
" contains one or more illegal characters and is invalid"));
string captured = "";
Console.WriteLine("badUserName " + badUserName +
(!HasWantedCharacters(badUserName, illegalCharacters, out captured) ?
" contains no illegal characters and is valid" :
//We can expect this line to print and show us the bad ones
" is invalid and contains the following illegal characters: " + captured));
}
//Takes a string to check for the presence of one or more of the wanted characters within a string
//As soon as one of the wanted characters is encountered, return true
//This is useful if a character is required, but NOT if a specific frequency is needed
//ie. you wouldn't use this to validate an email address
//but could use it to make sure a username is only alphanumeric
static bool HasWantedCharacters(string source, string wantedCharacters)
{
foreach(char s in source) //One by one, loop through the characters in source
{
foreach(char c in wantedCharacters) //One by one, loop through the wanted characters
{
if (c == s) //Is the current illegalChar here in the string?
return true;
}
}
return false;
}
//Overloaded version of HasWantedCharacters
//Checks to see if any one of the wantedCharacters is contained within the source string
//string source ~ String to test
//string wantedCharacters ~ string of characters to check for
static bool HasWantedCharacters(string source, string wantedCharacters, out string capturedCharacters)
{
capturedCharacters = ""; //Haven't found any wanted characters yet
foreach(char s in source)
{
foreach(char c in wantedCharacters) //Is the current illegalChar here in the string?
{
if(c == s)
{
if(!capturedCharacters.Contains(c.ToString()))
capturedCharacters += c.ToString(); //Send these characters to whoever's asking
}
}
}
if (capturedCharacters.Length > 0)
return true;
else
return false;
}
You could have a class for your extension methods and add this one:
public static bool Contains<T>(this string s, List<T> list)
{
foreach (char c in s)
{
foreach (T value in list)
{
if (c == Convert.ToChar(value))
return true;
}
}
return false;
}
You can use Regular Expressions
if(System.Text.RegularExpressions.IsMatch("a|b|c"))
If this is for a password checker with requirements, try this:
public static bool PasswordChecker(string input)
{
// determins if a password is save enough
if (input.Length < 8)
return false;
if (!new string[] { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R",
"S", "T", "U", "V", "W", "X", "Y", "Z", "Ä", "Ü", "Ö"}.Any(s => input.Contains(s)))
return false;
if (!new string[] { "1", "2", "3", "4", "5", "6", "7", "8", "9", "0"}.Any(s => input.Contains(s)))
return false;
if (!new string[] { "!", "'", "§", "$", "%", "&", "/", "(", ")", "=", "?", "*", "#", "+", "-", "_", ".",
",", ";", ":", "`", "´", "^", "°", }.Any(s => input.Contains(s)))
return false;
return true;
}
This will set a password to have a min length of 8, have it use at least one uppercase char, at least one number, and at least one of the special chars
I have to extract all variables from Formula
Fiddle for below problem
eg. (FB+AB+ESI) / 12
Output {FB,AB,ESI}
Code written so far
var length = formula.Length;
List<string> variables = new List<string>();
List<char> operators = new List<char> { '+', '-', '*', '/', ')', '(', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
int count = 0;
string character = string.Empty;
for (int i = 0; i < length; i++)
{
if (!operators.Contains(formula[i]))
character += formula[i];
else
{
if (!string.IsNullOrWhiteSpace(character))
variables.Add(character);
character = string.Empty;
count = i;
}
}
if (!string.IsNullOrWhiteSpace(character))
variables.Add(character);
return variables;
Output of the Method is {FB,AB,ESI} which is correct
My problem is where Varaible contains numeric field i.e
eg. (FB1+AB1)/100
Expected Output : {FB1,AB1}
But My method return {FB,AB}
If variable's names must start with
letter A..Z, a..z
and if variable's names can contain
letters A..Z, a..z
digits 0..1
underscopes _
you can use regular expressions:
String source = "(FB2+a_3B+EsI) / 12";
String pattern = #"([A-Z]|[a-z])+([A-z]|[a-z]|\d|_)*";
// output will be "{FB2,a_3B,EsI}"
String output = "{" + String.Join(",",
Regex.Matches(source, pattern)
.OfType<Match>()
.Select(item => item.Value)) + "}";
In case you need a collection, say an array of variable's names, just modify the Linq:
String names[] = Regex.Matches(source, pattern)
.OfType<Match>()
.Select(item => item.Value)
.ToArray();
However, what is implemented is just a kind of naive tokenizer: you have to separate "variable names" found from function names, class names, check if they are commented out etc.
Have changed your code to do what you asked, but not sure about the approach of the solution, seeing that bracket and operator precedence is not taken into consideration.
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
string formula = "AB1+FB+100";
var length = formula.Length;
List<string> variables = new List<string>();
List<char> operators = new List<char>{'+', '-', '*', '/', ')', '('};
List<char> numerals = new List<char>{'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};
int count = 0;
string character = string.Empty;
char prev_char = '\0';
for (int i = 0; i < length; i++)
{
bool is_operator = operators.Contains(formula[i]);
bool is_numeral = numerals.Contains(formula[i]);
bool is_variable = !(is_operator || is_numeral);
bool was_variable = character.Contains(prev_char);
if (is_variable || (was_variable && is_numeral) )
character += formula[i];
else
{
if (!string.IsNullOrWhiteSpace(character))
variables.Add(character);
character = string.Empty;
count = i;
}
prev_char = formula[i];
}
if (!string.IsNullOrWhiteSpace(character))
variables.Add(character);
foreach (var item in variables)
Console.WriteLine(item);
Console.WriteLine();
Console.WriteLine();
}
}
Maybe also consider something like Math-Expression-Evaluator (on nuget)
Here is how you could do it with Regular Expressions.
Regex regex = new Regex(#"([A-Z])\w+");
List<string> matchedStrings = new List<string>();
foreach (Match match in regex.Matches("(FB1+AB1)/100"))
{
matchedStrings.Add(match.Value);
}
This will create a list of strings of all the matches.
Without regex, you can split on the actual operators (not numbers), and then remove any items that begin with a number:
public static List<string> GetVariables(string formula)
{
if (string.IsNullOrWhitespace(formula)) return new List<string>();
var operators = new List<char> { '+', '-', '*', '/', '^', '%', '(', ')' };
int temp;
return formula
.Split(operators.ToArray(), StringSplitOptions.RemoveEmptyEntries)
.Where(operand => !int.TryParse(operand[0].ToString(), out temp))
.ToList();
}
You can do it this way, just optimize the code as you want.
string ss = "(FB+AB+ESI) / 12";
string[] spl = ss.Split(new char[] { '/' }, StringSplitOptions.RemoveEmptyEntries);
string final = spl[0].Replace("(", "").Replace(")", "").Trim();
string[] entries = final.Split(new char[] {'+'}, StringSplitOptions.RemoveEmptyEntries);
StringBuilder sbFinal = new StringBuilder();
sbFinal.Append("{");
foreach(string en in entries)
{
sbFinal.Append(en + ",");
}
string finalString = sbFinal.ToString().TrimEnd(',');
finalString += "}";
What you are trying to do is an interpreter.
I can't give you the whole code but what I can give you is a head start (it will require a lot of coding).
First, learn about reverse polish notation.
Second, you need to learn about stacks.
Third, you have to apply both to get what you want to interpret.
I have a String that in need to convert into a String[] of each word in the string. However I do not need any white space or any punctuation EXCEPT hyphens and Apostrophes that belong in the word.
Example Input:
Hello! This is a test and it's a short-er 1. - [ ] { } ___)
Example of the Array made from Input:
[ "Hello", "this", "is", "a", "test", "and", "it's", "a", "short-er", "1" ]
Currently this is the code I have tried
(Note: the 2nd gives an error later in the program when string.First() is called):
private string[] ConvertWordsFromFile(String NewFileText)
{
char[] delimiterChars = { ' ', ',', '.', ':', '/', '|', '<', '>', '/', '#', '#', '$', '%', '^', '&', '*', '"', '(', ')', ';' };
string[] words = NewFileText.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);
return words;
}
or
private string[] ConvertWordsFromFile(String NewFileText)
{
return Regex.Split(NewFileText, #"\W+");
}
The second example crashes with the following code
private string GroupWordsByFirstLetter(List<String> words)
{
var groups =
from w in words
group w by w.First();
return FormatGroupsByAlphabet(groups);
}
specifically, when w.First() is called.
To remove unwanted characters from a String
string randomString = "thi$ is h#ving s*me inva!id ch#rs";
string excpList ="$#*!";
LINQ Option 1
var chRemoved = randomString
.Select(ch => excpList.Contains(ch) ? (char?)null : ch);
var Result = string.Concat(chRemoved.ToArray());
LINQ Option 2
var Result = randomString.Split().Select(x => x.Except(excList.ToArray()))
.Select(c => new string(c.ToArray()))
.ToArray();
Here is a little something I worked up. Splits on \n and removes any unwanted characters.
private string ValidChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789'-";
private IEnumerable<string> SplitRemoveInvalid(string input)
{
string tmp = "";
foreach(char c in input)
{
if(c == '\n')
{
if(!String.IsNullOrEmpty(tmp))
{
yield return tmp;
tmp = "";
}
continue;
}
if(ValidChars.Contains(c))
{
tmp += tmp;
}
}
if (!String.IsNullOrEmpty(tmp)) yield return tmp;
}
Usage could be something like this:
string[] array = SplitRemoveInvalid("Hello! This is a test and it's a short-er 1. - [ ] { } _)")
.ToArray();
I didnt actually test it, but it should work. If it doesnt, it should be easy enough to fix.
Use string.Split(char [])
string strings = "4,6,8\n9,4";
string [] split = strings .Split(new Char [] {',' , '\n' });
OR
Try below if you get any unwanted empty items. String.Split Method (String[], StringSplitOptions)
string [] split = strings .Split(new Char [] {',' , '\n' },
StringSplitOptions.RemoveEmptyEntries);
This can be done quite easily with a RegEx, by matching words. I am using the following RegEx, which will allow hyphens and apostrophes in the middle of words, but will strip them out if they occur at a word boundary.
\w(?:[\w'-]*\w)?
See it in action here.
In C# it could look like this:
private string[] ConvertWordsFromFile(String NewFileText)
{
return (from m in new Regex(#"\w(?:[\w'-]*\w)?").Matches(NewFileText)
select m.Value).ToArray();
}
I am using LINQ to get an array of words from the MatchCollection returned by Matches.