Occurence of elements in the file with c# and Dictionary - c#

I have a file as
outlook temperature Humidity Windy PlayTennis
sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
I want to find occurence of each element e.g
sunny: 2
rain: 3
overcast:1
hot: 3
and so on
My code is:
string file = openFileDialog1.FileName;
var text1 = File.ReadAllLines(file);
StringBuilder str = new StringBuilder();
string[] lines = File.ReadAllLines(file);
string[] nonempty=lines.Where(s => s.Trim(' ')!="")
.Select(s => Regex.Replace(s, #"\s+", " ")).ToArray();
string[] colheader = null;
if (nonempty.Length > 0)
colheader = nonempty[0].Split();
else
return;
var linevalue = nonempty.Skip(1).Select(l => l.Split());
int colcount = colheader.Length;
Dictionary<string, string> colvalue = new Dictionary<string, string>();
for (int i = 0; i < colcount; i++)
{
int k = 0;
foreach (string[] values in linevalue)
{
if(! colvalue.ContainsKey(values[i]))
{
colvalue.Add(values[i],colheader[i]);
}
label2.Text = label2.Text + k.ToString();
}
}
foreach (KeyValuePair<string, string> pair in colvalue)
{
label1.Text += pair.Key+ "\n";
}
Output I get here is
sunny
overcast
rain
hot
mild
cool
N
P
true
false
I also want to find the occurence, which I am unable to get. Can u please help me out here.

This LINQ query will return Dictionary<string, int> which will contain each word in file as key, and word's occurrences as value:
var occurences = File.ReadAllLines(file).Skip(1) // skip titles line
.SelectMany(l => l.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries))
.GroupBy(w => w)
.ToDictionary(g => g.Key, g => g.Count());
Usage of dictionary:
int sunnyOccurences = occurences["sunny"];
foreach(var pair in occurences)
label1.Text += String.Format("{0}: {1}\n", pair.Key, pair.Value);

Seems to me like you are implementing a simple Tag Cloud. I have used non-generic collection but you can replace it with generic. Replace the HashTable with Dictionary
Follow this code:
Hashtable tagCloud = new Hashtable();
ArrayList frequency = new ArrayList();
Read from a file and store it as array
string[] lines = File.ReadAllLines("file.txt");
//use the specific delimiter
char[] delimiter = new char[] { ' ' };
StringBuilder buffer = new StringBuilder();
foreach (string line in lines)
{
if (line.ToString().Length != 0)
{
buffer.Append((" " + line.Trim()));
}
}
string[] words = buffer.ToString().Trim().Split(delimiter);
Storing occurrence of each word.
List<string> listOfWords = new List<string>(words);
foreach (string i in listOfWords)
{
int c = 0;
foreach (string j in words)
{
if (i.Equals(j))
c++;
}
frequency.Add(c);
}
Store as key value pair. Value will be word and key will be its occurrence
for (int i = 0; i < listOfWords.Count; i++)
{
//use dictionary here
tagCloud.Add(listOfWords[i], (int)frequency[i]);
}

If all you want is the keyword and a count of how many times they appear in the file, then lazyberezovsky's solution is about as elegant of a solution as you will find. But if you need to do any other metrics on the file's data, then I would load the file into a collection that keeps your other metadata intact.
Something simple like:
var forecasts = File.ReadAllLines(file).Skip(1) // skip the header row
.Select(line => line.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries)) // split the line into an array of strings
.Select (f =>
new
{
Outlook = f[0],
Temperature = f[1],
Humidity = f[2],
Windy = f[3],
PlayTennis = f[4]
});
will give you an IEnumerable<> of an anonymous type that has properties that can be queried.
For example if you wanted to see how many times "sunny" occurred in the Outlook then you could just use LINQ to do this:
var count = forecasts.Count( f => f.Outlook == "sunny");
Or if you just wanted the list of all outlooks you could write:
var outlooks = forecasts.Select(f => f.Outlook).Distinct();
Where this is useful is when you want to do more complicated queries like "How many rainy cool days are there?
var count = forecasts.Count (f => f.Outlook == "rain" && f.Temperature == "cool");
Again if you just want all words and their occurrence count, then this is overkill.

Related

C# looping through a list to find character counts

I'm trying to loop through a string to find the character, ASCII value, and the number of times the character occurs. So far, I have found each unique character and ASCII value using foreach statements, and finding if the value was already in the list, then don't add it, otherwise add it. However I'm struggling with the count portion. I was thinking the logic would be "if I am already in the list, don't count me again, however, increment my frequency"
I've tried a few different things, such as trying to find the index of the character it found and adding to that specific index, but i'm lost.
string String = "hello my name is lauren";
char[] String1 = String.ToCharArray();
// int [] frequency = new int[String1.Length]; //array of frequency counter
int length = 0;
List<char> letters = new List<char>();
List<int> ascii = new List<int>();
List<int> frequency = new List<int>();
foreach (int ASCII in String1)
{
bool exists = ascii.Contains(ASCII);
if (exists)
{
//add to frequency at same index
//ascii.Insert(1, ascii);
//get { ASCII[index]; }
}
else
{
ascii.Add(ASCII);
//add to frequency at new index
}
}
foreach (char letter in String1)
{
bool exists = letters.Contains(letter);
if (exists)
{
//add to frequency at same index
}
else
{
letters.Add(letter);
//add to frequency at new index
}
}
length = letters.Count;
for (int j = 0; j<length; ++j)
{
Console.WriteLine($"{letters[j].ToString(),3} {"(" + ascii[j] + ")"}\t");
}
Console.ReadLine();
}
}
}
I'm not sure if I understand your question but that what you are looking for may be Dictionary<T,T> instead of List<T>. Here are examples of solutions to problems i think you trying to solve.
Counting frequency of characters appearance
Dictionary<int, int> frequency = new Dictionary<int, int>();
foreach (int j in String)
{
if (frequency.ContainsKey(j))
{
frequency[j] += 1;
}
else
{
frequency.Add(j, 1);
}
}
Method to link characters to their ASCII
Dictionary<char, int> ASCIIofCharacters = new Dictionary<char, int>();
foreach (char i in String)
{
if (ASCIIofCharacters.ContainsKey(i))
{
}
else
{
ASCIIofCharacters.Add(i, (int)i);
}
}
A simple LINQ approach is to do this:
string String = "hello my name is lauren";
var results =
String
.GroupBy(x => x)
.Select(x => new { character = x.Key, ascii = (int)x.Key, frequency = x.Count() })
.ToArray();
That gives me:
If I understood your question, you want to map each char in the provided string to the count of times it appears in the string, right?
If that is the case, there are tons of ways to do that, and you also need to choose in which data structure you want to store the result.
Assuming you want to use linq and store the result in a Dictionary<char, int>, you could do something like this:
static IDictionary<char, int> getAsciiAndFrequencies(string str) {
return (
from c in str
group c by Convert.ToChar(c)
).ToDictionary(c => c.Key, c => c.Count());
}
And use if like this:
var f = getAsciiAndFrequencies("hello my name is lauren");
// result: { h: 1, e: 3, l: 3, o: 1, ... }
You are creating a histogram. But you should not use List.Contains as it gets ineffective as the list grows. You have to go through the list one item after another. Better use Dictionary which is based on hashing and you go directly to the item. The code may look like this
string str = "hello my name is lauren";
var dict = new Dictionary<char, int>();
foreach (char c in str)
{
dict.TryGetValue(c, out int count);
dict[c] = ++count;
}
foreach (var pair in dict.OrderBy(r => r.Key))
{
Console.WriteLine(pair.Value + "x " + pair.Key + " (" + (int)pair.Key + ")");
}
which gives
4x (32)
2x a (97)
3x e (101)
1x h (104)
1x i (105)
3x l (108)
2x m (109)
2x n (110)
1x o (111)
1x r (114)
1x s (115)
1x u (117)
1x y (121)

Ignoring case sensitivity

I have to count how many times each word from given input text appears in it.
And the thing where I'm stuck: The character casing differences should be ignored.
For example: "You are here.You you" -> the output :
are=1
here=1
You=3
What I've done:
string text = "You are here.You you";
IDictionary<string, int> wordsCount = new SortedDictionary<string, int>();
string[] words = text.Split(' ',',','.','-','!');
foreach (string word in words)
{
int count = 1;
if (wordsCount.ContainsKey(word))
count = wordsCount[word] + 1;
wordsCount[word] = count;
}
var items = from pair in wordsCount
orderby pair.Value ascending
select pair;
foreach (var p in items)
{
Console.WriteLine("{0} -> {1}", p.Key, p.Value);
}
There is a chance to make this possible without checking manually every word from the given text? For example if I have a very long paragraph to not check every word using the specific method?
Just add
for(i = 0; text[i] != '\0'; i++){
text[i] = text[i].ToLower();
}
But as text is a string, just do :
text = text.ToLower();
Just before the string[] words = text.Split(' ',',','.','-','!'); line.
And then enjoy !
How about linq?
var text = "You are here.You you";
var words = text.Split(' ', ',', '.', '-', '!');
words
.GroupBy(word => word.ToLowerInvariant())
.OrderByDescending(group => group.Count())
.ToList()
.ForEach(g=> Console.WriteLine(g.Key + "=" + g.Count()));

Counting words using LinkedList

I have a class WordCount which has string wordDic and int count. Next, I have a List.
I have ANOTHER List which has lots of words inside it. I am trying to use List to count the occurrences of each word inside List.
Below is where I am stuck.
class WordCount
{
string wordDic;
int count;
}
List<WordCount> usd = new List<WordCount>();
foreach (string word in wordsList)
{
if (usd.wordDic.Contains(new WordCount {wordDic=word, count=0 }))
usd.count[value] = usd.counts[value] + 1;
else
usd.Add(new WordCount() {wordDic=word, count=1});
}
I don't know how to properly implement this in code but I am trying to search my List to see if the word in wordsList already exists and if it does, add 1 to count but if it doesn't then insert it inside usd with count of 1.
Note: *I have to use Lists to do this. I am not allowed to use anything else like hash tables...*
This is the answer before you edited to only use lists...btw, what is driving that requirement?
List<string> words = new List<string> {...};
// For case-insensitive you can instantiate with
// new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase)
Dictionary<string, int> counts = new Dictionary<string, int>();
foreach (string word in words)
{
if (counts.ContainsKey(word))
{
counts[word] += 1;
}
else
{
counts[word] = 1;
}
}
If you can only use lists, Can you use List<KeyValuePair<string,int>> counts which is the same thing as a dictionary (although I'm not sure it would guarantee uniqueness). The solution would be very similar. If you can only use lists the following will work.
List<string> words = new List<string>{...};
List<string> foundWord = new List<string>();
List<int> countWord = new List<int>();
foreach (string word in words)
{
if (foundWord.Contains(word))
{
countWord[foundWord.IndexOf(word)] += 1;
}
else
{
foundWord.Add(word);
countWord.Add(1);
}
}
Using your WordCount class
List<string> words = new List<string>{...};
List<WordCount> foundWord = new List<WordCount>();
foreach (string word in words)
{
WordCount match = foundWord.SingleOrDefault(w => w.wordDic == word);
if (match!= null)
{
match.count += 1;
}
else
{
foundWord.Add(new WordCount { wordDic = word, count = 1 });
}
}
You can use Linq to do this.
static void Main(string[] args)
{
List<string> wordsList = new List<string>()
{
"Cat",
"Dog",
"Cat",
"Hat"
};
List<WordCount> usd = wordsList.GroupBy(x => x)
.Select(x => new WordCount() { wordDic = x.Key, count = x.Count() })
.ToList();
}
Use linq: Assuming your list of words :
string[] words = { "blueberry", "chimpanzee", "abacus", "banana", "abacus","apple", "cheese" };
You can do:
var count =
from word in words
group word.ToUpper() by word.ToUpper() into g
where g.Count() > 0
select new { g.Key, Count = g.Count() };
(or in your case, select new WordCount()... it'll depend on how you have your constructor set up)...
the result will look like:
First, all of your class member is private, thus, they could not be accessed somewhere out of your class. Let's assume you're using them in WordCount class too.
Second, your count member is an int. Therefore, follow statement will not work:
usd.count[value] = usd.counts[value] + 1;
And I think you've made a mistype between counts and count.
To solve your problem, find the counter responding your word. If it exists, increase count value, otherwise, create the new one.
foreach (string word in wordsList) {
WordCount counter = usd.Find(c => c.wordDic == word);
if (counter != null) // Counter exists
counter.count++;
else
usd.Add(new WordCount() { wordDic=word, count = 1 }); // Create new one
}
You should use a Dictionary as its faster when using the "Contains" method.
Just replace your list with this
Dictionary usd = new Dictionary();
foreach (string word in wordsList)
{
if (usd.ContainsKey(word.ToLower()))
usd.count[word.ToLower()].count++;
else
usd.Add(word.ToLower(), new WordCount() {wordDic=word, count=1});
}

C#: Loop over Textfile, split it and Print a new Textfile

I get many lines of String as an Input that look like this. The Input is a String that comes from
theObjects.Runstate;
each #VAR;****;#ENDVAR; represents one Line and one step in the loop.
#VAR;Variable=Speed;Value=Fast;Op==;#ENDVAR;#VAR;Variable=Fabricator;Value=Freescale;Op==;#ENDVAR;
I split it, to remove the unwanted fields, like #VAR,#ENDVAR and Op==.
The optimal Output would be:
Speed = Fast;
Fabricator = Freescale; and so on.
I am able to cut out the #VAR and the#ENDVAR. Cutting out the "Op==" wont be that hard, so thats now not the main focus of the question. My biggest concern right now is,thatI want to print the Output as a Text-File. To print an Array I would have to loop over it. But in every iteration, when I get a new line, I overwrite the Array with the current splitted string. I think the last line of the Inputfile is an empty String, so the Output I get is just an empty Text-File. It would be nice if someone could help me.
string[] w;
Textwriter tw2;
foreach (EA.Element theObjects in myPackageObject.Elements)
{
theObjects.Type = "Object";
foreach (EA.Element theElements in PackageHW.Elements)
{
if (theObjects.ClassfierID == theElements.ElementID)
{
t = theObjects.RunState;
w = t.Replace("#ENDVAR;", "#VAR;").Replace("#VAR;", ";").Split(new string[] { ";" }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in w)
{
tw2.WriteLine(s);
}
}
}
}
This linq-query gives the exptected result:
var keyValuePairLines = File.ReadLines(pathInputFile)
.Select(l =>
{
l = l.Replace("#VAR;", "").Replace("#ENDVAR;", "").Replace("Op==;", "");
IEnumerable<string[]> tokens = l.Split(new[]{';'}, StringSplitOptions.RemoveEmptyEntries)
.Select(t => t.Split('='));
return tokens.Select(t => {
return new KeyValuePair<string, string>(t.First(), t.Last());
});
});
foreach(var keyValLine in keyValuePairLines)
foreach(var keyVal in keyValLine)
Console.WriteLine("Key:{0} Value:{1}", keyVal.Key, keyVal.Value);
Output:
Key:Variable Value:Speed
Key:Value Value:Fast
Key:Variable Value:Fabricator
Key:Value Value:Freescale
If you want to output it to another text-file with one key-value pair on each line:
File.WriteAllLines(pathOutputFile, keyValuePairLines.SelectMany(l =>
l.Select(kv => string.Format("{0}:{1}", kv.Key, kv.Value))));
Edit according to your question in the comment:
"What would I have to change/add so that the Output is like this. I
need AttributeValuePairs, for example: Speed = Fast; or Fabricator =
Freescale ?"
Now i understand the logic, you have key-value pairs but you are interested only in the values. So every two key-values belong together, the first value of a pair specifies the attibute and the second value the value of that attribute(f.e. Speed=Fast).
Then it's a little bit more complicated:
var keyValuePairLines = File.ReadLines(pathInputFile)
.Select(l =>
{
l = l.Replace("#VAR;", "").Replace("#ENDVAR;", "").Replace("Op==;", "");
string[] tokens = l.Split(new[]{';'}, StringSplitOptions.RemoveEmptyEntries);
var lineValues = new List<KeyValuePair<string, string>>();
for(int i = 0; i < tokens.Length; i += 2)
{
// Value to a variable can be found on the next index, therefore i += 2
string[] pair = tokens[i].Split('=');
string key = pair.Last();
string value = null;
string nextToken = tokens.ElementAtOrDefault(i + 1);
if (nextToken != null)
{
pair = nextToken.Split('=');
value = pair.Last();
}
var keyVal = new KeyValuePair<string, string>(key, value);
lineValues.Add(keyVal);
}
return lineValues;
});
File.WriteAllLines(pathOutputFile, keyValuePairLines.SelectMany(l =>
l.Select(kv=>string.Format("{0} = {1}", kv.Key, kv.Value))));
Output in the file with your single sample-line:
Speed = Fast
Fabricator = Freescale

Convert array of strings to Dictionary<string, int> c# then output to Visual Studio

I have an array of strings like so:
[0]Board1
[1]Messages Transmitted75877814
[2]ISR Count682900312
[3]Bus Errors0
[4]Data Errors0
[5]Receive Timeouts0
[6]TX Q Overflows0
[7]No Handler Failures0
[8]Driver Failures0
[9]Spurious ISRs0
just to clarify the numbers in the square brackets indicate the strings position in the array
I want to convert the array of strings to a dictionary with the string to the left of each number acting as the key, for example (ISR Count, 682900312)
I then want to output specific entries in the dictionary to a text box/table in visual studio (which ever is better) it would be preferable for the numbers to be left aligned.
excuse my naivety, I'm a newbie!
Pretty Simple. Tried and Tested
string[] arr = new string[] { "Board1", "ISR Count682900312", ... };
var numAlpha = new Regex("(?<Alpha>[a-zA-Z ]*)(?<Numeric>[0-9]*)");
var res = arr.ToDictionary(x => numAlpha.Match(x).Groups["Alpha"],
x => numAlpha.Match(x).Groups["Numeric"]);
string[] strings =
{
"Board1", "Messages232"
};
Dictionary<string, int> dictionary = new Dictionary<string, int>();
foreach (var s in strings)
{
int index = 0;
for (int i = 0; i < s.Length; i++)
{
if (Char.IsDigit(s[i]))
{
index = i;
break;
}
}
dictionary.Add(s.Substring(0, index), int.Parse(s.Substring(index)));
}
var stringArray = new[]
{
"[0]Board1",
"[1]Messages Transmitted75877814",
"[2]ISR Count682900312",
"[3]Bus Errors0",
"[4]Data Errors0",
"[5]Receive Timeouts0",
"[6]TX Q Overflows0",
"[7]No Handler Failures0",
"[8]Driver Failures0",
"[9]Spurious ISRs0"
};
var resultDict = stringArray.Select(s => s.Substring(3))
.ToDictionary(s =>
{
int i = s.IndexOfAny("0123456789".ToCharArray());
return s.Substring(0, i);
},
s =>
{
int i = s.IndexOfAny("0123456789".ToCharArray());
return int.Parse(s.Substring(i));
});
EDIT: If the numbers in brackets are not included in the strings, remove .Select(s => s.Substring(3)).
Here you go:
string[] strA = new string[10]
{
"Board1",
"Messages Transmitted75877814",
"ISR Count682900312",
"Bus Errors0",
"Data Errors0",
"Receive Timeouts0",
"TX Q Overflows0",
"No Handler Failures0",
"Driver Failures0",
"Spurious ISRs0"
};
Dictionary<string, int> list = new Dictionary<string, int>();
foreach (var item in strA)
{
// this Regex matches any digit one or more times so it picks
// up all of the digits on the end of the string
var match = Regex.Match(item, #"\d+");
// this code will substring out the first part and parse the second as an int
list.Add(item.Substring(0, match.Index), int.Parse(match.Value));
}

Categories