Counting words using LinkedList - c#

I have a class WordCount which has string wordDic and int count. Next, I have a List.
I have ANOTHER List which has lots of words inside it. I am trying to use List to count the occurrences of each word inside List.
Below is where I am stuck.
class WordCount
{
string wordDic;
int count;
}
List<WordCount> usd = new List<WordCount>();
foreach (string word in wordsList)
{
if (usd.wordDic.Contains(new WordCount {wordDic=word, count=0 }))
usd.count[value] = usd.counts[value] + 1;
else
usd.Add(new WordCount() {wordDic=word, count=1});
}
I don't know how to properly implement this in code but I am trying to search my List to see if the word in wordsList already exists and if it does, add 1 to count but if it doesn't then insert it inside usd with count of 1.
Note: *I have to use Lists to do this. I am not allowed to use anything else like hash tables...*

This is the answer before you edited to only use lists...btw, what is driving that requirement?
List<string> words = new List<string> {...};
// For case-insensitive you can instantiate with
// new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase)
Dictionary<string, int> counts = new Dictionary<string, int>();
foreach (string word in words)
{
if (counts.ContainsKey(word))
{
counts[word] += 1;
}
else
{
counts[word] = 1;
}
}
If you can only use lists, Can you use List<KeyValuePair<string,int>> counts which is the same thing as a dictionary (although I'm not sure it would guarantee uniqueness). The solution would be very similar. If you can only use lists the following will work.
List<string> words = new List<string>{...};
List<string> foundWord = new List<string>();
List<int> countWord = new List<int>();
foreach (string word in words)
{
if (foundWord.Contains(word))
{
countWord[foundWord.IndexOf(word)] += 1;
}
else
{
foundWord.Add(word);
countWord.Add(1);
}
}
Using your WordCount class
List<string> words = new List<string>{...};
List<WordCount> foundWord = new List<WordCount>();
foreach (string word in words)
{
WordCount match = foundWord.SingleOrDefault(w => w.wordDic == word);
if (match!= null)
{
match.count += 1;
}
else
{
foundWord.Add(new WordCount { wordDic = word, count = 1 });
}
}

You can use Linq to do this.
static void Main(string[] args)
{
List<string> wordsList = new List<string>()
{
"Cat",
"Dog",
"Cat",
"Hat"
};
List<WordCount> usd = wordsList.GroupBy(x => x)
.Select(x => new WordCount() { wordDic = x.Key, count = x.Count() })
.ToList();
}

Use linq: Assuming your list of words :
string[] words = { "blueberry", "chimpanzee", "abacus", "banana", "abacus","apple", "cheese" };
You can do:
var count =
from word in words
group word.ToUpper() by word.ToUpper() into g
where g.Count() > 0
select new { g.Key, Count = g.Count() };
(or in your case, select new WordCount()... it'll depend on how you have your constructor set up)...
the result will look like:

First, all of your class member is private, thus, they could not be accessed somewhere out of your class. Let's assume you're using them in WordCount class too.
Second, your count member is an int. Therefore, follow statement will not work:
usd.count[value] = usd.counts[value] + 1;
And I think you've made a mistype between counts and count.
To solve your problem, find the counter responding your word. If it exists, increase count value, otherwise, create the new one.
foreach (string word in wordsList) {
WordCount counter = usd.Find(c => c.wordDic == word);
if (counter != null) // Counter exists
counter.count++;
else
usd.Add(new WordCount() { wordDic=word, count = 1 }); // Create new one
}

You should use a Dictionary as its faster when using the "Contains" method.
Just replace your list with this
Dictionary usd = new Dictionary();
foreach (string word in wordsList)
{
if (usd.ContainsKey(word.ToLower()))
usd.count[word.ToLower()].count++;
else
usd.Add(word.ToLower(), new WordCount() {wordDic=word, count=1});
}

Related

Formatting List of String

I have an array of strings. I need to sort the list and save each letter's item in a single line. After this, I need to find the longest line of string.
I have done the first part in an inefficient way but I am trying to make it concise.
List<string> fruits = new List<string>
{
"Pomme",
"Apple",
"Apricots",
"Avocado",
"Banana",
"Blackberries",
"Blackcurrant",
"Blueberries",
"Cherries",
"Clementine",
"Cranberries",
"Custard-Apple",
"Durian",
"Elderberries",
"Feijoa",
"Figs",
"Gooseberries",
"Grapefruit",
"Grapes",
"Guava",
"Breadfruit",
"Cantaloupe",
"Carambola",
"Cherimoya",
};
fruits.Sort();
List<string> sortedString = new List<string> { };
foreach (var str in fruits)
{
sortedString.Add(str);
}
//string A, B, C, D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S;
var A = "";
var B = "";
var C = "";
var D = "";
var E = "";
var F = "";
var G = "";
foreach (var item in sortedString)
{
if (item.StartsWith('A'))
{
A += item;
}
else if (item.StartsWith('B'))
{
B += item;
}
else if (item.StartsWith('C'))
{
C += item;
}
else if (item.StartsWith('D'))
{
D += item;
}
else if (item.StartsWith('E'))
{
E += item;
}
else if (item.StartsWith('F'))
{
F += item;
}
}
The result will be like -
AppleApricotsAvocado
BananaBlackberriesBlackcurrantBlueberriesBreadfruit
CantaloupeCarambolaCherimoyaCherriesClementineCranberriesCustard-Apple
Durian
Elderberries
FeijoaFigs
GooseberriesGrapefruitGrapesGuava
After this, I need to find the longest line and put space between each item. Without effective looping, the code will be messy. Can you assist me to show the right way to solve the problem?
The Sort() method already sorts your list and you don't need to assign it to a new one.
My proposal to resolve your problem is
fruits.Sort();
var result = fruits.GroupBy(f => f[0]);
int[] lineslength = new int[result.Count()];
int index = 0;
foreach (var group in result)
{
foreach (var item in group)
{
lineslength[index] += item.Length;
Console.Write(item + " ");
}
Console.WriteLine();
index++;
}
int longestIndex = Array.FindIndex(lineslength, val => val.Equals(lineslength.Max()));
Console.WriteLine(longestIndex);
I used the GroupBy method to group strings by their first letter. Then when I was displaying strings I also counted their length. Using the static FindIndex method of the Array class, I found the index containing the maximum value of the array what corresponds to the line with the maximum length. So index zero is the first line, one is the second line etc.

Occurence of elements in the file with c# and Dictionary

I have a file as
outlook temperature Humidity Windy PlayTennis
sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
I want to find occurence of each element e.g
sunny: 2
rain: 3
overcast:1
hot: 3
and so on
My code is:
string file = openFileDialog1.FileName;
var text1 = File.ReadAllLines(file);
StringBuilder str = new StringBuilder();
string[] lines = File.ReadAllLines(file);
string[] nonempty=lines.Where(s => s.Trim(' ')!="")
.Select(s => Regex.Replace(s, #"\s+", " ")).ToArray();
string[] colheader = null;
if (nonempty.Length > 0)
colheader = nonempty[0].Split();
else
return;
var linevalue = nonempty.Skip(1).Select(l => l.Split());
int colcount = colheader.Length;
Dictionary<string, string> colvalue = new Dictionary<string, string>();
for (int i = 0; i < colcount; i++)
{
int k = 0;
foreach (string[] values in linevalue)
{
if(! colvalue.ContainsKey(values[i]))
{
colvalue.Add(values[i],colheader[i]);
}
label2.Text = label2.Text + k.ToString();
}
}
foreach (KeyValuePair<string, string> pair in colvalue)
{
label1.Text += pair.Key+ "\n";
}
Output I get here is
sunny
overcast
rain
hot
mild
cool
N
P
true
false
I also want to find the occurence, which I am unable to get. Can u please help me out here.
This LINQ query will return Dictionary<string, int> which will contain each word in file as key, and word's occurrences as value:
var occurences = File.ReadAllLines(file).Skip(1) // skip titles line
.SelectMany(l => l.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries))
.GroupBy(w => w)
.ToDictionary(g => g.Key, g => g.Count());
Usage of dictionary:
int sunnyOccurences = occurences["sunny"];
foreach(var pair in occurences)
label1.Text += String.Format("{0}: {1}\n", pair.Key, pair.Value);
Seems to me like you are implementing a simple Tag Cloud. I have used non-generic collection but you can replace it with generic. Replace the HashTable with Dictionary
Follow this code:
Hashtable tagCloud = new Hashtable();
ArrayList frequency = new ArrayList();
Read from a file and store it as array
string[] lines = File.ReadAllLines("file.txt");
//use the specific delimiter
char[] delimiter = new char[] { ' ' };
StringBuilder buffer = new StringBuilder();
foreach (string line in lines)
{
if (line.ToString().Length != 0)
{
buffer.Append((" " + line.Trim()));
}
}
string[] words = buffer.ToString().Trim().Split(delimiter);
Storing occurrence of each word.
List<string> listOfWords = new List<string>(words);
foreach (string i in listOfWords)
{
int c = 0;
foreach (string j in words)
{
if (i.Equals(j))
c++;
}
frequency.Add(c);
}
Store as key value pair. Value will be word and key will be its occurrence
for (int i = 0; i < listOfWords.Count; i++)
{
//use dictionary here
tagCloud.Add(listOfWords[i], (int)frequency[i]);
}
If all you want is the keyword and a count of how many times they appear in the file, then lazyberezovsky's solution is about as elegant of a solution as you will find. But if you need to do any other metrics on the file's data, then I would load the file into a collection that keeps your other metadata intact.
Something simple like:
var forecasts = File.ReadAllLines(file).Skip(1) // skip the header row
.Select(line => line.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries)) // split the line into an array of strings
.Select (f =>
new
{
Outlook = f[0],
Temperature = f[1],
Humidity = f[2],
Windy = f[3],
PlayTennis = f[4]
});
will give you an IEnumerable<> of an anonymous type that has properties that can be queried.
For example if you wanted to see how many times "sunny" occurred in the Outlook then you could just use LINQ to do this:
var count = forecasts.Count( f => f.Outlook == "sunny");
Or if you just wanted the list of all outlooks you could write:
var outlooks = forecasts.Select(f => f.Outlook).Distinct();
Where this is useful is when you want to do more complicated queries like "How many rainy cool days are there?
var count = forecasts.Count (f => f.Outlook == "rain" && f.Temperature == "cool");
Again if you just want all words and their occurrence count, then this is overkill.

Convert array of strings to Dictionary<string, int> c# then output to Visual Studio

I have an array of strings like so:
[0]Board1
[1]Messages Transmitted75877814
[2]ISR Count682900312
[3]Bus Errors0
[4]Data Errors0
[5]Receive Timeouts0
[6]TX Q Overflows0
[7]No Handler Failures0
[8]Driver Failures0
[9]Spurious ISRs0
just to clarify the numbers in the square brackets indicate the strings position in the array
I want to convert the array of strings to a dictionary with the string to the left of each number acting as the key, for example (ISR Count, 682900312)
I then want to output specific entries in the dictionary to a text box/table in visual studio (which ever is better) it would be preferable for the numbers to be left aligned.
excuse my naivety, I'm a newbie!
Pretty Simple. Tried and Tested
string[] arr = new string[] { "Board1", "ISR Count682900312", ... };
var numAlpha = new Regex("(?<Alpha>[a-zA-Z ]*)(?<Numeric>[0-9]*)");
var res = arr.ToDictionary(x => numAlpha.Match(x).Groups["Alpha"],
x => numAlpha.Match(x).Groups["Numeric"]);
string[] strings =
{
"Board1", "Messages232"
};
Dictionary<string, int> dictionary = new Dictionary<string, int>();
foreach (var s in strings)
{
int index = 0;
for (int i = 0; i < s.Length; i++)
{
if (Char.IsDigit(s[i]))
{
index = i;
break;
}
}
dictionary.Add(s.Substring(0, index), int.Parse(s.Substring(index)));
}
var stringArray = new[]
{
"[0]Board1",
"[1]Messages Transmitted75877814",
"[2]ISR Count682900312",
"[3]Bus Errors0",
"[4]Data Errors0",
"[5]Receive Timeouts0",
"[6]TX Q Overflows0",
"[7]No Handler Failures0",
"[8]Driver Failures0",
"[9]Spurious ISRs0"
};
var resultDict = stringArray.Select(s => s.Substring(3))
.ToDictionary(s =>
{
int i = s.IndexOfAny("0123456789".ToCharArray());
return s.Substring(0, i);
},
s =>
{
int i = s.IndexOfAny("0123456789".ToCharArray());
return int.Parse(s.Substring(i));
});
EDIT: If the numbers in brackets are not included in the strings, remove .Select(s => s.Substring(3)).
Here you go:
string[] strA = new string[10]
{
"Board1",
"Messages Transmitted75877814",
"ISR Count682900312",
"Bus Errors0",
"Data Errors0",
"Receive Timeouts0",
"TX Q Overflows0",
"No Handler Failures0",
"Driver Failures0",
"Spurious ISRs0"
};
Dictionary<string, int> list = new Dictionary<string, int>();
foreach (var item in strA)
{
// this Regex matches any digit one or more times so it picks
// up all of the digits on the end of the string
var match = Regex.Match(item, #"\d+");
// this code will substring out the first part and parse the second as an int
list.Add(item.Substring(0, match.Index), int.Parse(match.Value));
}

Count the occurrence of strings in an array [duplicate]

This question already has answers here:
A method to count occurrences in a list
(7 answers)
Counting words in a collection using LINQ
(7 answers)
Closed 9 years ago.
I am counting the how many strings are present in the array-
Tags = "the cat the mat the sat";
string[] words = Tags.Split(' ');
int counter = 0;
foreach (string item in words)
{
if (item != "")
{
counter++;
}
}
However how could I modify my code so that I counted the occurrence of every string.
So for instance -
"the" = 3
"cat" = 1
"mat" = 1
"sat" = 1
and then store these values some way?
You don't say what language you use, but what I see it looks like c#. Here is one way to do it.
Dictionary<string, int> dictionary = new Dictionary<string, int>();
foreach (string word in words)
{
if (dictionary.ContainsKey(word))
{
dictionary[word] += 1;
}
else
{
dictionary.Add(word,1);
}
}
Try this:
var result = tags.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(tag => tag)
.ToDictionary(group => group.Key, group => group.Count());
var max = result.MaxBy(kvp => kvp.Value);
var min = result.MinBy(kvp => kvp.Value);
using MaxBy and MinBy from MoreLINQ.
Store in a map where the key is the word and the value is a counter of how many times it appears....
You must use Dictionary. Here it is:
string Tags = "the cat the mat the sat";
string[] words = Tags.Split(' ');
Dictionary<string, int> oddw = new Dictionary<string, int>();
foreach (string item in words)
{
if (item != "")
{
if (oddw.ContainsKey(item) == false)
{
oddw.Add(item, 1);
}
else
{
oddw[item]++;
}
}
}
foreach (var item in oddw)
{
Console.WriteLine(item);
}

C# Comparing two sorted lists and outputting to a file

I'm trying to compare a list of strings compiled together against a master list and print them out to a text file. The problem I'm having is the printable list remains empty. How do I populate the third list? And, is this a proper use of List<>, if not, what should I use?
Edit: Sorry about that, prior to this method running, textInput and textCompare read from two files and are populated with strings 7 characters in length: one pulled from a text file, the other from an excel sheet. I then remove any nulls, and attempt to compare the two lists with listA.intersects(listB). MSDN mentioned it need to be enumerated through for the intersects to work, which is why I put it in a foreach.
void Compare()
{
List<string> matches = new List<string>();
textInput.Sort();
textCompare.Sort();
progressBar.Maximum = textInput.Count;
int increment = 0;
for (int i = textCompare.Count - 1; i >= 0; i--)
{
if (textCompare[i] == null)
{
textCompare.RemoveAt(i);
}
}
foreach (string item in textInput)
{
matches = textInput.Intersect(textCompare).ToList();
increment++;
progressBar.Value = increment;
}
//A break point placed on the foreach reveals matches is empty.
foreach (object match in matches)
{
streamWriter.WriteLine(match);
}
doneLabel.Text = "Done!";
}
From the description in your comment this would do it:
var textOutput = textCompare.Where(s => !string.IsNullOrEmpty(s))
.Intersect(textInput)
.OrderBy(s => s);
File.WriteAllLines("outputfile.txt", textOutput);
Note that you can remove the .Where() condition provided you don't have empty strings in your masterlist "textInput" (very likely there aren't). Also, if order doesn't matter remove the .OrderBy(), you end up with this then:
var textOutput = textCompare.Intersect(textInput);
File.WriteAllLines("outputfile.txt", textOutput);
Not sure why you have this in the loop.
foreach (string item in textInput)
{
matches = textInput.Intersect(textCompare).ToList();
increment++;
progressBar.Value = increment;
}
you just need
matches = textInput.Intersect(textCompare).ToList();
if you try something like
List<string> matches = new List<string>();
List<string> textInput = new List<string>(new[] {"a", "b", "c"});
textInput.Sort();
List<string> textCompare = new List<string>(new[] { "b", "c", "d" }); ;
textCompare.Sort();
int increment = 0;
for (int i = textCompare.Count - 1; i >= 0; i--)
{
if (textCompare[i] == null)
{
textCompare.RemoveAt(i);
}
}
matches = textInput.Intersect(textCompare).ToList();
matches should have { "b , "c" }. so your problem might be somewhere else.
Consider something like this:
are the 2 sort calls even necessary?
use LINQ extension methods to remove the blank/nulls
void Compare()
{
textCompare.RemoveAll(x => string.IsNullOrEmpty(x));
List<string> matches= textInput.Intersect(textCompare).ToList();
matches.ForEach(x=> streamWriter.WriteLine(x));
doneLabel.Text = "Done!";
}

Categories