Saving a split string to an arraylist using LINQ - c#

I have some code that takes a string and processes it by splitting it into words, and giving the count of each word.
The trouble is it only returns void, because I am only able to print to the screen after the processing is done. Is there any way I can save the results in an arraylist, so that that I can return it to the method that called it?
The current code:
message.Split(' ').Where(messagestr => !string.IsNullOrEmpty(messagestr))
.GroupBy(messagestr => messagestr).OrderByDescending(groupCount => groupCount.Count())
.Take(20).ToList().ForEach(groupCount => Console.WriteLine("{0}\t{1}", groupCount.Key, groupCount.Count()));
Thank you.

Try this code
var wordCountList = message.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(messagestr => messagestr)
.OrderByDescending(grp => grp.Count())
.Take(20) //or take the whole
.Select(grp => new KeyValuePair<string, int>(grp.Key, grp.Count()))
.ToList(); //return wordCountList
//usage
wordCountList.ForEach(item => Console.WriteLine("{0}\t{1}", item.Key, item.Value));
If you want, you can return the wordCountList which is a List<KeyValuePair<string, int>> containing all the words and their counts in descending order.
How you can use that, is also shown in the last line.
And rather than taking first 20 from the list, if you want to take the whole, remove this .Take(20) part.

First of all, by calling Take(20) you just take the first 20 words and put the others away. So, if you want all the results, remove it.
After that, you can do it like this:
var words = message.Split(' ').
Where(messagestr => !string.IsNullOrEmpty(messagestr)).
GroupBy(messagestr => messagestr).
OrderByDescending(groupCount => groupCount.Count()).
ToList();
words.ForEach(groupCount => Console.WriteLine("{0}\t{1}", groupCount.Key, groupCount.Count()));
To put the results into some other data structure, you can use one of these ways:
var w = words.SelectMany(x => x.Distinct()).ToList(); //Add this line to get all the words in an array
// OR Use Dictionary
var dic = new Dictionary<string, int>();
foreach(var item in words)
{
dic.Add(item.Key, item.Count());
}

Related

How to modify string list for duplicate values?

I am working on project which is asp.net mvc core. I want to replace string list of duplicate values to one with comma separated,
List<string> stringList = surveylist.Split('&').ToList();
I have string list
This generate following output:
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF
1003=ABC
1=JKL
And I want to change output like this
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF,ABC
1=JKL
Duplicate items values should be comma separated.
There are probably 20 ways to do this. One simple one would be:
List<string> newStringList = stringList
.Select(a => new { KeyValue = a.Split("=") })
.GroupBy(a => a.KeyValue[0])
.Select(a => $"{a.Select(x => x.KeyValue[0]).First()}={string.Join(",", a.Select(x => x.KeyValue[1]))}")
.ToList();
Take a look at your output. Notice that an equal sign separates each string into a key-value pair. Think about how you want to approach this problem. Is a list of strings really the structure you want to build on? You could take a different approach and use a list of KeyValuePairs or a Dictionary instead.
If you really need to do it with a List, then look at the methods LINQ's Enumerable has to offer. Namely Select and GroupBy.
You can use Select to split once more on the equal sign: .Select(s => s.Split('=')).
You can use GroupBy to group values by a key: .GroupBy(pair => pair[0]).
To join it back to a string, you can use a Select again.
An end result could look something like this:
List<string> stringList = values.Split('&')
.Select(s => {
string[] pair = s.Split('=');
return new { Key = pair[0], Value = pair[1] };
})
.GroupBy(pair => pair.Key)
.Select(g => string.Concat(
g.Key,
'=',
string.Join(
", ",
g.Select(pair => pair.Value)
)
))
.ToList();
The group contains pairs so you need to select the value of each pair and join them into a string.

How can i remove same item in list

Could you help me? I cant remove the same item in a list.
List<string> text = new List<string>();
text.Add("A");
text.Add("B");
text.Add("C");
text.Add("D");
text.Add("D");
text.Add("A");
foreach(string i in text)
{
Console.WriteLine(i);
}
the result is A,B,C,D,D,A but I need to be B,C . How can i do?
Here is my solution,
var result = text.GroupBy(x => x).Where(y => y.Count() == 1).Select(z => z.Key);
Console.WriteLine(string.Join(", ", result));
Explanation:
GroupBy(x => x): This will group list based on characters i.e predicate.
.Where(y => y.Count() == 1): This will filter elements which are duplicates.
.Select(z => z.Key): Select will create new enumerable which contains Keys from Grouped elements
Something like
text.GroupBy(t => t).Where(tg => tg.Count()==1).Select(td => td.First());
This proberly wont compile, you need to fix that your self.
The idea is:
1. Group by item.
2. Take all groups with exactly 1 item
3. Select the Item

LINQ returns List<{int,double}> two values after .selected but I need List<int> with one value only

I have got this assignment. I need to create method which works with JSON data in this form:
On input N, what is top N of movies? The score of a movie is its average rate
So I have a JSONfile with 5 mil. movies inside. Each row looks like this:
{ Reviewer:1, Movie:1535440, Grade:1, Date:'2005-08-18'},
{ Reviewer:1, Movie:1666666, Grade:2, Date:'2006-09-20'},
{ Reviewer:2, Movie:1535440, Grade:3, Date:'2008-05-10'},
{ Reviewer:3, Movie:1535440, Grade:5, Date:'2008-05-11'},
This file is deserialized and then saved as a IEnumerable. And then I wanted to create a method, which returns List<int> where int is MovieId. Movies in the list are ordered descending and the amount of "top" movies is specified as a parameter of the method.
My method looks like this:
public List<int> GetSpecificAmountOfBestMovies(int amountOfMovies)
{
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.ToList();
var moviesSortedList = new List<int>();
foreach (var movie in moviesAndAverageGradeSortedList)
{
var key = movie.Key;
moviesSortedList.Add(key);
}
return moviesSortedList;
}
So moviesAndAverageGradeSortedList returns List<{int,double}> because of the .select method. So I could not return this value as this method is type of List<int> because I want only movieIds not their average grades.
So I created a new List<int> and then foreach loop which go through the moviesAndAverageGradeSortedList and saves only Keys from that List.
I think this solution is not correct because foreach loop can be then very slow when I put big number as a parameter. Does somebody know, how can I get "Keys" (movieIds) from the first list and therefore avoid creating another List<int> and foreach loop?
I will be thankful for every solution.
You can avoid the second list creation by just adding another .Select after the ordering. Also to make it all a bit cleaner you could:
return _deserializator.RatingCollection()
.GroupBy(i => i.Movie)
.OrderByDescending(g => g.Average(i => i.Grade))
.Select(g => g.Key)
.Take(amountOfMovies)
.ToList();
Note that this won't really improve performance much (if at all) because even in your original implementation the creation of the second list is done only on the subset of the first n items. The expensive operations are the ordering by the averages of the group and that you want to perform on all items in the json file, regardless to the number of item you want to return
You could add another select after you have ordered the list by average
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.Select(s=> s.Key)
.ToList();

IEnumerable<string> to Dictionary<char, IEnumerable<string>>

I suppose that this question might partially duplicate other similar questions, but i'm having troubles with such a situation:
I want to extract from some string sentences
For example from
`string sentence = "We can store these chars in separate variables. We can also test against other string characters.";`
I want to build an IEnumerable words;
var separators = new[] {',', ' ', '.'};
IEnumerable<string> words = sentence.Split(separators, StringSplitOptions.RemoveEmptyEntries);
After that, go throught all these words and take firs character into a distinct ascending ordered collection of characters.
var firstChars = words.Select(x => x.ToCharArray().First()).OrderBy(x => x).Distinct();
After that, go through both collections and for each character in firstChars get all items from words which has the first character equal with current character and create a Dictionary<char, IEnumerable<string>> dictionary.
I'm doing this way:
var dictionary = (from k in firstChars
from v in words
where v.ToCharArray().First().Equals(k)
select new { k, v })
.ToDictionary(x => x);
and here is the problem: An item with the same key has already been added.
Whis is because into that dictionary It is going to add an existing character.
I included a GroupBy extension into my query
var dictionary = (from k in firstChars
from v in words
where v.ToCharArray().First().Equals(k)
select new { k, v })
.GroupBy(x => x)
.ToDictionary(x => x);
The solution above gives makes all OK, but it gives me other type than I need.
What I should do to get as result an Dictionary<char, IEnumerable<string>>dictionary but not Dictionary<IGouping<'a,'a>> ?
The result which I want is as in the bellow image:
But here I have to iterate with 2 foreach(s) which will Show me wat i want... I cannot understand well how this happens ...
Any suggestion and advice will be welcome. Thank you.
As the relation is one to many, you can use a lookup instead of a dictionary:
var lookup = words.ToLookup(word => word[0]);
loopkup['s'] -> store, separate... as an IEnumerable<string>
And if you want to display the key/values sorted by first char:
for (var sortedEntry in lookup.OrderBy(entry => entry.Key))
{
Console.WriteLine(string.Format("First letter: {0}", sortedEntry.Key);
foreach (string word in sortedEntry)
{
Console.WriteLine(word);
}
}
You can do this:
var words = ...
var dictionary = words.GroupBy(w => w[0])
.ToDictionary(g => g.Key, g => g.AsEnumerable());
But for matter, why not use an ILookup?
var lookup = words.ToLookup(w => w[0]);

Converting Collection of Strings to Dictionary

This is probably a simple question, but the answer is eluding me.
I have a collection of strings that I'm trying to convert to a dictionary.
Each string in the collection is a comma-separated list of values that I obtained from a regex match. I would like the key for each entry in the dictionary to be the fourth element in the comma-separated list, and the corresponding value to be the second element in the comma-separated list.
When I attempt a direct call to ToDictionary, I end up in some kind of loop that appears to kick me of the BackgroundWorker thread I'm in:
var MoveFromItems = matches.Cast<Match>()
.SelectMany(m => m.Groups["args"].Captures
.Cast<Capture>().Select(c => c.Value));
var dictionary1 = MoveFromItems.ToDictionary(s => s.Split(',')[3],
s => s.Split(',')[1]);
When I create the dictionary manually, everything works fine:
var MoveFroms = new Dictionary<string, string>();
foreach(string sItem in MoveFromItems)
{
string sKey = sItem.Split(',')[3];
string sVal = sItem.Split(',')[1];
if(!MoveFroms.ContainsKey(sKey))
MoveFroms[sKey.ToUpper()] = sVal;
}
I appreciate any help you might be able to provide.
The problem is most likely that the keys have duplicates. You have three options.
Keep First Entry (This is what you're currently doing in the foreach loop)
Keys only have one entry, the first one that shows up - meaning you can have a Dictionary:
var first = MoveFromItems.Select(x => x.Split(','))
.GroupBy(x => x[3])
.ToDictionary(x => x.Key, x => x.First()[1]);
Keep All Entries, Grouped
Keys will have more than one entry (each key returns an Enumerable), and you use a Lookup instead of a Dictionary:
var lookup = MoveFromItems.Select(x => x.Split(','))
.ToLookup(x => x[3], x => x[1]);
Keep All Entries, Flattened
No such thing as a key, simply a flattened list of entries:
var flat = MoveFromItems.Select(x => x.Split(','))
.Select(x => new KeyValuePair<string,string>(x[3], x[1]));
You could also use a tuple here (Tuple.Create(x[3], x[1]);) instead.
Note: You will need to decide where/if you want the keys to be upper or lower case in these cases. I haven't done anything related to that yet. If you want to store the key as upper, just change x[3] to x[3].ToUpper() in everything above.
This splits each item and selects key out of the 4th split-value, and value out of the 2nd split-value, all into a dictionary.
var dictionary = MoveFromItems.Select(s => s.Split(','))
.ToDictionary(split => split[3],
split => split[1]);
There is no point in splitting the string twice, just to use different indices.
This would be just like saving the split results into a local variable, then using it to access index 3 and 1.
However, if indeed you don't know if keys might reoccur, I would go for the simple loop you've implemented, without a doubt.
Although you have a small bug in your loop:
MoveFroms = new Dictionary<string, string>();
foreach(string sItem in MoveFromItems)
{
string sKey = sItem.Split(',')[3];
string sVal = sItem.Split(',')[1];
// sKey might not exist as a key
if (!MoveFroms.ContainsKey(sKey))
//if (!MoveFroms.ContainsKey(sKey.ToUpper()))
{
// but sKey.ToUpper() might exist!
MoveFroms[sKey.ToUpper()] = sVal;
}
}
Should do ContainsKey(sKey.ToUpper()) in your condition as well, if you really want the key all upper cases.
This will Split each string in MoveFromItems with ',' and from them make 4th item (3rd Index) as Key and 2nd item(1st Index) as Value.
var dict = MoveFromItems.Select(x => x.Split(','))
.ToLookup(x => x[3], x => x[1]);

Categories