edit: Thanks Jason, the fact that it was a dictionary isn't that important. I just wanted the runtime to have a low runtime. Is that LINQ method fast? Also, I know this is off topic but what does the n => n mean?
I have a list of numbers and I want to make another list with the numbers that appear most at the beginning and the least at the end.
So what I did was when through the list and checked if the number x was in the dictionary. If it wasn't then I made the key x and the value one. If it was then I changed the value to be the value plus one.
Now I want to order the dictionary so that I can make a list with the ones that appear the most at the beginning and the least at the end.
How can I do that in C#?
ps. runtime is very important.
So it sounds like you have a Dictionary<int, int> where the key represents some integer that you have in a list and corresponding value represents the count of the number of times that integer appeared. You are saying that you want to order the keys by counts sorted in descending order by frequency. Then you can say
// dict is Dictionary<int, int>
var ordered = dict.Keys.OrderByDescending(k => dict[k]).ToList();
Now, it sounds like you started with a List<int> which are the values that you want to count and order by count. You can do this very quickly in LINQ like so:
// list is IEnumerable<int> (e.g., List<int>)
var ordered = list.GroupBy(n => n)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.ToList();
Or in query syntax
var ordered = (from n in list
group n by n into g
orderby g.Count() descending
select g.Key).ToList();
Now, if you need to have the intermediate dictionary you can say
var dict = list.GroupBy(n => n)
.ToDictionary(g => g.Key, g => g.Count());
var ordered = dict.Keys.OrderByDescending(k => dict[k]).ToList();
Use the GroupBy extension on IEnumerable() to group the numbers and extract the count of each. This creates the dictionary from the list and orders it in one statement.
var ordered = list.GroupBy( l => l )
.OrderByDescending( g => g.Count() )
.ToDictionary( g => g.Key, g.Count() );
You may also consider using SortedDictionary.
It sorts the items on the basis of key, while insertion. more..
List<KeyValuePair<type, type>> listEquivalent =
new List<KeyValuePair<type, type>>(dictionary);
listEquivalent.Sort((first,second) =>
{
return first.Value.CompareTo(second.Value);
});
Something like that maybe?
edit: Thanks Jason for the notice on my omission
Related
I have got this assignment. I need to create method which works with JSON data in this form:
On input N, what is top N of movies? The score of a movie is its average rate
So I have a JSONfile with 5 mil. movies inside. Each row looks like this:
{ Reviewer:1, Movie:1535440, Grade:1, Date:'2005-08-18'},
{ Reviewer:1, Movie:1666666, Grade:2, Date:'2006-09-20'},
{ Reviewer:2, Movie:1535440, Grade:3, Date:'2008-05-10'},
{ Reviewer:3, Movie:1535440, Grade:5, Date:'2008-05-11'},
This file is deserialized and then saved as a IEnumerable. And then I wanted to create a method, which returns List<int> where int is MovieId. Movies in the list are ordered descending and the amount of "top" movies is specified as a parameter of the method.
My method looks like this:
public List<int> GetSpecificAmountOfBestMovies(int amountOfMovies)
{
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.ToList();
var moviesSortedList = new List<int>();
foreach (var movie in moviesAndAverageGradeSortedList)
{
var key = movie.Key;
moviesSortedList.Add(key);
}
return moviesSortedList;
}
So moviesAndAverageGradeSortedList returns List<{int,double}> because of the .select method. So I could not return this value as this method is type of List<int> because I want only movieIds not their average grades.
So I created a new List<int> and then foreach loop which go through the moviesAndAverageGradeSortedList and saves only Keys from that List.
I think this solution is not correct because foreach loop can be then very slow when I put big number as a parameter. Does somebody know, how can I get "Keys" (movieIds) from the first list and therefore avoid creating another List<int> and foreach loop?
I will be thankful for every solution.
You can avoid the second list creation by just adding another .Select after the ordering. Also to make it all a bit cleaner you could:
return _deserializator.RatingCollection()
.GroupBy(i => i.Movie)
.OrderByDescending(g => g.Average(i => i.Grade))
.Select(g => g.Key)
.Take(amountOfMovies)
.ToList();
Note that this won't really improve performance much (if at all) because even in your original implementation the creation of the second list is done only on the subset of the first n items. The expensive operations are the ordering by the averages of the group and that you want to perform on all items in the json file, regardless to the number of item you want to return
You could add another select after you have ordered the list by average
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.Select(s=> s.Key)
.ToList();
So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.
If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?
I've checked many solutions on different sites but couldn't find what I was looking for. I'm working on a dictionary object with different Values against Keys. The structure is as follows:
Key Value
6 4
3 4
2 2
1 1
If they dictionary contains elements like this, the output should be 6 and 3, if Key (6) has the highest value, it should print only 6. However, if all the values are same against each key, it should print all the keys.
Trying to use the following but it only prints the highest Value.
var Keys_ = dicCommon.GroupBy(x => x.Value).Max(p => p.Key);
Any ideas
Instead of using Max(x=>x.Key) use .OrderByDescending(x=>x.Key) and .FirstOrDefault() that will give you the group that has the max value. You then can itterate over the group and display whatever you need.
var dicCommon = new Dictionary<int, int>();
dicCommon.Add(6, 4);
dicCommon.Add(3, 4);
dicCommon.Add(2, 2);
dicCommon.Add(1, 1);
var maxGroup = dicCommon.GroupBy(x => x.Value).OrderByDescending(x => x.Key).FirstOrDefault();
foreach (var keyValuePair in maxGroup)
{
Console.WriteLine("Key: {0}, Value {1}", keyValuePair.Key, keyValuePair.Value);
}
Run Code
First off a query can't return one and more than one result at the same time.So you need to pick one.
In this case if you want all Keys that has the highest corresponding Value, you can sort the groups based on Value then just get the first group which has the highest Value:
var Keys_ = dicCommon.GroupBy(x => x.Value)
.OrderByDescending(g => g.Key)
.First()
.Select(x => x.Key)
.ToList();
var keys = String.Join(",", dicCommon
.OrderByDescending(x=>x.Value)
.GroupBy(x => x.Value)
.First()
.Select(x=>x.Key));
You’re almost there:
dicCommon.GroupBy(x => x.Value)
.OrderByDescending(pair => pair.First().Value)
.First().Select(pair => pair.Key).ToList()
GroupBy returns an enumerable of IGrouping. So sort these descending by value, then get the first, and select the key of each containing element.
Since this requires sorting, the runtime complexity is not linear, although we can easily do that. One way would be figuring out the maximum value first and then getting all the keys where the value is equal to that:
int maxValue = dicCommon.Max(x => x.Value);
List<int> maxKeys = dicCommon.Where(x => x.Value == maxValue).Select(x => x.Key).ToList();
So far I have this:
List<Item> duplicates = items.GroupBy(x => x.Id)
.SelectMany(g => g.Skip(1)).ToList();
List<Item> nonDuplicates = items.GroupBy(x => x.Id)
.Select(x => x.First()).ToList();
Is there a more efficient way to do this (i.e. one select)?
Example input:
Id Value (added for some perspective)
-- -----
1 12
1 909
1231 0
1 577
Example Output:
duplicates -> {1, 909}, {1, 577}
non-duplicates -> {1, 12}, {1231, 0}
If you really want to avoid doing the actual grouping more than once, and thus avoid iterating the source sequence more than once, you can group the items, materialize that query into a list, and then grab the info that you want from that list.
var query = items.GroupBy(x => x.id)
.ToList();
var duplicates = query.SelectMany(group => group.Skip(1));
var nonDuplicates = query.Select(group => group.First());
Having said that, grouping items isn't particularly expensive of an operation, so this may not actually be a particularly huge win. Odds are reasonably high that your existing code is "good enough".
I'd be mostly interested in doing this if I wasn't confident that the source sequence would return the same items if iterated multiple times, or if it's say an IQueryable that needs to do a round trip to the database to get the items. In those cases this is a change worth implementing.
Get the first one for each Id, then use Except to get the others.
List<Item> nonDupes = items.GroupBy(x => x.Id).Select(x => x.First()).ToList();
List<Item> dupes = items.Except(nonDupes).ToList();
This is, however, assuming that Equals hasn't been overridden to be simply the Id.
EDIT: And here's a fiddle: http://dotnetfiddle.net/4GaPK4
var result = items.GroupBy(x => x.Id)
.Select(g => new {
Dups = g.Where(g.Count > 1),
NonDups = g.Where(g.Count == 1), })
.ToList();
Not in one query, but arguably more efficient approach can be achieved by using DistinctBy from MoreLinq:
var nonDuplicates = items.DistinctBy(i => i.Id);
var duplicates = items.Except(nonDuplicates);
I would like to take a list of objects and convert it to a dictionary where the key is a field in the object, and the value is a list of a different field in the objects that match on the key. I can do this now with a loop but I feel this should be able to be accomplished with linq and not having to write the loop. I was thinking a combination of GroupBy and ToDictionary but have been unsuccessful so far.
Here's how I'm doing it right now:
var samplesWithSpecificResult = new Dictionary<string, List<int>>();
foreach(var sample in sampleList)
{
List<int> sampleIDs = null;
if (samplesWithSpecificResult.TryGetValue(sample.ResultString, out sampleIDs))
{
sampleIDs.Add(sample.ID);
continue;
}
sampleIDs = new List<int>();
sampleIDs.Add(sample.ID);
samplesWithSpecificResult.Add(sample.ResultString, sampleIDs);
}
The farthest I can get with .GroupBy().ToDictionay() is Dictionary<sample.ResultString, List<sample>>.
Any help would be appreciated.
Try the following
var dictionary = sampleList
.GroupBy(x => x.ResultString, x => x.ID)
.ToDictionary(x => x.Key, x => x.ToList());
The GroupBy clause will group every Sample instance in the list by its ResultString member, but it will keep only the Id part of each sample. This means every element will be an IGrouping<string, int>.
The ToDictionary portion uses the Key of the IGrouping<string, int> as the dictionary Key. IGrouping<string, int> implements IEnumerable<int> and hence we can convert that collection of samples' Id to a List<int> with a call to ToList, which becomes the Value of the dictionary for that given Key.
Yeah, super simple. The key is that when you do a GroupBy on IEnumerable<T>, each "group" is an object that implements IEnumerable<T> as well (that's why I can say g.Select below, and I'm projecting the elements of the original sequence with a common key):
var dictionary =
sampleList.GroupBy(x => x.ResultString)
.ToDictionary(
g => g.Key,
g => g.Select(x => x.ID).ToList()
);
See, the result of sampleList.GroupBy(x => x.ResultString) is an IEnumerable<IGrouping<string, Sample>> and IGrouping<T, U> implements IEnumerable<U> so that every group is a sequence of Sample with the common key!
Dictionary<string, List<int>> resultDictionary =
(
from sample in sampleList
group sample.ID by sample.ResultString
).ToDictionary(g => g.Key, g => g.ToList());
You might want to consider using a Lookup instead of the Dictionary of Lists
ILookup<string, int> idLookup = sampleList.ToLookup(
sample => sample.ResultString,
sample => sample.ID
);
used thusly
foreach(IGrouping<string, int> group in idLookup)
{
string resultString = group.Key;
List<int> ids = group.ToList();
//do something with them.
}
//and
List<int> ids = idLookup[resultString].ToList();
var samplesWithSpecificResult =
sampleList.GroupBy(s => s.ResultString)
.ToDictionary(g => g.Key, g => g.Select(s => s.ID).ToList());
What we 're doing here is group the samples based on their ResultString -- this puts them into an IGrouping<string, Sample>. Then we project the collection of IGroupings to a dictionary, using the Key of each as the dictionary key and enumerating over each grouping (IGrouping<string, Sample> is also an IEnumerable<Sample>) to select the ID of each sample to make a list for the dictionary value.