LINQ returns List<{int,double}> two values after .selected but I need List<int> with one value only - c#

I have got this assignment. I need to create method which works with JSON data in this form:
On input N, what is top N of movies? The score of a movie is its average rate
So I have a JSONfile with 5 mil. movies inside. Each row looks like this:
{ Reviewer:1, Movie:1535440, Grade:1, Date:'2005-08-18'},
{ Reviewer:1, Movie:1666666, Grade:2, Date:'2006-09-20'},
{ Reviewer:2, Movie:1535440, Grade:3, Date:'2008-05-10'},
{ Reviewer:3, Movie:1535440, Grade:5, Date:'2008-05-11'},
This file is deserialized and then saved as a IEnumerable. And then I wanted to create a method, which returns List<int> where int is MovieId. Movies in the list are ordered descending and the amount of "top" movies is specified as a parameter of the method.
My method looks like this:
public List<int> GetSpecificAmountOfBestMovies(int amountOfMovies)
{
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.ToList();
var moviesSortedList = new List<int>();
foreach (var movie in moviesAndAverageGradeSortedList)
{
var key = movie.Key;
moviesSortedList.Add(key);
}
return moviesSortedList;
}
So moviesAndAverageGradeSortedList returns List<{int,double}> because of the .select method. So I could not return this value as this method is type of List<int> because I want only movieIds not their average grades.
So I created a new List<int> and then foreach loop which go through the moviesAndAverageGradeSortedList and saves only Keys from that List.
I think this solution is not correct because foreach loop can be then very slow when I put big number as a parameter. Does somebody know, how can I get "Keys" (movieIds) from the first list and therefore avoid creating another List<int> and foreach loop?
I will be thankful for every solution.

You can avoid the second list creation by just adding another .Select after the ordering. Also to make it all a bit cleaner you could:
return _deserializator.RatingCollection()
.GroupBy(i => i.Movie)
.OrderByDescending(g => g.Average(i => i.Grade))
.Select(g => g.Key)
.Take(amountOfMovies)
.ToList();
Note that this won't really improve performance much (if at all) because even in your original implementation the creation of the second list is done only on the subset of the first n items. The expensive operations are the ordering by the averages of the group and that you want to perform on all items in the json file, regardless to the number of item you want to return

You could add another select after you have ordered the list by average
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.Select(s=> s.Key)
.ToList();

Related

C# Linq union multiple properties to one list

Basically I have an object with 2 different properties, both int and I want to get one list with all values from both properties. As of now I have a couple of linq queries to do this for me, but I am wondering if this could be simplified somehow -
var componentsWithDynamicApis = result
.Components
.Where(c => c.DynamicApiChoicesId.HasValue ||
c.DynamicApiSubmissionsId.HasValue);
var choiceApis = componentsWithDynamicApis
.Select(c => c.DynamicApiChoicesId.Value);
var submissionApis = componentsWithDynamicApis
.Select(c => c.DynamicApiSubmissionsId.Value);
var dynamicApiIds = choiceApis
.Union(submissionApis)
.Distinct();
Not every component will have both Choices and Submissions.
By simplify, I assume you want to combine into fewer statements. You can also simplify in terms of execution by reducing the number of times you iterate the collection (the current code does it 3 times).
One way is to use a generator function (assuming the type of items in your result.Components collection is Component):
IEnumerable<int> GetIds(IEnumerable<Component> components)
{
foreach (var component in components)
{
if (component.DynamicApiChoicesId.HasValue) yield return component.DynamicApiChoicesId.Value;
if (component.DynamicApiSubmissionsId.HasValue) yield return component.DynamicApiSubmissionsId.Value;
}
}
Another option is to use SelectMany. The trick there is to create a temporary enumerable holding the appropriate values of DynamicApiChoicesId and DynamicApiSubmissionsId. I can't think of a one-liner for this, but here is one option:
var dynamicApiIds = result
.Components
.SelectMany(c => {
var temp = new List<int>();
if (c.DynamicApiChoicesId.HasValue) temp.Add(c.DynamicApiChoicesId.Value);
if (c.DynamicApiSubmissionsId.HasValue) temp.Add(c.DynamicApiSubmissionsId.Value);
return temp;
})
.Distinct();
#Eldar's answer gave me an idea for an improvement on option #2:
var dynamicApiIds = result
.Components
.SelectMany(c => new[] { c.DynamicApiChoicesId, c.DynamicApiSubmissionsId })
.Where(c => c.HasValue)
.Select(c => c.Value)
.Distinct();
Similar to some of the other answers, but I think this covers all your bases with a very minimal amount of code.
var dynamicApiIds = result.Components
.SelectMany(c => new[] { c.DynamicApiChoicesId, c.DynamicApiSubmissionsId}) // combine
.OfType<int>() // remove nulls
.Distinct();
To map each element in the source list onto more than one element on the destination list, you can use SelectMany.
var combined = componentsWithDynamicApis
.SelectMany(x => new[] { x.DynamicApiChoicesId.Value, x.DynamicApiSubmissionsId.Value })
.Distinct();
I have not tested it but you can use SelectMany with filtering out the null values like below :
var componentsWithDynamicApis = result
.Components
.Select(r=> new [] {r.DynamicApiChoicesId,r.DynamicApiSubmissionsId})
.SelectMany(r=> r.Where(p=> p!=null).Cast<int>()).Distinct();

Highhest Number's key in a GroupBy

I have a simple class:
class Balls
{
public int BallType;
}
And i have a really simple list:
var balls = new List<Balls>()
{
new Balls() { BallType = 1},
new Balls() { BallType = 1},
new Balls() { BallType = 1},
new Balls() { BallType = 2}
};
I've used GroupBy on this list and I want to get back the key which has the highest count/amount:
After I used x.GroupBy(q => q.BallType) I tried to use .Max(), but it returns 3 and I need the key which is 1.
I also tried to use Console.WriteLine(x.GroupBy(q => q.Balltype).Max().Key); but it throws System.ArgumentException.
Here's what I came up with:
var mostCommonBallType = balls
.GroupBy(k => k.BallType)
.OrderBy(g => g.Count())
.Last().Key
You group by the BallType, order by the count of items in the group, get the last value (since order by is in an ascending order, the most common value would be the last) and then return it's key
Some came up with the idea to order the sequence:
var mostCommonBallType = balls
.GroupBy(k => k.BallType)
.OrderBy(g => g.Count())
.Last().Key
Apart from that it is more efficient to OrderByDescending and then take the FirstOrDefault, you also get in trouble if your collection of Balls is empty.
If you use a different overload of GroupBy, you won't have these problems
var mostCommonBallType = balls.GroupBy(
// KeySelector:
k => k.BallType,
// ResultSelector:
(ballType, ballsWithThisBallType) => new
{
BallType = ballType,
Count = ballsWithThisBallType.Count(),
})
.OrderByDescending(group => group.Count)
.Select(group => group.BallType)
.FirstOrDefault();
This solves the previously mentioned problems. However, if you only need the 1st element, why would you order the 2nd and the 3rd element? Using Aggregate instead of OrderByDescending will enumerate only once:
Assuming your collection is not empty:
var result = ... GroupBy(...)
.Aggregate( (groupWithHighestBallCount, nextGroup) =>
(groupWithHighestBallCount.Count >= nextGroup.Count) ?
groupWithHighestBallCount : nextGroup)
.Select(...).FirstOrDefault();
Aggregate takes the first element of your non-empty sequence, and assigns it to groupWithHighestBallCount. Then it iterates over the rest of the sequence, and compare this nextGroup.Count with the groupWithHighestBallCount.Count. It keeps the one with the hightes value as the next groupWithHighestBallCount. The return value is the final groupWithHighestBallCount.
See that Aggregate only enumerates once?

How to do in this in Linq C#

So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.
If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?

LINQ Syntax for Selecting a Parameter to Be Copied

I have a some code to sort my collection in linq in C#. I want it to group by the houseName to sum over the volumes, order that collection, but also pass a third parameter, pctVol, to the new sorted collection. What am I doing wrong? I know that the problem lies in the pctVol = group.Selecct(item => item.pctVol) line.
var inBetween = this.GroupBy(item => item.houseName)
.Select(group =>
new DataItem
{
houseName = group.Key,
VOLUME = group.Sum(item => item.VOLUME),
pctVol = group.Select(item => item.pctVol)
})
.ToList();
ObservableCollection<DataItem> objSort = new ObservableCollection<DataItem>(inBetween.OrderBy(DataItem =>
DataItem.VOLUME));
return objSort;
What kind of value do you want pctVol to have? With that code, it looks like DataItem.pctVol will be an IEnumerable containing all the pctVol values in that group.
If you want a single value, and all the pctVol values in each group are guaranteed to be the same, then you could just take the value from the first element, like this: pctVol = group.First().pctVol

How can I order a Dictionary in C#?

edit: Thanks Jason, the fact that it was a dictionary isn't that important. I just wanted the runtime to have a low runtime. Is that LINQ method fast? Also, I know this is off topic but what does the n => n mean?
I have a list of numbers and I want to make another list with the numbers that appear most at the beginning and the least at the end.
So what I did was when through the list and checked if the number x was in the dictionary. If it wasn't then I made the key x and the value one. If it was then I changed the value to be the value plus one.
Now I want to order the dictionary so that I can make a list with the ones that appear the most at the beginning and the least at the end.
How can I do that in C#?
ps. runtime is very important.
So it sounds like you have a Dictionary<int, int> where the key represents some integer that you have in a list and corresponding value represents the count of the number of times that integer appeared. You are saying that you want to order the keys by counts sorted in descending order by frequency. Then you can say
// dict is Dictionary<int, int>
var ordered = dict.Keys.OrderByDescending(k => dict[k]).ToList();
Now, it sounds like you started with a List<int> which are the values that you want to count and order by count. You can do this very quickly in LINQ like so:
// list is IEnumerable<int> (e.g., List<int>)
var ordered = list.GroupBy(n => n)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.ToList();
Or in query syntax
var ordered = (from n in list
group n by n into g
orderby g.Count() descending
select g.Key).ToList();
Now, if you need to have the intermediate dictionary you can say
var dict = list.GroupBy(n => n)
.ToDictionary(g => g.Key, g => g.Count());
var ordered = dict.Keys.OrderByDescending(k => dict[k]).ToList();
Use the GroupBy extension on IEnumerable() to group the numbers and extract the count of each. This creates the dictionary from the list and orders it in one statement.
var ordered = list.GroupBy( l => l )
.OrderByDescending( g => g.Count() )
.ToDictionary( g => g.Key, g.Count() );
You may also consider using SortedDictionary.
It sorts the items on the basis of key, while insertion. more..
List<KeyValuePair<type, type>> listEquivalent =
new List<KeyValuePair<type, type>>(dictionary);
listEquivalent.Sort((first,second) =>
{
return first.Value.CompareTo(second.Value);
});
Something like that maybe?
edit: Thanks Jason for the notice on my omission

Categories