LINQ: Getting the row with the maximum value of a given attribute - c#

I have a bunch of rows grouped on an attribute called MyID. Now I want the one row from each group where the StatusDate attribute is the highest in that one group.
This is what I've come up with.
rows.Select(x => x.Where(y => y.StatusDate == x.Max(z => z.StatusDate)).First())
With a bit more explanation:
rows.Select(x => // x is a group
x.Where(y => // get all rows in that group where...
// the status date is equal to the largest
// status date in the group
y.StatusDate == x.Max(z => z.StatusDate)
).First()) // and then get the first one of those rows
Is there any faster or more idiomatic way to do this?

One alternative would be to use:
rows.Select(x => x.OrderByDescending(y => y.StatusDate).First());
... and check that the query optimiser knows that it doesn't really need to sort everything. (This would be disastrous in LINQ to Objects, but you could use MaxBy from MoreLINQ in that case :)
(Apologies for previous version - I hadn't fully comprehended the grouping bit.)

Don't know if this is Linq to SQL, but if it is, you could alternatively accomplish via a rank() function in SQL (rank each group by date, then select the first ranked row from each), then call this as a stored proc from LINQ. I think that's an approch that is becoming more idiomatic as people hit the bounderies of LINQ2SQL...

Related

How is it possible that selecting first after GroupBy on a DataTable does not return one value for every group?

I am trying to get one row per id from a DataTable, and I do not care which row I take. The same id can exist on several rows in the table.
Here's the expression that's giving me trouble:
dt.AsEnumerable().GroupBy(i => i.Field<int>("id")).Select(i => i.First())
Running just this section dt.AsEnumerable().GroupBy(i => i.Field<int>("id") correctly gives me a result of 22 groupings for my DataTable. (I have 22 ids with data in this table)
However, when adding on the .Select(i => i.First()), I am only seeing 10 data rows.
To me this doesn't seem to make any sense. If the GroupBy function managed to find 22 distinct id values, I would expect this logic to grab one of each.
My only other thought is that maybe it's just a weird side effect of viewing this data through a watch in Visual Studio rather than assigning to a variable.
If you think it's just weird side effects of viewing the data in a watch, which can happen with LINQ statements, then split it out into
var groups = dt.AsEnumerable().GroupBy(i => i.Field<int>("id")).ToList();
var firstOfGroups = groups.Select(i => i.First()).ToList();
and then look at groups and firstOfGroups in the debugger. Temporarily evaluating items with .ToList() can help a lot with viewing things in the debugger.
I think it is possible, can double check the count of each group items
.Select(g=>new { k = g.Key, c = g.Count() })

How to group records in Entity Frameworkk with where clause

i want to apply where clause on COLID column and wish to take the very last value w.r.t that COLID, e.g in COLID 1 case the last returned value should not be NULL and in COLID 2 it would be 30, against each ENTRYID
I can do it well in SQL, look at the query and data:
You can edit your question to append code rather than pasting it into comments. Paste the code into the question and use the "{}" button in the editor to code format.
You can group and sort items in Linq without too much issue.
context.Entries
.GroupBy(x => x.EntryID)
.Select(x => x.OrderByDescending(y => y.ColID).FirstOrDefault())
.ToList();
GroupBy defines what columns make up the unique grouping. In your case if you want the latest Entry then the EntryId would be enough to define what to group on. This will form the Key for each group. From there we use Select to tell it what to select from each group. The group will be all entries of that EntryID so we order by the ColID descending so the biggest one is first, then use FirstOrDefault to take the largest one. The ToList() at the end materializes the result to retrieve the latest ColID version of each EntryID.
Edit: If you want to only consider non-null values:
context.Entries
.Where(x => x.ColValueId != null)
.GroupBy(x => x.EntryID)
.Select(x => x.OrderByDescending(y => y.ColID).FirstOrDefault())
.ToList();
Linq is a bit different than SQL and can have some limitations when used with EF because EF will ultimately need to convert it to SQL. Still, it is well worth reading up on Linq because it is a very powerful tool when working with objects and it allows EF to do a lot of the heavy lifting.

Using linq to retrieve distinct dates

Hi I'm having trouble retrieving distinct dates from a database. In this database I have several events on any particular day and results show a list of events with the same date / 'StartDate' field. How could I retrieve just the distinct days. I've tried:
ICollection result;
result= client.GetEventInstances().Select(x => x.StartDate).Distinct();
I expect to see just one distinct date along with just the first event for that date only.
Then you need to group on the events and get the first item out of every date:
client.GetEventInstances()
.GroupBy(k => k.StartDate.Date)
.ToDictionary(k => k.Key, v => v.First())
Well, I managed to search and find a solution for my needs, it wont suit everyone of course. Thanks for the time people gave me on this, always appreciated
ICollection<EventInstance> result;
result = client.GetEventInstances();
IEnumerable<EventInstance> distinctDate = result
.GroupBy(e => e.StartDate.Date)
.Select(group => group.First());

Get "lowest" date of parents great grand child with Linq

I'm trying to learn LINQ - the hard way.
I have a few entities which all are connected in sequence
Department -> Groups -> Works -> project
project has a startdate (and an end date)
I'm trying to get the startdate of the Group with the first startdate so to speak.
I've tried:
Department.Groups.Select(g => g.Works.Select(w => w.project.StartDate).Min())
and various variations thereof.
The problem being it returns a list in a list in a list, and I'm getting dizzy from just thinking about it :)
I've tried to work my way backwards from
g.Works.Select(w=> w.Project.StartDate).Min() which gives me the lowest date for Works
Any help is greatly appreciated
Flatten your List with SelectMany and then apply Min like:
var minDate = Department.Groups
.SelectMany(g => g.Works.Select(w => w.project.StartDate))
.Min();
That will return the minimum date for all the Works under all Groups.
If you just want the date then Habib's flatten solution is good. If you want the whole record you can order by and then take the first record, like this:
Department
.Groups
.OrderBy(g => g.Works.Select(w => w.project.StartDate))
.FirstOrDefault()

Linq using sum() in list within a list

The relationship between them is one order to many journals. Now I want to get the sum of all pending orders (or records that are flagged as false under IsDelivered in Order entity).
So far, I have this query but can't seem to get working when I add .Sum()
Orders
.Where(o => o.IsDelivered == false)
.Select(o => new {
pendingOrders = o.Journals.Sum(j => j.TotalAmount)
})
So far this results to:
In a nutshell, how can I get the sum of them? If query needs to be altered or should be a new one. It is welcome. Any help would be much appreciated. Thanks!
You can do it in several ways:
You could add Sum(n => n.pendingOrders) to the end of your query to add up the values, or
You could use SelectMany before Select, and use Sum instead of Select.
Either of the two approaches is going to work.
In this case, it is easiest to use SelectMany (MSDN) to flatten your collections, then use Sum:
Orders.Where(o => !o.IsDelivered).SelectMany(o => o.Journals).Sum(j => j.TotalAmount);

Categories