How to group using LINQ, then get value of largest group - c#

Here's what I have so far:
var bestReason =
from p in successfulReasons
group p by p.Reason into g
select new { Excuse = g.Key, ExcuseCount = g.Count() };
What I need to do now is return one reason that is the best reason, determined by which were successful in the past.
Sample data:
ID,Reason
---------
0,Weather
1,Traffic
2,Illness
3,Weather
4,Traffic
5,Traffic
6,Pirates
should return "Traffic"
Would like to do it all in one LINQ statement, if possible.
Thanks.
EDIT: If there are 7 Pirate Attacks, and 7 Traffic Accidents, I'm ok with returning either one (the first alphabetically would be fine).

var bestReason = successfulReasons
.GroupBy(r => r.Reason)
.OrderByDescending(grp => grp.Count())
.First().Key;

If I understand your question correctly, you can do:
string bestReason =
(from p in successfulReasons
orderby p.Reason
group p by p.Reason into g
orderby g.Count() descending
select g.Key).FirstOrDefault();

var group = excuses.GroupBy(m => m.Reason)
.OrderByDescending(m => m.Count())
.Select(m => m.Key)
.FirstOrDefault();
Which produces the following sql statement:
SELECT TOP (1) [t1].[Reason]
FROM (
SELECT COUNT(*) AS [value], [t0].[Reason]
From [dbo].[Excuses] As [t0]
GROUP BY [t0].[Reason]
) As [t1]
ORDER BY [t1].[value] DESC
Since this is a moderately complicated IQueryable expression, you might consider compiling it to speed up the response time:
Func<ExcusesDataContext, string> commonResult = CompiledQuery.Compile(
(ExcusesDataContext c) => c.Excuses.GroupBy(m => m.Reason).OrderByDescending(m => m.Count()).Select(m => m.Key).FirstOrDefault()
);
Console.WriteLine(commonResult(new ExcusesDataContext()));
Console.ReadLine();
You could also just call the stored procedure via a repository and snag the particular value that you're looking for. This would be the fastest path to happiness, but the least fun to maintain:
string excuse = this.repo.Excuses.MostCommonFor(ProblemList.BeingLate);

Related

How to write linq query for this sql statement

How would you write a linq query with the following SQL statement. I've tried several methods referenced on stackoverflow but they either don't work with the EF version I'm using (EF core 3.5.1) or the DBMS (SQL Server).
select a.ProductID, a.DateTimeStamp, a.LastPrice
from Products a
where a.DateTimeStamp = (select max(DateTimeStamp) from Products where a.ProductID = ProductID)
For reference, a couple that I've tried (both get run-time errors).
var results = _context.Products
.GroupBy(s => s.ProductID)
.Select(s => s.OrderByDescending(x => x.DateTimeStamp).FirstOrDefault());
var results = _context.Products
.GroupBy(x => new { x.ProductID, x.DateTimeStamp })
.SelectMany(y => y.OrderByDescending(z => z.DateTimeStamp).Take(1))
Thanks!
I understand you would like to have a list of the latest prices of each products?
First of all I prefer to use group by option even over 1st query
select a.ProductID, a.DateTimeStamp, a.LastPrice
from Products a
where a.DateTimeStamp IN (select max(DateTimeStamp) from Products group by ProductID)
Later Linq:
var maxDateTimeStamps = _context.Products
.GroupBy(s => s.ProductID)
.Select(s => s.Max(x => x.DateTimeStamp)).ToArray();
var results = _context.Products.Where(s=>maxDateTimeStamps.Contains(s.DateTimeStamp));
-- all assuming that max datetime stamps are unique
I've managed to do it with the following which replicates the correlated sub query in the original post (other than using TOP and order by instead of the Max aggregate), though I feel like there must be a more elegant way to do this.
var results = from x
in _context.Products
where x.DateTimeStamp == (from y
in _context.Products
where y.ProductID == x.ProductID
orderby y.DateTimeStamp descending
select y.DateTimeStamp
).FirstOrDefault()
select x;
I prefer to break up these queries into IQueryable parts, do you can debug each "step".
Something like this:
IQueryable<ProductOrmEntity> pocoPerParentMaxUpdateDates =
entityDbContext.Products
//.Where(itm => itm.x == 1)/*if you need where */
.GroupBy(i => i.ProductID)
.Select(g => new ProductOrmEntity
{
ProductID = g.Key,
DateTimeStamp = g.Max(row => row.DateTimeStamp)
});
//// next line for debugging..do not leave in for production code
var temppocoPerParentMaxUpdateDates = pocoPerParentMaxUpdateDates.ToListAsync(CancellationToken.None);
IQueryable<ProductOrmEntity> filteredChildren =
from itm
in entityDbContext.Products
join pocoMaxUpdateDatePerParent in pocoPerParentMaxUpdateDates
on new { a = itm.DateTimeStamp, b = itm.ProductID }
equals
new { a = pocoMaxUpdateDatePerParent.DateTimeStamp, b = pocoMaxUpdateDatePerParent.ProductID }
// where
;
IEnumerable<ProductOrmEntity> hereIsWhatIWantItems = filteredChildren.ToListAsync(CancellationToken.None);
That last step, I am putting in an anonymous object. You can put the data in a "new ProductOrmEntity() { ProductID = pocoMaxUpdateDatePerParent.ProductID }...or you can get the FULL ProductOrmEntity object. Your original code, I don't know if getting all columns of the Product object is what you want, or only some of the columns of the object.

Converting SQL to LINQ with Multiple Tables and Group

There seem to be lots of questions about SQL to LINQ, but I can't seem to find examples with joined tables and grouping; specifically with a need to get data from multiple tables.
Take this simple SQL:
SELECT
s.showId, s.showName, v.venueName, Min(dateTime) startDate
FROM
shows s
INNER JOIN venues v ON s.venueId = v.venueId
INNER JOIN showDates d ON s.showId = d.showId
GROUP BY
s.showId
The best I can come up with is the following
var ungrouped = (
from s in db.Shows
join v in db.Venues on s.VenueId equals v.VenueId
join d in db.ShowDates on s.ShowId equals d.ShowId
select new { s, v, d }
).ToList();
var grouped = (
from s in ungrouped
group s by s.s.ShowId into grp
select new
{
showId = grp.Key,
name = (from g in grp select g.s.showName).FirstOrDefault(),
venue = (from g in grp select g.v.VenueName).FirstOrDefault(),
startDate = grp.Max(g => g.d.DateTime)
}
);
This works but it feels messy. I don't like:
It being split into two statements
Having to repeatedly write (from g in grp select ...).FirstOrDefault()
Bits like s.s.ShowId
How its vastly more lines of code than the SQL
This example is a simple one, it only gets worse when I have 5+ tables to join and 10+ columns to select.
Question: Is this the best way to do this, and I should just accept it; or is there a better way to write this query?
I am not sure if you are looking for something like this but it's a bit cleaner, it's not split in 2 statements and you might find it helpful. I couldn't use a dbcontext so I used lists to make sure the syntax is correct.
var res = Shows.Join(Venues,
show => show.VenueID,
venue => venue.VenueID,
(show, venue) => new { show, venue })
.Join(ShowDates,
val => val.show.ShowID,
showdate => showdate.ShowID,
(val, showDate) => new { val.show, val.venue, showDates = showDate })
.GroupBy(u => u.show.ShowID)
.Select(grp => new
{
showId = grp.Key,
name = grp.FirstOrDefault()?.show.showName,
venue = grp.FirstOrDefault()?.venue.VenueName,
startDate = grp.Max(g => g.showDates.DateTime)
});
we need to now realation beetwen them one to one or one to many , but not too far from this answer.
var GrouppedResult = Shows.Include(x=>x.Veneu).Include(x=>x.ShowDates)
.Where(x=>x.Veneu.Any()&&x.ShowDates.Any())
.GroupBy(x=>x.ShowId)
.Select(x=>///anything you want);
or
from show in Shows
join veneu in Veneu on veneu.VeneuId equals show.VeneuId
join showDates in ShowDates on showDates.ShowId=show.ShowID
group show by show.Id into grouppedShows
select new { ///what you want };

How to GroupBy data from many tables in dotnet core Entity Framework

I'm building some marketplace web app, let's say something like e-bay. Typical scenario is:
User makes offer which consists of one or more items and those items are of certain type.After that other users are bidding on that offer.
Here is simplified diagram.
On SQL Fiddle (here) you can see both CREATE TABLE and INSERT INTO statements
Sample data:
There are two offers. On one offer (Id 1) which consists of one item which is type of "watch". There is another offer, (Id 2), which has one item which is of type "headphone".
On both offers there are bids. On watch, there are two bis; one bid with 100 dollars and another with 120. On headphones, there are bids with 50 and 80 dollars.
What I want to achieve is to have average bid per type. In this sample, that means i want to get 110 as average bid for watch and 65 as average bid for headphone. To achieve that using T-SQL, I would write query like this:
SELECT t.name,
avg(amount)
FROM bid b
LEFT JOIN offer o ON b.OfferId = o.id
LEFT JOIN offeritem oi ON o.id = oi.OfferId
LEFT JOIN itemType t ON oi.itemtypeid = t.Id
GROUP BY t.name
So, my question is - how to achieve that in dotnet core 3.0 EntityFramework
Using GroupBy, like this:
_context.Bids
.Include(b => b.Offer)
.ThenInclude(o => o.OfferItems)
.ThenInclude(os => os.ItemType)
.GroupBy(b => b.Offer.OfferItems.First().ItemType.Name);
gives exception:
Client side GroupBy is not supported.
. When I try with projection, like this:
_context.Bids
.Include(b => b.Offer)
.ThenInclude(o => o.OfferItems)
.ThenInclude(os => os.ItemType)
.GroupBy(b => b.Offer.OfferItems.First().ItemType.Name)
.Select(g => new
{
Key = g,
Value = g.Average(b => b.Amount)
});
i get exception again.
Processing of the LINQ .... failed. This may indicate either a bug or
a limitation in EF Core.
EDIT:
This approach
_context.Bids
.Include(b => b.Offer)
.ThenInclude(o => o.OfferItems)
.ThenInclude(os => os.ItemType)
.GroupBy(b => new { b.Offer.OfferItems.First().ItemType.Name}, b => b.Amount)
.Select(g => new
{
Key = g.Key.Code,
Value = g.Average()
});
also threw an exception, but this time:
Cannot use an aggregate or a subquery in an expression used for the
group by list of a GROUP BY clause.
...
So, is there a way to group that data (get simple Average) or should I make another query and iterate throught collection and make calculation myself? That would lower performance for sure (I was hoping I can do server grouping, but as you can see, i got into mentioned issues). Any ideas? Thanks in advance.
In your case it is hard to hide subquery from grouping
You can try it in such way
var joined =
context.Bid
.SelectMany(x =>
x.Offer.OfferItem
.Select(y => new
{
Amount = x.Amount,
Name = y.ItemType.Name
})
.Take(1));
var grouped = from i in joined
group i by i.Name into groups
select new
{
Key = groups.Key,
Amount = groups.Average(x => x.Amount)
};
it gives me a query
SELECT [t].[Name] AS [Key], AVG([t].[Amount]) AS [Amount]
FROM [Bid] AS [b]
INNER JOIN [Offer] AS [o] ON [b].[OfferId] = [o].[Id]
CROSS APPLY (
SELECT TOP(1) [b].[Amount], [i].[Name], [o0].[Id], [i].[Id] AS [Id0], [o0].[OfferId]
FROM [OfferItem] AS [o0]
INNER JOIN [ItemType] AS [i] ON [o0].[ItemTypeId] = [i].[Id]
WHERE [o].[Id] = [o0].[OfferId]
) AS [t]
GROUP BY [t].[Name]

Get last 2 rows from group on database side

I'm trying to improve performance of linq query for PostgreSQL. There are two tables (Parcles, ParcelStates) with relation 1:n. I need to get last 2 ParcelStates for each Parcel. Looks simple, I have following code:
IQueryable<Parcel> parcels = _dbContext.Parcels
.OrderByDescending(x => x.Id)
.Take(100);
Then getting states:
var states = await parcels
.GroupJoin(_dbContext.ParcelStates, ps => ps.Id, p => p.ParcelId, (ps, p) => new { ps, p })
.SelectMany(x => x.p.DefaultIfEmpty().OrderByDescending(y => y.Id).Take(2), (x,c) => c)
.ToListAsync();
It returns me 180 states, and it is ok. But there is performance issue, because it generates not perform SQL query:
SELECT *
FROM (
SELECT *
FROM parcels AS x
WHERE x.isdeleted = FALSE
ORDER BY c DESC, c0 DESC
LIMIT #__p_1 OFFSET #__p_0
) AS t
LEFT JOIN parcelstates AS p ON t.id = p.parcelid
ORDER BY t.c DESC, t.c0 DESC, t.id
It takes all states from database, when I need only 2.
How to change LINQ to filter result on database side?
In logs I found:
The LINQ expression 'Take(2)' could not be translated and will be evaluated
If you insert the SelectMany expression into the GroupJoin, will it convert to SQL?
var states = await parcels
.GroupJoin(_dbContext.ParcelStates, ps => ps.Id, p => p.ParcelId,
(ps, p) => p.DefaultIfEmpty().OrderByDescending(y => y.Id).Take(2))
.ToListAsync();
We can use a foreach loop which will translate to several very fast SQL lookups (should execute in < 1 second). Not ideal but I would still recommend writing a stored procedure to get this data, instead of relying on LINQ to SQL which doesn't always generate the most optimum query:
// Store a list of parcel states
var parcelStates = new List<ParcelState>();
// Read top 100 parcels from the database
var parcels = dbContext.Parcels
.OrderBy(p => p.Id)
.Take(100);
// For each parcel, use SQL to lookup the 2 most recent parcel states
foreach (var p in parcels)
{
var ps = dbContext.ParcelStates
.Where(ps => ps.ParcelId == p.Id)
.OrderByDescending(ps => ps.Id)
.Take(2);
parcelStates.AddRange(ps);
}
// Now we have all parcel states for those parcels
Console.WriteLine($"Found {parcelStates.Count} parcel states for {parcels.Count} parcels");

How to filter entity framework result with multiple columns using a lambda expression

I have the following table:
And the following data:
How can i filter the result, so that i only get the latest row from each omraade_id (sorted descending by timestamp)?
Which in this case would be the rows with id: 1010 and 1005
--
From #lazyberezovsky's answer, i have created the following expression:
dbConnection = new ElecEntities();
var query = from data in dbConnection.Valgdata
orderby data.timestamp descending
group data by data.omraade_id into g
select g.FirstOrDefault();
return query.ToList();
It returns two rows with the ID 3 and 4, which are the first two rows in the database, and also the ones with the lowest timestamp. Any idea why?
var query = dbConnection.Valgdata
.GroupBy(x => x.omraade_id)
.Select(g => g
.OrderByDescending(x => x.timestamp)
.FirstOrDefault());
I have no experience with EF, so I'm unsure if only SQL-esque linq works here. A plain C#-ish:
var query = dbConnection.Valgdata.GroupBy(u => u.omraade_id)
.Select(x => x.FirstOrDefault(y => x.Max(p => p.timestamp) == y.timestamp));
You have put filter on every item. It should be applied on complete query result, not on every item.
Following is updated query.
var query = (from data in dbConnection.Valgdata
orderby data.timestamp descending
group data by data.omraade_id into g
select g).FirstOrDefault();
var query = from v in dbConnection.Valgdata
orderby v.timestamp descending
group v by v.omraade_id into g
select g.First();
This will return only record with max timestamp for each omraade_id.
UPDATE query above works fine to me (at least for MS SQL Linq provider). Also you don't need to do FirstOrDefault - if omraade_id is grouped, then it definitely has at least one row.
var query = from v in dbConnection.Valgdata
group v by v.omraade_id into g
select g.OrderByDesc(x => x.timestamp).First();
This is my solution so far:
var data = dbConnection.Valgdata.Where(x => x.godkendt == false).ToList();
var dataGrouped = data.GroupBy(x => x.omraade_id).ToList();
List<Valgdata> list = new List<Valgdata>();
foreach (var grpdata in dataGrouped)
{
var dataGroup = grpdata.OrderByDescending(x => x.timestamp).ToList();
list.Add(dataGroup.FirstOrDefault());
}
return list;
I dont know if it is the most effective, but it works.

Categories