I have a list of post objects. Each post object has a car property and each car object has a brand property. I am trying to find total number of posts for a particular brand, for this I am using the following code
var grp = posts.Where(t=> !t.Car.Brand.Name.Equals("Test"))
.Select(t=> new Brand
{
BrandId = t.Car.Brand.Id,
Name = t.Car.Brand.Name,
Url = t.Car.Brand.Url,
})
.GroupBy(t => t.BrandId)
.Select(t=> new Brand
{
BrandId = t.First().BrandId,
Name = t.First().Name,
Url = t.First().Url,
Count = t.Count()
}).OrderByDescending(t=>t.Count).ToList();
This code works but it is a bit slow, any suggestions to improve performance?
Using First() on grouping result dramaticallly decrease perfromance. Up to EF Core 6 it will thtow exception that this query is not translatablke. If you want to write performant queries always think in SQL way: grouping can return only grouping keys and aggregation result, other quirks are slow, even they are translatable to the SQL.
var grp = posts
.Where(t => !t.Car.Brand.Name.Equals("Test"))
.Select(t => new Brand
{
BrandId = t.Car.Brand.Id,
Name = t.Car.Brand.Name,
Url = t.Car.Brand.Url,
})
.GroupBy(t => t)
.Select(t => new Brand
{
BrandId = t.Key.BrandId,
Name = t.Key.Name,
Url = t.Key.Url,
Count = t.Count()
})
.OrderByDescending(t => t.Count)
.ToList();
Related
Let's say I have a table of locations with location ID and location name. And let's say I want to get the revenues for each location (in this simple scenario I might not even need GroupBy - but please assume that I do!)
var revenues = await _context.SaleTransaction.GroupBy(s => s.LocationId)
.Select(x => new LocationDTO {
LocationId = x.Key,
LocationName = ???
Revenues = x.Sum(i => i.Amount)
}).ToListAsync();
I tried to cheat
LocationName = x.Select(i => i.Location.LocationName).First()
since all location names for this ID are the same. But EF can't translate First() unless I use AsEnumerable() and bring the whole sales table into application memory.
Or I can traverse the result the second time:
foreach(var revenue in revenues) {
revenue.LocationName = _context.Location.Find(revenue.LocationId).LocationName;
}
Given that the number of locations is fixed (and relatively small), it may be the best approach. Still, neither going to DB for every location O(n) nor pulling the whole location list into memory doesn't sit well. Maybe there is a way to assign LocationName (and some other attributes) as part of GroupBy statement.
I am using EF Core 5; or if something is coming in EF Core 6 - that would work as well.
From what I can briefly see is that you need a linq join query in order to join the searches. With EF linq query it means those won't be loaded into memory until they are used so it would solve the problem with loading the whole table.
You could write something like:
var revenues = await _context.SaleTransactions.Join(_context.Locations, s => s.LocationId, l => l.Id, (s, l) => new {LocationId = s.LocationId, LocationName = l.LocationName, Revenues = s.Sum(i => i.Amount)});
I will link the whole fiddle with the mock of your possible model
https://dotnetfiddle.net/BGJmjj
You can group by more than one value. eg;
var revenues = await _context.SaleTransaction
.GroupBy(s => new {
s.LocationId,
s.Location.Name
})
.Select(x => new LocationDTO {
LocationId = x.Key.LocationId,
LocationName = x.Key.Name,
Revenues = x.Sum(i => i.Amount)
}).ToListAsync();
Though it seems like you are calculating a total per location, in which case you can build your query around locations instead.
var revenues = await _context.Location
.Select(x => new LocationDTO {
LocationId = x.Id,
LocationName = x.Name,
Revenues = x.SaleTransactions.Sum(i => i.Amount)
}).ToListAsync();
var revenues = await _context.Location
.Select(x => new LocationDTO {
LocationId = x.Id,
LocationName = x.Name,
Revenues = x.SaleTransactions.Sum(i => i.Amount)
}).ToListAsync();
there is example:
.NetFiddle
In my database I have two tables Organizations and OrganizationMembers, with a 1:N relationship.
I want to express a query that returns each organization with the first and last name of the first organization owner.
My current select expression works, but it's neither efficient nor does it look right to me, since every subquery gets defined multiple times.
await dbContext.Organizations
.AsNoTracking()
.Select(x =>
{
return new OrganizationListItem
{
Id = x.Id,
Name = x.Name,
OwnerFirstName = (x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner)).FirstName,
OwnerLastName = (x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner)).LastName,
OwnerEmailAddress = (x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner)).EmailAddress
};
})
.ToArrayAsync();
Is it somehow possible to summarize or reuse the subqueries, so I don't need to define them multiple times?
Note that I've already tried storing the subquery result in a variable. This doesn't work, because it requires converting the expression into a statement body, which results in a compiler error.
The subquery can be reused by introducing intermediate projection (Select), which is the equivalent of let operator in the query syntax.
For instance:
dbContext.Organizations.AsNoTracking()
// intermediate projection
.Select(x => new
{
Organization = x,
Owner = x.Members
.Where(member => member.Role == RoleType.Owner)
.OrderBy(member => member.CreatedAt)
.FirstOrDefault()
})
// final projection
.Select(x => new OrganizationListItem
{
Id = x.Organization.Id,
Name = x.Organization.Name,
OwnerFirstName = Owner.FirstName,
OwnerLastName = Owner.LastName,
OwnerEmailAddress = Owner.EmailAddress
})
Note that in pre EF Core 3.0 you have to use FirstOrDefault instead of First if you want to avoid client evaluation.
Also this does not make the generated SQL query better/faster - it still contains separate inline subquery for each property included in the final select. Hence will improve readability, but not the efficiency.
That's why it's usually better to project nested object into unflattened DTO property, i.e. instead of OwnerFirstName, OwnerLastName, OwnerEmailAddress have a class with properties FirstName, LastName, EmailAddress and property let say Owner of that type in OrganizationListItem (similar to entity with reference navigation property). This way you will be able to use something like
dbContext.Organizations.AsNoTracking()
.Select(x => new
{
Id = x.Organization.Id,
Name = x.Organization.Name,
Owner = x.Members
.Where(member => member.Role == RoleType.Owner)
.OrderBy(member => member.CreatedAt)
.Select(member => new OwnerInfo // the new class
{
FirstName = member.FirstName,
LastName = member.LastName,
EmailAddress = member.EmailAddress
})
.FirstOrDefault()
})
Unfortunately in pre 3.0 versions EF Core will generate N + 1 SQL queries for this LINQ query, but in 3.0+ it will generate a single and quite efficient SQL query.
How about this:
await dbContext.Organizations
.AsNoTracking()
.Select(x =>
{
var firstMember = x.Members.OrderBy(member => member.CreatedAt).First(member => member.Role == RoleType.Owner);
return new OrganizationListItem
{
Id = x.Id,
Name = x.Name,
OwnerFirstName = firstMember.FirstName,
OwnerLastName = firstMember.LastName,
OwnerEmailAddress = firstMember.EmailAddress
};
})
.ToArrayAsync();
How about doing this like
await dbContext.Organizations
.AsNoTracking()
.Select(x => new OrganizationListItem
{
Id = x.Id,
Name = x.Name,
OwnerFirstName = x.Members.FirstOrDefault(member => member.Role == RoleType.Owner).FirstName,
OwnerLastName = x.Members.FirstOrDefault(member => member.Role == RoleType.Owner)).LastName,
OwnerEmailAddress = x.Members.FirstOrDefault(member => member.Role == RoleType.Owner)).EmailAddress
})
.ToArrayAsync();
I have query that needs to filter large set of data by some search criteria.
The search is happening through 3 tables: Products, ProductPrimaryCodes, ProductCodes.
The large data (given there is around 2000 records, so is not that large, but is largest by the other tables data) set is in ProductCodes table.
Here is an example of what I've done.
var result = products.Where(x => x.Code.Contains(se) ||
x.ProductPrimaryCodes.Any(p => p.Code.Contains(se)) ||
x.ProductCodes.Any(p => p.Code.Contains(se)))
.Select(x => new ProductDto
{
Id = x.Id,
Name = x.Name,
InStock = x.InStock,
BrandId = (BrandType)x.BrandId,
Code = x.Code,
CategoryName = x.Category.Name,
SubCategoryName = x.SubCategory.Name,
});
The time that query executes is around 8-9 sec, so i believe is quite long for this kind of search. And just a note, without doing ProductCodes.Any(), the query executes in less than a second and retrieves result to the page.
ProductCodes table:
Id,
Code,
ProductId
Any suggestions how to get better performance of the query?
This is the solution that worked for me.
var filteredProductsByCode = products.Where(x => x.Code.Contains(se));
var filteredProducts = products.Where(x => x.ProductCodes.Any(p => p.Code.Contains(se))
|| x.ProductPrimaryCodes.Any(p => p.Code.Contains(se)));
return filteredProductsByCode.Union(filteredProducts).Select(x => new ProductDto
{
Id = x.Id,
Name = x.Name,
InStock = x.InStock,
BrandId = (BrandType)x.BrandId,
Code = x.Code,
CategoryName = x.Category.Name,
SubCategoryName = x.SubCategory.Name,
}).OrderByDescending(x => x.Id)
Clearly not the cleanest, but I will also consider introducing stored procedures for this kind of queries.
I have a class (ApplicationHistory) with 3 properties:
ApplicantId, ProviderId, ApplicationDate
I return the data from the database into a list, however this contains duplicate ApplicantId/ProviderId keys.
I want to supress the list so that the list only contains the the earliest Application Date for each ApplicantId/ProviderId.
The example below is where I'm currently at, but I'm not sure how to ensure the earliest date is returned.
var supressed = history
.GroupBy(x => new
{
ApplicantId = x.ApplicantId,
ProviderId = x.ProviderId
})
.First();
All advice appreciated.
Recall that each group formed by the GroupBy call is an IGrouping<ApplicationHistory>, which implements IEnumerable<ApplicationHistory>. Read more about IGrouping here. You can order those and pick the first one:
var oldestPerGroup = history
.GroupBy(x => new
{
ApplicantId = x.ApplicantId,
ProviderId = x.ProviderId
})
.Select(g => g.OrderBy(x => x.ApplicationDate).FirstOrDefault());
You are selecting first group. Instead select first item from each group:
var supressed = history
.GroupBy(x => new {
ApplicantId = x.ApplicantId,
ProviderId = x.ProviderId
})
.Select(g => g.OrderBy(x => x.ApplicationDate).First());
Or query syntax (btw you don't need to specify names for anonymous object properties in this case):
var supressed = from h in history
group h by new {
h.ApplicantId,
h.ProviderId
} into g
select g.OrderBy(x => x.ApplicationDate).First();
I have the following block of code which works fine;
var boughtItemsToday = (from DBControl.MoneySpent
bought in BoughtItemDB.BoughtItems
select bought);
BoughtItems = new ObservableCollection<DBControl.MoneySpent>(boughtItemsToday);
It returns data from my MoneySpent table which includes ItemCategory, ItemAmount, ItemDateTime.
I want to change it to group by ItemCategory and ItemAmount so I can see where I am spending most of my money, so I created a GroupBy query, and ended up with this;
var finalQuery = boughtItemsToday.AsQueryable().GroupBy(category => category.ItemCategory);
BoughtItems = new ObservableCollection<DBControl.MoneySpent>(finalQuery);
Which gives me 2 errors;
Error 1 The best overloaded method match for 'System.Collections.ObjectModel.ObservableCollection.ObservableCollection(System.Collections.Generic.List)' has some invalid arguments
Error 2 Argument 1: cannot convert from 'System.Linq.IQueryable>' to 'System.Collections.Generic.List'
And this is where I'm stuck! How can I use the GroupBy and Sum aggregate function to get a list of my categories and the associated spend in 1 LINQ query?!
Any help/suggestions gratefully received.
Mark
.GroupBy(category => category.ItemCategory); returns an enumerable of IGrouping objects, where the key of each IGrouping is a distinct ItemCategory value, and the value is a list of MoneySpent objects. So, you won't be able to simply drop these groupings into an ObservableCollection as you're currently doing.
Instead, you probably want to Select each grouped result into a new MoneySpent object:
var finalQuery = boughtItemsToday
.GroupBy(category => category.ItemCategory)
.Select(grouping => new MoneySpent { ItemCategory = grouping.Key, ItemAmount = grouping.Sum(moneySpent => moneySpent.ItemAmount);
BoughtItems = new ObservableCollection<DBControl.MoneySpent>(finalQuery);
You can project each group to an anyonymous (or better yet create a new type for this) class with the properties you want:
var finalQuery = boughtItemsToday.GroupBy(category => category.ItemCategory);
.Select(g => new
{
ItemCategory = g.Key,
Cost = g.Sum(x => x.ItemAmount)
});
The AsQueryable() should not be needed at all since boughtItemsToday is an IQuerable anyway. You can also just combine the queries:
var finalQuery = BoughtItemDB.BoughtItems
.GroupBy(item => item.ItemCategory);
.Select(g => new
{
ItemCategory = g.Key,
Cost = g.Sum(x => x.ItemAmount)
});