I have a query written in Linq To Entities:
db.Table<Operation>()
.Where(x => x.Date >= dateStart)
.Where(x => x.Date < dateEnd)
.GroupBy(x => new
{
x.EntityId,
x.EntityName,
x.EntityToken
})
.Select(x => new EntityBrief
{
EntityId = x.Key.EntityId,
EntityName = x.Key.EntityName,
EntityToken = x.Key.EntityToken,
Quantity = x.Count()
})
.OrderByDescending(x => x.Quantity)
.Take(5)
.ToList();
The problem is that it takes 4 seconds when executing in the application using EF. But when I take the created pure SQL Query from that query object (using Log) and fire it directly on SQL Server, then it takes 0 seconds. Is it a known problem?
Firstly, try improving your query:
var entityBriefs =
Table<Operation>().Where(x => x.Date >= dateStart && x.Date < dateEnd)
.GroupBy(x => x.EntityId)
.OrderByDescending(x => x.Count())
.Take(5)
.Select(x => new EntityBrief
{
EntityId = x.Key.EntityId,
Quantity = x.Count()
});
var c = entityBriefs.ToDictionary(e => e.EntityId, e => e);
var entityInfo = Table<Operation>().Where(o => mapping.Keys.Contains(o.EntityId).ToList();
foreach(var entity in entityInfo)
{
mapping[entity.EntityId].EntityName = entity.EntityName;
mapping[entity.EntityId].EntityToken = entity.EntityToken;
}
You may also compile queries with the help of CompiledQuery.Compile, and use it further with improved performance.
http://msdn.microsoft.com/en-us/library/bb399335%28v=vs.110%29.aspx
The problem was with the database locks. I used wrong isolation level, so my queries were blocked under some circumstances. Now I use read-commited-snapshot and the execution time looks good.
Related
Currently I am doing a keyword search on the Plates table (Name column) but also have a Search (searching on SearchTerm column) table which contains Plat Id's that I also want to search and return the corresponding platforms.
The code below works but I'd like to simplify the logic using an .Include statement if possible although I'm not quite sure how. Any help would be greatly appreciated.
if (!string.IsNullOrEmpty(request.Keyword))
{
var searchTermPlateIds = await _db.Search
.Where(x=> x.SearchTerm.ToLower().Contains(request.Keyword.Trim().ToLower()))
.Select(x => x.PlatformId)
.ToListAsync(ct);
var plateFromPlateIds = await _db.Plate
.OrderBy(x => x.Name)
.Where(x => searchTermPlateIds.Contains(x.Id) && x.Status != PlateStatus.Disabled)
.ToListAsync(ct);
plates = await _db.Plates
.OrderBy(x => x.Name)
.Where(x => !string.IsNullOrEmpty(request.Keyword.Trim()) && x.Name.ToLower().Contains(request.Keyword.Trim().ToLower()) && x.Status != PlateStatus.Disabled)
.ToListAsync(ct);
plates = plates.Union(platesFromPlateIds).ToList();
}
Remember simple thing, Include ONLY for loading related data, not for filtering.
What we can do here - optimize query, to make only one request to database, instead of three.
var query = _db.Plates
.Where(x => x.Status != PlateStatus.Disabled);
if (!string.IsNullOrEmpty(request.Keyword))
{
// do not materialize Ids
var searchTermPlateIds = _db.Search
.Where(x => x.SearchTerm.ToLower().Contains(request.Keyword.Trim().ToLower()))
.Select(x => x.PlatformId);
// queryable will be combined into one query
query = query
.Where(x => searchTermPlateIds.Contains(x.Id);
}
// final materialization, here you can add Includes if needed.
var plates = await query
.OrderBy(x => x.Name)
.ToListAsync(ct);
I have the following Entity Framework 2.0 query:
var user = context.Users.AsNoTracking()
.Include(x => x.UserSkills).ThenInclude(x => x.Skill)
.Include(x => x.UserSkills).ThenInclude(x => x.SkillLevel)
.FirstOrDefault(x => x.Id == userId);
var userSkills = user.UserSkills.Select(z => new {
SkillId = z.SkillId,
SkillLevelId = z.SkillLevelId
}).ToList()
Then I tried the following query:
var lessons = _context.Lessons.AsNoTracking()
.Where(x => x.LessonSkills.All(y =>
userSkills.Any(z => y.SkillId == z.SkillId && y.SkillLevelId <= z.SkillLevelId)))
.ToList();
This query evaluates locally and I get the message:
The LINQ expression 'where (([y].SkillId == [z].SkillId) AndAlso ([y].SkillLevelId <= [z].SkillLevelId))' could not be translated and will be evaluated locally.'.
I tried to solve it using userSkills instead of user.UserSkills but no luck.
Is there a way to run this query on the server?
You should try limiting the usage of in-memory collections inside LINQ to Entities queries to basically Contains on primitive value collection, which currently is the only server translatable construct.
Since Contains is not applicable here, you should not use the memory collection, but the corresponding server side subquery:
var userSkills = context.UserSkills
.Where(x => x.UserId == userId);
var lessons = context.Lessons.AsNoTracking()
.Where(x => x.LessonSkills.All(y =>
userSkills.Any(z => y.SkillId == z.SkillId && y.SkillLevelId <= z.SkillLevelId)))
.ToList();
or even embed the first subquery into the main query:
var lessons = context.Lessons.AsNoTracking()
.Where(x => x.LessonSkills.All(y =>
context.UserSkills.Any(z => z.UserId == userId && y.SkillId == z.SkillId && y.SkillLevelId <= z.SkillLevelId)))
.ToList();
Use Contains on the server then filter further on the client:
var userSkillIds = userSkills.Select(s => s.SkillId).ToList();
var lessons = _context.Lessons.AsNoTracking()
.Where(lsn => lsn.LessonSkills.All(lsnskill => userSkillIds.Contains(lsnskill.SkillId)))
.AsEnumerable() // depending on EF Core translation, may not be needed
.Where(lsn => lsn.LessonSkills.All(lsnskill => userSkills.Any(uskill => uskill.SkillId == lsnskill.SkillId && lsnskill.SkillLevelId <= uskill.SkillLevelId)))
.ToList();
I have the following query:
var enumerable = repository.Elemtents.Where((s) =>
DbFunctions.TruncateTime(s.Timestamp) <= parameter.To.Date &&
DbFunctions.TruncateTime(s.Timestamp) >= parameter.From.Date)
.OrderByDescending((s) => s.Timestamp)
.GroupBy((s) => new {Date = DbFunctions.TruncateTime(s.Timestamp), s.Timestamp.Hour})
.OrderByDescending((s) => s.Key.Date);
I now want to apply paging with Skip() and Take(). In my table (protocol entries) can be a large amount of data. So I coul do the following, but it would be a perfomance lack.
var result = enumerable
.ToList()
.SelectMany((x) => x)
.Skip(0)
.Take(2);
I want to apply Skip() and Take() on the query directly so that it will be done on the sql server. If I do the following, I get weird results:
var result = repository.Elemtents.Where((s) =>
DbFunctions.TruncateTime(s.Timestamp) <= parameter.To.Date &&
DbFunctions.TruncateTime(s.Timestamp) >= parameter.From.Date)
.OrderByDescending((s) => s.Timestamp)
.GroupBy((s) => new {Date = DbFunctions.TruncateTime(s.Timestamp), s.Timestamp.Hour})
.OrderByDescending((s) => s.Key.Date)
.Skip(0)
.Take(2)
.ToList();
Does anyone know how to resolve this?
I have a massive LINQ query that fetches information that looks like this:
In other words, first-level categories, which own second-level categories, which own third level categories. For each category we retrieve the number of listings it contains.
Here is the query:
categories = categoryRepository
.Categories
.Where(x => x.ParentID == null)
.Select(x => new CategoryBrowseIndexViewModel
{
CategoryID = x.CategoryID,
FriendlyName = x.FriendlyName,
RoutingName = x.RoutingName,
ListingCount = listingRepository
.Listings
.Where(y => y.SelectedCategoryOneID == x.CategoryID
&& y.Lister.Status != Subscription.StatusEnum.Cancelled.ToString())
.Count(),
BrowseCategoriesLevelTwoViewModels = categoryRepository
.Categories
.Where(a => a.ParentID == x.CategoryID)
.Select(a => new BrowseCategoriesLevelTwoViewModel
{
CategoryID = a.CategoryID,
FriendlyName = a.FriendlyName,
RoutingName = a.RoutingName,
ParentRoutingName = x.RoutingName,
ListingCount = listingRepository
.Listings
.Where(n => n.SelectedCategoryTwoID == a.CategoryID
&& n.Lister.Status != Subscription.StatusEnum.Cancelled.ToString())
.Count(),
BrowseCategoriesLevelThreeViewModels = categoryRepository
.Categories
.Where(b => b.ParentID == a.CategoryID)
.Select(b => new BrowseCategoriesLevelThreeViewModel
{
CategoryID = b.CategoryID,
FriendlyName = b.FriendlyName,
RoutingName = b.RoutingName,
ParentRoutingName = a.RoutingName,
ParentParentID = x.CategoryID,
ParentParentRoutingName = x.RoutingName,
ListingCount = listingRepository
.Listings
.Where(n => n.SelectedCategoryThreeID == b.CategoryID
&& n.Lister.Status != Subscription.StatusEnum.Cancelled.ToString())
.Count()
})
.Distinct()
.OrderBy(b => b.FriendlyName)
.ToList()
})
.Distinct()
.OrderBy(a => a.FriendlyName)
.ToList()
})
.Distinct()
.OrderBy(x => x.FriendlyName == jobVacanciesFriendlyName)
.ThenBy(x => x.FriendlyName == servicesLabourHireFriendlyName)
.ThenBy(x => x.FriendlyName == goodsEquipmentFriendlyName)
.ToList();
This was fast enough on my dev machine, but alas! Deployed to Azure it's very slow. The reason seems to be that this query is making hundreds of dependency calls to the database, I'm pretty sure because of the immediate execution of the Count statements. Although the app and the database are in the same datacenter, the calls add up in a way they didn't on my dev machine (~40s vs < 1s). So what I'd like to do is send this whole thing off to the database, let it crunch, and get it all back in one hit, if it's possible. How do I do this? Also if I'm approaching this whole thing wrong please tell me. This is the biggest bottleneck in my web app so any help to make it more efficient is appreciated. Thank you! (I'm less concerned about web app memory usage than I am about the cumulative effect of all the database calls.)
This is my suggestion to your massive query.
Don't use ToList() inside the inner queries.
Don't use Count() inside the inner queries.
Try to retrieve all the data once without above IEnumerable operations.In other words fetch the data as IQueryable mode.After loading it in to the App's memory,you can create your data model as you wish.This process will give huge performance boost to your app.So try that and let us know.
Update : about Count()
If you have lot of columns on that list, just fetch a 1 column without Count() using projection.After that you can get the count() on your IEnumerable list.In other words on your app's memory after fetching it from the db.
Here's what I've got so far. It's working really well, but I'm still curious if I can do this in one DB trip, not two. That would seem to be complicated by the fact that each repository has its own DBContext. If you guys have any more thoughts I'd be more than happy to upvote you.
var allCategories = categoryRepository
.Categories
.Select(x => new
{
x.CategoryID,
x.FriendlyName,
x.RoutingName,
x.ParentID
})
.ToList();
var allListings = listingRepository
.Listings
.Where(x => x.Lister.Status != Subscription.StatusEnum.Cancelled.ToString())
.Select(x => new
{
x.SelectedCategoryOneID,
x.SelectedCategoryTwoID,
x.SelectedCategoryThreeID,
})
.ToList();
categories =
allCategories
.Where(x => x.ParentID == null)
.Select(a => new CategoryBrowseIndexViewModel
{
CategoryID = a.CategoryID,
FriendlyName = a.FriendlyName,
RoutingName = a.RoutingName,
ListingCount = allListings
.Where(x => x.SelectedCategoryOneID == a.CategoryID)
.Count(),
BrowseCategoriesLevelTwoViewModels =
allCategories
.Where(x => x.ParentID == a.CategoryID)
.Select(b => new BrowseCategoriesLevelTwoViewModel
{
CategoryID = b.CategoryID,
FriendlyName = b.FriendlyName,
RoutingName = b.RoutingName,
ParentRoutingName = a.RoutingName,
ListingCount = allListings
.Where(x => x.SelectedCategoryTwoID == b.CategoryID)
.Count(),
BrowseCategoriesLevelThreeViewModels =
allCategories
.Where(x => x.ParentID == b.CategoryID)
.Select(c => new BrowseCategoriesLevelThreeViewModel
{
CategoryID = c.CategoryID,
FriendlyName = c.FriendlyName,
RoutingName = c.RoutingName,
ParentRoutingName = b.RoutingName,
ParentParentID = a.CategoryID,
ParentParentRoutingName = a.RoutingName,
ListingCount = allListings
.Where(x => x.SelectedCategoryThreeID == c.CategoryID)
.Count()
})
.OrderBy(x => x.FriendlyName)
})
.OrderBy(x => x.FriendlyName)
})
.OrderBy(x => x.FriendlyName == jobVacanciesFriendlyName)
.ThenBy(x => x.FriendlyName == servicesLabourHireFriendlyName)
.ThenBy(x => x.FriendlyName == goodsEquipmentFriendlyName);
I'm working on a report right now that runs great with our on-premises DB (just refreshed from PROD). However, when I deploy the site to Azure, I get a SQL Timeout during its execution. If I point my development instance at the SQL Azure instance, I get a timeout as well.
Goal: To output a list of customers that have had an activity created during the search range, and when that customer is found, get some other information about that customer regarding policies, etc. I've removed some of the properties below for brevity (as best I can)...
UPDATE
After lots of trial and error, I can get the entire query to run fairly consistently within 1000MS so long as this block of code is not executed.
CurrentStatus = a.Activities
.Where(b => b.ActivityType.IsReportable)
.OrderByDescending(b => b.DueDateTime)
.Select(b => b.Status.Name)
.FirstOrDefault(),
With this code in place, things begin to go haywire. I think this Where clause is a big part of it: .Where(b => b.ActivityType.IsReportable). What is the best way to grab the status name?
EXISTING CODE
Any thoughts as to why SQL Azure would timeout whereas on-premises would turn this around in less than 100MS?
return db.Customers
.Where(a => a.Activities.Where(
b => b.CreatedDateTime >= search.BeginDateCreated
&& b.CreatedDateTime <= search.EndDateCreated).Count() > 0)
.Where(a => a.CustomerGroup.Any(d => d.GroupId== search.GroupId))
.Select(a => new CustomCustomerReport
{
CustomerId = a.Id,
Manager = a.Manager.Name,
Customer = a.FirstName + " " + a.LastName,
ContactSource= a.ContactSource!= null ? a.ContactSource.Name : "Unknown",
ContactDate = a.DateCreated,
NewSale = a.Sales
.Where(p => p.Employee.IsActive)
.OrderByDescending(p => p.DateCreated)
.Select(p => new PolicyViewModel
{
//MISC PROPERTIES
}).FirstOrDefault(),
ExistingSale = a.Sales
.Where(p => p.CancellationDate == null || p.CancellationDate <= myDate)
.Where(p => p.SaleDate < myDate)
.OrderByDescending(p => p.DateCreated)
.Select(p => new SalesViewModel
{
//MISC PROPERTIES
}).FirstOrDefault(),
CurrentStatus = a.Activities
.Where(b => b.ActivityType.IsReportable)
.OrderByDescending(b => b.DueDateTime)
.Select(b => b.Disposition.Name)
.FirstOrDefault(),
CustomerGroup = a.CustomerGroup
.Where(cd => cd.GroupId == search.GroupId)
.Select(cd => new GroupViewModel
{
//MISC PROPERTIES
}).FirstOrDefault()
}).ToList();
I cannot give you a definite answer but I would recommend approaching the problem by:
Run SQL profiler locally when this code is executed and see what SQL is generated and run. Look at the query execution plan for each query and look for table scans and other slow operations. Add indexes as needed.
Check your lambdas for things that cannot be easily translated into SQL. You might be pulling the contents of a table into memory and running lambdas on the results, which will be very slow. Change your lambdas or consider writing raw SQL.
Is the Azure database the same as your local database? If not, pull the data locally so your local system is indicative.
Remove sections (i.e. CustomerGroup then CurrentDisposition then ExistingSale then NewSale) and see if there is a significant performance improvement after removing the last section. Focus on the last removed section.
Looking at the line itself:
You use ".Count() > 0" on line 4. Use ".Any()" instead, since the former goes through every row in the database to get you an accurate count when you just want to know if at least one row satisfies the requirements.
Ensure fields referenced in where clauses have indexes, such as IsReportable.
Short answer: use memory.
Long answer:
Because of either bad maintenance plans or limited hardware, running this query in one big lump is what's causing it to fail on Azure. Even if that weren't the case, because of all the navigation properties you're using, this query would generate a staggering number of joins. The answer here is to break it down in smaller pieces that Azure can run. I'm going to try to rewrite your query into multiple smaller, easier to digest queries that use the memory of your .NET application. Please bear with me as I make (more or less) educated guesses about your business logic/db schema and rewrite the query accordingly. Sorry for using the query form of LINQ but I find things such as join and group by are more readable in that form.
var activityFilterCustomerIds = db.Activities
.Where(a =>
a.CreatedDateTime >= search.BeginDateCreated &&
a.CreatedDateTime <= search.EndDateCreated)
.Select(a => a.CustomerId)
.Distinct()
.ToList();
var groupFilterCustomerIds = db.CustomerGroup
.Where(g => g.GroupId = search.GroupId)
.Select(g => g.CustomerId)
.Distinct()
.ToList();
var customers = db.Customers
.AsNoTracking()
.Where(c =>
activityFilterCustomerIds.Contains(c.Id) &&
groupFilterCustomerIds.Contains(c.Id))
.ToList();
var customerIds = customers.Select(x => x.Id).ToList();
var newSales =
(from s in db.Sales
where customerIds.Contains(s.CustomerId)
&& s.Employee.IsActive
group s by s.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Sale = grouped
.OrderByDescending(x => x.DateCreated)
.Select(new PolicyViewModel
{
// properties
})
.FirstOrDefault()
}).ToList();
var existingSales =
(from s in db.Sales
where customerIds.Contains(s.CustomerId)
&& (s.CancellationDate == null || s.CancellationDate <= myDate)
&& s.SaleDate < myDate
group s by s.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Sale = grouped
.OrderByDescending(x => x.DateCreated)
.Select(new SalesViewModel
{
// properties
})
.FirstOrDefault()
}).ToList();
var currentStatuses =
(from a in db.Activities.AsNoTracking()
where customerIds.Contains(a.CustomerId)
&& a.ActivityType.IsReportable
group a by a.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Status = grouped
.OrderByDescending(x => x.DueDateTime)
.Select(x => x.Disposition.Name)
.FirstOrDefault()
}).ToList();
var customerGroups =
(from cg in db.CustomerGroups
where cg.GroupId == search.GroupId
group cg by cg.CustomerId into grouped
select new
{
CustomerId = grouped.Key,
Group = grouped
.Select(x =>
new GroupViewModel
{
// ...
})
.FirstOrDefault()
}).ToList();
return customers
.Select(c =>
new CustomCustomerReport
{
// ... simple props
// ...
// ...
NewSale = newSales
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Sale)
.FirstOrDefault(),
ExistingSale = existingSales
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Sale)
.FirstOrDefault(),
CurrentStatus = currentStatuses
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Status)
.FirstOrDefault(),
CustomerGroup = customerGroups
.Where(s => s.CustomerId == c.Id)
.Select(x => x.Group)
.FirstOrDefault(),
})
.ToList();
Hard to suggest anything without seeing actual table definitions, espectially the indexes and foreign keys on Activities entity.
As far I understand Activity (CustomerId, ActivityTypeId, DueDateTime, DispositionId). If this is standard warehousing table (DateTime, ClientId, Activity), I'd suggest the following:
If number of Activities is reasonably small, then force the use of CONTAINS by
var activities = db.Activities.Where( x => x.IsReportable ).ToList();
...
.Where( b => activities.Contains(b.Activity) )
You can even help the optimiser by specifying that you want ActivityId.
Indexes on Activitiy entity should be up to date. For this particular query I suggest (CustomerId, ActivityId, DueDateTime DESC)
precache Disposition table, my crystal ball tells me that it's dictionary table.
For similar task to avoid constantly hitting Activity table I made another small table (CustomerId, LastActivity, LastVAlue) and updated it as the status changed.