I use Entity Framework 6 and i currently have a query with many includes which loads about 1200 entities into the dbContext. Loading the entities seems to be quite slow as the query takes almost a minute. Is there anything I can do about the performance? I have 4 such queries that take 2.5 minutes to load?
LazyLoading is enabled but for performance reasons i preload the entities.
var report = DbContext.REPORT.Single(r => r.ID == reportId);
//this query takes a bit less than 1 minute
DbContext.REPORT_ELEMENT
.Include(re => re.LAYOUT)
.Include(re => re.PAGEMASTER)
.Include(re => re.REPORT_ELEMENTS)
.Include(re => re.SUBTITLE_CONTENT)
.Include(re => re.REPORT_ELEMENT_NOTE)
.Include("SUBTITLE_CONTENT.CONTENT_ELEMENT.LANGUAGE")
.Include("TITLE_CONTENT.CONTENT_ELEMENT.LANGUAGE")
.Where(re => re.REPORT_ID == report.ID)
.Load();
Performance suggestions:
Prevent tracking. Query in read-only mode.
Prevent getting too much data in one query. Try to page it.
Prevent include. The query has too many Includes which makes performance bad.
Prevent tracking
Consider adding AsNoTracking for this makes query performance better.
Reference: https://learn.microsoft.com/en-us/ef/core/querying/tracking#no-tracking-queries
Only get the data you need
And the key reason for your slow query is it outputs too much data. Consider adding: Take(200), Skip() to take only the data you need or the current page requires. Use a pager to generate the report. This might helps a lot.
Prevent Include
Include generates SQL to select multiple tables. Which greatly increased complexity. You can only select the data you need and prevent writing the Include function.
For example, if you only want to get the last ball in the box, consider writing like this:
public class Box
{
public int Id { get; set; }
public IEnumerable<Ball> Balls { get; set; }
}
public class Ball
{
public int Id { get; set; }
public int BoxId { get; set; }
public Box Box { get; set; }
}
var boxes = await Boxes
// DO NOT Call Include(t => t.Balls) here!
.Where(somecondition)
.Select(t => new Box(){
Id = t.Id,
Balls = t.Balls.OrderByDescending(x => x.CreationTime)
.Take(1) // Only get what you need
})
.ToListAsync()
Also when we use Select we can remove .Include because it won’t have any effect here.
Disclaimer: I'm the owner of the project Entity Framework Plus
The Query IncludeOptimized feature allows to filter with include and optimize the query performance at the same time.
It usually improves the performance (split the query in smaller queries)
DbContext.REPORT_ELEMENT
.IncludeOptimized(re => re.LAYOUT)
.IncludeOptimized(re => re.PAGEMASTER)
.IncludeOptimized(re => re.REPORT_ELEMENTS)
.IncludeOptimized(re => re.SUBTITLE_CONTENT)
.IncludeOptimized(re => re.REPORT_ELEMENT_NOTE)
.IncludeOptimized(re => re.SUBTITLE_CONTENT.Select(sc => sc.CONTENT_ELEMENT)) // SelectMany?
.IncludeOptimized(re => re.SUBTITLE_CONTENT.Select(sc => sc.CONTENT_ELEMENT).Select(ce => ce.LANGUAGE)) // SelectMany?
.IncludeOptimized(re => re.TITLE_CONTENT)
.IncludeOptimized(re => re.SUBTITLE_CONTENT.Select(sc => sc.CONTENT_ELEMENT)) // SelectMany?
.IncludeOptimized(re => re.SUBTITLE_CONTENT.Select(sc => sc.CONTENT_ELEMENT).Select(ce => ce.LANGUAGE)) // SelectMany?
.Where(re => re.REPORT_ID == report.ID)
.Load();
Documentation: EF+ Query IncludeOptimized
Additionally to the advices from Anduin, I would like to add the advice to split up the Includes() in several distinct queries. EF will be able to keep track of the references between the entities within the same DBContext. As a general rule of thumb - do not use more then three Includes() in the same query. Also ensure, that you have an index in the DB for every resulting JOIN.
To be able to do so, you must expose your FK fields in the entities additionally to the navigation properties.
Your initial query would become something like this:
DbContext.LAYOUT
.Where(re => re.LAYOUT_ID == report.LAYOUT_FK)
.Load();
DbContext.PAGEMASTER
.Where(re => re.PAGEMASTERT_ID == report.PAGEMASTER_FK)
.Load();
Related
I have an Entity Framework database that I'm querying, so I'm using linq-to-entities.
Here's my query:
// 'Find' is just a wrapper method that returns IQueryable
var q = r.Find(topic =>
topic.PageId != null &&
!topic.Page.IsDeleted &&
topic.Page.IsActive)
// These are standard EF extension methods, which are used to include
linked tables. Note: Page_Topic has a one-to-many relationship with topic.
.Include(topic => topic.Page.Route)
.Include(topic => topic.Page_Topic.Select(pt => pt.Page.Route))
// HERE'S THE QUESTION: This select statement needs to flatten Page_Topic (which it does). But it seems to do it in the wrong place. To explain, if I were to include another column that depended on Page_Topic (for example: 'PillarRoutName2', I'd have to apply the same flattening logic to that column too. Surely the filtering of Page_Topic should be done higher up the query in a DRY way.
.Select(x => new
{
TopicName = x.Name,
HubRouteName = x.Page.Route.Name,
PillarRouteName = x.Page_Topic.FirstOrDefault(y => y.IsPrimary).Page.Route.Name
}).ToList();
Surely the filtering of Page_Topic should be done higher up the query in a DRY way.
Correct! And it's easy to do this:
.Select(x => new
{
TopicName = x.Name,
HubRouteName = x.Page.Route.Name,
FirstTopic = x.Page_Topic.FirstOrDefault(y => y.IsPrimary)
})
.Select(x => new
{
TopicName = x.TopicName,
HubRouteName = x.HubRouteName,
PillarRouteName = x.FirstTopic.Page.Route.Name,
PillarRoutName2 = x.FirstTopic. ...
}).ToList();
Depending on where you start to get properties from FirstTopic you can also use x.Page_Topic.FirstOrDefault(y => y.IsPrimary).Page or .Page.Route in the first part.
Note that you don't need the Includes. They will be ignored because the query is a projection (Select(x => new ...).
I'm trying to use Include() in a query but it doesn't work as I expect.
I have a table called Bes who has this columns:
Id
Name
Pesta
CeId
OpId
The CeId and OpId are foreign keys to this tables:
Ces:
Id
Name
Ops:
Id
Name
I want to run this query (which works but doesn't fill the Ces and Ops tables)
var bes = await _context.Bes
.Include(x => x.Ops)
.Include(x => x.Ces)
.Where(x => besWithXPs.Contains(x.Pesta))
.GroupBy(x => x.Pesta)
.Select(x => x.First())
.ToListAsync();
I try to use this query without the Select() and the Ops and Ces where fill but I only want one Bes per Pesta (that's why the GroupBy and Select)
Anyone knows what is happening?
Btw I'm using Entity Framework Core with .NET Core 2
I think you should add
[JsonIgnore]
public Ces Ce { get; set; }
[JsonIgnore]
public Ods Od { get; set; }
to your model in order to let include them and then do:
var bes = await _context.Bes.AsNotracking()
.Include(x => x.Ops)
.Include(x => x.Ces)
.Where(x => besWithXPs.Contains(x.Pesta))
.GroupBy(x => x.Pesta)
.FirstOrDefaultAsync();
1 - What AsNoTracking Does:
Entity Framework exposes a number of performance tuning options to help you optimise the performance of your applications. One of these tuning options is .AsNoTracking(). This optimisation allows you to tell Entity Framework not to track the results of a query. This means that Entity Framework performs no additional processing or storage of the entities which are returned by the query. However, it also means that you can't update these entities without reattaching them to the tracking graph.
2 - Change ".Select(x => x.First()).ToListAsync();" to ".FirstOrDefaultAsync();". Use .Select only if you want to get some of the properties or if you need a new class filled with the data that you receive.
In this query:
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters
.OrderBy(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName)
.Select(p => p.Last());
}
I had to switch it to this for it to work
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters
.OrderByDescending(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName)
.Select(p => p.FirstOrDefault());
}
I couldn't even use p.First(), to mirror the first query.
Why are there such basic limitations in what's otherwise such a robust ORM system?
That limitation comes down to the fact that eventually it has to translate that query to SQL and SQL has a SELECT TOP (in T-SQL) but not a SELECT BOTTOM (no such thing).
There is an easy way around it though, just order descending and then do a First(), which is what you did.
EDIT:
Other providers will possibly have different implementations of SELECT TOP 1, on Oracle it would probably be something more like WHERE ROWNUM = 1
EDIT:
Another less efficient alternative - I DO NOT recommend this! - is to call .ToList() on your data before .Last(), which will immediately execute the LINQ To Entities Expression that has been built up to that point, and then your .Last() will work, because at that point the .Last() is effectively executed in the context of a LINQ to Objects Expression instead. (And as you pointed out, it could bring back thousands of records and waste loads of CPU materialising objects that will never get used)
Again, I would not recommend doing this second, but it does help illustrate the difference between where and when the LINQ expression is executed.
Instead of Last(), Try this:
model.OrderByDescending(o => o.Id).FirstOrDefault();
Replace Last() by a Linq selector OrderByDescending(x => x.ID).Take(1).Single()
Something like that would be works if you prefert do it in Linq :
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters.OrderBy(p => p.ServerStatus.ServerDateTime).GroupBy(p => p.RawName).Select(p => p.OrderByDescending(x => x.Id).Take(1).Single());
}
Yet another way get last element without OrderByDescending and load all entities:
dbSet
.Where(f => f.Id == dbSet.Max(f2 => f2.Id))
.FirstOrDefault();
That's because LINQ to Entities (and databases in general) does not support all the LINQ methods (see here for details: http://msdn.microsoft.com/en-us/library/bb738550.aspx)
What you need here is to order your data in such a way that the "last" record becomes "first" and then you can use FirstOrDefault. Note that databasese usually don't have such concepts as "first" and "last", it's not like the most recently inserted record will be "last" in the table.
This method can solve your problem
db.databaseTable.OrderByDescending(obj => obj.Id).FirstOrDefault();
Adding a single function AsEnumerable() before Select function worked for me.
Example:
return context.ServerOnlineCharacters
.OrderByDescending(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName).AsEnumerable()
.Select(p => p.FirstOrDefault());
Ref:
https://www.codeproject.com/Questions/1005274/LINQ-to-Entities-does-not-recognize-the-method-Sys
In this query:
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters
.OrderBy(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName)
.Select(p => p.Last());
}
I had to switch it to this for it to work
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters
.OrderByDescending(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName)
.Select(p => p.FirstOrDefault());
}
I couldn't even use p.First(), to mirror the first query.
Why are there such basic limitations in what's otherwise such a robust ORM system?
That limitation comes down to the fact that eventually it has to translate that query to SQL and SQL has a SELECT TOP (in T-SQL) but not a SELECT BOTTOM (no such thing).
There is an easy way around it though, just order descending and then do a First(), which is what you did.
EDIT:
Other providers will possibly have different implementations of SELECT TOP 1, on Oracle it would probably be something more like WHERE ROWNUM = 1
EDIT:
Another less efficient alternative - I DO NOT recommend this! - is to call .ToList() on your data before .Last(), which will immediately execute the LINQ To Entities Expression that has been built up to that point, and then your .Last() will work, because at that point the .Last() is effectively executed in the context of a LINQ to Objects Expression instead. (And as you pointed out, it could bring back thousands of records and waste loads of CPU materialising objects that will never get used)
Again, I would not recommend doing this second, but it does help illustrate the difference between where and when the LINQ expression is executed.
Instead of Last(), Try this:
model.OrderByDescending(o => o.Id).FirstOrDefault();
Replace Last() by a Linq selector OrderByDescending(x => x.ID).Take(1).Single()
Something like that would be works if you prefert do it in Linq :
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters.OrderBy(p => p.ServerStatus.ServerDateTime).GroupBy(p => p.RawName).Select(p => p.OrderByDescending(x => x.Id).Take(1).Single());
}
Yet another way get last element without OrderByDescending and load all entities:
dbSet
.Where(f => f.Id == dbSet.Max(f2 => f2.Id))
.FirstOrDefault();
That's because LINQ to Entities (and databases in general) does not support all the LINQ methods (see here for details: http://msdn.microsoft.com/en-us/library/bb738550.aspx)
What you need here is to order your data in such a way that the "last" record becomes "first" and then you can use FirstOrDefault. Note that databasese usually don't have such concepts as "first" and "last", it's not like the most recently inserted record will be "last" in the table.
This method can solve your problem
db.databaseTable.OrderByDescending(obj => obj.Id).FirstOrDefault();
Adding a single function AsEnumerable() before Select function worked for me.
Example:
return context.ServerOnlineCharacters
.OrderByDescending(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName).AsEnumerable()
.Select(p => p.FirstOrDefault());
Ref:
https://www.codeproject.com/Questions/1005274/LINQ-to-Entities-does-not-recognize-the-method-Sys
What is the most efficient way to order a LocalDb table in descending order by four columns? I have a table that tracks a file storage hierarchy. Four folders act like an odometer (one digit for each folder). The table reflects this as a "storage item." I need to find the highest number using all four folders.
Here is the code I am currently using. I am worried that it is not efficient or accurate for a LocalDb database...
public StorageItem GetLastItem()
{
var item = _context.StorageItems.AsNoTracking()
.OrderByDescending(x => x.LevelA) // int
.OrderByDescending(x => x.LevelB) // int
.OrderByDescending(x => x.LevelC) // int
.OrderByDescending(x => x.ItemNumber) // int
.Where(x => !x.AuditDateDeleted.HasValue) // DateTime?
FirstOrDefault();
// Caching logic here
return item;
}
I don't think it'll be inefficient, but chaining a bunch of OrderByDescendings is probably not what you intended to do. Currently, this should generate a SQL ORDER BY clause of ItemNumber DESC, LevelC DESC, LevelB DESC, LevelA DESC. I think you want to use ThenByDescending...
var item = _context.StorageItems.AsNoTracking()
.Where(x => !x.AuditDateDeleted.HasValue)
.OrderByDescending(x => x.LevelA)
.ThenByDescending(x => x.LevelB)
.ThenByDescending(x => x.LevelC)
.ThenByDescending(x => x.ItemNumber)
.FirstOrDefault();
Also moved the where clause higher up, although I think the database should be smart enough to optimize that.