I have the below query which is tremendously slow. I am new to Entity Framework and I believe it has got to do something with Eager Loading, Lazy Loading or Explicit Loading. Need help optimize the below C# statement.
var queryResult = CurrentSession.Set<SomeType_T>().Include(a => a.SomeType1_T)
.Include(a => a.SomeType1_T.Catalog_Type_T)
.Include(a => a.SomeType1_T.SomeType4_T)
.Include(a => a.SomeType1_T.SomeType2_T)
.Include("SomeType1_T.SomeType2_T.SomeType3_T")
.Include(a => a.SomeType1_T.SomeType4_T.SomeType5_T)
.Include(a => a.SomeType1_T.SomeType5_T)
.Include(a => a.SomeType1_T.Questions_T)
.Include(a => a.SomeType1_T.Questions_T.Question_Type_T)
.Include(a => a.SomeType1_T.Members_T)
.Include(b => b.SomeMasterType_T)
.Include(b => b.SomeMasterType_T.SomeMasterType1_T)
.Include(c => c.SomeType6_T)
.Include(d => d.SomeType7_T)
.Include(d => d.SomeType8_T)
.Include(d => d.SomeType8_T1)
.Where(t => t.SomeType9_T == _MatchThisKey);
You can improve the performance of many includes by creating 2 or more small data request from the database as shown below.According to my experience,you can give maximum 2 includes per query.More than that will give really bad performance.Yes,this is Ugly. But it will give very good performance improvement.You too can try and feel that :)
Note : This is just an example.
var userData = from u in db.Users
.Include(p=>p.UserSkills)
.Include(p=>p.UserIdeas)
.FirstOrDefault();
userData = from u in db.Users
.Include(p=>p.UserFriends)
.Include(p=>p.UserFriends1)
.FirstOrDefault();
Above will bring small data set from database by using more than one traverse to the database.
Another option is to use asynchronous loading of your collections if you don't need all your data up front.
For example:
var initialResult = db.Person.Include(c=>c.FirstCollection).First();
var load1 = db.Entry(initialResult).Collection(c=>c.SecondCollection).LoadAsync();
//do all the work you can
await load1;
//continue with more work
You should also consider .AsNoTracking() if you don't plan on editing and saving the entities back to the db. It gives a small performance boost but won't cache entities for future queries.
If you are going to explicitly handle all the collection loading eagerly or in code later on then use these too as they also give a small performance boost.
db.Configuration.LazyLoadingEnabled = false;
db.Configuration.ProxyCreationEnabled = false;
Any Include() call translates to SQL join operator and the number of joins in your example is very serious. If you really need to perform all the joins I'd optimize indexes by looking at the DB engine execution plan.
Related
I have linq query like this in EF Core 2.0, It work as it is, but when I upgrade to EF Core 3.0 it always timeout. I found the issue in query = query.Where(x => x.Questions);.
My Question is i would like to return the course with filter questions like only Take(10) or with .Where condition that only display certain range not all questions.
var query = _courseRepository.Table;
query = query.Where(x => x.Id == id);
query = query.Include(x => x.Questions);
query = query.Include(x => x.CourseYear);
query = query.Include(x => x.CourseSubject);
query = query.Include(x => x.Instructors).ThenInclude(y => y.User);
query = query.Include(x => x.Instructors).ThenInclude(y => y.Course);
query = query.Include(x => x.Instructors).ThenInclude(y => y.CourseClass);
query = query.Include(x => x.CourseSections);
query = query.Include(x => x.CourseSections).ThenInclude(y => y.Lessons);
query = query.Include(x => x.CourseClasses);
query = query.Include(x => x.UserCourses).ThenInclude(y => y.User);
var result = query.FirstOrDefault();
EFCore 3.0 changed the query(ies) generated by using .Include() and you are experiencing the Cartesian Explosion Problem;
Specifically there is the following Red Caution in the Docs now:
Caution
Since version 3.0.0, each Include will cause an additional JOIN to be
added to SQL queries produced by relational providers, whereas
previous versions generated additional SQL queries. This can
significantly change the performance of your queries, for better or
worse. In particular, LINQ queries with an exceedingly high number of
Include operators may need to be broken down into multiple separate
LINQ queries in order to avoid the cartesian explosion problem.
The solution is to execute multiple queries now per the docs.
Its super unfortunate loading entity graphs, common to highly normalized data, is so un-performant but this is its current state with EF.
See: Loading Related Data and scroll until you see red.
var query = _courseRepository.Table
.Include(x => x.Questions)
.Include(x => x.CourseClasses)
.Include(x => x.CourseYear)
.Include(x => x.CourseSubject);
var course = await query.FirstOrDefaultAsync(x => x.Id == id);
query.Include(x => x.Instructors).ThenInclude(y => y.User).SelectMany(a => a.Instructors).Load();
query.Include(x => x.Instructors).ThenInclude(y => y.Course).SelectMany(a => a.Instructors).Load();
query.Include(x => x.Instructors).ThenInclude(y => y.CourseClass).SelectMany(a => a.Instructors).Load();
query.Include(x => x.CourseSections).ThenInclude(y => y.Lessons).SelectMany(a => a.CourseSections).Load();
query.Include(x => x.UserCourses).ThenInclude(y => y.User).SelectMany(a => a.UserCourses).Load();
I have this query:
var mapping = await context.MAPPING_COMPANIES
.Include(x => x.CUSTOMER_INFO)
.SingleOrDefaultAsync(where => where.AMIN_COMPANY_ID == aminCompanyId);
Now the single MAPPING_COMPANY will have a single CUSTOMER_INFO. I also need to include two more entities of CUSTOMER_INFO: MASTER_ADDRESS_TYPE and MASTER_CUSTOMER_STATUS. I need these two also included or flattened with the CUSTOMER_INFO.
How do I do that? I have experimented with more Including statements and combining Selects and even tried out the ThenIncludeBy.EF6 nuget but to no avail.
Have you tried this?
var mapping = await context.MAPPING_COMPANIES
.Include(x => x.CUSTOMER_INFO)
.Include(x => x.CUSTOMER_INFO.MASTER_ADDRESS_TYPE)
.Include(x => x.CUSTOMER_INFO.MASTER_CUSTOMER_STATUS)
.SingleOrDefaultAsync(where => where.AMIN_COMPANY_ID == aminCompanyId);
You'll need to make sure you do not have any Select(), or I think GroupBy(), in there since Include() only works if query shape matches the entity set.
In this query:
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters
.OrderBy(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName)
.Select(p => p.Last());
}
I had to switch it to this for it to work
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters
.OrderByDescending(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName)
.Select(p => p.FirstOrDefault());
}
I couldn't even use p.First(), to mirror the first query.
Why are there such basic limitations in what's otherwise such a robust ORM system?
That limitation comes down to the fact that eventually it has to translate that query to SQL and SQL has a SELECT TOP (in T-SQL) but not a SELECT BOTTOM (no such thing).
There is an easy way around it though, just order descending and then do a First(), which is what you did.
EDIT:
Other providers will possibly have different implementations of SELECT TOP 1, on Oracle it would probably be something more like WHERE ROWNUM = 1
EDIT:
Another less efficient alternative - I DO NOT recommend this! - is to call .ToList() on your data before .Last(), which will immediately execute the LINQ To Entities Expression that has been built up to that point, and then your .Last() will work, because at that point the .Last() is effectively executed in the context of a LINQ to Objects Expression instead. (And as you pointed out, it could bring back thousands of records and waste loads of CPU materialising objects that will never get used)
Again, I would not recommend doing this second, but it does help illustrate the difference between where and when the LINQ expression is executed.
Instead of Last(), Try this:
model.OrderByDescending(o => o.Id).FirstOrDefault();
Replace Last() by a Linq selector OrderByDescending(x => x.ID).Take(1).Single()
Something like that would be works if you prefert do it in Linq :
public static IEnumerable<IServerOnlineCharacter> GetUpdated()
{
var context = DataContext.GetDataContext();
return context.ServerOnlineCharacters.OrderBy(p => p.ServerStatus.ServerDateTime).GroupBy(p => p.RawName).Select(p => p.OrderByDescending(x => x.Id).Take(1).Single());
}
Yet another way get last element without OrderByDescending and load all entities:
dbSet
.Where(f => f.Id == dbSet.Max(f2 => f2.Id))
.FirstOrDefault();
That's because LINQ to Entities (and databases in general) does not support all the LINQ methods (see here for details: http://msdn.microsoft.com/en-us/library/bb738550.aspx)
What you need here is to order your data in such a way that the "last" record becomes "first" and then you can use FirstOrDefault. Note that databasese usually don't have such concepts as "first" and "last", it's not like the most recently inserted record will be "last" in the table.
This method can solve your problem
db.databaseTable.OrderByDescending(obj => obj.Id).FirstOrDefault();
Adding a single function AsEnumerable() before Select function worked for me.
Example:
return context.ServerOnlineCharacters
.OrderByDescending(p => p.ServerStatus.ServerDateTime)
.GroupBy(p => p.RawName).AsEnumerable()
.Select(p => p.FirstOrDefault());
Ref:
https://www.codeproject.com/Questions/1005274/LINQ-to-Entities-does-not-recognize-the-method-Sys
I am attempting to pull back a large number of relations form a SQL Server database using the entity framework for display on a summary web page and I am finding the performance of using many include statement in the query is abysmal.
The requirement is to display all of a single user's data on a page at once, generally, this isn't a huge amount of data, but fetching it does require traversing quite a few EF relations with a query something like this
var class = context.Class.Where(a => a.Id.Equals(Id))
.Include(a => a.Teacher.Address)
.Include(a => a.Teacher.Supplies.Notebooks)
.Include(a => a.Teacher.Supplies.Pencils)
.Include(a => a.Teacher.Supplies.Textbooks)
.Include(a => a.Teacher.Supplies.Erasers)
.Include(a => a.Students.Select(d => d.Supplies.Notebooks))
.Include(a => a.Students.Select(d => d.Supplies.Pencils))
.Include(a => a.Students.Select(d => d.Supplies.Textbooks))
.Include(a => a.Students.Select(d => d.Supplies.Erasers))
.Include(a => a.Configuration)
.Include(a => a.Payment.Payer.Address)
.Include(a => a.Payment.PaymentMethod)
.First();
That takes more than 10 seconds to run against a test database that contains minimal data. However if I do this instead, performance takes ~1 second:
var class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Teacher.Address).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Teacher.Supplies.Notebooks).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Teacher.Supplies.Pencils).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Teacher.Supplies.Textbooks).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Teacher.Supplies.Erasers).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Students.Select(d => d.Supplies.Notebooks)).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Students.Select(d => d.Supplies.Pencils)).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Students.Select(d => d.Supplies.Textbooks)).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Students.Select(d => d.Supplies.Erasers)).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Configuration).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Payment.Payer.Address).First();
class = context.Class.Where(a => a.Id.Equals(Id)).Include(a => a.Payment.PaymentMethod).First();
Is this really the best way to run a query to fetch all this data or am I doing this completely wrong?
Well, I count 18 joins there, so it's not going to be fast by any means. However, depending on how you're testing this, your results may or may not be significant. In particular, if you're debugging or just running locally in general, everything is going to be slower in general. If you're using LocalDb, it's going to be slower than SQL Server. IIS Express is single-threaded, while IIS is multi-threaded, so that has an effect on performance as well.
First, and foremost, you should get something like Glimpse running with your project. This will let you actually separate the time it takes to load the page in general from the time it takes to run your queries, as well as give you visibility into the number and scope of queries being run. However, until you test on an actual production-like machine, with full IIS and SQL Server, you won't really know how this is going to perform.
If it's a real problem, you can look into creating a stored procedure to return all this information. That will be much quicker than anything EF can ever do as SQL Server can store the execution plan and optimize the queries. If you're finding that Entity Framework is too slow for your purposes, in general, you can also investigate using alternate ORMs like Dapper. Of course, you'll have a learning curve there, but if your primary focus is performance, you can pretty much always to better than Entity Framework, though you might lose some of the niceties in the process.
Unfortunately, I can't explain why the Include technique is so slow, but this is how I would try to fetch that data:
var class = context
.Class
.Where(a => a.Id.Equals(Id))
.First()
.Select(a => new
{
TA = a.Teacher.Address,
TN = a.Teacher.Supplies.Notebooks,
TP =a.Teacher.Supplies.Pencils,
TT = a.Teacher.Supplies.Textbooks,
TE = a.Teacher.Supplies.Erasers,
SN = a.Students.Select(d => d.Supplies.Notebooks),
SP = a.Students.Select(d => d.Supplies.Pencils),
ST = a.Students.Select(d => d.Supplies.Textbooks),
SE = a.Students.Select(d => d.Supplies.Erasers),
a.Configuration,
PA = a.Payment.Payer.Address,
a.Payment.PaymentMethod
};
Does that improve the performance? If it does I'd be interested in the difference between the sql generated by the 2 techniques
How do I include a child of a child entitiy?
Ie, Jobs have Quotes which have QuoteItems
var job = db.Jobs
.Where(x => x.JobID == id)
.Include(x => x.Quotes)
.Include(x => x.Quotes.QuoteItems) // This doesn't work
.SingleOrDefault();
Just to be clearer - I'm trying to retrieve a single Job item, and it's associated Quotes (one to many) and for each Quote the associated QuoteItems (One Quote can have many QuoteItems)
The reason I'm asking is because in my Quote Index view I'm trying to show the Total of all the Quote items for each Quote by SUMming the Subtotal, but it's coming out as 0. I'm calling the Subtotal like this:
#item.QuoteItem.Sum(p => p.Subtotal)
I believe the reason I have this issue is that my Linq query above isn't retrieving the associated QuoteItems for each Quote.
To get a job and eager load all its quotes and their quoteitems, you write:
var job = db.Jobs
.Include(x => x.Quotes.Select(q => q.QuoteItems))
.Where(x => x.JobID == id)
.SingleOrDefault();
You might need SelectMany instead of Select if QuoteItems is a collection too.
Note to others; The strongly typed Include() method is an extension method so you need to include using System.Data.Entity; at the top of your file.
The method in the accepted answer doesn't work in .NET Core.
For anyone using .NET Core, while the magic string way does work, the cleaner way to do it would be ThenInclude:
var job = db.Jobs
.Where(x => x.JobID == id)
.Include(x => x.Quotes)
.ThenInclude(x => x.QuoteItems)
.SingleOrDefault();
Source: Work with data in ASP.NET Core Apps | Microsoft Learn
This will do the job (given that we are talking entity framework and you want to fetch child-entities):
var job = db.Jobs
.Include(x => x.Quotes) // include the "Job.Quotes" relation and data
.Include("Quotes.QuoteItems") // include the "Job.Quotes.QuoteItems" relation with data
.Where(x => x.JobID == id) // going on the original Job.JobID
.SingleOrDefault(); // fetches the first hit from db.
For more information about the Include statement have a look at this: https://learn.microsoft.com/en-us/dotnet/api/system.data.objects.objectquery-1.include
This answer has been getting upvotes throught the years, so I'd just like to clarify, try https://stackoverflow.com/a/24120209/691294 first. This answer is for those cases where all else fails and you have to resort to a black magic solution (i.e. using magic strings).
This did the trick for me as #flindeberg said here .
Just added checking if there are children in each parent item in the list
List<WCF.DAL.Company> companies = dbCtx.Companies.Where(x=>x.CompanyBranches.Count > 0)
.Include(c => c.CompanyBranches)
.Include("CompanyBranches.Address")
.ToList();