In Entity Framework 4, what is the difference between Lazy Loading, and using Load() method?
Edit: I have added, two 'if' statements:
Lazy Loading:
var query = from c in context.Contacts select c;
foreach ( var contact in query ) {
if ( contact.ID == 5 )
Console.WriteLine( contact.Addresses.City );
}
Load() method:
context.ContextOptions.LazyLoadingEnabled = false;
var query = from c in context.Contacts select c;
foreach ( var contact in query ) {
if ( contact.ID == 5 ) {
contact.Addresses.Load()
Console.WriteLine( contact.Addresses.City );
}
}
Now, having this two 'if' checks, why should I preffer one before another?
Lazy Loading means that a load will only occur once the object is needed, thus not loading unnecessary data.
When you disable Lazy Loading you say that you will load yourself by calling load.
http://en.wikipedia.org/wiki/Lazy_loading
Lazy Loading is disabled by default, so when you set it to false in your first line it does not do anything.
When you call Load, you will load all the related objects to that database (which is not needed in this case which makes it work without it)
This post on Working with Lazy Loading in EF 4 Code First should also help with understanding how Entity Framework behaves both with and without lazy loading enabled. It also demonstrates that it is enabled by default in EF4 and how to disable it on a per-instance or by default for your application basis.
Related
I have a very basic example C# program which uses EF6 with DbContext and which contains two classes:
MyTable1 1...n MyTable2
I have the following code:
// Creation of instances and saving them to the database.
using (var context = new EFTestEntities()) {
var myTable1 = new MyTable1() { ID = 1 };
var myTable2 = new MyTable2() { ID = 2 };
myTable1.MyTable2.Add(myTable2);
context.MyTable1.Add(myTable1);
context.SaveChanges();
}
// Getting the above objects in another context.
// This part here is what my question is about.
using (var context = new EFTestEntities()) {
var myTable1 = context.MyTable1.Where(e => e.ID == 1).FirstOrDefault();
var myTable2 = context.MyTable2.Where(e => e.ID == 2).FirstOrDefault();
var myTable2AssignedToMyTable1 = myTable1.MyTable2.FirstOrDefault();
}
My question is about the second part of the code above.
When lazy loading is enabled, I get three database calls. One for each line. The variable myTable2AssignedToMyTable1 contains then myTable2 - which is correct.
When lazy loading is disabled, I get two database calls. One for the first and one for the second line. The variable myTable2AssignedToMyTable1 contains as well myTable2 - without a third database call. This makes sense to me because myTable2 was loaded in line 2.
But: If that third database call is not needed, why is it made when lazy loading is enabled?
When I check the database calls in SQL Server Profiler with and without lazy loading I realize that they are exactly the same (apart from the third call). The data from the third call is the same as in the first or second call. So when the third call is happening, the desired data are already in the context. When lazy loading is disabled, EF realizes it and uses the already loaded data. When lazy loading is enabled it does not realize it and fetches the same data from the database again. Why?
Is it faster than finding the correct instance of MyTable2 in the context?
Or is it faster than always searching for the correct instance in the context first and then make the database call? One can assume that in most cases the desired data are not already in the context. So I guess the EF designers decided to not check the context first. In most cases it will be a miss any way.
But these are only my assumptions. Knowing the real reason would be interesting.
Could Someone help me to clarify the difference between :
var query = awlt.People.Include(p => p.EmailAddresses)
.Where(p => p.LastName.Equals(lastName))
.SelectMany(a => a.EmailAddresses)
.Select(a => a.EmailAddress1);
var query = awlt.People
.Where(p => p.LastName.Equals(lastName))
.SelectMany(a => a.EmailAddresses)
.Select(a => a.EmailAddress1);
I get the same results in both cases without knowing the difference .
Does the Eager Loading require using Include ?
The both query are retrieving the related data just first query by using Eager Loading (and yes Eager loading is achieved by use of the Include method as you guessed) and the second query by using Lazy loading which is by default. But since your query will only returns EmailAddresses because of the Select() and SelectMany() operations the Include() method doesn't change the behavior. To see when Include() method is matter in your example read the following lines that I will prove it in one example:
To know some difference between this two kind of loading related entities Eager loading is typically more efficient when you need the related data for all retrieved rows of the primary table. And also when relations are not too much, eager loading will be good practice to reduce further queries on server. But when you know that you will not need a property instantly then lazy loading maybe a good choice. And also eager loading is a good choice in a situation where your db context would be disposed and lazy loading could not take place anymore.
To prove that one is Lazy Loading and one is Eager Loading consider the following code:
public List<Person> GetEmailAddresses()
{
using (yourEntities awlt = new yourEntities())
{
var query = awlt.People
.Where(p => p.LastName.Equals(lastName));
return query.ToList();
}
}
After calling this method, You cannot load the related entity lazily because the db is disposed. To prove try this:
var query = GetEmailAddresses();
foreach (var item in query.SelectMany(a => a.EmailAddresses).Select(a => a.EmailAddress1))
{
MessageBox.Show(item);
}
And you will get this error:
The ObjectContext instance has been disposed and can no longer be used for operations that require a connection.
But if you change the GetEmailAddresses to use Eager Loading like this:
public List<Person> GetEmailAddresses()
{
using (yourEntities awlt = new yourEntities())
{
var query = awlt.People.Include("EmailAddresses")
.Where(p => p.LastName.Equals(lastName));
return query.ToList();
}
}
Then the below code should works fine:
var query = GetEmailAddresses();
foreach (var item in query.SelectMany(a => a.EmailAddresses).Select(a => a.EmailAddress1))
{
MessageBox.Show(item);
}
So in a situation where your db context would be disposed the Eager Loading would be a better choice.
Don't know about EF 7, but in EF 6 both those statements produce the same queries to database and so are essentially the same. There is no lazy loading, no eager loading (in a sense this term is usually used) whatsoever.
You need to Include only properties of entities you materialize. In the example above you materialize Person.EmailAddresses.EmailAddress1, but you include just Person.EmailAddresses - this has no effect (for more details see for example here).
Consider this sample code (details does not matter, there is just Error entity with Code navigation property):
// note we materialized query
var errors = ctx.Errors.Include(c => c.Code).ToArray();
// no lazy loading happens here - we already loaded all related Codes with Include
var codeIds = errors.Select(c => c.Code.CodeID).ToArray();
And this one:
// no need to include here!
var codeIds = ctx.Errors.Select(c =>c.Code.CodeID).ToArray();
And with include:
// include has no effect here!
var codeIds = ctx.Errors.Inlcude(c => c.Code).Select(c => c.Code.CodeID).ToArray();
What is eager loading? It's when you include additional data to the related entity using Include statement. Here Include statement has no effect, it's just does nothing, so we cannot name that eager loading.
What is lazy loading? It's when navigation property is loading when you access it for the first time. You do not do this in your examples, so there is no lazy loading either.
Both examples just execute identical queries to database (after you materialize them with enumeration`ToArray` etc).
The result of the two queries is exactly the same (also about 'eager' and 'lazy' load).
In this case I think that also the query are very similar or the same BUT never trust EF Provider generated queries. To see the generated queries you can stop the program with a breakpoint and have a look to query object (pointing the mouse on it). That is the query generated by EF Provider.
About Eager loading (the Include statement) in this case it should not be useful because is used to load properties of the output object. In this case you are selecting EMailAddress1 so with Include you could eager load properties of EMailAddress1 (and avoid lazy queries during EMailAddress1 access).
You can find the difference if you look into SQL Server Profiler after the query is run. So in your first case there is only one query going to your database and fetching records from People table as well as EmailAddresses table whereas in the second case it does two queries to database and fetches People first and then EmailAddresses in a second query. Thus the first scenario is called eager loading and the second one lazy loading.
I've started to have a problem where child collections in Entity Framework are not being loaded properly with Lazy Loading.
The most prominent example of this is my Orders object - each Order has one or more Order Lines associated with it (ie. a list of which products have been ordered and how many). Sometimes, when the program is run, you can open up some orders and all the order lines (for every order) will be blank. Restart the program, and they might re-appear. it's pretty intermittent.
I have confirmed that there are no entries in the child collection through logging & debugging:
private ObservableCollection<OrderLine> LoadOrderLines()
{
Log.Debug("Loading {0} order lines...", this.Model.OrderLines.Count);
var result = new ObservableCollection<OrderLine>();
foreach (var orderLine in this.Model.OrderLines)
{
result.Add(orderLine);
}
return result;
}
Sometimes it will say "Loading 0 order lines..." and sometimes "Loading 4 order lines..." for the same order.
I can't use Eager Loading when I load the list of orders because I don't want to load all the order lines for all the orders when only a few of them might ever be opened - I need to keep the loading as fast as possible and only load things as they are needed, hence lazy loading.
It's not only the Orders object that it is happening on, it sometimes happens on other child collections too, but the effect is exactly the same.
Anybody have any idea why EF is doing this and what I can do to fix it? It's a huge problem - I can't have empty order lines in my program when they should be there!
Extra info that may or my not be of use:
This is a WPF MVVM application. The data layer, which is shared with the website, uses a Repository/Unit of Work pattern.
In the OrdersRepository:
public IEnumerable<Order> GetOrders(DateTime fromDate, DateTime toDate, IEnumerable<string> sources, bool? paidStatus, bool? shippedStatus, bool? cancelledStatus, bool? pendingStatus)
{
if (sources == null)
{
sources = this.context.OrderSources.Select(s => s.SourceId);
}
return
this.context.Orders.Where(
o =>
o.OrderDate >= fromDate
&& o.OrderDate < toDate
&& sources.Contains(o.SourceId)
&& (!paidStatus.HasValue || ((o.ReceiptId != null) == paidStatus.Value))
&& (!shippedStatus.HasValue || ((o.ShippedDate != null) == shippedStatus.Value))
&& (!pendingStatus.HasValue || (o.IsPending == pendingStatus.Value))
&& (!cancelledStatus.HasValue || (o.Cancelled == cancelledStatus.Value))).OrderByDescending(
o => o.OrderDate);
}
The OrdersViewModel then loads the orders, creates an orderViewModel for each one and puts them in an ObservableCollection:
var ordersList = this.unitOfWork.OrdersRepository.GetOrders(this.filter).ToList();
foreach (var order in ordersList)
{
var viewModel = this.viewModelProvider.GetViewModel<OrderViewModel, Order>(order);
this.orders.Add(viewModel);
}
Lazy loading is for loading the related entities automatically when you acces the navigation property. But, in this case, you're not doing it automatically, but manually.
To do so, you can disable lazy loading, and use explicit loading, like this:
context.Entry(order).Collection(o => o.orderLines).Load();
(Besides, using this technique, you can apply filters).
Your problem with lazy loading can be a consequence of a long lived DbContext that caches the related entities at a given point in time, and reuses this cache later, without hitting the DB, so it's outdated. I.e. one DbContext finds 3 order lines for an order and caches them. Something else, (outside the db context), adds 2 extra new order lines to this order. Then you access the order lines from the first db context and get the outdated 3 order lines, instead of the 5 that there are in the DB. There are several ways in wich you could, thoretically, reload/refresh the cached data on the DbContext, but you can get into trouble. You'd rather use the explicit loading as I suggested above. If you see the docs for Load method, you can read this:
Loads the collection of entities from the database. Note that entities that already exist in the context are not overwritten with values from the database.
However, the safest option is always to dispose the DbContext and create a new one. (Read the second sentence of the block above).
Given:
public SomeEntity Read(int primaryKey)
{
SomeEntity myEntity;
using (var context = new MyEntities2())
{
context.Configuration.LazyLoadingEnabled = false;//This line is wacky
myEntity = context.SomeEntities.SingleOrDefault(ct => ct.PrimaryKey == primaryKey);
if (myEntity == null)
return myEntity;
//Force eager Load...
var bypassDeferredExecution = myEntity.RelatedTable1.ToList();
var bypassDeferredExecution2 = myEntity.RelatedTable2.ToList();
}
return myEntity;
}
If I set LazyLoadingEnabled = false then myEntity.RelatedTable1.Count == 0.
Leave at the default LazyLoadingEnabled = true then myEntity.RelatedTable1.Count == 2.
My understanding is that Lazy Loading and Eager Loading are polar opposites. I forced eager loading. I expect my related table (a cross reference table) to have 2 results whether or not I use lazy loading. So in my mind these results make no sense.
Why does lazy loading impact my results?
You have to use Include to eagerly load related entities:
myEntity = context.SomeEntities
.Include("RelatedTable1")
.Include("RelatedTable2")
.SingleOrDefault(ct => ct.PrimaryKey == primaryKey);
Setting Lazy Loading to false won't cause it happen automatically.
If you are using lazy loading, then there needs to be a LINQ to Entities Include method call to identify the (foreign keyed) tables to eagerly load.
Navigation property isn't query, it's enumerable collection. You have 2 ways to get it from DB:
- Lazy loading (will be loaded on the first access to property)
- Eager loading (will be loaded after executing main query if you add Include({propertyName} method
So, if you turned off lazy loading and don't add Include methods to the query each navigation property will be empty (empty collection or null value for single entities)
The following code should work for your case:
myEntity = context.SomeEntities
.Include("RelatedTable1")
.Include("RelatedTable2")
.SingleOrDefault(ct => ct.PrimaryKey == primaryKey);
Lazy loading defers the initialization of an object until it is needed. In this case it will automatically execute a query to the DB to load the object requested.
Eager loading loads a specific set of related objects along with the objects that were explicitly requested in the query.
So in order to use Eager Loading you need to specify the related objects that you want to load.
In EF you can achieve this using the method Include from ObjectQuery.
context.Entity.Include("RelatedObject");
I am using Entity Framework through a Repositories pattern and have a significant and surprising performance problem. I have done profiling so I have a pretty good idea of what happens, I just don't know what to do about it.
Here is the essense of my code (simplified):
var employee = Repositories.Employees.FirstOrDefault(s => s.EmployeeId == employeeId);
employee.CompanyId = null;
Repositories.Commit();
The middle line (employee.CompanyId = null) takes an astounding amount of time to complete (around 30 seconds). The time is NOT spent on the Commit line.
Through profiling, I have found the reason to be running this part of the auto generated EF code:
if (previousValue != null && previousValue.**Employees**.Contains(this))
{
previousValue.Employees.Remove(this);
}
That doesn't really help me, but it does confirm that the problem lies in the EF. I would really like to know what to do. I can update the column in other ways (stored procedure) but I would really rather use the EF everywhere.
I cannot easily edit the EF settings, so I would prefer suggestions that does not involve this.
Update
I solved the problem by running SQL directly against the database and then refreshing the object from context to make sure EF would detect this change immediately.
public void SetCompanyNull(Guid employeeId)
{
_ctx.ExecuteStoreCommand("UPDATE Employee SET CompanyId = NULL WHERE EmployeeId = N'" + employeeId + "'");
_ctx.Refresh(RefreshMode.StoreWins, _ctx.Employees.FirstOrDefault(s => s.EmployeeId == employeeId));
}
Update 2
I ALSO solved the problem by disabling lazy loading temporarily.
var lazyLoadDisabled = false;
if (_ctx.ContextOptions.LazyLoadingEnabled)
{
_ctx.ContextOptions.LazyLoadingEnabled = false;
lazyLoadDisabled = true;
}
this.GetEmployeeById(employeeId).CompanyId = null;
this.SaveChanges();
if (lazyLoadDisabled)
{
_ctx.ContextOptions.LazyLoadingEnabled = true;
}
I am really curious about WHY it's so much faster with lazy loading disabled (and which side-effects this might have)
This is problem in EF POCO Generator Template which causes unexpected lazy loading in some scenarios. This template generates fixup code for navigation properties so if you change the navigation property on one side it internally goes to other end of the changed relation and tries to fix the relation to be still consistent. Unfortunately if your related object doesn't have navigation property loaded it triggers lazy loading.
What you can do:
Modify template and remove all code related to fixup
Remove reverse navigation property (Employees) from your Company
Turn off lazy loading prior to this operation - context.ContextOptions.LazyLoadingEnabled = false
Well, it is strange; on the otherhand, instead of setting companyId to null; you may try to remove it from the collection. Something like;
var company = Repositories.Companies.Include("Employee").FirstOrDefault(s => s.Employee.Any(q => q.EmployeeId == employeeId));
company.Employees.Remove(q => company.Employees.Where(l => l.EmployeeId == employeeId).SingleOrDefault());
Repositories.Commit();
You have a problem with automatic change detecttion.
As pointed out by Ladislav Mrnka, when you change a property involved in a relation, it will try to transfer the change to the related entities.
You can avoid this by disabling automatic "Change Tracking" on your context while you execute that operation.
This explains a similar problem an its solution:
Using DbContext in EF 4.1 Part 12: Automatically Detecting Changes
An this explains the Chaneg Tracking concept in general:
Change Tracking