Could Someone help me to clarify the difference between :
var query = awlt.People.Include(p => p.EmailAddresses)
.Where(p => p.LastName.Equals(lastName))
.SelectMany(a => a.EmailAddresses)
.Select(a => a.EmailAddress1);
var query = awlt.People
.Where(p => p.LastName.Equals(lastName))
.SelectMany(a => a.EmailAddresses)
.Select(a => a.EmailAddress1);
I get the same results in both cases without knowing the difference .
Does the Eager Loading require using Include ?
The both query are retrieving the related data just first query by using Eager Loading (and yes Eager loading is achieved by use of the Include method as you guessed) and the second query by using Lazy loading which is by default. But since your query will only returns EmailAddresses because of the Select() and SelectMany() operations the Include() method doesn't change the behavior. To see when Include() method is matter in your example read the following lines that I will prove it in one example:
To know some difference between this two kind of loading related entities Eager loading is typically more efficient when you need the related data for all retrieved rows of the primary table. And also when relations are not too much, eager loading will be good practice to reduce further queries on server. But when you know that you will not need a property instantly then lazy loading maybe a good choice. And also eager loading is a good choice in a situation where your db context would be disposed and lazy loading could not take place anymore.
To prove that one is Lazy Loading and one is Eager Loading consider the following code:
public List<Person> GetEmailAddresses()
{
using (yourEntities awlt = new yourEntities())
{
var query = awlt.People
.Where(p => p.LastName.Equals(lastName));
return query.ToList();
}
}
After calling this method, You cannot load the related entity lazily because the db is disposed. To prove try this:
var query = GetEmailAddresses();
foreach (var item in query.SelectMany(a => a.EmailAddresses).Select(a => a.EmailAddress1))
{
MessageBox.Show(item);
}
And you will get this error:
The ObjectContext instance has been disposed and can no longer be used for operations that require a connection.
But if you change the GetEmailAddresses to use Eager Loading like this:
public List<Person> GetEmailAddresses()
{
using (yourEntities awlt = new yourEntities())
{
var query = awlt.People.Include("EmailAddresses")
.Where(p => p.LastName.Equals(lastName));
return query.ToList();
}
}
Then the below code should works fine:
var query = GetEmailAddresses();
foreach (var item in query.SelectMany(a => a.EmailAddresses).Select(a => a.EmailAddress1))
{
MessageBox.Show(item);
}
So in a situation where your db context would be disposed the Eager Loading would be a better choice.
Don't know about EF 7, but in EF 6 both those statements produce the same queries to database and so are essentially the same. There is no lazy loading, no eager loading (in a sense this term is usually used) whatsoever.
You need to Include only properties of entities you materialize. In the example above you materialize Person.EmailAddresses.EmailAddress1, but you include just Person.EmailAddresses - this has no effect (for more details see for example here).
Consider this sample code (details does not matter, there is just Error entity with Code navigation property):
// note we materialized query
var errors = ctx.Errors.Include(c => c.Code).ToArray();
// no lazy loading happens here - we already loaded all related Codes with Include
var codeIds = errors.Select(c => c.Code.CodeID).ToArray();
And this one:
// no need to include here!
var codeIds = ctx.Errors.Select(c =>c.Code.CodeID).ToArray();
And with include:
// include has no effect here!
var codeIds = ctx.Errors.Inlcude(c => c.Code).Select(c => c.Code.CodeID).ToArray();
What is eager loading? It's when you include additional data to the related entity using Include statement. Here Include statement has no effect, it's just does nothing, so we cannot name that eager loading.
What is lazy loading? It's when navigation property is loading when you access it for the first time. You do not do this in your examples, so there is no lazy loading either.
Both examples just execute identical queries to database (after you materialize them with enumeration`ToArray` etc).
The result of the two queries is exactly the same (also about 'eager' and 'lazy' load).
In this case I think that also the query are very similar or the same BUT never trust EF Provider generated queries. To see the generated queries you can stop the program with a breakpoint and have a look to query object (pointing the mouse on it). That is the query generated by EF Provider.
About Eager loading (the Include statement) in this case it should not be useful because is used to load properties of the output object. In this case you are selecting EMailAddress1 so with Include you could eager load properties of EMailAddress1 (and avoid lazy queries during EMailAddress1 access).
You can find the difference if you look into SQL Server Profiler after the query is run. So in your first case there is only one query going to your database and fetching records from People table as well as EmailAddresses table whereas in the second case it does two queries to database and fetches People first and then EmailAddresses in a second query. Thus the first scenario is called eager loading and the second one lazy loading.
Related
I wanted to use a Select() method but it seems like Select() doesn't really accept the await keyword. Therefore I was asking myself the question, should I keep using Eager Loading asynchronously or actually the Select method is very good synchronously and will do the job as efficiently as the other ?
I use the Select() to map a very large entity to a DTO and Eager Loading to basically mimic the Select() method by creating a bunch of method including a few relationships such as GetObjectWithPrice and GetObjectWithPriceAndDate and so that's why I was asking for the use of a Select() method instead but the synchronicity worries me.
EDIT
To answer #AvrohomYisroel, here's what I've been doing with my code so far:
public async Task<IReadOnlyList<Book>> GetAllBooksWithRelatedDataAsync()
{
var books = await context.Books
.AsNoTracking()
.AsSplitQuery()
.Include(d => d.Price)
.Include(d => d.Images)
.Include(d => d.Author)
.ToListAsync();
return books;
}
That is how I use Eager Loading asynchronously. And I've been questioning if when using Select(), I was supposed to expect it to be used the same way in terms of asynchronicity. However I might be completely clouded in that the use of Select() is synchronous because it works a different way that I thought it did.
.Select() or projections are generally the most efficient form of data retrieval if performed on the server, but when we use projections we are precluding the use of includes. .Select() will change the shape of the object graph, if you are not changing the shape, then just use .Include(), .AsSplitQuery() gives you the best performance when you need to include related entities because it will load each of the navigation path as an individual query.
.Select() is frequently used in repository patterns to map data models into DTOs, there are other libraries you can use that can simplify this that will internally call .Select().
As of Core 6, LINQ to Entity Projections also support .AsSplitQuery() which means that performance is now less of a concern when choosing between Eager Loading and Projections.
The point of projecting is to pull back not just the required related navigation entities, but only the specific fields that we need. .Select() is therefor an even more eager form of loading than .Include(), but allows for you to be selective about which fields to load.
But we can't have both .Select() and .Include()' in a server expression, in fact any .Include()expressed before the.Select()will be ignored unless the.IQueryable()has been loaded into an.IEnumerable()` first.
As to asynchronicity, there is no difference between .Select() or .Include() if you apply them to an IQueryable<T> expression, the following would still be asynchronous:
var books = await context. Books
.Select(b => new BookDTO {
ISBN = b.BookNumber,
Title = b.Title,
Author = b.Author.Name,
Price = b.Price,
Images = b.Images.Select(i => i.Url)
})
.AsSplitQuery()
.ToListAsync();
The only time that .Select() might constrain you to a synchronous context is if in your projection you have used a function or logic that cannot be converted to a data store (SQL) expression. In this instance EF Core will autmatically evaluate your expression to bring data into memory and then it will perform the .Select(). At that point you are dealing with IEnumerable<T> and synchronous evaluations.
I have two different ways to get data from my SQL database:
var sql = #"Select Exam.Name, Test.TestId, Test.QuestionsCount, Test.Title
FROM Test
INNER JOIN Exam
ON ( Test.ExamId = Exam.ExamId)
WHERE Test.TestStatusId = 1";
var tests1 = db.Database.SqlQuery<TestDTO>(sql).ToList();
var tests2 = await db.Tests
.Include(t => t.Exam)
.Where(t => t.TestStatusId == 1)
.Select(t => new TestDTO
{
ExamName = t.Exam.Name,
Id = t.TestId,
QuestionsCount = t.QuestionsCount,
Title = t.Title
})
.ToListAsync();
I realize the 2nd way seems to be the more "popular" but from a performance point of view is there any difference between the two ways. In particular is it possible to have an async version of the first method or is there likely to be minimal benefit in having that anyway?
These two methods of retrieving data aren't even equal.
In your second method, you are writing an LINQ to Entities query and pulling back a list of Tests that include Exams. Once the query is executed, all of the Tests and Exams will be added to the Entity Framework Change Tracker.
In your first method, you are just using Entity Framework to execute some SQL and convert it to a TestDTO object. None of your entities will be in the change tracker at all.
The first method will probably be faster because you're not involving tracking entities, but they are not really comparable because they are not doing the same thing. Do you plan on making changes to Tests and Exams and calling SaveChanges on the DbContext? If you are then you will doing all of that manually with your first method. I'm not sure why you are asking the question unless you are seeing performance issues with the second query, though. Why not just use a SqlReader or DataAdapter at this point if you don't plan on using EF to even query the data.
Given:
public SomeEntity Read(int primaryKey)
{
SomeEntity myEntity;
using (var context = new MyEntities2())
{
context.Configuration.LazyLoadingEnabled = false;//This line is wacky
myEntity = context.SomeEntities.SingleOrDefault(ct => ct.PrimaryKey == primaryKey);
if (myEntity == null)
return myEntity;
//Force eager Load...
var bypassDeferredExecution = myEntity.RelatedTable1.ToList();
var bypassDeferredExecution2 = myEntity.RelatedTable2.ToList();
}
return myEntity;
}
If I set LazyLoadingEnabled = false then myEntity.RelatedTable1.Count == 0.
Leave at the default LazyLoadingEnabled = true then myEntity.RelatedTable1.Count == 2.
My understanding is that Lazy Loading and Eager Loading are polar opposites. I forced eager loading. I expect my related table (a cross reference table) to have 2 results whether or not I use lazy loading. So in my mind these results make no sense.
Why does lazy loading impact my results?
You have to use Include to eagerly load related entities:
myEntity = context.SomeEntities
.Include("RelatedTable1")
.Include("RelatedTable2")
.SingleOrDefault(ct => ct.PrimaryKey == primaryKey);
Setting Lazy Loading to false won't cause it happen automatically.
If you are using lazy loading, then there needs to be a LINQ to Entities Include method call to identify the (foreign keyed) tables to eagerly load.
Navigation property isn't query, it's enumerable collection. You have 2 ways to get it from DB:
- Lazy loading (will be loaded on the first access to property)
- Eager loading (will be loaded after executing main query if you add Include({propertyName} method
So, if you turned off lazy loading and don't add Include methods to the query each navigation property will be empty (empty collection or null value for single entities)
The following code should work for your case:
myEntity = context.SomeEntities
.Include("RelatedTable1")
.Include("RelatedTable2")
.SingleOrDefault(ct => ct.PrimaryKey == primaryKey);
Lazy loading defers the initialization of an object until it is needed. In this case it will automatically execute a query to the DB to load the object requested.
Eager loading loads a specific set of related objects along with the objects that were explicitly requested in the query.
So in order to use Eager Loading you need to specify the related objects that you want to load.
In EF you can achieve this using the method Include from ObjectQuery.
context.Entity.Include("RelatedObject");
I retrieve a collection with the following query:
var numbers = _betDetailItem.GetBetDetailItems().Where(betDetailItem => betDetailItem.BetDetail.Bet.DateDrawing == resultToCreate.Date && betDetailItem.BetDetail.Bet.Status == 1).Where(condition);
Right there I'm able to access my navigation properties and navigate through binded info. Note how I actually use them to filter the data.
After I group the results, the navigation properties become null.
var grouped = numbers.GroupBy(p => p.BetDetail.Bet);
//Iterate through the collection created by the Grouping
foreach (IGrouping<Bet, BetDetailItem> group in grouped)
{
var details = group.Key.BetDetails; //This is what doesn't work. BetDetails is a navigation property which was accessible in the previous query.
}
Am I doing something wrong?
You are confusing LINQ to Entities and object operations.
This is LINQ to Entities:
var numbers = _betDetailItem.GetBetDetailItems().Where(betDetailItem => betDetailItem.BetDetail.Bet.DateDrawing == resultToCreate.Date && betDetailItem.BetDetail.Bet.Status == 1).Where(condition);
So is this:
var grouped = numbers.GroupBy(p => p.BetDetail.Bet);
These are object operations:
foreach (IGrouping<Bet, BetDetailItem> group in grouped)
{
var details = group.Key.BetDetails; //This is what doesn't work. BetDetails is a navigation property which was accessible in the previous query.
}
In LINQ to Entities, there is never any need to think about loading related instances. You can always refer to any property of any object. However, at some point, you want to move out of the LINQ to Entities world and into object space, because you want to work with instances of type BetDetail instead of type IQueryable<BetDetail>. This means that the Entity Framework is now required to generate SQL to retrieve data from the database. At that point, it doesn't snow which related instances you will be accessing in your code later on. Nothing in your LINQ to Entities query forces the loading of the related Bet. So unless you do something to cause it to be loaded, like use eager loading, explicit loading, or EF 4 lazy loading, it won't be loaded.
Using lazy loading (e.g., in Entity Framework 4, or in another ORM) will make this code appear to function, but it will be unnecessarily slow, due to the large number of database queries generated. A better solution would be to use eager loading or projection. This way there will be only one DB roundtrip.
Once you do a GroupBy(), you're no longer dealing with your entities -- they have been... well, grouped, so the var in var grouped = ... is now of type IEnumerable<IGrouping<.... As such, the methods available on the items in the grouped collection are the methods of the IGrouping<> interface.
You may want to OrderBy() instead of GroupBy(), depending on your need, or you'll need to iterate on two levels: iterate over each group in grouped, and over each member within each of those.
Once you are inside of a particular IGrouping<>, you should have access to the properties for which you are looking.
I have my NHibernate mappings set to lazy loading = true.
In my CustomersViewModel I have something like:
foreach (Customer c in _customerRepository)
{
this.Customers.Add(new SingleCustomerViewModel(c));
}
This obviously kills all the lazy loading, since the customers are passed one by one.
How do I get my collections (including subcollections and sub-subcollections a.s.f.) of model-objects into the corresponding ObservableCollections of my ViewModels to bind to the UI?
This seems to be a common problem, but I found no answer, neither here nor on the Googles ...
I am not sure I completely understand the question .
But I was thinking why not change your getCustomers method to
IEnumerable<SingleCustomerViewModel> getCustomers(){
return from c in _customerRepository select SingleCustomerViewModel(c);
}
Since LINQ expressions are lazily evaluated you nhibernate collection wont be initialized until its actually bound to the UI .
This is a classic "SELECT N+1" problem: whichever query layer you are using for NHibernate offers you a way to eagerly load the child collections in your initial query to avoid this row-by-row query pattern.
With the LINQ provider, for example:
session.Query<Customer> ()
.FetchMany (c => c.Widgets) // eagerly load child collections
.ThenFetchMany (w => w.Frobbers); // now get grandchild collection
If you're using HQL, just add the fetch keyword to your joins.