Include Path Expression Must Refer To A Navigation Property - c#

I've search a lot about my problem but I didn't find any clear solution. I just know that I can't use Where linq clause with Include but it doesn't make sense to me that how I make this query.
var brands = await _context.Brands
.Include(x => x.FoodCategories
.Select(y => y.Products
.Where(z => z.Sugar)
.Select(w => w.FileDetail)))
.ToListAsync();
Actually I want to apply the Where statement on Products but I want entities in hierarchy like I do here. How can I do it?
I've already try myself with different stackoverflow question answer but I'm not getting the point. Here is my trial:
var brands = _context.Brands
.Select(b => new
{
b,
FoodCategories = b.FoodCategories
.Where(x => x.BrandId == b.BrandId)
.Select(c => new
{
c,
Products = c.Products
.Where(y => y.FoodCategoryId == c.FoodCategoryId &&
y.Sugar)
.Select(p => new
{
p,
File = p.FileDetail
})
})
})
.AsEnumerable()
.Select(z => z.b)
.ToList();
But it is not returning all the product items instead of sugar only products.

Why you're only getting sugar products.
But it is not returning all the product items instead of sugar only products.
Of course it is. Because you're asking it to only give you the sugar products:
var brands = _context.Brands
.Select(b => new
{
b,
FoodCategories = b.FoodCategories
.Where(x => x.BrandId == b.BrandId)
.Select(c => new
{
c,
Products = c.Products
.Where(y => y.FoodCategoryId == c.FoodCategoryId
&& y.Sugar) //HERE!
.Select(p => new
{
p,
File = p.FileDetail
})
})
})
.AsEnumerable()
.Select(z => z.b)
.ToList();
If you want all products; then don't filter on only the ones where Sugar is set to true.
There is a lot of redundant code here.
b.FoodCategories.Where(x => x.BrandId == b.BrandId)
b.FoodCategories already expresses the food categories of this particular brand b. You don't need the Where.
The same applies to
c.Products.Where(y => y.FoodCategoryId == c.FoodCategoryId ... )
Here's an improved version of your (second) snippet:
var brands = _context.Brands
.Select(b => new
{
b,
FoodCategories = b.FoodCategories
.Select(c => new
{
c,
Products = c.Products
.Select(p => new
{
p,
File = p.FileDetail
})
})
})
.AsEnumerable()
.Select(z => z.b)
.ToList();
This should make it clearer that the custom Select logic isn't necessary. All you're doing is loading the related entities into properties of the same name. You can simply rely on the existing entities and their relations, there's no reason to define the same relationship again.
The only reason a custom Select would be desirable here was if:
You wanted to limit the retrieved columns in order to lower the data size (useful for large queries)
You want to selectively load children, not just all related children. Your code suggest that you want this, but then you say "But it is not returning all the product items" so I conclude that you don't want to filter the products on their sugar content.
Why your Include didn't work.
SImply put: you cannot use Where statements in includes.
Include statements are based on the structure of the entities, whereas a Where only filters data from a set. One has nothing to do with the other.
And even though you'd think it'd be nice to do something like "include the parent only if they have an active status", that's simply not how Include was designed to work.
Include boils down to "for every [type1], also load their related [type2]". This will be done for every [type1] object that your query will instantiate and it will load every related [type2].
Taking the next step in refactoring the above snippet:
var brands = _context.Brands
.Include(b => b.FoodCategories)
.Include(b => b.FoodCategories.Select(fc => fc.Products))
.Include(b => b.FoodCategories.Select(fc => fc.Products.Select(p => p.FileDetail)))
.ToList();
The includes give Entity Framework specific instructions:
For every loaded brand, load its related food categories.
For every loaded food category, load its related products.
For every loaded product, load its related file details.
Notice that it does not instruct WHICH brands should be loaded! This is an important distinction to make. The Include statements do not in any way filter the data, they only explain what additional data needs to be retrieved for every entry that will be loaded.
Which entries will be loaded has not been defined yet. By default, you get the whole dataset, but you can apply further filtering using Where statements before you load the data.
Think of it this way:
A restaurant wants every new customer's mother to give permission to serve dessert to the customer. Therefore, the restaurant drafts a rule: "every customer must bring their mother".
This is the equivalent of db.Customers.Include(c => c.Mother).
This does not state which customers are allowed to visit the restaurant. It only states that any customer that visits the restaurant must bring their mother (if they have no mother, they will bring null instead).
Notice how this rule applies regardless of which customers visit the restaurant:
Ladies night: db.Customers.Include(c => c.Mother).Where(c => c.IsFemale)
Parents night: db.Customers.Include(c => c.Mother).Where(c => c.Children.Any())
People whose father is named Bob night: db.Customers.Include(c => c.Mother).Where(c => c.Father.Name == "Bob")
Take note of the third example. Even though you filter on the father, you will only load the mother entity. It's perfectly possible to filter items on related entity values without actually loading the entities themselves (fathers).
You may ask yourself "why Select?". That's a good question, because it's not intuitive here.
Ideally, you'd want to do something like
context.Brand.Include(b => b.FoodCategories.Products.FileDetails)
But this is not possible because of a limitation in the language. FoodCategories is a List<FoodCategory>, which does not have a Products property.
However, FoodCategory itself does have a Products property. This is why Select is used: it allows you to access the properties of the list element's type, rather than the list itself.
Internally, EF is going to deconstruct your Select statement (which is an Expression) and it will figure out which property you want to be loaded. Don't worry too much about how EF works behind the scenes. It's not always pretty.
The Include/Select syntax is not the prettiest. Especially when you drill down multiple levels, it becomes cumbersome to write (and read).
So I suggest you invert your approach (start at the lowest child, drill up to the parent). Technically, it yields the same result, but it allows for a neater Include syntax:
var brands = context.FileDetails
.Include(fd => fd.Product)
.Include(fd => fd.Product.FoodCategory)
.Include(fd => fd.Product.FoodCategory.Brand)
.Select(fd => fd.Product.FoodCategory.Brand)
Now you don't need any nasty Select workaround in order to reference the related types.
Do note that you need to put an Include for every step! You can't just use the last Include and skip the others. EF does not infer that it needs to load multiple relations from a single Include.
Note that this trick only really works if you have a chain of one-to-many relationships. Many-to-many relationships make it harder to apply this trick. At worst, you'll have to resort to using the Select syntax from the earlier example.
While I am not a fan of the Include methods that take a string parameter (I don't like hardcoded strings that can fail on typos), I do feel it's relevant to mention here that they do not suffer from this issue. If you use the string-based includes, you can do things like:
context.Brands
.Include("FoodCategories")
.Include("FoodCategories.Products")
.Include("FoodCategories.Products.FileDetails")
The parsing logic of the string include method will automatically look for the element inside the List, thereby effectively preventing the ugly syntax.
But there are other reasons why I generally don't advise using string parameters here (doesn't update when you rename a property, no intellisense, very prone to developer error)

Related

context include sometimes returns list items in no particular order

In my ASP.Core 2.1 Web App, I have the 3 Models,
Profile which has Many Invoices,
Invoices which has Many Invoice Statuses
I retrieve them from the Db e.g.
_context.Invoices
.Include(st => st.InvoiceStatuses)
.FirstOrDefault(iv => iv.Id == invoiceId);
or sometimes
_context.Invoices
.Include(pr => pr.Profile)
.Include(st => st.InvoiceStatuses)
.FirstOrDefault(iv => iv.Id == invoiceId);
From this I expect to get a specific invoice and all related InvoiceStatuses in the order in which they were created(Db index order essentialy)
Most of the time this is indeed the case.
However, occasionally, I add a new Invoice record and initial invoice status and just a few of the invoices have their related Invoice Statuses List in a random / unexpected order. e.g index 10 12 18 16
I can get round this by breaking it down in to two queries for invoice and their statuses but was hoping someone could perhaps give some insight into what might be happening?
It would be easier if the problem happened consistently but if you delete a record (Sometimes needs to be a couple of records). You can then go on and add multiple records before the problem might potentially appear again.
I get the same problem when returning all Invoices.ToList() and each ones .Include related data but was trying to focus on the most simple scenario first.
I have not turned on LazyLoading or used Virtual keywords but not sure if this would matter.
To continue from my comment....
First/FirstOrDefault should always be done with an OrderBy clause unless you truly don't care which one you will get.
Ordering in general is a display and business logic concern. Entities are a view of data.
In cases where you want to display data in order you should consider composing view models for the data to display, then use .Select() with applicable children in appropriate order. For instance if I want to select an invoice and list it's statuses in the order they were added. (assumed by the auto-increment Id order)
var invoice = _context.Invoices.OrderBy(x => x.Id)
.Select(x => new InvoiceViewModel
{
Id = x.Id,
// ... Fields the view needs to know about
InvoiceStatuses = x.InvoiceStatuses.OrderBy(s => s.Id)
.Select(s => s.StatusText)
.ToList()
}).FirstOrDefault();
So something like that would use the Invoice OrderBy to find the first applicable Invoice (by ID order) then select the fields we care about into a view model. For the Invoice Statuses it orders them by their Id and selects the StatusText to provide the view a list of Statuses as strings. Alternatively you could select an InvoiceStatusViewModel to return the Status Text, Status ID, etc. depending on what you view wanted.
Alternatively if you are selecting the data to be consumed on the spot for some business logic, you don't need to declare the view model classes, simply use anonymous types:
var invoice = _context.Invoices.OrderBy(x => x.Id)
.Select(x => new
{
x.Id,
// ... Fields the view needs to know about
InvoiceStatuses = x.InvoiceStatuses.OrderBy(s => s.Id)
.Select(s => new
{
s.Id,
s.StatusText
})
.ToList()
}).FirstOrDefault();
This gives you the data you might need to consume, in order, but as anonymous types you cannot return this data outside of the function scope such as to a view.
The technique of using .Select() to reduce results helps lead to more efficient queries as you can utilize all forms of aggregate methods so that rather than returning everything and then writing logic to iterate over, you can utilize Max, Min, Sum, Any, etc. to compose more efficient queries that run faster, and return less data over the wire.

Use skip and take inside a LINQ include

I have an object that has a property which is a collection of another object. I would like to load just a subset of the collection property using LINQ.
Here's how I'm trying to do it:
manager = db.Managers
.Include(m => m.Transactions.Skip((page - 1) * 10).Take(10))
.Where(m => m.Id == id)
.FirstOrDefault();
The code above throws an error that says
The Include path expression must refer to a navigation property defined on the type. Use dotted paths for reference navigation properties and the Select operator for collection navigation properties.\r\nParameter name: path
What is the right way to do this in LINQ? Thanks in advance.
You cannot do this with Include. EF simply doesn't know how to translate that to SQL. But you can do something similar with sub-query.
manager = db.Managers
.Where(m => m.Id == id)
.Select(m => new { Manager = m,
Transactions = db.Transactions
.Where(t=>t.ManagerId == m.Id)
.Skip((page-1) * 10)
.Take(10)})
.FirstOrDefault();
This will not return instance of Manager class. But it should be easy to modify it to suit your needs.
Also you have two other options:
Load all transactions and then filter in memory. Of course if there are a lot of transactions this might be quite inefficient.
Don't be afraid to make 2 queries in database. This is prime example when that is probably the best route, and will probably be the most efficient way of doing it.
Either way, if you are concerned with performance at all I would advise you to test all 3 approaches and see what is the fastest. And please let us know what were the results!
Sometimes the added complexity of putting everything in a single query is not worth it. I would split this up into two separate queries:
var manager = db.Managers.SingleOrDefault(m => m.Id == id);
var transactions = db.Transactions
.Where(t => t.ManagerId == id)
// .OrderBy(...)
.Skip((page - 1) * 10).Take(10)
.ToList();
Note that after doing this, manager.Transactions can be used as well to refer to those just-loaded transactions: Entity Framework automatically links loaded entities as long as they're loaded into the same context. Just make sure lazy loading is disabled, to prevent EF from automatically pulling in all other transactions that you specifically tried to filter out.

EF7 projection doesnt eager load collections

When selecting entities with "include" all my items gets fetched with a single SQL join statement. But when i project it to some other form with its children, the join is no longer executed, instead a separate query per row is executed to get the children. How can i prevent this? My goal is to reduce the columns fetched, and to reduce the amount of queries
This issue leads me to believe that this should work: https://github.com/aspnet/EntityFramework/issues/599
//executes ONE query as expected
context.Parents.Include(p => p.Children).ToList();
//executes MULTIPLE queries
context.Parents.Include(p => p.Children).Select(p => new {
Id = p.Id,
Name = p.Name,
Children = p.Children.Select(c => new {
Id = c.Id,
Name = c.Name
})
}).ToList();
You are seeing multiple queries sent to the database because EF Core is not yet smart enough to translate navigations in a projection to a JOIN. Here is the issue tracking this feature - https://github.com/aspnet/EntityFramework/issues/4007.
BTW as previously mentioned by others, Include only works when the entity type if part of the result (i.e. it means "if you end up creating instances of the entity type then make sure this navigation property is populated").
Your problem is here:
Children = p.Children.Select(c => new {
Id = c.Id,
Name = c.Name
})
eager loading statement Include() work only with requests without projections.
instead of this you can do:
context.Parents.Include(p => p.Children).AsEnumerable()
.Select(p => new {
Id = p.Id,
Name = p.Name,
Children = p.Children.Select(c => new {
Id = c.Id,
Name = c.Name
})
}).ToList();
AsEnumerable() says to EF that all code after it should be executed on objects and should not be transfered to sql requests.
This is partially fixed for EFCore 2.0.0-preview2 (currently not on nuget)
https://github.com/aspnet/EntityFramework/commit/0075cb7c831bb3618bee4a84c9bfa86a499ddc6c
This is partially addressed by #8584 - if the collection navigation is
not composed on, we re-use include pipeline which creates 2 queries,
rather than N+1. If the collection is filtered however, we still use
the old rewrite and N+1 queries are being issued
I don't know exactly what composed on means here and if it just means that the Child object cannot be filtered (probably acceptable for most cases), but my guess is that certainly your original query as it stands would now work.

Search and select multiples with where linq/lambda expressions

I currently have the following code:
var FirstNameList = db.Clients.Include(x => x.FirstNames).Include(x => x.Addresses).SelectMany(a => a.FirstNames).Where(x => x.Name.ToLower().Trim() == "Max".ToLower().Trim()).ToList();
I have a navigation property of FirstNames and Addresses which I which to include in the result.
I use the SelectMany Statement because it, for me, is the only one which works. Kind of. It returns all the FirstNames where the Name equals Max.
What I would like it to do is return all the Clients who have the property Name equals Max from the table FirstNames.
the other way I thought about doing this was to take all the ID's returned from FirstNameList and then returning the Clients by querying the data against the FirstNameList but I would be then querying the database twice, which seems inefficient.
My question is is it possible, and how would I go about querying the database to return my Clients, if it was?
Kind regards
The following query should give you what you're looking for. You can look within each client's FirstNames and see if any of them are named "max". (In this case, since "max" is a constant you're typing in, I removed the ToLower().Trim() from it)
var clientsNamedMax = db.Clients.Include(x => x.FirstNames).Include(x => x.Addresses).Where(x => x.FirstNames.Any(y => y.Name.ToLower().Trim() == "max")).ToList();

Confused by LINQ 2 SQL behaviour when selecting anonymous types

I'm really confused by a behaviour of LINQ I'm seeing and it's causing me a problem.
I wrote a query like:
var reportalerts = pushDB.ReportAlerts
.Select(p => new {p.Title, p.Url, p.DateStamp})
.OrderBy(p => p.DateStamp)
.Take(numResultsPerPage);
This creates the SQL that I'd expect:
SELECT TOP (5) [t0].[Title], [t0].[Url], [t0].[DateStamp]
FROM [dbo].[ReportAlerts] AS [t0]
ORDER BY [t0].[DateStamp]
If I then add an extra property to my anonymous type, the sql generated is radically different:
var reportalerts = pushDB.ReportAlerts
.Select(p => new {p.Title, p.Url, p.DateStamp, p.Text})
.OrderBy(p => p.DateStamp)
.Take(numResultsPerPage);
Becomes:
SELECT TOP (5) [t0].[Title], [t0].[Url], [t0].[DateStamp], [t0].[PushReportAlertID], [t0].[DateOfAlert], [t0].[AlertProductID], [t0].[Description], [t0].[Mpid], [t0].[UIName], [t0].[CustomerDesc], [t0].[ProductArea]
FROM [dbo].[ReportAlerts] AS [t0]
ORDER BY [t0].[DateStamp]
That is it takes every column from the table now. It's like it has decided, this guy is selecting enough columns for me to just go and grab all of them now. This is a problem for me as I want to concat (i.e. UNION ALL) the query with a similar one from another table which has different columns. (before the order by/take) If it would take only the columns that are specified by my anonymous type properties it would not be a problem, but as it takes all the columns and the two tables have differing columns, it fails.
I can solve the problem in various different ways, so I'm not stopped by this, I just want to understand what is happening above, and if there is not a way you can get it to just return the columns you want.
Gar, it's because I'm an idiot.
The problem is that Text is not a property on the table, it is there to satisfy an interface and actually returns another property which is part of the table.
Changing the above to:
var reportalerts = pushDB.ReportAlerts
.Where(p => subscribedMpids.Contains(p.Mpid))
.Select(p => new {p.Title, p.Url, p.DateStamp, Text = p.Description})
.OrderBy(p => p.DateStamp)
.Take(numResultsPerPage);
Works as expected.

Categories