Use skip and take inside a LINQ include - c#

I have an object that has a property which is a collection of another object. I would like to load just a subset of the collection property using LINQ.
Here's how I'm trying to do it:
manager = db.Managers
.Include(m => m.Transactions.Skip((page - 1) * 10).Take(10))
.Where(m => m.Id == id)
.FirstOrDefault();
The code above throws an error that says
The Include path expression must refer to a navigation property defined on the type. Use dotted paths for reference navigation properties and the Select operator for collection navigation properties.\r\nParameter name: path
What is the right way to do this in LINQ? Thanks in advance.

You cannot do this with Include. EF simply doesn't know how to translate that to SQL. But you can do something similar with sub-query.
manager = db.Managers
.Where(m => m.Id == id)
.Select(m => new { Manager = m,
Transactions = db.Transactions
.Where(t=>t.ManagerId == m.Id)
.Skip((page-1) * 10)
.Take(10)})
.FirstOrDefault();
This will not return instance of Manager class. But it should be easy to modify it to suit your needs.
Also you have two other options:
Load all transactions and then filter in memory. Of course if there are a lot of transactions this might be quite inefficient.
Don't be afraid to make 2 queries in database. This is prime example when that is probably the best route, and will probably be the most efficient way of doing it.
Either way, if you are concerned with performance at all I would advise you to test all 3 approaches and see what is the fastest. And please let us know what were the results!

Sometimes the added complexity of putting everything in a single query is not worth it. I would split this up into two separate queries:
var manager = db.Managers.SingleOrDefault(m => m.Id == id);
var transactions = db.Transactions
.Where(t => t.ManagerId == id)
// .OrderBy(...)
.Skip((page - 1) * 10).Take(10)
.ToList();
Note that after doing this, manager.Transactions can be used as well to refer to those just-loaded transactions: Entity Framework automatically links loaded entities as long as they're loaded into the same context. Just make sure lazy loading is disabled, to prevent EF from automatically pulling in all other transactions that you specifically tried to filter out.

Related

How to query based on an previous object result? Entity Framework

I am extremely stuck with getting the right information from the DB. So, basically the problem is that I need to add where closure in my statement to validate that it only retrieves the real and needed information.
public async Task<IEnumerable<Post>> GetAllPosts(int userId, int pageNumber)
{
var followersIds = _dataContext.Followees.Where(f => f.CaUserId == userId).AsQueryable();
pageNumber *= 15;
var posts = await _dataContext.Posts
.Include(p => p.CaUser)
.Include(p => p.CaUser.Photos)
.Include(c => c.Comments)
.Where(u => u.CaUserId == followersIds.Id) <=== ERROR
.Include(l => l.LikeDet).ToListAsync();
return posts.OrderByDescending(p => p.Created).Take(pageNumber);
}
As you can see the followersIds contains all the required Id which I need to validate in the post variable. However I have tried with a foreach but nothing seems to work here. Can somebody help me with this issue?
The short version is that you can change that error line you have marked above to something like .Where(u => followersIds.Contains(u.CaUserId) which will return all entities with an CaUserID that is contained in the followersIds variable, however this still has the potential to return a much larger dataset than you will actually need and be quite a large query. (You also might need to check the sytax just a bit, shooting from memory without an IDE open) You are including a lot of linked entities in that query above, so maybe you'd be better off using a Select query vs a Where query, which would load only the properties that you need from each entity.
Take a look at this article from Jon Smith, who wrote the book "Entity Framework Core In Action", where he talks about using Select queries and DTO's to only get out what you need. Chances are, you don't need every property of every entity you are asking for in the query you have above. (Maybe you do, what do I know :p) Using this might help you get something much more efficient for just the dataset you need. More lines of code in the query, but potentaily better performance on the back end and a lighter memory footprint.

EF Core 2.1 Group By for Views

I have a view in SQL lets call it MyCustomView.
If I was to write a simple SQL query to count and sum I could do something like: SELECT COUNT(*), SUM(ISNULL(ValueA, ValueB)) FROM MyCustomView
Is it possible to translate that query in EF Core? Diggin around I found the answers mentioning the user of GroupBy 1 (however this doesn't seem to work for views), i.e.
context
.Query<MyCustomView>()
.GroupBy(p => 1)
.Select(grp => new { count = grp.Count(), total = Sum(p=>p.ValueA ?? p.ValueB)}
The issue I am having is that whenever I attempt to run the query I get a complaint about having to run the group by on the client. However If I was to replace the .Query<MyCustomView>() with a DbSet property from the context then that query works fine. So I am guessing it has to do with the fact that I am trying to execute the operation on a View. Is there a way to achieve this behaviour with a View or am I out of luck again with EF Core :(
Querying Views are notoriously slow when they are not indexed. Instead you can convert your View results into a list first, then query that list. It will eliminate the querying the view on the SQL side and should speed up the overall process.
context
.Query<MyCustomView>()
.ToList()
.GroupBy(p => 1)
.Select(grp => new { count = grp.Count(), total = Sum(p=>p.ValueA ?? p.ValueB)}
I will say, the proper solution (if you can do it) is to index the view.
For anyone that is curious (or until someone else manages to provide an anwser) I managed to get it work by creating a linq query like this:
const a = 1;
context
.Query<MyCustomView>()
// For some reason adding the below select lets it execute
.Select(p => new { p.ValueA, p.ValueB })
.GroupBy(p => a)
.Select(grp => new { count = grp.Count(), total = Sum(p=>p.ValueA ?? p.ValueB)})
.First();
Also according the EF Core team this has been sorted in EF Core 3+, unfortunately I haven't got the luxury to upgrade to 3.

Include Path Expression Must Refer To A Navigation Property

I've search a lot about my problem but I didn't find any clear solution. I just know that I can't use Where linq clause with Include but it doesn't make sense to me that how I make this query.
var brands = await _context.Brands
.Include(x => x.FoodCategories
.Select(y => y.Products
.Where(z => z.Sugar)
.Select(w => w.FileDetail)))
.ToListAsync();
Actually I want to apply the Where statement on Products but I want entities in hierarchy like I do here. How can I do it?
I've already try myself with different stackoverflow question answer but I'm not getting the point. Here is my trial:
var brands = _context.Brands
.Select(b => new
{
b,
FoodCategories = b.FoodCategories
.Where(x => x.BrandId == b.BrandId)
.Select(c => new
{
c,
Products = c.Products
.Where(y => y.FoodCategoryId == c.FoodCategoryId &&
y.Sugar)
.Select(p => new
{
p,
File = p.FileDetail
})
})
})
.AsEnumerable()
.Select(z => z.b)
.ToList();
But it is not returning all the product items instead of sugar only products.
Why you're only getting sugar products.
But it is not returning all the product items instead of sugar only products.
Of course it is. Because you're asking it to only give you the sugar products:
var brands = _context.Brands
.Select(b => new
{
b,
FoodCategories = b.FoodCategories
.Where(x => x.BrandId == b.BrandId)
.Select(c => new
{
c,
Products = c.Products
.Where(y => y.FoodCategoryId == c.FoodCategoryId
&& y.Sugar) //HERE!
.Select(p => new
{
p,
File = p.FileDetail
})
})
})
.AsEnumerable()
.Select(z => z.b)
.ToList();
If you want all products; then don't filter on only the ones where Sugar is set to true.
There is a lot of redundant code here.
b.FoodCategories.Where(x => x.BrandId == b.BrandId)
b.FoodCategories already expresses the food categories of this particular brand b. You don't need the Where.
The same applies to
c.Products.Where(y => y.FoodCategoryId == c.FoodCategoryId ... )
Here's an improved version of your (second) snippet:
var brands = _context.Brands
.Select(b => new
{
b,
FoodCategories = b.FoodCategories
.Select(c => new
{
c,
Products = c.Products
.Select(p => new
{
p,
File = p.FileDetail
})
})
})
.AsEnumerable()
.Select(z => z.b)
.ToList();
This should make it clearer that the custom Select logic isn't necessary. All you're doing is loading the related entities into properties of the same name. You can simply rely on the existing entities and their relations, there's no reason to define the same relationship again.
The only reason a custom Select would be desirable here was if:
You wanted to limit the retrieved columns in order to lower the data size (useful for large queries)
You want to selectively load children, not just all related children. Your code suggest that you want this, but then you say "But it is not returning all the product items" so I conclude that you don't want to filter the products on their sugar content.
Why your Include didn't work.
SImply put: you cannot use Where statements in includes.
Include statements are based on the structure of the entities, whereas a Where only filters data from a set. One has nothing to do with the other.
And even though you'd think it'd be nice to do something like "include the parent only if they have an active status", that's simply not how Include was designed to work.
Include boils down to "for every [type1], also load their related [type2]". This will be done for every [type1] object that your query will instantiate and it will load every related [type2].
Taking the next step in refactoring the above snippet:
var brands = _context.Brands
.Include(b => b.FoodCategories)
.Include(b => b.FoodCategories.Select(fc => fc.Products))
.Include(b => b.FoodCategories.Select(fc => fc.Products.Select(p => p.FileDetail)))
.ToList();
The includes give Entity Framework specific instructions:
For every loaded brand, load its related food categories.
For every loaded food category, load its related products.
For every loaded product, load its related file details.
Notice that it does not instruct WHICH brands should be loaded! This is an important distinction to make. The Include statements do not in any way filter the data, they only explain what additional data needs to be retrieved for every entry that will be loaded.
Which entries will be loaded has not been defined yet. By default, you get the whole dataset, but you can apply further filtering using Where statements before you load the data.
Think of it this way:
A restaurant wants every new customer's mother to give permission to serve dessert to the customer. Therefore, the restaurant drafts a rule: "every customer must bring their mother".
This is the equivalent of db.Customers.Include(c => c.Mother).
This does not state which customers are allowed to visit the restaurant. It only states that any customer that visits the restaurant must bring their mother (if they have no mother, they will bring null instead).
Notice how this rule applies regardless of which customers visit the restaurant:
Ladies night: db.Customers.Include(c => c.Mother).Where(c => c.IsFemale)
Parents night: db.Customers.Include(c => c.Mother).Where(c => c.Children.Any())
People whose father is named Bob night: db.Customers.Include(c => c.Mother).Where(c => c.Father.Name == "Bob")
Take note of the third example. Even though you filter on the father, you will only load the mother entity. It's perfectly possible to filter items on related entity values without actually loading the entities themselves (fathers).
You may ask yourself "why Select?". That's a good question, because it's not intuitive here.
Ideally, you'd want to do something like
context.Brand.Include(b => b.FoodCategories.Products.FileDetails)
But this is not possible because of a limitation in the language. FoodCategories is a List<FoodCategory>, which does not have a Products property.
However, FoodCategory itself does have a Products property. This is why Select is used: it allows you to access the properties of the list element's type, rather than the list itself.
Internally, EF is going to deconstruct your Select statement (which is an Expression) and it will figure out which property you want to be loaded. Don't worry too much about how EF works behind the scenes. It's not always pretty.
The Include/Select syntax is not the prettiest. Especially when you drill down multiple levels, it becomes cumbersome to write (and read).
So I suggest you invert your approach (start at the lowest child, drill up to the parent). Technically, it yields the same result, but it allows for a neater Include syntax:
var brands = context.FileDetails
.Include(fd => fd.Product)
.Include(fd => fd.Product.FoodCategory)
.Include(fd => fd.Product.FoodCategory.Brand)
.Select(fd => fd.Product.FoodCategory.Brand)
Now you don't need any nasty Select workaround in order to reference the related types.
Do note that you need to put an Include for every step! You can't just use the last Include and skip the others. EF does not infer that it needs to load multiple relations from a single Include.
Note that this trick only really works if you have a chain of one-to-many relationships. Many-to-many relationships make it harder to apply this trick. At worst, you'll have to resort to using the Select syntax from the earlier example.
While I am not a fan of the Include methods that take a string parameter (I don't like hardcoded strings that can fail on typos), I do feel it's relevant to mention here that they do not suffer from this issue. If you use the string-based includes, you can do things like:
context.Brands
.Include("FoodCategories")
.Include("FoodCategories.Products")
.Include("FoodCategories.Products.FileDetails")
The parsing logic of the string include method will automatically look for the element inside the List, thereby effectively preventing the ugly syntax.
But there are other reasons why I generally don't advise using string parameters here (doesn't update when you rename a property, no intellisense, very prone to developer error)

Include() vs Select() performance

I have a parent entity with a navigation property to a child entity. The parent entity may not be removed as long as there are associated records in the child entity. The child entity can contain hundreds of thousands of records.
I'm wondering what will be the most efficient to do in Entity Framework to do this:
var parentRecord = _context.Where(x => x.Id == request.Id)
.Include(x => x.ChildTable)
.FirstOrDefault();
// check if parentRecord exists
if (parentRecord.ChildTable.Any()) {
// cannot remove
}
or
var parentRecord = _context.Where(x => x.Id == request.Id)
.Select(x => new {
ParentRecord = x,
HasChildRecords = x.ChildTable.Any()
})
.FirstOrDefault();
// check if parentRecord exists
if (parentRecord.HasChildRecords) {
// cannot remove
}
The first query may include thousands of records while the second query will not, however, the second one is more complex.
Which is the best way to do this?
I would say it depens. It depends on which DBMS you're using. it depends on how good the optimizer works etc.
So one single statement with a JOIN could be far faster than a lot of SELECT statements.
In general I would say when you need the rows from your Child table use .Include(). Otherwise don't include them.
Or in simple words, just read the data you need.
The answer depends on your database design. Which columns are indexed? How much data is in table?
Include() offloads work to your C# layer, but means a more simple query. It's probably the better choice here but you should consider extracting the SQL that is generated by entity framework and running each through an optimisation check.
You can output the sql generated by entity framework to your visual studio console as note here.
This example might create a better sql query that suites your needs.

fluent nhibernate outer join 2 query

I am using linq-to-nhibernate with fluent (c#), and I want to do the following:
I have two IQueryable interfaces, relations and documentsrelations.
The relations object contain a list of enterprises and the documentsrelations object contain a second list of enterprises.
I want to generate a new list of enterprises that contains the list of enterprises (relations) minus the second list of enterprises (documentsrelations).
In sql I would try with an outer join, but I don't know how to do with this.
**** DECLARATIONS ****
IQueryable<EnterpriseRelation> documentsrelations =
shared_doc.SharedIn.AsQueryable();
var relations = EnterpriseRelationService
.QueryRelationsForEnterprise(LoggedUser.ActiveAsEnterprise)
.Where(x => x.ContactingEnterprise.NIF == LoggedUser.ActiveAsEnterprise.NIF);
relations is also an IQueryable<EnterpriseRelation>.
I tried multiplied things, but it always tells me that it is not supported.
Some help?
Thanks!
Supposing your entity primary key is Id, have you tried something like below?
var relations = EnterpriseRelationService
.QueryRelationsForEnterprise(LoggedUser.ActiveAsEnterprise)
.Where(x => x.ContactingEnterprise.NIF == LoggedUser.ActiveAsEnterprise.NIF)
.Where(x => !documentsrelations.Select(dr => dr.Id).Contains(x.Id));
Note:
Using .AsQueryable() as you do in your question is most of the time a mistake. If SharedIn is not actually a queryable instance, it will just convert it to a linq-to-object queryable, executing in memory, not in database. And for your case, there are no need to have it as a queryable anyway, at least with my answer above, and unless it is already a queryable awaiting execution in db.
If my answer fails, try again without your AsQueryable call on SharedIn.

Categories