Recursive linq results returning duplicates - c#

This question builds off of one I asked last week: Recursive linq to get infinite children. The answer given in that post produced what I needed; a distinct list of Locations and their children based on a parent. We needed to use our own model for Locations, so we created one, and since then, I've been getting duplicate results. Our model is very basic:
class LocationModel
{
public int LocationID { get; set; }
public int ParentLocationID { get; set; }
public string LocationName { get; set;}
}
If you compare it to the entity created by EF, I just cut out all the fields we don't need/use (see link above). So I modified my linq statements to use this new model instead:
DBEntities db = new DBEntities();
public IEnumerable<LocationModel> GetAllChildLocations(int parentId)
{
var locations = (from l in db.Locations
where l.ParentLocationID == parentId ||
l.LocationID == parentId
select new LocationModel()
{
LocationID = l.LocationID,
ParentLocationID = l.ParentLocationID,
LocationName = l.LocationName
}).ToList();
var children = locations.AsEnumerable()
.Union(db.Locations.AsEnumerable()
.Where(x => x.ParentLocationID == parentId)
.SelectMany(y => GetAllChildLocations(y.LocationID)))
.ToList();
return children.OrderBy(l => l.LocationName);
}
When I run it, either in Visual Studio or in LinqPad, I now get duplicates. Here's the original code that does not produce duplicates:
public IEnumerable<Location> GetAllChildLocations(int parentId)
{
var locations = (from l in db.Locations
where l.ParentLocationID == parentId ||
l.LocationID == parentId
select l).ToList();
var child = locations.AsEnumerable()
.Union(db.Locations.AsEnumerable()
.Where(x => x.ParentLocationID == parentId)
.SelectMany(y => GetAllChildLocations(y.LocationID)))
.ToList();
return child;
}
Why is it producing duplicates when I use my own model vs. the generated one from EF? Does it have to do with the auto-generating fields that the EF model has and mine doesn't?

Why is it producing duplicates when I use my own model vs. the generated one from EF?
Because you are using Enumerable.Union method which by default uses reference equality. EF DbContext change tracker keeps internally (tracks) the already loaded entity object instances with the same PK (even if you retrieve them via separate database queries), hence the reference equality works. Which cannot be said for the new LocationModel instances created by the query select operators.
One way to resolve it is to implement GetHashCode and Equals in your LocationModel class. But in general I don't like the implementation of the recursive children retrieval and the usage of Union - there must be a better way, but this is outside the scope of this question (but for the linked).
The root of the evil for me is the following condition
where l.ParentLocationID == parentId || l.LocationID == parentId
which selects both the item and its children, leading to duplicates in the result set, which then are supposed to be eliminated by the Union method. The good implementation will not generate duplicates at all.

Related

Retrieving elements from a join with into Entity Framework

I have this code:
private IQueryable<Trip> GetTripWithBreaksAndPassengers(DB_Context context, long id)
{
return from t in context.trip
where t.Id == id
join breaks in context.breaks on t.Id equals breaks.tripid into breaksJoin
join drivers in context.drivers on t.Id equals drivers.tripid into driversJoin
select new Trip() { TripBreaks = ?, TripDrivers = ?};
}
For my Trip specified by an id, I want to return a list of Breaks and Drivers.
My Trip object has two fields which are lists, TripBreaks and TripDrivers:
public virtual List<TripBreak> TripBreaks { get; set; }
public virtual List<TripDriver> TripDrivers { get; set; }
I want both of them to be returned as part of a Trip - I am expecting breaksJoin and driversJoin to hold those specific results, but if queried like
TripDrivers = driversJoin.ToList()
it will throw an error.
How should I use those join results to get the elements held?
Explicit joins in a EF project are a code smell.
You should be able to use navigation properties and Include() to get the data you want, something like this.
var result = context.trip.Include(t => t.Breaks).Include(t => t.Drivers).FirstOrDefault(t => t.Id == id);
This will get you all the related entities in one go.
Adjust property names accordingly, since you didn't share your model classes.

Only primitive types do I need to cast?

I am trying to use anonymous types in Entity Framework, but I am getting an error about
Unable to create a constant value
MinQty and MaxQty are int so I don't know if I need to add to Convert.ToInt32?
Unable to create a constant value of type 'Anonymous type'. Only primitive types or enumeration types are supported in this context.
This builds a list object
var listOfLicense = (from l in db.License
select new
{
l.ProductId,
l.MinLicense,
l.MaxLicense
}).tolist();
This is the larger EF object where I am getting the error am I missing a casting?
var ShoppingCart = (from sc in db.ShoppingCarts
Select new model.Shoppingchart{
ShoppingCartId= sc.Id,
MinQty = (int)listOfLicense
.Where(mt => (int)mt.ProductId == sc.ProductId)
.Select(mt => (int)mt.MinLicense)
.Min(mt => mt.Value),
MaxQty = (int)listOfLicense
.Where(mt => (int)mt.ProductId == p.ProductId)
.Select(mt =>(int) mt.MaxQty)
.Max(mt => mt.Value)}.tolist();
This builds a list object
var listOfLicense = (from l in db.License
select new
{
l.ProductId,
l.MinLicense,
l.MaxLicense
})
The above example does not build a list of objects. It builds a query to return objects of that anonymous type.
This builds an in-memory list of objects of that type:
var listOfLicense = (from l in db.License
select new
{
l.ProductId,
l.MinLicense,
l.MaxLicense
}).ToList();
Using .ToList() here will execute the query and return a materialized list of the anonymous types. From there, your code may work as expected without the exception. However, this is effectively loading the 3 columns from all rows in your database table, which may be a problem as the system matures and rows are added.
The error you are getting isn't a casting issue, it is a translation issue. Because your initial query is still just an EF Query, (IQueryable) any further querying against it will need to conform to EF limitations. EF has to be able to translate what your expressions are trying to select back into SQL. In your case, what your real code is trying to do is breaking those rules.
Generally it is better to let EF work with the IQueryable rather than materializing an entire list to memory. Though to accomplish that we'd need to either see the real code, or a minimum reproducible example.
This code:
MinQty = (int)listOfLicense
.Where(mt => (int)mt.ParentProductId == p.ProductId)
.Select(mt => (int)mt.MinLicense)
.Min(mt => mt.Value),
... does not fit with the above anonymous type as there is no correlation between what mt.ParentProductId is in relation to the anonymous type. (p seems to be associated with that type, not mt so there looks to be a lot of Query code missing from your example.)
Edit: based on your updated example:
var ShoppingCart = (from sc in db.ShoppingCarts
Select new model.Shoppingchart{
ShoppingCartId= sc.Id,
MinQty = (int)listOfLicense
.Where(mt => (int)mt.ProductId == sc.ProductId)
.Select(mt => (int)mt.MinLicense)
.Min(mt => mt.Value),
MaxQty = (int)listOfLicense
.Where(mt => (int)mt.ProductId == p.ProductId)
.Select(mt =>(int) mt.MaxQty)
.Max(mt => mt.Value)}.ToList();
It may be possible to build something like this into a single query expression depending on the relationships between ShoppingCart, Product, and Licence. It almost looks like "Licence" really refers to a "Product" which contains a min and max quantity that you're interested in.
Assuming a structure like:
public class Product
{
[Key]
public int ProductId { get; set; }
public int MinQuantity { get; set; }
public int MaxQuantity { get; set; }
// ...
}
// Here lies a question on how your shopping cart to product relationship is mapped. I've laid out a many-to-many relationship using ShoppingCartItems
public class ShoppingCart
{
[Key]
public int ShoppingCartId { get; set; }
// ...
public virtual ICollection<ShoppingCartItem> ShoppingCartItems { get; set; } = new List<ShoppingCartItem>();
}
public class ShoppingCartItem
{
[Key, Column(0), ForeignKey("ShoppingCart")]
public int ShoppingCartId { get; set; }
public virtual ShoppingCart ShoppingCart{ get; set; }
[Key, Column(1), ForeignKey("Product")]
public int ProductId { get; set; }
public virtual Product Product { get; set; }
}
With something like this, to get shopping carts with their product min and max quantities:
var shoppingCarts = db.ShoppingCarts
.Select(sc => new model.ShoppingCart
{
ShoppingCartId = sc.Id,
Products = sc.ShoppingCartItems
.Select(sci => new model.Product
{
ProductId = sci.ProductId,
MinQuantity = sci.MinQuantity,
MaxQuantity = sci.MaxQuantity
}).ToList()
}).ToList();
This would provide a list of Shopping Carts with each containing a list of products with their respective min/max quantities.
If you also wanted a Lowest min quantity and highest max quantity across all products in a cart:
var shoppingCarts = db.ShoppingCarts
.Select(sc => new model.ShoppingCart
{
ShoppingCartId = sc.Id,
Products = sc.ShoppingCartItems
.Select(sci => new model.Product
{
ProductId = sci.ProductId,
MinQuantity = sci.MinQuantity,
MaxQuantity = sci.MaxQuantity
}).ToList(),
OverallMinQuantity = sc.ShoppingCartItems
.Min(sci => sci.MinQuantity),
OverallMaxQuantity = sc.ShoppingCartItems
.Max(sci => sci.MaxQuantity),
}).ToList();
Though I'm not sure how practical a figure like that might be in relation to a shopping cart structure. In any case, with navigation properties set up for the relationship between your entities, EF should be perfectly capable of building an IQueryable query for the data you want to retrieve without resorting to pre-fetching lists. One issue with pre-fetching and re-introducing those lists into further queries is that there will be a maximum # of rows that EF can handle. Like with SQL IN clauses, there is a maximum # of items that can be parsed from a set.
In any case it sounds like it's provided you with some ideas to try and get to the figures you want.

How to fetch a list from the database based on another list of a different object type?

I have these two models:
public class Product
{
public int Id { get; set; }
public int ProductGroupId { get; set; }
public int ProductGroupSortOrder { get; set; }
// ... some more properties
public ICollection<ProductInCategory> InCategories { get; set; }
}
public class ProductInCategory
{
public int Id { get; set; }
public int ProductId { get; set; }
public int ProductCategoryId { get; set; }
public int SortOrder { get; set; }
// Nav.props.:
public Product Product { get; set; }
public ProductCategory ProductCategory { get; set; }
}
Some of the Products are grouped together via the property ProductGroupId, and I want to be able to remove whole groups of Products from ProductInCategory in a single Db-query.
The controller method receives a product_id and a category_id, not a ProductGroupId.
For a single Product I have been using this query to remove it from the category:
ProductInCategory unCategorize = await _context.ProductsInCategories
.Where(pic => pic.ProductId == product_id && pic.ProductCategoryId == category_id)
.FirstOrDefaultAsync();
and then:
_context.Remove(unCategorize);
await _context.SaveChangesAsync();
Now, if I have a List<Product> that I want to remove from ProductsInCategories, what would the query look like?
I have tried this, but it fails on the .Any()-bit:
Product product = await _context.Products
.Where(p => p.Id == product_id)
.FirstOrDefaultAsync();
List<Product> products = await _context.Products
.Where(g => g.ProductGroupId == product.ProductGroupId)
.ToListAsync();
List<ProductInCategory> unCategorize = await _context.ProductsInCategories
.Where(pic => pic.ProductId == products.Any(p => p.Id)
&& pic.ProductCategoryId == category_id)
.ToListAsync();
The controller method receives a product_id and a category_id, not a ProductGroupId
The first question is why the method receives product_id while it needs to do something with ProductGroupId.
This smells to a bad design, but anyway, let first translate the product_id to the desired ProductGroupId (this will cost us additional db query):
int? productGroupId = await _context.Products
.Where(p => p.Id == product_id)
.Select(p => (int?)p.ProductGroupId)
.FirstOrDefaultAsync();
if (productGroupId == null)
{
// Handle non existing product_id
}
The rest is a simply matter of accessing the navigation property inside the LINQ to Entities query, which will be translated by EF Core to the appropriate join inside the generated SQL query. No intermediate Product list is needed.
List<ProductInCategory> unCategorize = await _context.ProductsInCategories
.Where(pic => pic.Product.ProductGroupId == productGroupId)
.ToListAsync();
Try changing to the following:
List<ProductInCategory> unCategorize = await _context.ProductsInCategories
.Where(pic => products.Select(p => p.Id).Contains(pic.ProductId)
&& pic.ProductCategoryId == secondary_id)
.ToListAsync();
I could suggest a code fix for what you want, but there's a better solution: not loading the data to begin with.
In your code example, you are loading the data from the database, before then telling EF to delete the loaded items. That's not efficient. There should be no reason to load the data, you should be able to simply execute the query without needing to load data.
As far as I'm aware, Entity Framework is not capable of "conditional deletion" (for lack of a better name), for example:
DELETE FROM People WHERE Name = 'Bob'
If you want to delete items based on a particular column value (other than the entity's ID), you can't rely on Entity Framework unless you want to load the data (which eats performance).
There are two better options here:
1. Execute the SQL query yourself
context.Database.ExecuteSqlCommand(
"DELETE FROM Products WHERE ProductGroupId = " + product.ProductGroupId
);
This is how I've always done it.
Sidenote: I'm expecting comments about SQL injection. To be clear: there is no danger of SQL injection here as product.ProductGroupId is not a string, and its value controlled by the developer, not the end user.
Nonetheless, I do agree that using SQL parameters is good practice. But in this answer, I wanted to provide a simple example to showcase how to execute a string containing SQL.
2. Find a library that enables you to delete without loading.
I only stumbled on this when googling just now. Entity Framework Extensions seems to have implemented the conditional delete feature:
context.Customers.Where(x => x.ID == userId).DeleteFromQuery();
In your case, that would be:
_context.Products.Where(g => g.ProductGroupId == product.ProductGroupId).DeleteFromQuery();
Sidenote:
I've always used Code First, and EF has always generated cascaded deletes for me automatically. Therefore, when you delete the parent, its children get deleted as well. I'm not sure if your database has cascaded deletes, but I am assuming default EF behavior (according to my experience).
Probably the products values is set to null on the previous LINQ query. add another condition to validate that.
.Where(pic => products && pic.ProductId == products.Any(p => p.Id)
&& pic.ProductCategoryId == secondary_id)

Class with multiple List Properties of same type but with different restrictions

Here's my problem: I have a class that have 2 list properties of the same class type (but with some different restriction as on how to be filled), let's say:
public class Team
{
[Key]
public int IDTeam { get; set; }
public string TeamName { get; set; }
public List<Programmer> Members { get; set; }
public List<Programmer> Leaders { get; set; }
public LoadLists(MyProjectDBContext db)
{
this.Members = db.Programmers.Where(p => p.IDTeam = this.IDTeam
&& (p.Experience == "" || p.Experience == null)).ToList();
this.Leaders = db.Programmers.Where(p => p.IDTeam = this.IDTeam
&& (p.Experience != null && p.Experience != "")).ToList();
}
}
public class Programmer
{
[Key]
public int IDProgrammer { get; set; }
[ForeignKey("Team")]
public int IDTeam { get; set; }
public virtual Team Team { get; set; }
public string Name { get; set; }
public string Experience { get; set; }
}
At some point, I need to take a list of Teams, with it's members and leaders, and for this I would assume something like:
return db.Teams
.Include(m => m.Members.Where(p => p.Experience == "" || p.Experience == null)
.Include(l => l.Leaders.Where(p => p.Experience != null && p.Experience != "")
.OrderBy(t => t.TeamName)
.ToList();
And, of course, in this case I would be assuming it wrong (cause it's not working at all).
Any ideas on how to achieve that?
EDIT: To clarify a bit more, the 2 list properties of the team class should be filled according to:
1 - Members attribute - Should include all related proggramers with no experience (proggramer.Experience == null or "");
2 - Leaders attribute - Should include all related proggramers with any experience (programmer.Experiente != null nor "");
EDIT 2: Here's the MyProjectDbContext declaration:
public class MyProjectDBContext : DbContext
{
public DbSet<Team> Teams { get; set; }
public DbSet<Programmer> Programmers { get; set; }
}
You are talking about EntityFramework (Linq to entities) right? If so, Include() is a Method of Linq To Entities to include a sub-relation in the result set. I think you should place the Where() outside of the Inlcude().
On this topic you'll find some examples on how to use the Include() method.
So I suggest to add the Include()'s first to include the relations "Members" and "Leaders" and then apply your Where-Statement (can be done with one Where()).
return db.Teams
.Include("Team.Members")
.Include("Team.Leaders")
.Where(t => string.IsNullOrWhitespace(t.Members.Experience) ... )
What is unclear to me is your where criteria and your use-case at all as you are talking of getting a list of Teams with Leaders and Members. May above example will return a list of Teams that match the Where() statement. You can look though it and within that loop you can list its members and leaders - if that is the use-case.
An alternative is something like this:
return db.Members
.Where(m => string.IsNullOrWhitespace(m.Experience))
.GroupBy(m => m.Team)
This get you a list of members with no experience grouped by Team. You can loop the groups (Teams) and within on its members. If you like to get each team only once you can add a Distinct(m => m.Team) at the end.
Hope this helps. If you need some more detailed code samples it would help to understand your requirements better. So maybe you can say a few more words on what you expect from the query.
Update:
Just read our edits which sound interesting. I don't think you can do this all in one Linq-To-Entities statement. Personally I would do that on the getters of the properties Members and Leaders which do their own query (as a read-only property). To get performance for huge data amount I would even do it with SQL-views on the DB itself. But this depends a little on the context the "Members" and "Leaders" are used (high frequent etc).
Update 2:
Using a single query to get a table of teams with sublists for members and leaders I would do a query on "Programmers" and group them nested by Team and Experience. The result is then a list of groups (=Teams) with Groups (Experienced/Non-experience) with Programmers in it. The final table then can be build with three nested foreach-Statements. See here for some grouping examples (see the example "GroupBy - Nested").
Whenever you fetch entities, they will be stored in the context -- regardless of the form they are "selected" in. That means you can fetch the teams along with all the necessary related entities into an anonymous type, like this:
var teams =
(from team in db.Teams
select new {
team,
relatedProgrammers = team.Programmers.Where(
[query that gets all leaders OR members])
}).ToList().Select(x => x.team);
It looks like we're throwing away the relatedProgrammers field here, but those Programmer entities are still in memory. So, when you execute this:
foreach (var team in teams) team.LoadLists(db);
...it will populate the lists from the programmers that were already fetched, without querying the database again (assuming db is the same context instance as above).
Note: I haven't tested this myself. It's based on a similar technique shown in this answer.
EDIT - Actually, it looks like your "leaders" and "members" cover all programmers associated with a team, so you should be able to just do Teams.Include(t => t.Programmers) and then LoadLists.

Query that returns Entity and child collection.. child collection sorted?

How can i create a query that will return a single entity along with a collection, and have the collection sorted? i don't care if it's two calls to the DB.. i just need the end result to be an Entity type and not an anonymous type:
var
category = context.Categories.Where(c => c.categoryID == categoryID).Select(c => new { c, products = c.Products.OrderByDescending(p => p.priceDate) }).First();
But i'd like to have the above either return or cast to a Category with the Products collections.. as apposed to anonymous type. thanks!
Could you just put that into the category class? Something like this:
public parital class Category
{
public Ilist<Product> ProductsByPriceDate
{
get
{
return this.Products.OrderbyDescending(p= > p.priceDate).ToList();
}
}
}
If you have the correct foreign key relationships in your database then doing a simple select for your category should suffice and then you can order the products so
Category category = context.Categories.Where(c => c.categoryID == categoryID).First();
List<Product> sortedProducts= context.Categories.products.OrderBy(...).ToList()
When your FK relationships are set up when you get the top level object it should retrieve its children too (i havent worked with linq-to-entities but with linq-to-sql i cant imagine it being diffrent in this aspect tho)
I think the above should work...

Categories