Linq EF Split Parent into multiple Parents - c#

Using Entity Framework to query a database with a Parent table and Child table with a 1-n relationship:
public class Parent {
public int id { get; set; }
public IList<Child> Children { get; set; }
}
public class Child {
public int id { get; set; }
}
Using EF, here's a quick sample query:
var parents = context.Parents;
Which returns:
parent id = 1, children = { (id = 1), (id = 2), (id = 3) }
What we need is for this to flatten into a 1-1 relationship, but as a list of parents with a single child each:
parent id = 1, children = { (id = 1) }
parent id = 1, children = { (id = 2) }
parent id = 1, children = { (id = 3) }
We're using an OData service layer which hits EF. So performance is an issue -- don't want it to perform a ToList() or iterate the entire result for example.
We've tried several different things, and the closest we can get is creating an anonymous type like such:
var results = from p in context.Parents
from c in p.Children
select new { Parent = p, Child = c }
But this isn't really what we're looking for. It creates an anonymous type of parent and child, not parent with child. So we can't return an IEnumerable<Parent> any longer, but rather an IEnumerable<anonymous>. The anonymous type isn't working with our OData service layer.
Also tried with SelectMany and got 3 results, but all of Children which again isn't quite what we need:
context.Parents.SelectMany(p => p.Children)
Is what we're trying to do possible? With the sample data provided, we'd want 3 rows returned -- representing a List each with a single Child. When normally it returns 1 Parent with 3 Children, we want the Parent returned 3 times with a single child each.

Your requirements don't make any sense, the idea behind how EF and LINQ work is not those repetitive info like SQL does. But you know them better and we don't know the whole picture, so I will try to answer your question hoping I understood it correctly.
If like you said, your problem is that IEnumerable<anonymous> doesn't work with your OData service layer, then create a class for the relationship:
public class ParentChild {
public Parent Parent { get; set; }
public Child Child { get; set; }
}
And then you can use in in your LINQ query:
var results = from p in context.Parents
from c in p.Children
select new ParentChild { Parent = p, Child = c }

Related

Entity Framework Core: load full hierarchy

Having a self referencing table, with a ParentId attribute which holds the id of the parent record, what can I do so that using ef I will load into each parent its children.
What I want is to transform this cte which will return the full hierarchy as a collection.
var queryString = #"
;WITH cte AS (
SELECT * FROM [dbo].[Folders] _f WHERE _f.[Id] = #id
UNION ALL
SELECT _c.* FROM [dbo].[Folders] _c
INNER JOIN cte _cte
ON _cte.[Id] = _c.[ParentFolderId]
)
SELECT * FROM cte";
return await this.Entities.FromSql(new RawSqlString(queryString), new SqlParameter("id", id)).ToListAsync();
into something that will somehow load the hierarchy of children into their parents, keeping at the same time the performance of one trip to db.
class Folder
{
public int Id { get; set; }
public int? FolderId { get; set; }
public Folder Folder { get; set; }
public IEnumerable<Folder> Children { get; set; }
}
Hierarchy example
- Main (Id: 1 / ParentId: null)
- C1 (2/1)
- C11 (4/2)
- C111 (7/4)
- C12 (5/2)
- C2 (3/1)
- C21 (6/3)
- C211 (8/6)
Configured relation
builder.Ignore(prop => prop.Folder);
builder.HasOne(prop => prop.Folder).WithMany(prop => prop.Children).HasForeignKey(fk => fk.FolderId);
If you want the entire hierarchy in one query, that's easy. Just retrieve all the Folders and if Change Tracking is enabled EF will fix-up all the relationships. IE if you just run
var folders = db.Set<Folder>().ToList();
You'll have the whole hierarcy with all the Navigation Properties populated.
You can get the whole hierarchy with this query:
var hierarchy = db.Set<Folder>().Include(f => f.Children).ToList();

EF Core - Enforce priority in executing commands in a transaction

I want to delete 2 set of data in database, using EF Core.
All codes are hypothetical.
Data models:
class Parent
{
public int Id { get; set; }
}
class Child
{
public int Id { get; set; }
public int ParentId { get; set; }
public virtual Parent Parent { get; set; }
public bool Flag { get; set; }
}
Let's assume I want to delete all [Child] records with (ParentId=100) and (flag=false), after that if (child.ParentId=100).length=0 then delete the parent itself too.
So, here is the service class:
class Service
{
public void Command(int parentId)
{
Parent parent = GetParent(parentId);
List<Child> children = GetChildren(parent);
List<Child> toDelete = children.Where(x => !x.Flag).ToList();
foreach(var child in toDelete)
{
var entry = DbContext.Entry(child);
entry.State = EntityState.Deleted;
}
List<Child> remainChildren = children.Where(x => x.Flag).ToList();
if (!remainChildren.Any())
{
var entry = DbContext.Entry(parent );
entry.State = EntityState.Deleted;
}
SaveChanges();
}
}
I have multiple scenarios that call the Service.Command method.
Because I call SaveChanges() only once, I assume that all delete operations will be executed in a single transaction, and of course they would be in this order:
Delete child records
Delete parent
but EF send queries to database like this:
Delete parent
Delete child records
Obviously it will throw an ForeignKey exception.
Is there any way to enforce EF Core to execute queries in order that I wrote the code?
Set the parent child relationship to cascade delete at the DB level.
Query the needed data in one hit...
var data = context.Parents.Where(p => p.ParentId == parentId)
.Select(p => new
{
Parent = p,
ChildrenToRemove = p.Children.Where(c => c.Flag).ToList(),
HasRemainingChildren = p.Children.Any(c => !c.Flag)
}).Single();
Then it's just a matter of inspecting the data and acting accordingly. If there are no remaining children, delete the parent and let cascade take care of it. Otherwise, just delete the children from the context.
if(!data.HasRemainingChildren)
context.Parents.Remove(data.Parent);
else
context.Children.RemoveRange(data.ChildrenToRemove);
For big entities you can further optimize this by selecting just the IDs then associating them to new Entity instances, attach them to a fresh DbContext, and then issue the Remove/RemoveRange calls. This option is an optimization for dealing with large numbers of items, or "big" entities that would otherwise result in a lot of data across the wire.

GetAllWithChildren() performance issue

I used SQLite-Net Extensions
in the following code to retrieve 1000 rows with their children relationships from an Sqlite database:
var list =
SQLiteNetExtensions.Extensions.ReadOperations.GetAllWithChildren<DataModel>(connection);
The problem is that the performance is awkward. Because GetAllWithChildren() returns a List not an Enumerable. Does exist any way to load the records in to an Enumerable using Sqlite.net extensions?
I now use Table() method from Sqlite.net, loads the fetched rows in to the Enumerable but I dont want to use it because it does not understand the relationships and does not load the children entities at all.
GetAllWithChildren suffers from the N+1 problem, and in your specific scenario this performs specially bad. It's not clear in your question what you're trying, but you could try these solutions:
Use the filterparameter in GetAllWithChildren:
Instead of loading all the objects to memory and then filter, you can use the filter property, that internally performs a Table<T>().Where(filter) query, and SQLite-Net will convert to a SELECT-WHERE clause, so it's very efficient:
var list = connection.GetAllWithChildren<DataModel>(d => d.Name == "Jason");
Perform the query and then load the relationships
If you look at the GetAllWithChildren code you'll realize that it just performs the query and then loads the existing relationships. You can do that by yourself to avoid automatically loading unwanted relationships:
// Load elements from database
var list = connection.Table<DataModel>().Where(d => d.Name == "Jason").toList();
// Iterate elements and load relationships
foreach (DataModel element in list) {
connection.GetChildren(element, recursive = false);
}
Load relationships manually
To completely workaround the N+1 problem you can manually fetch relationships using a Contains filter with the foreign keys. This highly depends on you entity model, but would look like this:
// Load elements from database
var list = connection.Table<DataModel>().Where(d => d.Name == "Jason").toList();
// Get list of dependency IDs
var dependencyIds = list.Select(d => d.DependencyId).toList();
// Load all dependencies from database on a single query
var dependencies = connection.Table<Dependency>.Where(d => dependencyIds.Contains(d.Id)).ToList();
// Assign relationships back to the elements
foreach (DataModel element in list) {
element.Dependency = dependencies.FirstOrDefault(d => d.Id == element.DependencyId);
}
This solution solves the N+1 problem, because it performs only two database queries.
Another method to load relationships manually
Imagine we have these classes:
public class Parent
{
[PrimaryKey, AutoIncrement] public int Id { get; set; }
public string Name { get; set; }
public List<Child> children { get; set; }
public override bool Equals(object obj)
{
return obj != null && Id.Equals(((BaseModel) obj).Id);
}
public override int GetHashCode()
{
return Id.GetHashCode();
}
}
and
public class Child
{
[PrimaryKey, AutoIncrement] public int Id { get; set; }
public string Name { get; set; }
public int ParentId { get; set; }
}
Hint these classes have one-to-many relation. Then inner join between them would be:
var parents = databaseSync.Table<Parent>().ToList();
var children = databaseSync.Table<Child>().ToList();
List<Parent> parentsWithChildren = parents.GroupJoin(children, parent => parent.Id, child => child.ParentId,
(parent, children1) =>
{
parent.children = children1.ToList();
return parent;
}).Where(parent => parent.children.Any()).ToList();

Linq query which retrieved parent/child of children with a collection

I have a self referencing Category class from which I would like to retrieve parent categories and all corresponding children if it has at least one child category and has at least 1 or more activities (ICollection<Activity>) in the collection.
This would also go for children of children as these should only be returned if there are children categories with at least 1 or more activities.
If there are no child categories with at least 1 or more activities the parent or child Category should not be returned.
The query should return the parent Category as an actual Category object and not just the CategoryId. It this possible?
public class Category
{
public int CategoryId { get; set; }
public string Name { get; set; }
public int? ParentId { get; set; }
public virtual Category Parent { get; set; }
public virtual ICollection<Category> Children { get; set; }
public virtual ICollection<Activity> Activities { get; set; }
}
UPDATE 1
The query which partially works:
var categories = _db.Categories
.Where(x => x.Parent != null && x.Activities.Count > 0)
.GroupBy(x => x.ParentId)
.Select(g => new { Parent = g.Key, Children = g.ToList() }).ToList();
Let's start off a bit smaller, since the query you are looking to create is somewhat complex. We will create your query from the bottom up. First off, you want to eliminate categories that do not have any child categories with at least one or more activities. Let's make a Predicate to return true for those that should be included and false for those that should be excluded, at a single level. We will do this in two stages. First, let's make a predicate that returns true for categories that have activities:
Predicate<Category> hasActivities = cat => cat.Activities.Any();
Second, let's make a Predicate to return true for those categories with child categories that have activities:
Predicate<Category> hasChildWithActivities =
parentCat => parentCat.Children.Any(hasActivities);
Now let's create the filter query that will filter a given Category's descendants. To do this, we will create a Func that takes a parent Category, performs the logic and returns the updated Category:
Func<Category, Category> getFilteredCategory =
parentCat =>
{
parentCat.Children = parentCat.Children
.Where(hasChildWithActivities)
.Select(getFilteredCategory);
return parentCat;
});
Note that this is equivalent to:
Func<Category, Category> getFilteredCategory = delegate(Category parentCat)
{
parentCat.Children = parentCat.Children
.Where(hasChildWithActivities)
.Select(getFilteredCategory);
return parentCat;
};
In your OP, you mentioned that you wanted to filter parents as well. You can use this same logic on the parents by traversing up to the top level and running this query, or by creating a separate query with "joins" or more complex "select" statements. IMHO, the latter would likely be messy and I would advise against it. If you need to apply the logic to parents as well, then first traverse up the tree. Either way, this should give you a good start.
Let me know if you have any questions. Good luck and happy coding! :)

Avoiding duplicates in hierarchical parent-child relational collection

I am looking to write linq statement for a simple scenario of collections. I am trying to avoid duplicate items in collection based on parent child relationship. The data structure and sample code is below
public class Catalog
{
public int CatalogId { get; set; }
public int ParentCatalogId { get; set; }
public string CatalogName { get; set; }
}
public class Model
{
public int CatalogId { get; set; }
public string ItemName { get; set; }
...
}
List<Catalog> Catalogs : Contains the complete list of parent child relations to any level of all the catalogs and the root one with ParentCatalogid=null
List<Model> CollectionA : Contains all the items of child as well as parent catalog for a specific catalogId (till its root).
I need to create a CollectionB from CollectionA that will contain items of the provided catalogId including all the items of all the parents such that if item is present in child catalog, i need to ignore same item in parent catalog. In this way there wont be any duplicate Items if same items is available in child as well as parent.
In terms of code I am trying to achieve something like this
while (catalogId!= null)
{
CollectionB.AddRange(
CollectionA.Where(x => x.CatalogId == catalogId &&
!CollectionB.Select(y => y.ItemName).Contains(x.ItemName)));
// Starting from child to parent and ignoring items that are already in CollectionB
catalogId = Catalogs.
Where(x => x.Id == catalogId).
Select(x => x.ParentCatalogId).
FirstOrDefault();
}
I know that Contains clause in linq in above statement will not work but just put that statement to explain what i am trying to do. I can do that using foreach loop but just want to use linq. I am looking for correct linq statement to do this. The sample data is given below and will really appreciate if i can get some help
Catalog
ID ParenId CatalogName
1 null CatalogA
2 1 Catalogb
3 1 CatalogC
4 2 CatalogD
5 4 CatalogE
CollectionA
CatalogId ItemName
5 ItemA
5 ItemB
4 ItemA
4 ItemC
2 ItemA
2 ItemC
1 ItemD
Expected output
CollectionB
5 ItemA
5 ItemB
4 ItemC
1 ItemD
LINQ is not designed to traverse hierarchical data structures as it has been already considered in:
Walking a hierarchy table with Linq
Recursive Hierarchy - Recursive Query using Linq
But if you can get the hierarchy of catalogs from child to root then the problem could be solved with join and distinct - LINQ's Distinct() on a particular property :
var modelsForE = (from catalog in flattenedHierarchyOfCatalogE
join model in models
on catalog.CatalogId equals model.CatalogId
select model).
GroupBy(model => model.ItemName).
Select(modelGroup => modelGroup.First()).
Distinct();
Or even better - adapt Jon Skeet's answer for distinct.
It solves the duplicates problem but leaves us with another question : How to get flattenedHierarchyOfCatalogE?
PURE LINQ SOLUTION:
It is not easy task, but not exactly impossible with pure LINQ. Adapting How to search Hierarchical Data with Linq we get:
public static class LinqExtensions
{
public static IEnumerable<T> Flatten<T>(this T source, Func<T, IEnumerable<T>> selector)
{
return selector(source).SelectMany(c => Flatten(c, selector))
.Concat(new[] { source });
}
}
//...
var catalogs = new Catalog[]
{
new Catalog(1, 0, "CatalogA"),
new Catalog(2, 1, "Catalogb"),
new Catalog(3, 1, "CatalogC"),
new Catalog(4, 2, "CatalogD"),
new Catalog(5, 4, "CatalogE")
};
var models = new Model[]
{
new Model(5, "ItemA"),
new Model(5, "ItemB"),
new Model(4, "ItemA"),
new Model(4, "ItemC"),
new Model(2, "ItemA"),
new Model(2, "ItemC"),
new Model(1, "ItemD")
};
var catalogE = catalogs.SingleOrDefault(catalog => catalog.CatalogName == "CatalogE");
var flattenedHierarchyOfCatalogE = catalogE.Flatten((source) =>
catalogs.Where(catalog =>
catalog.CatalogId == source.ParentCatalogId));
And then feed the flattenedHierarchyOfCatalogE into the query from the beginning of the question.
WARNING: I have added constructors for your classes, so previous snippet may fail to compile in your project:
public Catalog(Int32 catalogId, Int32 parentCatalogId, String catalogName)
{
this.CatalogId = catalogId;
this.ParentCatalogId = parentCatalogId;
this.CatalogName = catalogName;
} //...
SOMETHING TO CONSIDER
There is nothing wrong with previous solution(well, personally I may have considered to use something with less extensive use of LINQ like Recursive Hierarchy - Recursive Query using Linq), but whichever solution you like you may have one problem: It works, but it doesn't use any optimized datastructures - it is just direct search and selection. If your catalogs grow and queries will execute more often, then the performance may become a problem.
But even if the performance is not a problem then the ease of use of your classes is. Ids, foreign keys are good for relational databases but very unwieldy in OO systems. You may want to consider possible object relational mapping for your classes(or creation of their wrappers(mirrors) that will look something like:
public class Catalog
{
public Catalog Parent { get; set; }
public IEnumerable<Catalog> Children { get; set; }
public string CatalogName { get; set; }
}
public class Model
{
public Catalog Catalog { get; set; }
public string ItemName { get; set; }
}
Such classes are far more self contained and much more easier to use and to traverse their hierarchies. I don't know whether your system is database-driven or not, but you can nonetheless take a look at some object-relational mapping examples and technologies.
P.S.: LINQ is not an absolute tool in .NET arsenal. No doubts that it is very useful tool applicable in multitude of situations, but not in each of all possible. And if tool cannot help you to solve a problem, then it should be either modified or put aside for a moment.
You are most likely looking for SelectMany() extension. A short example of how it can be used to select all the children for comparison (to avoid duplicates) is below:
var col = new[] {
new { name = "joe", children = new [] {
new { name = "billy", age=1 },
new { name = "sally", age=4 }
}},
new { name = "bob", children = new [] {
new { name = "megan", age=10 },
new { name = "molly", age=7 }
}}
};
col.SelectMany(c => c.children).Dump("kids");
For more information there are a few questions on stack overflow about this extension and of course you can read the actual msdn documentation

Categories