I am projecting LINQ to SQL results to strongly typed classes: Parent and Child. The performance difference between these two queries is large:
Slow Query - logging from the DataContext shows that a separate call to the db is being made for each parent
var q = from p in parenttable
select new Parent()
{
id = p.id,
Children = (from c in childtable
where c.parentid = p.id
select c).ToList()
}
return q.ToList() //SLOW
Fast Query - logging from the DataContext shows a single db hit query that returns all required data
var q = from p in parenttable
select new Parent()
{
id = p.id,
Children = from c in childtable
where c.parentid = p.id
select c
}
return q.ToList() //FAST
I want to force LINQ to use the single-query style of the second example, but populate the Parent classes with their Children objects directly. otherwise, the Children property is an IQuerierable<Child> that has to be queried to expose the Child object.
The referenced questions do not appear to address my situation. using db.LoadOptions does not work. perhaps it requires the type to be a TEntity registered with the DataContext.
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Parent>(p => p.Children);
db.LoadOptions = options;
Please Note: Parent and Child are simple types, not Table<TEntity> types. and there is no contextual relationship between Parent and Child. the subqueries are ad-hoc.
The Crux of the Issue: in the 2nd LINQ example I implement IQueriable statements and do not call ToList() function and for some reason LINQ knows how to generate one single query that can retrieve all the required data. How do i populate my ad-hoc projection with the actual data as is accomplished in the first query? Also, if anyone could help me better-phrase my question, I would appreciate it.
It's important to remember that LINQ queries rely in deferred execution. In your second query you aren't actually fetching any information about the children. You've created the queries, but you haven't actually executed them to get the results of those queries. If you were to iterate the list, and then iterate the Children collection of each item you'd see it taking as much time as the first query.
Your query is also inherently very inefficient. You're using a nested query in order to represent a Join relationship. If you use a Join instead the query will be able to be optimized appropriately by both the query provider as well as the database to execute much more quickly. You may also need to adjust the indexes on your database to improve performance. Here is how the join might look:
var q = from p in parenttable
join child in childtable
on p.id equals child.parentid into children
select new Parent()
{
id = p.id,
Children = children.ToList(),
}
return q.ToList() //SLOW
The fastest way I found to accomplish this is to do a query that returns all the results then group all the results. Make sure you do a .ToList() on the first query, so that the second query doesn't do many calls.
Here r should have what you want to accomplish with only a single db query.
var q = from p in parenttable
join c in childtable on p.id equals c.parentid
select c).ToList();
var r = q.GroupBy(x => x.parentid).Select(x => new { id = x.Key, Children=x });
You must set correct options for your data load.
options.LoadWith<Document>(d => d.Metadata);
Look at this
P.S. Include for the LINQToEntity only.
The second query is fast precisely because Children is not being populated.
And the first one is slow just because Children is being populated.
Choose the one that fits your needs best, you simply can't have their features together!
EDIT:
As #Servy says:
In your second query you aren't actually fetching any information about the children. You've created the queries, but you haven't actually executed them to get the results of those queries. If you were to iterate the list, and then iterate the Children collection of each item you'd see it taking as much time as the first query.
Related
Need to select data in entity framework but need to filter on the childrent and grandchildren
i have 4 tables. Parent -> Child -> GrandChild -> GreatGrandChild I want to return all the parents but filter on child and greatgrandchildren.
in other words (for example)
SELECT Parent.*
FROM Parent
INNER JOIN Child
INNER JOIN Grandchild
INNER JOIN GreatGrandChild
WHERE child.Column5 = 600 AND
GreatGrandChild.Column3 = 1000
it cant be anomymous type because i need to update the data and saveChanges to the db.
using vs 2010 and EF 4.0
Using linq you should need something like this.
var q = from q1 in dbContext.Parent
join q2 in dbContext.Children
on q1.key equals q2.fkey
join q3 in ........
where q4.col1 == 3000
select q1;
This query should do what you want. Yes, it is a bit of a mess because it is so deeply nested.
var result = context.Parent
.Where(parent => parent.Child
.Any(child => (child.Column5 == 600) &&
child.GrandChild
.Any(grandchild => grandchild.GreatGrandChild
.Any(greatgrandchild => greatgrandchild.Column3 == 1000))));
Your table structure - if your example is not just for illustration, leads me to think you may want to think more about your model here (i.e. are children separate entity types or should they be a defined relationship?)
What you describe is a simple joins and where clause though, and is written essentially the same way: assuming you are returning DBSet from your DBContext:
_context.Parents.Join(context.Child, p=>p.Parent.ID, c=>c.ParentID)
.Join(...Grandchild...).Where(o=>o.Column5=600)
.Join(...GreatGrandChild...).Where(o=>o.Column3=1000)
EDIT to get back the strongly typed entities you might need to do something like:
var greatgrandchildren = context.GreatGrandchildren.Where(o=>o.Column3=1000).ToList();
var grandchildren = context.Grandchildren.Where(o=>o.Column3=600 and greatgrandchildren.contains(o)).ToList();
var children = context.Children.Where(o=>grandchildren.Contains(o)).ToList();
var parents = context.Parent(o=>children.Contains(o).ToList();
My syntax might be off, and someone can add, can I avoid .ToList() to prevent roundtrips until the last call?
EDIT: forgot to say I'm using Fluent NHibernate, even though the tag could hint about it anyway.
I have these entity classes:
class OuterLevel
{
ICollection<MidLevel> mid_items;
... other properties
}
class MidLevel
{
OuterLevel parent;
Inner1 inner1;
Inner2 inner2;
... other properties
}
class Inner1
{
int id;
string description;
}
class Inner2
{
int id;
string description;
}
I need to build a Linq query that returns a list of OuterLevel objects with all children populated properly.
Supposing all mappings are correct and working, the hard part I'm finding here is that the resulting query should be something like
SELECT * FROM OuterLevelTable OLT INNER JOIN MidLevelTable MLT ON (MLT.parentID = OLT.ID) INNER JOIN
Inner1Table ON (MLT.Inner1ID = Inner1Table.ID) INNER JOIN
Inner2Table ON (MLT.Inner2ID = Inner2Table.ID)
WHERE (Inner1Table.someproperty1 = somevalue1) AND (Inner2Table.someproperty2 = somevalue2)
The main problem is that two joins start from MidLevel object downward the hierarchy, so I cannot figure out which Fetch and FetchMany combination can be used without having the resulting query join two times the MidLevelTable, such as the following does:
return All().FetchMany(x => x.mid_items).ThenFetch(x => x.inner1).FetchMany(x => x.mid_items).ThenFetch(x => x.inner2);
I would like to return a IQueryable that can be further filtered, so I would prefer avoiding Query and QueryOver.
Thanks in advance,
Mario
what you want is not possible because when you filter on the joined tables the resulting records are not enough to populate the collections anyway. You better construct the query in one place to further tune it or set the collection batch size to get down SELECT N+1.
I have a fairly complicated join query that I use with my database. Upon running it I end up with results that contain an baseID and a bunch of other fields. I then want to take this baseID and determine how many times it occurs in a table like this:
TableToBeCounted (One to Many)
{
baseID,
childID
}
How do I perform a linq query that still uses the query I already have and then JOINs the count() with the baseID?
Something like this in untested linq code:
from k in db.Kingdom
join p in db.Phylum on k.KingdomID equals p.KingdomID
where p.PhylumID == "Something"
join c in db.Class on p.PhylumID equals c.PhylumID
select new {c.ClassID, c.Name};
I then want to take that code and count how many orders are nested within each class. I then want to append a column using linq so that my final select looks like this:
select new {c.ClassID, c.Name, o.Count()}//Or something like that.
The entire example is based upon the Biological Classification system.
Assume for the example that I have multiple tables:
Kingdom
|--Phylum
|--Class
|--Order
Each Phylum has a Phylum ID and a Kingdom ID. Meaning that all phylum are a subset of a kingdom. All Orders are subsets of a Class ID. I want to count how many Orders below to each class.
select new {c.ClassID, c.Name, (from o in orders where o.classId == c.ClassId select o).Count()}
Is this possible for you? Best I can do without knowing more of the arch.
If the relationships are as you describe:
var foo = db.Class.Where(c=>c.Phylum.PhylumID == "something")
.Select(x=> new { ClassID = x.ClassID,
ClassName = x.Name,
NumOrders= x.Order.Count})
.ToList();
Side question: why are you joining those entities? Shouldn't they naturally be FK'd, thereby not requiring an explicit join?
I'm trying to filter down the results returned by EF into only those relevant - in the example below to those in a year (formattedYear) and an ordertype (filtOrder)
I have a simple set of objects
PEOPLE 1-M ORDERS 1-M ORDERLINES
with these relationships already defined in the Model.edmx
in SQL I would do something like...
select * from PEOPLE inner join ORDERS on ORDERS.PEOPLE_RECNO=PEOPLE.RECORD_NUMBER
inner join ORDERLINE on ORDERLINE.ORDER_RECNO=ORDERS.RECORD_NUMBER
where ORDERLINE.SERVICE_YEAR=#formattedYear
and ORDERS.ORDER_KEY=#filtOrder
I've tried a couple of approaches...
var y = _entities.PEOPLE.Include("ORDERS").Where("it.ORDERS.ORDER_KEY=" + filtOrder.ToString()).Include("ORDERLINEs").Where("it.ORDERS.ORDERLINEs.SERVICE_YEAR='" + formattedYear + "'");
var x = (from hp in _entities.PEOPLE
join ho in _entities.ORDERS on hp.RECORD_NUMBER equals ho.PEOPLE_RECNO
join ol in _entities.ORDERLINEs on ho.RECORD_NUMBER equals ol.ORDERS_RECNO
where (formattedYear == ol.SERVICE_YEAR) && (ho.ORDER_KEY==filtOrder)
select hp
);
y fails with ORDER_KEY is not a member of transient.collection...
and x returns the right PEOPLE but they have all of their orders attached - not just those I am after.
I guess I'm missing something simple ?
Imagine you have a person with 100 orders. Now you filter those orders down to 10. Finally you select the person who has those orders. Guess what? The person still has 100 orders!
What you're asking for is not the entity, because you don't want the whole entity. What you seem to want is a subset of the data from the entity. So project that:
var x = from hp in _entities.PEOPLE
let ho = hp.ORDERS.Where(o => o.ORDER_KEY == filtOrder
&& o.ORDERLINES.Any(ol => ol.SERVICE_YEAR == formattedYear))
where ho.Any()
select new
{
Id = hp.ID,
Name = hp.Name, // etc.
Orders = from o in ho
select new { // whatever
};
I am not exactly sure what your question is but the following might be helpful.
In entity framework if you want to load an object graph and filter the children then you might first do a query for the child objects and enumerate it (i.e. call ToList()) so the childern will be fetched in memory.
And then when you fetch the parent objects (and do not use .include) enitity framework will able to construct the graph on its own (but note that you might have to disable lazy loading first or it will take long to load).
here is an example (assuming your context is "db"):
db.ContextOptions.LazyLoadingEnabled = false;
var childQuery = (from o in db.orders.Take(10) select o).ToList();
var q = (from p in db.people select p).ToList();
Now you will find that every people object has ten order objects
EDIT: I was in a hurry when I wrote the sample code, and as such I have not tested it yet, and I probably went wrong by claiming that .Take(10) will bring back ten orders for every people object, instead I believe that .Take(10) will bring back only ten overall orders when lazy loading is disabled, (and for the case when lazy loading is enabled I have to actually test what the result will be) and in order to bring back ten orders for every people object you might have to do more extensive filtering.
But the idea is simple, you first fetch all children objects and entity framework constructs the graph on its own.
I have 5 tables in a L2S Classes dbml : Global >> Categories >> ItemType >> Item >> ItemData. For the below example I have only gone as far as itemtype.
//cdc is my datacontext
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Global>(p => p.Category);
options.AssociateWith<Global>(p => p.Category.OrderBy(o => o.SortOrder));
options.LoadWith<Category>(p => p.ItemTypes);
options.AssociateWith<Category>(p => p.ItemTypes.OrderBy(o => o.SortOrder));
cdc.LoadOptions = options;
TraceTextWriter traceWriter = new TraceTextWriter();
cdc.Log = traceWriter;
var query =
from g in cdc.Global
where g.active == true && g.globalid == 41
select g;
var globalList = query.ToList();
// In this case I have hardcoded an id while I figure this out
// but intend on trying to figure out a way to include something like globalid in (#,#,#)
foreach (var g in globalList)
{
// I only have one result set, but if I had multiple globals this would run however many times and execute multiple queries like it does farther down in the hierarchy
List<Category> categoryList = g.category.ToList<Category>();
// Doing some processing that sticks parent record into a hierarchical collection
var categories = (from comp in categoryList
where comp.Type == i
select comp).ToList<Category>();
foreach (var c in categories)
{
// Doing some processing that stick child records into a hierarchical collection
// Here is where multiple queries are run for each type collection in the category
// I want to somehow run this above the loop once where I can get all the Items for the categories
// And just do a filter
List<ItemType> typeList = c.ItemTypes.ToList<ItemType>();
var itemTypes = (from cat in TypeList
where cat.itemLevel == 2
select cat).ToList<ItemType>();
foreach (var t in itemTypes)
{
// Doing some processing that stick child records into a hierarchical collection
}
}
}
"List typeList = c.ItemTypes.ToList();"
This line gets executed numerous times in the foreach, and a query is executed to fetch the results, and I understand why to an extent, but I thought it would eager load on Loadwith as an option, as in fetch everything with one query.
So basically I would have expected L2S behind the scenes to fetch the "global" records in one query, take any primary key values, get the "category" children using one one query. Take those results and stick them into collections linked to the global. Then take all the category keys and excute one query to fetch the itemtype children and link those into their associated collections. Something on the order of (Select * from ItemTypes Where CategoryID in ( select categoryID from Categories where GlobalID in ( #,#,# ))
I would like to know how to properly eager load associated children with minimal queries and possibly how to accomplish my routine generically not knowing how far down I need to build the hierarchy, but given a parent entity, grab all the associated child collections and then do what I need to do.
Linq to SQL has some limitations with respect to eager loading.
So Eager Load in Linq To SQL is only
eager loading for one level at a time.
As it is for lazy loading, with Load
Options we will still issue one query
per row (or object) at the root level
and this is something we really want
to avoid to spare the database. Which
is kind of the point with eager
loading, to spare the database. The
way LINQ to SQL issues queries for the
hierarchy will decrease the
performance by log(n) where n is the
number of root objects. Calling ToList
won't change the behavior but it will
control when in time all the queries
will be issued to the database.
For details see:
http://www.cnblogs.com/cw_volcano/archive/2012/07/31/2616729.html
I am sure this could be done better, but I got my code working with minimal queries. One per level. This is obviously not really eager loading using L2S, but if someone knows the right way I would like to know for future reference.
var query =
from g in cdc.Global
where g.active == true && g.globalId == 41
select g;
var globalList = query.ToList();
List<Category> categoryList = g.category.ToList<Category>();
var categoryIds = from c in cdc.Category
where c.globalId == g.globalId
select c.categoryId;
var types = from t in cdc.ItemTypes
where categoryIds.Any(i => i == t.categoryId)
select t;
List<ItemType> TypeList = types.ToList<ItemType>();
var items = from i in cdc.Items
from d in cdc.ItemData
where i.ItemId == d.ItemId && d.labelId == 1
where types.Any(i => i == r.ItemTypes)
select new
{
i.Id,
// A Bunch of more fields shortened for berevity
d.Data
};
var ItemList = items.ToList();
// Keep on going down the hierarchy if you need more child results
// Do your processing psuedocode
for each item in list
filter child list
for each item in child list
.....
//
Wouldn't mind knowing how to do this all using generics and a recursive method given the top level table