I have "static" readonly entities which I simply load with QueryOver<T>().List<T>(). All their properties are not "lazy". So some of them have N+1 problem.
I tried to use Future to avoid N+1. But it looks like then NH considers entity properties as "lazy". And when I access them it even reloads entities from db one by one (leading to the same N+1 situation) despite that all entities were preloaded preliminary and should be cached in the session 1st level cache. Here is the code how I'm doing it:
var futures = new List<IEnumerable>();
futures.Add(s.QueryOver<DailyBonus>().Future<DailyBonus>());
futures.Add(s.QueryOver<DailyBonusChestContent>().Future<DailyBonusChestContent>());
// ... other entities ...
// all queries should be sent with first enumeration
// but I want to ensure everything is loaded
// before using lazy properties
foreach (IEnumerable future in futures)
{
if (future.Cast<object>().Any(x => false)) break;
}
// now everything should be in cache, right?
// so I can travel the whole graph without accessing db?
Serializer.Serialize(ms, futures); // wow, N+1 here!
I checked this behavior using hibernatingrhinos profiler.
So what is going on wrong here?
The only correct way of using futures for loading entity with collections is with Fetch which means joins for each query:
var q = session.Query<User>().Where(x => x.Id == id);
var lst = new List<IEnumerable>
{
q.FetchMany(x => x.Characters).ToFuture(),
q.Fetch(x=>x.UpdateableData).ToFuture(),
session.QueryOver<User>().Where(x => x.Id == id)
.Fetch(x=>x.Characters).Eager
.Fetch(x => x.Characters.First().SmartChallengeTrackers).Eager
.Future()
};
var r = session.QueryOver<User>().Where(x => x.Id == id)
.TransformUsing(Transformers.DistinctRootEntity)
.Future();
foreach (IEnumerable el in lst)
{
foreach (object o in el)
{
}
}
return r.ToArray();
It's still better than join everything in a one query - NHibernate won't have to parse thousands of rows introduced by join x join x join x join...
You can add normal select queries (without Fetch) to to the same batch but they won't be used for retrieving collections on another entity.
Related
I'm trying to load some user info in code, but EF returns null.
foreach (var user in allAutoUsers)
{
Wallet wallet = db.CabinetUsers
.Find(user.IdCabinetUser)?
.Wallets
.FirstOrDefault(x => x.TypeCurrency == currency);
}
Variable user reporting 1 wallet, but when I'm trying to get it in code above it returns null.
Are there some ways to solve this problem?
Read more into Linq expressions rather than relying on Find. You can be running into issues if your relationships between entities are not defined as virtual which would prevent EF from lazy loading them.
The issue with using .Find() is that it will return the entity if it exists, however, attempting to access any related property that the DbContext isn't already aware of will require a lazy load call. This can be easily missed if lazy loading is disabled, or the member isn't virtual, and it can be a performance issue when it is enabled.
Instead, Linq can allow you to query through the object graph directly to get what you want:
foreach (var user in allAutoUsers)
{
Wallet wallet = db.CabinetUsers
.Where(x => x.IdCabinetUser == user.IdCabinetUser)
.SelectMany(x => x.Wallets)
.SingleOrDefault(x => x.TypeCurrency == currency);
// do stuff with wallet...
}
This assumes that there will be only 1 wallet for the specified currency per user. When expecting 0 or 1, use SingleOrDefault. Use FirstOrDefault only when you are expecting 0 or many, want the "first" one and have specified an OrderBy clause to ensure the first item is predictable.
This will result in a query per user. To accomplish with 1 query for all users:
var userIds = allAutoUsers.Select(x => x.IdCabinetUser).ToList();
var userWallets = db.CabinetUsers
.Where(x => userIds.Contains(x.IdCabinetUser))
.Select(x => new
{
x.IdCabinetUser,
Wallet = x.SelectMany(x => x.Wallets)
.SingleOrDefault(x => x.TypeCurrency == currency);
}).ToList();
From this I would consider expanding the wallets SelectMany with a Select for the details in the Wallet you actually care about rather than a reference to the entire wallet entity. This has the benefit of speeding up the query, reducing memory use, and avoids the potential for lazy load calls tripping things up if Wallet references any other entities that get touched later on.
For example if you only need the IdWallet, WalletName, TypeCurrency, and Balance:
// replace this line from above...
Wallet = x.SelectMany(x => x.Wallets)
// with ...
Wallet = x.SelectMany(x => x.Wallets.Select(w => new
{
w.IdWallet,
w.WalletName,
w.TypeCurrency,
w.Ballance
}) // ...
From there you can foreach to your heart's content without extra queries:
foreach ( var userWallet in userWallets)
{
// do stuff with userWallet.Wallet and userWallet.IdCabinetUser.
}
If you want to return the wallet details to a calling method or view or such, then you cannot use an anonymous type for that ( new { } ). Instead you will need to define a simple class for the data you want to return and use Select into that. I.e. new WalletDTO { IdWallet = w.IdWallet, //... } Even if you use Entities, it is recommended to reduce these into DTOs or ViewModels rather than returning entities. Entities should not "live" longer than the DbContext that spawned them otherwise you get all kinds of nasty behaviour popping up like ObjectDisposedException and serialization exceptions.
I have to put a complex query on your database. But the query ends at 8000 ms. Do I do something wrong? I use .net 1.1 and Entity Framework core 1.1.2 version.
var fol = _context.UserRelations
.Where(u => u.FollowerId == id && u.State == true)
.Select(p => p.FollowingId)
.ToArray();
var Votes = await _context.Votes
.OrderByDescending(c => c.CreationDate)
.Skip(pageSize * pageIndex)
.Take(pageSize)
.Where(fo => fol.Contains(fo.UserId))
.Select(vote => new
{
Id = vote.Id,
VoteQuestions = vote.VoteQuestions,
VoteImages = _context.VoteMedias.Where(m => m.VoteId == vote.Id)
.Select(k => k.MediaUrl.ToString()),
Options = _context.VoteOptions.Where(m => m.VoteId == vote.Id).Select( ques => new
{
OptionsID = ques.Id,
OptionsName = ques.VoteOption,
OptionsCount = ques.VoteRating.Count(cout => cout.VoteOptionsId == ques.Id),
}),
User = _context.Users.Where(u => u.Id == vote.UserId).Select(usr => new
{
Id = usr.Id,
Name = usr.UserProperties.Where(o => o.UserId == vote.UserId).Select(l => l.Name.ToString())
.First(),
Surname = usr.UserProperties.Where(o => o.UserId == vote.UserId)
.Select(l => l.SurName.ToString()).First(),
ProfileImage = usr.UserProfileImages.Where(h => h.UserId == vote.UserId && h.State == true)
.Select(n => n.ImageUrl.ToString()).First()
}),
NextPage = nextPage
}).ToListAsync();
Have a look at the SQL queries you generate to the server (and results of this queries). For SQL Server the best option is SQL Server Profiler, there are ways for other servers too.
you create two queries. First creates fol array and then you pass it into the second query using Contains. Do you know how this works? You probably generate query with as many parameters as many items you have in the array. It is neither pretty or efficient. It is not necessary here, merge it into the main query and you would have only one parameter.
you do paginating before filtering, is this really the way it should work? Also have a look at other ways of paginating based on filtering by ids rather than simple skipping.
you do too much side queries in one query. When you query three sublists of 100 items each, you do not get 300 rows. To get it in one query you create join and get actually 100*100*100 = 1000000 rows. Unless you are sure the frameworks can split it into multiple queries (probably can not), you should query the sublists in separate queries. This would be probably the main performance problem you have.
please use singular to name tables, not plural
for performance analysis, indexes structure and execution plan are vital information and you can not really say much without them
As noted in the comments, you are potentially executing 100, 1000 or 10000 queries. For every Vote in your database that matches the first result you do 3 other queries.
For 1000 votes which result from the first query you need to do 3000 other queries to fetch the data. That's insane!
You have to use EF Cores eager loading feature to fetch this data with very few queries. If your models are designed well with relations and navigation properties its easy.
When you load flat models without a projection (using .Select), you have to use .Include to tell EF Which other related entities it should load.
// Assuming your navigation property is called VoteMedia
await _context.Votes.
.Include(vote => vote.VoteMedia)
...
This would load all VoteMedia objects with the vote. So extra query to get them is not necessary.
But if you use projects, the .Include calls are not necessary (in fact they are even ignored, when you reference navigation properties in the projection).
// Assuming your navigation property is called VoteMedia
await _context.Votes.
.Include(vote => vote.VoteMedia)
...
.Select( vote => new
{
Id = vote.Id,
VoteQuestions = vote.VoteQuestions,
// here you reference to VoteMedia from your Model
// EF Core recognize that and will load VoteMedia too.
//
// When using _context.VoteMedias.Where(...), EF won't do that
// because you directly call into the context
VoteImages = vote.VoteMedias.Where(m => m.VoteId == vote.Id)
.Select(k => k.MediaUrl.ToString()),
// Same here
Options = vote.VoteOptions.Where(m => m.VoteId == vote.Id).Select( ques => ... );
}
Consider following LINQ query:
var item = (from obj in _db.SampleEntity.Include(s => s.NavProp1)
select new
{
ItemProp1 = obj,
ItemProp2 = obj.NavProp2.Any(n => n.Active)
}).SingleOrDefault();
This runs as expected, but item.ItemProp1.NavProp1 is NULL.
As it explains here this is because of the query actually changes after using Include(). but the question is what is the solution with this situation?
Edit:
When I change the query like this, every things works fine:
var item = (from obj in _db.SampleEntity.Include(s => s.NavProp1)
select obj).SingleOrDefault();
Regarding to this article I guess what the problem is... but the solution provided by author not working in my situation (because of using anonymous type in final select rather than entity type).
As you mentioned, Include is only effective when the final result of the query consists of the entities that should include the Include-d navigation properties.
So in this case Include has effect:
var list = _db.SampleEntity.Include(s => s.NavProp1).ToList();
The SQL query will contain a JOIN and each SampleEntity will have its NavProp1 loaded.
In this case it has no effect:
var list = _db.SampleEntity.Include(s => s.NavProp1)
.Select(s => new { s })
.ToList();
The SQL query won't even contain a JOIN, EF completely ignores the Include.
If in the latter query you want the SampleEntitys to contain their NavProp1s you can do:
var list = _db.SampleEntity
.Select(s => new { s, s.NavProp1 })
.ToList();
Now Entity Framework has fetched SampleEntitys and NavProp1 entities from the database separately, but it glues them together by a process called relationship fixup. As you see, the Include is not necessary to make this happen.
However, if Navprop1 is a collection, you'll notice that...
var navprop1 = list.First().s.Navprop1;
...will still execute a query to fetch Navprop1 by lazy loading. Why is that?
While relationship fixup does fill Navprop1 properties, it doesn't mark them as loaded. This only happens when Include loaded the properties. So now we have SampleEntity all having their Navprop1s, but you can't access them without triggering lazy loading. The only thing you can do to prevent this is
_db.Configuration.LazyLoadingEnabled = false;
var navprop1 = list.First().s.Navprop1;
(or by preventing lazy loading by disabling proxy creation or by not making Navprop1 virtual.)
Now you'll get Navprop1 without a new query.
For reference navigation properties this doesn't apply, lazy loading isn't triggered when it's enabled.
In Entity Framework core, things have changed drastically in this area. A query like _db.SampleEntity.Include(s => s.NavProp1).Select(s => new { s }) will now include NavProp1 in the end result. EF-core is smarter in looking for "Includable" entities in the end result. Therefore, we won't feel inclined to shape a query like Select(s => new { s, s.NavProp1 }) in order to populate the navigation property. Be aware though, that if we use such a query without Include, lazy loading will still be triggered when s.NavProp1 is accessed.
I know this will probably get a few laughs, but don't forget the obvious like i just did. The row in the database didn't actually have a foreign key reference! I should have checked the dam data first before thinking EF Include wasn't working! Grrr. 30 minutes of my life I won't get back.
If your model is defined properly it should work without any problems.
using System.Data.Entity;
var item = _db.SampleEntity
.Include(p => p.NavigationProperty)
.Select(p => new YourModel{
PropertyOne = p.Something,
PropertyTwo = p.NavigationProperty.Any(x => x.Active)
})
.SingleOrDefault(p => p.Something == true);
How did you find that item.ItemProp1.NavProp1 is null. EF uses proxies to load all required properties when you try to access it.
What about
var item = (from obj in _db.SampleEntity.Include(s => s.NavProp1)
select obj).SingleOrDefault();
Assert.IsNotNull(obj.NavProp1);
Assert.IsNotNull(obj.NavProp2);
You can also try with
var item = (from obj in _db.SampleEntity.Include(s => s.NavProp1)
select new
{
ItemProp1 = obj,
NavProp1 = obj.NavProp1,
ItemProp2 = obj.NavProp2.Any(n => n.Active)
}).SingleOrDefault();
Assert.IsNotNull(item.NavProp1)
Of course I assume that you don't have any problems with EF navigation property mappings.
I have two entities, Class and Student, linked in a many-to-many relationship.
When data is imported from an external application, unfortunately some classes are created in duplicate. The 'duplicate' classes have different names, but the same subject and the same students.
For example:
{ Id = 341, Title = '10rs/PE1a', SubjectId = 60, Students = { Jack, Bill, Sarah } }
{ Id = 429, Title = '10rs/PE1b', SubjectId = 60, Students = { Jack, Bill, Sarah } }
There is no general rule for matching the names of these duplicate classes, so the only way to identify that two classes are duplicates is that they have the same SubjectId and Students.
I'd like to use LINQ to detect all duplicates (and ultimately merge them). So far I have tried:
var sb = new StringBuilder();
using (var ctx = new Ctx()) {
ctx.CommandTimeout = 10000; // Because the next line takes so long!
var allClasses = ctx.Classes.Include("Students").OrderBy(o => o.Id);
foreach (var c in allClasses) {
var duplicates = allClasses.Where(o => o.SubjectId == c.SubjectId && o.Id != c.Id && o.Students.Equals(c.Students));
foreach (var d in duplicates)
sb.Append(d.LongName).Append(" is a duplicate of ").Append(c.LongName).Append("<br />");
}
}
lblResult.Text = sb.ToString();
This is no good because I get the error:
NotSupportedException: Unable to create a constant value of type 'TeachEDM.Student'. Only primitive types ('such as Int32, String, and Guid') are supported in this context.
Evidently it doesn't like me trying to match o.SubjectId == c.SubjectId in LINQ.
Also, this seems a horrible method in general and is very slow. The call to the database takes more than 5 minutes.
I'd really appreciate some advice.
The comparison of the SubjectId is not the problem because c.SubjectId is a value of a primitive type (int, I guess). The exception complains about Equals(c.Students). c.Students is a constant (with respect to the query duplicates) but not a primitive type.
I would also try to do the comparison in memory and not in the database. You are loading the whole data into memory anyway when you start your first foreach loop: It executes the query allClasses. Then inside of the loop you extend the IQueryable allClasses to the IQueryable duplicates which gets executed then in the inner foreach loop. This is one database query per element of your outer loop! This could explain the poor performance of the code.
So I would try to perform the content of the first foreach in memory. For the comparison of the Students list it is necessary to compare element by element, not the references to the Students collections because they are for sure different.
var sb = new StringBuilder();
using (var ctx = new Ctx())
{
ctx.CommandTimeout = 10000; // Perhaps not necessary anymore
var allClasses = ctx.Classes.Include("Students").OrderBy(o => o.Id)
.ToList(); // executes query, allClasses is now a List, not an IQueryable
// everything from here runs in memory
foreach (var c in allClasses)
{
var duplicates = allClasses.Where(
o => o.SubjectId == c.SubjectId &&
o.Id != c.Id &&
o.Students.OrderBy(s => s.Name).Select(s => s.Name)
.SequenceEqual(c.Students.OrderBy(s => s.Name).Select(s => s.Name)));
// duplicates is an IEnumerable, not an IQueryable
foreach (var d in duplicates)
sb.Append(d.LongName)
.Append(" is a duplicate of ")
.Append(c.LongName)
.Append("<br />");
}
}
lblResult.Text = sb.ToString();
Ordering the sequences by name is necessary because, I believe, SequenceEqual compares length of the sequence and then element 0 with element 0, then element 1 with element 1 and so on.
Edit To your comment that the first query is still slow.
If you have 1300 classes with 30 students each the performance of eager loading (Include) could suffer from the multiplication of data which are transfered between database and client. This is explained here: How many Include I can use on ObjectSet in EntityFramework to retain performance? . The query is complex because it needs a JOIN between classes and students and object materialization is complex as well because EF must filter out the duplicated data when the objects are created.
An alternative approach is to load only the classes without the students in the first query and then load the students one by one inside of a loop explicitely. It would look like this:
var sb = new StringBuilder();
using (var ctx = new Ctx())
{
ctx.CommandTimeout = 10000; // Perhaps not necessary anymore
var allClasses = ctx.Classes.OrderBy(o => o.Id).ToList(); // <- No Include!
foreach (var c in allClasses)
{
// "Explicite loading": This is a new roundtrip to the DB
ctx.LoadProperty(c, "Students");
}
foreach (var c in allClasses)
{
// ... same code as above
}
}
lblResult.Text = sb.ToString();
You would have 1 + 1300 database queries in this example instead of only one, but you won't have the data multiplication which occurs with eager loading and the queries are simpler (no JOIN between classes and students).
Explicite loading is explained here:
http://msdn.microsoft.com/en-us/library/bb896272.aspx
For POCOs (works also for EntityObject derived entities): http://msdn.microsoft.com/en-us/library/dd456855.aspx
For EntityObject derived entities you can also use the Load method of EntityCollection: http://msdn.microsoft.com/en-us/library/bb896370.aspx
If you work with Lazy Loading the first foreach with LoadProperty would not be necessary as the Students collections will be loaded the first time you access it. It should result in the same 1300 additional queries like explicite loading.
I have 5 tables in a L2S Classes dbml : Global >> Categories >> ItemType >> Item >> ItemData. For the below example I have only gone as far as itemtype.
//cdc is my datacontext
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Global>(p => p.Category);
options.AssociateWith<Global>(p => p.Category.OrderBy(o => o.SortOrder));
options.LoadWith<Category>(p => p.ItemTypes);
options.AssociateWith<Category>(p => p.ItemTypes.OrderBy(o => o.SortOrder));
cdc.LoadOptions = options;
TraceTextWriter traceWriter = new TraceTextWriter();
cdc.Log = traceWriter;
var query =
from g in cdc.Global
where g.active == true && g.globalid == 41
select g;
var globalList = query.ToList();
// In this case I have hardcoded an id while I figure this out
// but intend on trying to figure out a way to include something like globalid in (#,#,#)
foreach (var g in globalList)
{
// I only have one result set, but if I had multiple globals this would run however many times and execute multiple queries like it does farther down in the hierarchy
List<Category> categoryList = g.category.ToList<Category>();
// Doing some processing that sticks parent record into a hierarchical collection
var categories = (from comp in categoryList
where comp.Type == i
select comp).ToList<Category>();
foreach (var c in categories)
{
// Doing some processing that stick child records into a hierarchical collection
// Here is where multiple queries are run for each type collection in the category
// I want to somehow run this above the loop once where I can get all the Items for the categories
// And just do a filter
List<ItemType> typeList = c.ItemTypes.ToList<ItemType>();
var itemTypes = (from cat in TypeList
where cat.itemLevel == 2
select cat).ToList<ItemType>();
foreach (var t in itemTypes)
{
// Doing some processing that stick child records into a hierarchical collection
}
}
}
"List typeList = c.ItemTypes.ToList();"
This line gets executed numerous times in the foreach, and a query is executed to fetch the results, and I understand why to an extent, but I thought it would eager load on Loadwith as an option, as in fetch everything with one query.
So basically I would have expected L2S behind the scenes to fetch the "global" records in one query, take any primary key values, get the "category" children using one one query. Take those results and stick them into collections linked to the global. Then take all the category keys and excute one query to fetch the itemtype children and link those into their associated collections. Something on the order of (Select * from ItemTypes Where CategoryID in ( select categoryID from Categories where GlobalID in ( #,#,# ))
I would like to know how to properly eager load associated children with minimal queries and possibly how to accomplish my routine generically not knowing how far down I need to build the hierarchy, but given a parent entity, grab all the associated child collections and then do what I need to do.
Linq to SQL has some limitations with respect to eager loading.
So Eager Load in Linq To SQL is only
eager loading for one level at a time.
As it is for lazy loading, with Load
Options we will still issue one query
per row (or object) at the root level
and this is something we really want
to avoid to spare the database. Which
is kind of the point with eager
loading, to spare the database. The
way LINQ to SQL issues queries for the
hierarchy will decrease the
performance by log(n) where n is the
number of root objects. Calling ToList
won't change the behavior but it will
control when in time all the queries
will be issued to the database.
For details see:
http://www.cnblogs.com/cw_volcano/archive/2012/07/31/2616729.html
I am sure this could be done better, but I got my code working with minimal queries. One per level. This is obviously not really eager loading using L2S, but if someone knows the right way I would like to know for future reference.
var query =
from g in cdc.Global
where g.active == true && g.globalId == 41
select g;
var globalList = query.ToList();
List<Category> categoryList = g.category.ToList<Category>();
var categoryIds = from c in cdc.Category
where c.globalId == g.globalId
select c.categoryId;
var types = from t in cdc.ItemTypes
where categoryIds.Any(i => i == t.categoryId)
select t;
List<ItemType> TypeList = types.ToList<ItemType>();
var items = from i in cdc.Items
from d in cdc.ItemData
where i.ItemId == d.ItemId && d.labelId == 1
where types.Any(i => i == r.ItemTypes)
select new
{
i.Id,
// A Bunch of more fields shortened for berevity
d.Data
};
var ItemList = items.ToList();
// Keep on going down the hierarchy if you need more child results
// Do your processing psuedocode
for each item in list
filter child list
for each item in child list
.....
//
Wouldn't mind knowing how to do this all using generics and a recursive method given the top level table