Does EF automatically load many to many references collections - c#

Imagine we have the following db structure
Organization
{
Guid OrganizationId
//....
}
User
{
Guid UserId
}
OrganizationUsers
{
Guid OrganizationId
Guid UserId
}
When the edmx generated this class it abstracts away the OrganizationUsers into a many to many references. So no POCO class will be generated for it.
Say I'm loading data from my context, but to avoid Cartesian Production, I don't use an include I make two seperate queries.
using(var context = new EntitiesContext())
{
var organizationsQuery = context.Where(FilterByParent);
var organizations = organizationsQuery.ToList();
var users = organizationsQuery.SelectMany(x => x.Users).Load();
}
Is it safe to assume that the connected entitites are loaded?
Would this make any difference if I loaded the users directly from the DBSet?

From database point of view:
Is it safe to assume that the connected entitites are loaded?
Yes It's safe, because first organizations being tracked by EF Change Tracker and then by calling Load in next statement EF knows that results should be attach to tracked entities
Would this make any difference if I loaded the users directly from the DBSet?
In fact using Load this way does nothing better than Include!
If you use Include EF translate it to LEFT JOIN, if you use Load it will be translated to INNER JOIN, and if you fetch Users directly by their ids using Contains method it will be translated to IN on Sql side.
In Load and Contains cases you execute two query (in two pass) on Sql, but in Include case it's being done in one pass, so overally it's outperform your approach.
You can compare these approaches yourself using Sql Profiler tool.
Update:
Based on conversations I realized that the main issue of Johnny is just existence of OrganizationUsers object. So I suggest to change your approach from DB First to Code first then this object explicitly could be exist! See this to help you on this way
Also another approach that I guess maybe work is customizing T4 Template that seems harder but not impossible!

Related

SingleOrDefault and FirstOrDefault returning cached data

Some previous code I had written used the Find() method to retrieve single entities by their primary key:
return myContext.Products.Find(id)
This worked great because I had this code tucked into a generic class, and each entity had a different field name as its primary key.
But I had to replace the code because I noticed that it was returning cached data, and I need it to return data from the database each call. Microsoft's documentation confirmed this is the behavior of Find().
So I changed my code to use SingleOrDefault or FirstOrDefault. I haven't found anything in documentation that states these methods return cached data.
Now I am executing these steps:
Save an entity via EF.
Execute an UPDATE statement in SSMS to update the recently saved
record's Description field.
Retrieve the entity into a new entity variable using SingleOrDefault
or FirstOrDefault.
The entities being returned still have the old value in the Description field.
I have run a SQL trace, and verified that the data is being queried during step 3. This baffles me - if EF is making a round trip to the database, why is it returning cached data?
I've searched online, and most answers apply to the Find() method. Furthermore, they suggest some solutions that are merely workarounds (dispose the DbContext and instantiate a new one) or solutions that won't work for me (use the AsNoTracking() method).
How can I retrieve my entities from the database and bypass the EF cache?
The behaviour you're seeing is described in Microsoft's How Queries Work article under point 3:
For each item in the result set
a. If this is a tracking query, EF checks if the data represents an entity already in the change tracker for the context instance
If so, the existing entity is returned
It's described a little better in this blog post:
It turns out that Entity Framework uses the Identity Map pattern. This means that once an entity with a given key is loaded in the context’s cache, it is never loaded again for as long as that context exists. So when we hit the database a second time to get the customers, it retrieved the updated 851 record from the database, but because customer 851 was already loaded in the context, it ignored the newer record from the database (more details).
All of this is saying that if you make a query, it checks the primary key first to see if it already has it in the cache. If so, it uses what's in the cache.
How do you avoid it? The first is to make sure you're not keeping your DbContext object alive too long. DbContext objects are only designed to be used for one unit of work. Bad things happen if you keep it around too long, like excessive memory consumption.
Do you need to retrieve data to display to the user? Create a DbContext to get the data and discard that DbContext.
Do you need to update a record? Create a new DbContext, update the record and discard that DbContext.
This is why, when you use EF Core with dependency injection in ASP.NET Core, it is created with a scoped lifetime, so any DbContext object only lives for the life of one HTTP request.
In the rare case you really do need to get fresh data for a record you already have an object for, you can use EntityEntry.Reload()/EntityEntry.ReloadAsync like this:
myContext.Entry(myProduct).Reload();
That doesn't help you if you only know the ID though.
If you really really need to reload an entity that you only have the ID for, you could do something weird like this:
private Product GetProductById(int id) {
//check if it's in the cache already
var cachedEntity = myContext.ChangeTracker.Entries<Product>()
.FirstOrDefault(p => p.Entity.Id == id);
if (cachedEntity == null) {
//not in cache - get it from the database
return myContext.Products.Find(id);
} else {
//we already have it - reload it
cachedEntity.Reload();
return cachedEntity.Entity;
}
}
But again, this should only be used in limited cases, when you've already addressed any cases of long-living DbContext object because unwanted caching isn't the only consequence.
Ok, I have the same problem and finally found the answer,
You doing everything right, that's just how EF works.
You can use .AsNoTracking() for your purposes:
return myContext.Products.AsNoTracking().Find(id)
make sure you addedusing Microsoft.EntityFrameworkCore; at the top.
It works like a magic

ASP.NET Core 2 Identity/Entity Framework - How to get the custom properties within the User class?

Currently I do obtain the user with:
ApplicationUser currentUser = await _userManager.GetUserAsync(User);
but I found that in this way it doesn't hold custom properties, for example:
public virtual UserImage UserImage { get; set; }
so every time that I need to get such property I write a method to get from the db with entity-framework, like:
public async Task<UserImage> GetUserImage(string userId) =>
await _dBcontext.UserImage.SingleOrDefaultAsync(u => u.ApplicationUserId == userId);
I would like to cache within the application(on the server, not cookie) all the user properties by only calling await _userManager.GetUserAsync(User);
There is such a way?
I'm going to assume that you're actually using Entity Framework Core, even though your question is tagged with just entity-framework. The reason is that what you have would just naturally work with Entity Framework, whereas, it will definitely not work at all with Entity Framework Core.
The key difference between the two is that EF Core does not support lazy-loading. With EF virtual navigation properties, a proxy class is dynamically created that derives from your entity class. The navigation properties are then dynamically overridden to add EF's lazy-loading logic to the getters. This causes an access of the property getter to invoke said lazy-loading logic and issue a query to the database to materialize the related entity or entities.
Since EF Core does not support lazy-loading, none of this occurs. As a result, unless you eagerly or explicitly load the relationship, it remains null. However, lazy-loading is a bad idea in the first place. It can lead to huge inefficiencies such as the 1+N query issue, where for example, you iterate over a list and end up issuing a query per item in the list to materialize some relationship on that item. If you have a lot of items, you can end up issuing a ton of queries, particularly if there's other relationships involved further in the tree. Say for example that you have a list of items with a related entity and then that related entity itself has a related entity you need to access. Now, you're issuing even more queries fetch that related entity each time. It can get out of control very quickly.
Long and short, it's far better to eagerly load the relationships you need. That will actually cause JOINs to issued in the initial query to fetch all the relationships at the same time, in just that one query. Short of that, explicit loading is still superior, as at least you are then aware of specific queries you are issuing and can clearly see if things start to get out of hand.
UserManager, however, does not give you any opportunity to do eager loads. As a result, if you use it to get the user, your only option is an explicit load of the related entity. That's not necessarily a bad thing, though, as it's only one additional query.
var currentUser = await _userManager.GetUserAsync(User);
await _dbContext.Entry(currentUser).Reference(u => u.UserImage).LoadAsync();
Now, you can access the related image.
Alternatively, you can query the user from the context, instead, and then eagerly load the image at the same time:
var currentUser = await _dbContext.Users.Include(u => u.UserImage).SingleOrDefault(u => u.Id == User.Identity.GetUserId());
That will issue just one query with a join on the image relationship.
Have a look at this thread: asp.net forum
Rules for lazy loading:
context.Configuration.ProxyCreationEnabled should be true.
context.Configuration.LazyLoadingEnabled should be true. Navigation
property should be defined as public, virtual. Context will NOT do
lazy loading if the property is not defined as virtual.
I hope this will help ;)

How to implement "free-form" relation in Entity Framework 6

I'm using EF 6 to work with a somewhat shoddily constructed database. I'm using a code-first model.
A lot of the logical relations there aren't implemented correctly using keys, but use various other strategies (Such as character-separated ids or strings, for example) that were previously manipulated using complex SQL queries.
(Changing the schema is not an option)
I really want to capture those relations as properties. It's possible to do this by using explicit queries instead of defining actual relations using the fluent/attribute syntax.
I'm planning to do this by having IQueryable<T> properties that perform a query. For example:
partial class Product {
public IQueryable<tblCategory> SubCategories {
get {
//SubCategoriesID is a string like "1234, 12351, 12" containing a list of IDs.
var ids = SubCategoriesID.Split(',').Select(x => int.Parse(x.Trim()));
return from category in this.GetContext().tblCategories
where ids.Contains(category.CategoryID)
select category;
}
}
}
(The GetContext() method is an extension method that somehow acquires an appropriate DbContext)
However, is there a better way to do this that I'm not familiar with?
Furthermore, if I do do this, what's the best way of getting the DbContext for the operation? It could be:
Just create a new one. I'm a bit leery of doing this, since I don't know much about how they work.
Use some tricks to get the context that was used to create this specific instance.
Do something else?
First, I would recommend not returning an IQueryable, as that retains a relationship to the original DbContext. Instead, I'd ToList the results of the query and return that as an IEnumerable<tblCategory>
Try not to keep DbContext instances hanging around; there's a lot of state management baked into them, and since they are not thread-safe you don't want to have multiple threads hitting the same instance. The pattern I personally tend to follow on data access methods is to use a new DbContext in a using block:
using (var ctx = new YourDbContextTypeHere()) {
return (from category in ctx.tblCategories
where ids.Contains(category.CategoryID)
select category).ToList();
}
Beware that .Contains() on a list of ids is very slow in EF, i.e. try to avoid it. I'd use subqueries, such as
var subcategories = context.SubCategories.Where(...);
var categories = context.Categories.Where(x => subCategories.Select(x => x.Id).Contains(category.CategoryId);
In this setup, you can avoid loading all the ids onto the server, and the query will be fast.

Entity Framework with multiple databases

Is there anyway to map multiple SQL Server databases in a single EF context? For instance I'm trying to do something like this
select order from context.Orders
where context.Users.Any(user => user.UserID == order.UserID)
And I'd like to get generated SQL along the lines of:
select .. from store.dbo.order where userID in
(select userID from authentication.dbo.user)
and note that the database names are different - store in one place, authentication in the other.
I've found a few articles that deal with multiple schema ('dbo' in this case), but none dealing with multiple database names.
As a potential workaround, you could create a view of the table from the second database in the first database and point your mappings to the view.
I'm pretty sure this isn't possible. The context derives from DbContext.
A DbContext instance represents a combination of the Unit Of Work and Repository patterns such that it can be used to query from a database and group together changes that will then be written back to the store as a unit. DbContext is conceptually similar to ObjectContext.
Configuration (connection string, schema, etc) for a DbContext is specific to a single database.
It's not possible. A notion of context is below notion of a database, and allowing this would probably be a bad practice. Allowing such a thing could cause developers to forget that they are dealing with two databases, and to take care about all performance implications that come from that.
I imagine you should still be able use two contexts and write elegant code.
var userIds = AuthContext.Users
.Where(user => user.Name = "Bob")
.Select(user => user.UserId)
.ToList();
var orders = StoreContext.Orders
.Where(order => userIds.Contains(order.UserId))
.ToList()
First execute query on authentication database context, in order to provide parameters for second query.

EF4 update a value for all rows in a table without doing a select

I need to reset a boolean field in a specific table before I run an update.
The table could have 1 million or so records and I'd prefer not to have to have to do a select before update as its taking too much time.
Basically what I need in code is to produce the following in TSQL
update tablename
set flag = false
where flag = true
I have some thing close to what I need here http://www.aneyfamily.com/terryandann/post/2008/04/Batch-Updates-and-Deletes-with-LINQ-to-SQL.aspx
but have yet to implement it but was wondering if there is a more standard way.
To keep within the restrictions we have for this project, we cant use SPROCs or directly write TSQL in an ExecuteStoreCommand parameter on the context which I believe you can do.
I'm aware that what I need to do may not be directly supported in EF4 and we may need to look at a SPROC for the job [in the total absence of any other way] but I just need to explore fully all possibilities first.
In an EF ideal world the call above to update the flag would be possible or alternatively it would be possible to get the entity with the id and the boolean flag only minus the associated entities and loop through the entity and set the flag and do a single SaveChanges call, but that may not be the way it works.
Any ideas,
Thanks in advance.
Liam
I would go to stakeholder who introduced restirctions about not using SQL or SProc directly and present him these facts:
Updates in ORM (like entity framework) work this way: you load object you perform modification you save object. That is the only valid way.
Obviously in you case it would mean load 1M entities and execute 1M updates separately (EF has no command batching - each command runs in its own roundtrip to DB) - usually absolutely useless solution.
The example you provided looks very interesting but it is for Linq-To-Sql. Not for Entity framework. Unless you implement it you can't be sure that it will work for EF, because infrastructure in EF is much more complex. So you can spent several man days by doing this without any result - this should be approved by stakeholder.
Solution with SProc or direct SQL will take you few minutes and it will simply work.
In both solution you will have to deal with another problem. If you already have materialized entities and you will run such command (via mentioned extension or via SQL) these changes will not be mirrored in already loaded entities - you will have to iterate them and set the flag.
Both scenarios break unit of work because some data changes are executed before unit of work is completed.
It is all about using the right tool for the right requirement.
Btw. loading of realted tables can be avoided. It is just about the query you run. Do not use Include and do not access navigation properties (in case of lazy loading) and you will not load relation.
It is possible to select only Id (via projection), create dummy entity (set only id and and flag to true) and execute only updates of flag but it will still execute up to 1M updates.
using(var myContext = new MyContext(connectionString))
{
var query = from o in myContext.MyEntities
where o.Flag == false
select o.Id;
foreach (var id in query)
{
var entity = new MyEntity
{
Id = id,
Flag = true
};
myContext.Attach(entity);
myContext.ObjectStateManager.GetObjectStateEntry(entity).SetModifiedProperty("Flag");
}
myContext.SaveChanges();
}
Moreover it will only work in empty object context (or at least no entity from updated table can be attached to context). So in some scenarios running this before other updates will require two ObjectContext instances = manually sharing DbConnection or two database connections and in case of transactions = distributed transaction and another performance hit.
Make a new EF model, and only add the one Table you need to make the update on. This way, all of the joins don't occur. This will greatly speed up your processing.
ObjectContext.ExecuteStoreCommand ( _
commandText As String, _
ParamArray parameters As Object() _
) As Integer
http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.executestorecommand.aspx
Edit
Sorry, did not read the post all the way.

Categories