EF Core fix-up when querying subset of columns - c#

From the documentation:
Entity Framework Core will automatically fix-up navigation properties to any other entities that were previously loaded into the context instance. So even if you don't explicitly include the data for a navigation property, the property may still be populated if some or all of the related entities were previously loaded.
Entities setup:
public class Page{
public Page () {
Event = new HashSet<Event>();
}
[Key]
public int Id { get; set; }
public string Title { get; set; }
public string Content { get; set; } // don't want to retrieve, too large
public ICollection<Event> Event { get; set; }
}
public class Event{
[Key]
public int Id { get; set; }
public string Name { get; set; }
public string Type { get; set; }
public Page Page { get; set; }
}
The context is set up with a One-To-Many relationship.
These are the queries I run, one after the other:
var pages = _dbContext.Page.Select(page => new Page
{
Id = page.Id,
Title = page.Title
}).ToList();
var events = _dbContent.Event.ToList();
I expect each Page to have the Events collection populated (and vice-versa for Event with the Page reference), but the fix-up doesn't happen (Page in Event is null, and Event in Page is null).
If I replace the first query by this, then the fix-up works:
var pages = _dbContext.Page.ToList();
So it seems that with projection the fix-up doesn't happen. The reason I split this in 2 queries was to avoid using something like Include which would make a huge join and duplicate plenty of data.
Is there any way around that? Do I need to do the fix-up manually myself?

When you project into a new type yourself in the query, EF Core does not track the object coming out of the query even if they are of type an entity which is part of Model. This is by design.
Since in your case Pages are not getting tracked, Events have nothing to do fixup with. Hence you are seeing null navigation properties.
This behavior was same in previous version (EF6). The main reason for not tracking is, as in your case, you are creating new Page without loading Content. If we track the new entity then it will have Content set to null (default(string)). If you mark this whole entity as modified then SaveChanges will end up saving null value in Content column in database. This would cause data loss. Due to minor error could cause major issue like data loss, EF Core does not track entities by default. Another reason is weak entity types (or complex types in EF6) which share CLR type with other entities but uniquely identified through Parent type, if you project out such entity then EF Core cannot figure out which entity type it is without parent information.
You could put those entities in changetracker by calling Attach method, which will cause fix up and you will get desired behavior. Be careful not to save them.
In general the scenario you want is useful. This issue is tracking support for that in EF Core.

I don't think that should work. Did you verify this behavior worked in previous versions of EntityFramework? Since, you aren't pulling out the full entity, and only properties of it, and then passing it into a new Entity, you are essentially just Selecting properties and creating a new Entity.
If you would like this to attach you can manually call the Attach Method after selecting your page
var pages = _dbContext.Page.Select(page => new Page
{
Id = page.Id,
Title = page.Title
}).ToList();
pages.ForEach(p => _dbContext.Page.Attach(p));
Keep in mind that if you call SaveChanges After this you will lose the unloaded properties, so only use this when calling Get Methods

Related

EF Core - How can I retrieve associated entities of an aggregate root?

In my app I am trying to follow DDD whilst modelling a simple fund account.
The classes are
FundAccount
public Guid Id { get; set; }
private ICollection<AccountTransaction> _transactions = new List<AccountTransaction>();
public IReadOnlyCollection<AccountTransaction> Transactions => _transactions
.OrderByDescending(transaction => transaction.TransactionDateTime).ToList();
AccountTransaction
public Guid Id { get; set; }
public decimal Amount { get; set; )
I am trying to retrieve the fund account from the database with the transactions included with the following:
var fundAccount = await _context.FundAccounts
.Include(a => a.Transactions)
.SingleAsync(a => a.Id == ...);
When I retrieve the FundAccount (which has transactions in the database) Transactions has 0 AccountTransaction?
Can anyone see what I need to do here?
First, when using *domain logic" in the entity data model (which you shouldn't, but that's another story), make sure to configure EF Core to use backing fields instead of properties (the default), by adding the following to the db context OnModelCreating override:
modelBuilder.UsePropertyAccessMode(PropertyAccessMode.Field);
This btw has been recognized as issue and will be fixed in the 3.0 version - see Breaking Changes - Backing fields are used by default. But currently you have to include the above code.
Second, you have to change the backing field type to be compatible with the property type.
In your case, ICollection<AccountTransaction> is not compatible with IReadOnlyCollection<AccountTransaction>, because the former does not inherit the later for historical reasons. But the List<T> (and other collection classes) implement both interfaces, and since this is what you use to initialize the field, simply use it as a field type:
private List<AccountTransaction> _transactions = new List<AccountTransaction>();
With these two modifications in place, the collection navigation property will be correctly loaded by EF Core.

When building a simple ORM for a hierarchy of POCO objects based on a database, who should 'own' the data?

I've been struggling with a design of my data-driven app for some time right now and keep going back and forth on my implementation, causing a lot of wheel-spin without much actual movement. Here's my problem.
I have a self-referencing table in my database representing a hierarchy of categories. They can recurse infinitely. The root categories have no parent and each category, root or otherwise, can have zero or more subcategories.
If this were a pure POCO-based design without the database backing, I would just do this...
public class CategoryCollection : ObservableCollection<Category>
{
}
public class Category
{
public string Name { get; set; }
public CategoryCollection Subcategories = new CategoryCollection()
}
public CategoryCollection RootCategories = new CategoryCollection()
Then I'd stuff the RootCategories with the root instances and build out the hierarchy from there.
I may even extend it to have a parent so I can traverse up and down the hierarchy as needed.
public class Category
{
public string Name { get; set; }
public Category? ParentCategory { get; set; }
public CategoryCollection Subcategories = new CategoryCollection()
}
However, here's where things start to get confusing. Who is responsible for ensuring the parent is in-sync with the collection? Who 'owns' that value? For instance, a new instance that hasn't yet been inserted into the RootCategories collection would also have 'null' for its parent, but it's not a root category until it's actually added.
You could invert this and only have a parent, but then you would have to do LINQ statements or similar to query who the subcategories are for any given parent.
Instead of a parent, I've also considered having an OwningCollection, then having the OwningCollection have a parent, like so...
public class CategoryCollection : ObservableCollection<Category>
{
public Category? ParentCategory { get; set; }
}
public class Category
{
public string Name { get; set; }
public CategoryCollection? OwningCollection{ get; set }
public CategoryCollection Subcategories = new CategoryCollection()
}
While this addresses the Root Category issue (it's not part of the hierarchy if OwningCollection is null) this moves ownership of the parent to the collection, not the child, which kind of makes things cleaner as if you remove it from the collection, you've essentially removed its parent, but this doesn't stop someone from adding the category to multiple collections meaning multiple parents! So that won't work and led me back to the Parent being the gospel.
Going down that road, I made the actual setter of the Parent property take care of adding and removing membership of the child collections which seemed like the right approach there.
But now throw in a database for storage and even that falls apart.
The table in the database is obviously the 'gospel' of the data, and as such, the model should always follow what the database says. As such, you can't simply set the parent like you could above, as then either the model needs to update the database, or someone listening to the model does, and depending on whether or not that db call succeeds or fails, then you need to update collection membership.
Ok, so how about removing the collections from the model and instead make them simple POCOs which are sent and received from the DAL/Storage layer via services. Now the POCOs match the database directly. Add simple caching and you can guarantee a single instance per ID.
var categoryCache = new CategoryCache()
public class Category
{
public int? Id { get; set; }
public string Name { get; set; }
public Category? ParentCategory { get; set; }
}
Requests to the DAL now simply returned either a Category or an IEnumerable like so...
var category = DAL.GetCategoryWithId(123) // Returns Category
var subcategories = DAL.GetSubcategoriesWithParentId(123) // Returns IEnumerable<Category>
Internally, they both return row(s) from the database and either update existing instances in the cache (weakly stored), or added new ones to the cache ensuring fresh data whenever it's requested.
But this now breaks down when you actually need to show the hierchy updating in the UI because there's no longer any owning observable collections! If something gets deleted from the database, and thus the cache, the UI can still be hanging on to that instance in a TreeView or a list on screen, and the model can't know every place that item is used.
So it's back to implementing the collections in the POCOs, and you're right back to the problem of who's responsibility is it to update those collections? Do you allow inserting and removal directly, or do you perform all actions via service calls and they alone can update parents and collections?
OR...
Is the database not the gospel of the data, but rather what's in memory is the gospel, and the database shouldn't do anything except store things by ID. It means you'd have to remove constraints and such in the DB (such as ensuring names are unique at each level, etc.) and enforce that in the model level, but that doesn't stop someone from changing the data out from underneath you, meaning you now have to re-sync memory with the DB or vice-versa.
That's why I'm leaning more towards the 'Services are gospel' approach and everything on the model is otherwise read-only to all on the outside.
Even so, I'd still love to hear the community's thoughts on how I can improve this design.

Prevent EF default behaviour of saving duplicates via populated navigation properties

Imagine this pair of entities in Entity Framework:
public class Price
{
public virtual Document Document { get; set; }
public int DocumentId { get; set; }
//stuff
}
public class Document
{
public int DocumentId { get; set; }
//stuff
}
It's well known that if you populate the Document object in this pairing it can result in duplicates of existing objects, as explained here: Entityframework duplicating when calling savechanges - the solution is to only populate the key field before saving.
However, consider this situation in creating new objects.
Price price = repository.GetPriceById(1);
Document doc = new Document();
Right now that document has no Id, because the DocumentId field is an IDENTITY and it hasn't been sent to the database - it's just a virtual in-memory object. I can't get an Id for it unless I save it, and I don't want to do that at this point: the requirement is for a one-button save, not partial saves as the code works through. So, if I want to attach it to the Price object, I therefore have no way of making that association other than to assign it directly to the Document property. If I save it in that state, I'll get a duplicate.
So when I save it, I'm forced to do this:
repository.UpdatePrice(price.Document);
repository.SaveChanges();
price.DocumentId = price.Document.DocumentId;
price.Document = null;
repository.SaveChanges();
In which the set to null just seems ridiculous: there's no obvious reason for doing so and it feels like a future maintenance issue in waiting. All the more so because the one-click-save requirement means we have this problem all over the codebase. Is there any other way to deal with this issue?

What's the fundamental concept of EF that I'm missing?

I'm sure I am misunderstanding something fundamental about how EF5 works.
In a [previous question] I asked about how to pass values between actions in an ASP.NET MVC application and it was suggested I could use TempData as a mechanism to pass around data (in my case I've gone for the POCOs that represent my data model in EF).
My controllers in MVC are not aware of any persistence mechanism within EF. They make use of a service layer which I've called "Managers" to perform common tasks on my POCOs and read/persist them to the underlying datastore.
I'm writing a workflow to allow an "employee" of my site to cancel a "LeaveRequest". In terms of controllers and actions, there's an HttpGet action "CancelLeaveRequest" which takes the ID of the LeaveRequest in question, retrieves the LeaveRequest through the service layer, and displays some details, a warning and a confirm button. Before the controller returns the relevant View, it commits the LeaveRequest entity into TempData ready to be picked up in the next step...
The confirm button causes an HttpPost to "LeaveRequest" which then uses the LeaveRequest from TempData and a call down to the service layer to make changes to the LeaveRequest and save them back to the database with EF.
Each instance of a manager class in my code has it's own EF DBContext. The controllers in MVC instantiate a manager and dispose of it within the page lifecycle. Thus, the LeaveRequest is retrieved using one instance of a DBContext, and changes are made and submitted via another instance.
My understanding is that the entity becomes "detached" when the first DBContext falls out of scope. So, when I try to commit changes against the second DBContext, I have to attach the entity to the context using DBContext.LeaveRequests.Attach()? There is an added complication that I need to use an "Employee" entity to note which employee cancelled the leave request.
My code in the service layer for cancelling the leave request reads as follows.
public void CancelLeaveRequest(int employeeId, LeaveRequest request)
{
_DBContext.LeaveRequests.Attach(request);
request.State = LeaveRequestApprovalState.Cancelled;
request.ResponseDate = DateTime.Now;
using (var em = new EmployeesManager())
{
var employee = em.GetEmployeeById(employeeId);
request.Responder = employee;
_DBContext.Entry(request.Responder).State = System.Data.EntityState.Unchanged;
}
_CommitDatabaseChanges();
}
You can see that I retrieve an Employee entity from the EmployeesManager and assign this employee as the responder to the leave request.
In my test case, the "responder" to the Leave Request is the same employee as the "requestor", another property on Leave Request. The relationships are many-to-one between leave requests and a requesting employee, and many-to-one between leave requests and a responding employee.
When my code runs in it's present state, I get the following error:
AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges.
I suspect this is because EF thinks it knows about the employee in question already. The line that fails is:
_DBContext.Entry(request.Responder).State = System.Data.EntityState.Unchanged;
However, if I remove this line and don't try to be clever by telling EF not to change my employee object, the leave request gets cancelled as expected but some very strange things happen to my Employees.
Firstly, the employee who made/responded to the request is duplicated. Then, any navigation properties (like "Manager", a many-to-one relationship between an Employee and other Employees) seem to get duplicated too. I can understand that the duplication of the Manager property on Employee is because I am loading the Manager object graph in as part of GetEmployeeById and I think I understand that the original Employee is being duplicated because, as far as the LeaveRequest DBContext is concerned, it has just appeared out of nowhere (I retrieved the Employee through a different DBContext). However, assuming those two points are correct, I'm at a loss as to how I can a) prevent the Employee and it's associated object graph being duplicate in the database and b) how I can ensure the modified LeaveRequest is persisted correctly (which it seems to stop doing with various combinations of attaching, changing state to modified etc... on the employee and leave request).
Please can someone highlight the error of my ways?
My LeaveRequest entity:
public class LeaveRequest
{
public LeaveRequest()
{
HalfDays = new List<LeaveRequestHalfDay>();
}
public int CalculatedHalfDaysConsumed { get; set; }
public Employee Employee { get; set; }
public virtual ICollection<LeaveRequestHalfDay> HalfDays { get; set; }
public int LeaveRequestId { get; set; }
public DateTime RequestDate { get; set; }
public int ResponderId { get; set; }
public virtual Employee Responder { get; set; }
public DateTime? ResponseDate { get; set; }
public LeaveRequestApprovalState State { get; set; }
public LeaveRequestType Type { get; set; }
public ICollection<LeaveRequest> ChildRequests { get; set; }
public LeaveRequest ParentRequest { get; set; }
}
The "Employee" field (of type Employee...) is the person who submitted the request. The "Responder" is potentially a different, but could be the same, employee.
You should change your navigation properties to this:
public int ResponderId {get;set;}
public virtual Employee Responder { get; set; }
This scalar property will be auto-mapped to the navigation property by EF. Next you can simply do the following (and you don't need the Unchanged state):
var employee = em.GetEmployeeById(employeeId);
request.ResponderId = employee.Id;
See also this article about relationships in EF.

Entity Framework - Code First saving many to many relation

I have two classes:
public class Company
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<User> Users { get; set; }
}
public class User
{
public int Id { get; set; }
public string Email { get; set; }
public virtual ICollection<Company> Companies { get; set; }
}
In my MVC application controller get new Company from post. I want to add current user to created Company in something like this.
User user = GetCurrentLoggedUser();
//company.Users = new ICollection<User>(); // Users is null :/
company.Users.Add(user); // NullReferenceException
companyRepository.InsertOrUpdate(company);
companyRepository.Save();
How it should look like to work properly? I don't know it yet but after adding user to collection I expect problems with saving it to database. Any tips on how it should look like would be appreciated.
Use this approach:
public class Company
{
public int Id { get; set; }
public string Name { get; set;}
private ICollection<User> _users;
public ICollection<User> Users
{
get
{
return _users ?? (_users = new HashSet<User>());
}
set
{
_users = value;
}
}
}
HashSet is better then other collections if you also override Equals and GetHashCode in your entities. It will handle duplicities for you. Also lazy collection initialization is better. I don't remember it exactly, but I think I had some problems in one of my first EF test applications when I initialized the collection in the constructor and also used dynamic proxies for lazy loading and change tracking.
There are two types of entities: detached and attached. An attached entity is already tracked by the context. You usually get the attached entity from linq-to-entities query or by calling Create on DbSet. A detached entity is not tracked by context but once you call Attach or Add on the set to attach this entity all related entities will be attached / added as well. The only problem you have to deal with when working with detached entities is if related entity already exists in database and you only want to create new relation.
The main rule which you must understand is difference between Add and Attach method:
Add will attach all detached entities in graph as Added => all related entities will be inserted as new ones.
Attach will attach all detached entities in graph as Unchanged => you must manually say what has been modified.
You can manually set state of any attached entity by using:
context.Entry<TEntity>(entity).State = EntityState....;
When working with detached many-to-many you usually must use these techniques to build only relations instead of inserting duplicit entities to database.
By my own experience working with detached entity graphs is very hard especially after deleting relations and because of that I always load entity graphs from database and manually merge changes into attached graphs wich are able to fully track all changes for me.
Be aware that you can't mix entities from different contexts. If you want to attach entity from one context to another you must first explicitly detach entity from the first one. I hope you can do it by setting its state to Detached in the first context.
In your constructor for the Company entity you can create an empty collection on the Users property.
public class Company
{
public Company() {
Users = new Collection<User>();
}
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<User> Users { get; set; }
}
As far as saving to the database is concerned, I asked a related question a few days ago and was assured that Entity Framework is able to track the changes made to related entities. Read up on that here:
Are child entities automatically tracked when added to a parent?

Categories