Prevent EF default behaviour of saving duplicates via populated navigation properties - c#

Imagine this pair of entities in Entity Framework:
public class Price
{
public virtual Document Document { get; set; }
public int DocumentId { get; set; }
//stuff
}
public class Document
{
public int DocumentId { get; set; }
//stuff
}
It's well known that if you populate the Document object in this pairing it can result in duplicates of existing objects, as explained here: Entityframework duplicating when calling savechanges - the solution is to only populate the key field before saving.
However, consider this situation in creating new objects.
Price price = repository.GetPriceById(1);
Document doc = new Document();
Right now that document has no Id, because the DocumentId field is an IDENTITY and it hasn't been sent to the database - it's just a virtual in-memory object. I can't get an Id for it unless I save it, and I don't want to do that at this point: the requirement is for a one-button save, not partial saves as the code works through. So, if I want to attach it to the Price object, I therefore have no way of making that association other than to assign it directly to the Document property. If I save it in that state, I'll get a duplicate.
So when I save it, I'm forced to do this:
repository.UpdatePrice(price.Document);
repository.SaveChanges();
price.DocumentId = price.Document.DocumentId;
price.Document = null;
repository.SaveChanges();
In which the set to null just seems ridiculous: there's no obvious reason for doing so and it feels like a future maintenance issue in waiting. All the more so because the one-click-save requirement means we have this problem all over the codebase. Is there any other way to deal with this issue?

Related

EF Core fix-up when querying subset of columns

From the documentation:
Entity Framework Core will automatically fix-up navigation properties to any other entities that were previously loaded into the context instance. So even if you don't explicitly include the data for a navigation property, the property may still be populated if some or all of the related entities were previously loaded.
Entities setup:
public class Page{
public Page () {
Event = new HashSet<Event>();
}
[Key]
public int Id { get; set; }
public string Title { get; set; }
public string Content { get; set; } // don't want to retrieve, too large
public ICollection<Event> Event { get; set; }
}
public class Event{
[Key]
public int Id { get; set; }
public string Name { get; set; }
public string Type { get; set; }
public Page Page { get; set; }
}
The context is set up with a One-To-Many relationship.
These are the queries I run, one after the other:
var pages = _dbContext.Page.Select(page => new Page
{
Id = page.Id,
Title = page.Title
}).ToList();
var events = _dbContent.Event.ToList();
I expect each Page to have the Events collection populated (and vice-versa for Event with the Page reference), but the fix-up doesn't happen (Page in Event is null, and Event in Page is null).
If I replace the first query by this, then the fix-up works:
var pages = _dbContext.Page.ToList();
So it seems that with projection the fix-up doesn't happen. The reason I split this in 2 queries was to avoid using something like Include which would make a huge join and duplicate plenty of data.
Is there any way around that? Do I need to do the fix-up manually myself?
When you project into a new type yourself in the query, EF Core does not track the object coming out of the query even if they are of type an entity which is part of Model. This is by design.
Since in your case Pages are not getting tracked, Events have nothing to do fixup with. Hence you are seeing null navigation properties.
This behavior was same in previous version (EF6). The main reason for not tracking is, as in your case, you are creating new Page without loading Content. If we track the new entity then it will have Content set to null (default(string)). If you mark this whole entity as modified then SaveChanges will end up saving null value in Content column in database. This would cause data loss. Due to minor error could cause major issue like data loss, EF Core does not track entities by default. Another reason is weak entity types (or complex types in EF6) which share CLR type with other entities but uniquely identified through Parent type, if you project out such entity then EF Core cannot figure out which entity type it is without parent information.
You could put those entities in changetracker by calling Attach method, which will cause fix up and you will get desired behavior. Be careful not to save them.
In general the scenario you want is useful. This issue is tracking support for that in EF Core.
I don't think that should work. Did you verify this behavior worked in previous versions of EntityFramework? Since, you aren't pulling out the full entity, and only properties of it, and then passing it into a new Entity, you are essentially just Selecting properties and creating a new Entity.
If you would like this to attach you can manually call the Attach Method after selecting your page
var pages = _dbContext.Page.Select(page => new Page
{
Id = page.Id,
Title = page.Title
}).ToList();
pages.ForEach(p => _dbContext.Page.Attach(p));
Keep in mind that if you call SaveChanges After this you will lose the unloaded properties, so only use this when calling Get Methods

Entity Framework - Non Key Relationships

Problem
I have a situation whereby I need to use Entity Framework 6, Code First, with a legacy database structure which cannot be changed. The database has a very generic table which stores text based data alongside some non key data which can be used to relate the record back to another table.
To illustrate:
Assume the Notes table has a model as follows:
[Table("Notes")]
public class Notes
{
[Key]
public int RecordId { get; set; }
[Required]
public string RelatedTableName { get; set; }
[Required]
public int RelatedTableRecordId { get; set; }
[Required]
public string NotesText { get; set; }
}
I then have another model which could look like so:
[Table("Drivers")]
public class Drivers
{
[Key]
public int RecordId { get; set; }
[Required]
public string DriverName { get; set; }
public ICollection<Notes> DriverNotes { get; private set; }
}
There is no foreign key which links the tables. The Drivers table is linked to the Notes table by way of the RelatedTableName and RelatedTableRecordId fields.
I do not have a problem reading data from the database and hydrating the models using entity framework.
The problem I have is that I want to be able to save a new Driver and its newly created Notes in one transaction and have the RelatedTableRecordId field set to the primary key of the Driver.
If a foreign key existed entity framework would know to back fill the property but in this case it doesn't know about the relationship.
Key Points
Database Structure must not change.
Must use Entity Framework 6 Code First
Must be able to use an Execution Strategy.
Require a relationship between non key fields.
Need to be able to persist all data in a single transaction.
What I've Tried
I had a similar issue with Audit type data and solved it by doing something similar to the following (note that this is very pseudo here):
public override int SaveChanges()
{
int changes = 0;
//Disable the current execution strategy as the default ones do not support user instantiated transactions.
this.ContextConfiguration.SuspendExecutionStrategy();
try
{
//Wrap a whole transaction inside an execution strategy so that auditing can be combined with regular saving of changes.
this.ExecutionStrategy.Execute(
() =>
{
using (var transaction = this.Database.BeginTransaction())
{
//Reset the change count so that it doesn't increase each time the transaction fails.
changes = 0;
//Remove any audit records created by previous failed transactions.
this.AuditTableChanges.Local.Clear();
//Evaluate the change tracker to identify entities which will potentially require an audit trail.
var insertedEntities = this.ChangeTracker.Entries().Where(entryEntity => entryEntity.State == EntityState.Added).ToList();
//Save all changes to get identities.
changes = base.SaveChanges();
//Create the audit trail for inserted entities. This step must occur after the initial call to SaveChanges() so that the identities are set.
foreach (DbEntityEntry entryEntity in insertedEntities)
{
//For each inserted record, get the audit record entries and add them
foreach (AuditTableChange auditTableChange in GetAuditRecords(entryEntity, AuditTableChangeType.Insert).Result)
this.AuditTableChanges.Add(auditTableChange);
}
//Save the audit trail for inserted entities.
changes += base.SaveChanges();
//Commit all changes to the database
transaction.Commit();
}
});
}
finally
{
//Re-enable the execution strategy so that other calls can benefit from the retry policy.
this.ContextConfiguration.UnSuspendExecutionStrategy();
}
return changes;
}
This worked fine for the Audit data as the implementation was hidden away in the framework. I do not want my development team to have to do all of the above each time they persist records.
In its simplistic form this is as much as I'd want people to be doing:
public void CreateDriver()
{
using (MyContext context = new MyContext())
{
Drivers driver = new Drivers();
driver.DriverName = "Joe Bloggs";
Notes driverNote = new Notes();
driverNote.RelatedTableName = "Drivers";
driverNote.NotesText = "Some very long text";
driver.DriverNotes.Add(driverNote);
context.Drivers.Add(driver);
context.SaveChanges();
}
}
In a way I want a foreign key which exists in code but not in the database so that entity framework knows to fill in the RelatedTableRecordId field. I've read some articles on hacking the EDMX but this project is purely Code First only.
There are older questions on stack overflow which are similar but relate to older versions of entity framework and don't help much or have as much detail as the above.
I'm hoping that someone may have experienced a similar problem and has an answer which may involve perhaps some custom mapping/metadata or some overrides to entity framework logic.
Any help would be appreciated.
Thanks,
Greg

What's the fundamental concept of EF that I'm missing?

I'm sure I am misunderstanding something fundamental about how EF5 works.
In a [previous question] I asked about how to pass values between actions in an ASP.NET MVC application and it was suggested I could use TempData as a mechanism to pass around data (in my case I've gone for the POCOs that represent my data model in EF).
My controllers in MVC are not aware of any persistence mechanism within EF. They make use of a service layer which I've called "Managers" to perform common tasks on my POCOs and read/persist them to the underlying datastore.
I'm writing a workflow to allow an "employee" of my site to cancel a "LeaveRequest". In terms of controllers and actions, there's an HttpGet action "CancelLeaveRequest" which takes the ID of the LeaveRequest in question, retrieves the LeaveRequest through the service layer, and displays some details, a warning and a confirm button. Before the controller returns the relevant View, it commits the LeaveRequest entity into TempData ready to be picked up in the next step...
The confirm button causes an HttpPost to "LeaveRequest" which then uses the LeaveRequest from TempData and a call down to the service layer to make changes to the LeaveRequest and save them back to the database with EF.
Each instance of a manager class in my code has it's own EF DBContext. The controllers in MVC instantiate a manager and dispose of it within the page lifecycle. Thus, the LeaveRequest is retrieved using one instance of a DBContext, and changes are made and submitted via another instance.
My understanding is that the entity becomes "detached" when the first DBContext falls out of scope. So, when I try to commit changes against the second DBContext, I have to attach the entity to the context using DBContext.LeaveRequests.Attach()? There is an added complication that I need to use an "Employee" entity to note which employee cancelled the leave request.
My code in the service layer for cancelling the leave request reads as follows.
public void CancelLeaveRequest(int employeeId, LeaveRequest request)
{
_DBContext.LeaveRequests.Attach(request);
request.State = LeaveRequestApprovalState.Cancelled;
request.ResponseDate = DateTime.Now;
using (var em = new EmployeesManager())
{
var employee = em.GetEmployeeById(employeeId);
request.Responder = employee;
_DBContext.Entry(request.Responder).State = System.Data.EntityState.Unchanged;
}
_CommitDatabaseChanges();
}
You can see that I retrieve an Employee entity from the EmployeesManager and assign this employee as the responder to the leave request.
In my test case, the "responder" to the Leave Request is the same employee as the "requestor", another property on Leave Request. The relationships are many-to-one between leave requests and a requesting employee, and many-to-one between leave requests and a responding employee.
When my code runs in it's present state, I get the following error:
AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges.
I suspect this is because EF thinks it knows about the employee in question already. The line that fails is:
_DBContext.Entry(request.Responder).State = System.Data.EntityState.Unchanged;
However, if I remove this line and don't try to be clever by telling EF not to change my employee object, the leave request gets cancelled as expected but some very strange things happen to my Employees.
Firstly, the employee who made/responded to the request is duplicated. Then, any navigation properties (like "Manager", a many-to-one relationship between an Employee and other Employees) seem to get duplicated too. I can understand that the duplication of the Manager property on Employee is because I am loading the Manager object graph in as part of GetEmployeeById and I think I understand that the original Employee is being duplicated because, as far as the LeaveRequest DBContext is concerned, it has just appeared out of nowhere (I retrieved the Employee through a different DBContext). However, assuming those two points are correct, I'm at a loss as to how I can a) prevent the Employee and it's associated object graph being duplicate in the database and b) how I can ensure the modified LeaveRequest is persisted correctly (which it seems to stop doing with various combinations of attaching, changing state to modified etc... on the employee and leave request).
Please can someone highlight the error of my ways?
My LeaveRequest entity:
public class LeaveRequest
{
public LeaveRequest()
{
HalfDays = new List<LeaveRequestHalfDay>();
}
public int CalculatedHalfDaysConsumed { get; set; }
public Employee Employee { get; set; }
public virtual ICollection<LeaveRequestHalfDay> HalfDays { get; set; }
public int LeaveRequestId { get; set; }
public DateTime RequestDate { get; set; }
public int ResponderId { get; set; }
public virtual Employee Responder { get; set; }
public DateTime? ResponseDate { get; set; }
public LeaveRequestApprovalState State { get; set; }
public LeaveRequestType Type { get; set; }
public ICollection<LeaveRequest> ChildRequests { get; set; }
public LeaveRequest ParentRequest { get; set; }
}
The "Employee" field (of type Employee...) is the person who submitted the request. The "Responder" is potentially a different, but could be the same, employee.
You should change your navigation properties to this:
public int ResponderId {get;set;}
public virtual Employee Responder { get; set; }
This scalar property will be auto-mapped to the navigation property by EF. Next you can simply do the following (and you don't need the Unchanged state):
var employee = em.GetEmployeeById(employeeId);
request.ResponderId = employee.Id;
See also this article about relationships in EF.

Modifying specific item in nested collection in RavenDB

I have something that looks like the following document structure:
public class Document {
public int Id { get; set; }
public string Name { get; set; }
public List<Property> Properties { get; set; }
}
public class Property {
public int Id { get; set; }
public string Name { get; set; }
}
Now, querying and modifying Documents is easy. But I need to access specific Property-instances in my app, and it seems that they won't automatically get an ID like the root document does. And it seems this is by design in RavenDB.
I might be me stuck in the relational world, but what I'd like to do is basically retrieve the correct document, then get the right property, modify it and save the document again.
from property in document.Properties
where property.Id == someId
select property
...which will obviously not work as long as
RavenDB does not auto-set the Id field or
I don't make any ID-generating mechanism myself
Am I heading completely the wrong way, or does what I'm trying to do mak sense? Should I move the Properties out to being a root node and make some sort of reference to them in Document? Or should I just do something like this when inserting properties:
Retrieve the document with the list of properties
Get Properties[last]'s ID
Add 1 and insert new ID myself in new properties
?
This would, however, require at least two requests (one to get existing properties, one to save the changes) to the database, which just seems dirty and unnecessarsy for such a seemingly simple task.
I've found a lot of sortof similar posts, but none of them really answers this AFAIK.
Check to see how we do that in RaccoonBlog:
https://github.com/ayende/RaccoonBlog/blob/master/RaccoonBlog.Web/Infrastructure/Tasks/AddCommentTask.cs

Entity Framework - Code First saving many to many relation

I have two classes:
public class Company
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<User> Users { get; set; }
}
public class User
{
public int Id { get; set; }
public string Email { get; set; }
public virtual ICollection<Company> Companies { get; set; }
}
In my MVC application controller get new Company from post. I want to add current user to created Company in something like this.
User user = GetCurrentLoggedUser();
//company.Users = new ICollection<User>(); // Users is null :/
company.Users.Add(user); // NullReferenceException
companyRepository.InsertOrUpdate(company);
companyRepository.Save();
How it should look like to work properly? I don't know it yet but after adding user to collection I expect problems with saving it to database. Any tips on how it should look like would be appreciated.
Use this approach:
public class Company
{
public int Id { get; set; }
public string Name { get; set;}
private ICollection<User> _users;
public ICollection<User> Users
{
get
{
return _users ?? (_users = new HashSet<User>());
}
set
{
_users = value;
}
}
}
HashSet is better then other collections if you also override Equals and GetHashCode in your entities. It will handle duplicities for you. Also lazy collection initialization is better. I don't remember it exactly, but I think I had some problems in one of my first EF test applications when I initialized the collection in the constructor and also used dynamic proxies for lazy loading and change tracking.
There are two types of entities: detached and attached. An attached entity is already tracked by the context. You usually get the attached entity from linq-to-entities query or by calling Create on DbSet. A detached entity is not tracked by context but once you call Attach or Add on the set to attach this entity all related entities will be attached / added as well. The only problem you have to deal with when working with detached entities is if related entity already exists in database and you only want to create new relation.
The main rule which you must understand is difference between Add and Attach method:
Add will attach all detached entities in graph as Added => all related entities will be inserted as new ones.
Attach will attach all detached entities in graph as Unchanged => you must manually say what has been modified.
You can manually set state of any attached entity by using:
context.Entry<TEntity>(entity).State = EntityState....;
When working with detached many-to-many you usually must use these techniques to build only relations instead of inserting duplicit entities to database.
By my own experience working with detached entity graphs is very hard especially after deleting relations and because of that I always load entity graphs from database and manually merge changes into attached graphs wich are able to fully track all changes for me.
Be aware that you can't mix entities from different contexts. If you want to attach entity from one context to another you must first explicitly detach entity from the first one. I hope you can do it by setting its state to Detached in the first context.
In your constructor for the Company entity you can create an empty collection on the Users property.
public class Company
{
public Company() {
Users = new Collection<User>();
}
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<User> Users { get; set; }
}
As far as saving to the database is concerned, I asked a related question a few days ago and was assured that Entity Framework is able to track the changes made to related entities. Read up on that here:
Are child entities automatically tracked when added to a parent?

Categories