I have the below code that fetches the data from service and attaches to the objects that was loaded from local database by EntityFramework 6!
We had the UserJob details before in our local database, However we decided to move it to the service later and migrated all the UserJobs to new micro service.
We are in the middle of this move and now we are saving the User jobs in both local database and in the new service!
When we read from micro service and update the local database as well, the code treats its a new job, since the JobId is no longer available in new service, but we have same creation date and user id to find the JobId available in local database! I set it now correctly.
private void IncludeUserJobOnUser(UserJob userJob, MyDBContext db)
{
var aspnetUser = new EFDataReader().GetUser(userJob.userId, db);
if (userJob != null && aspnetUser != null)
{
userJob.JobId = new EFDataReader().GetUserJobId(userJob.userId, userJob.CreationDate, db);
DbEntityEntry entry = db.Entry(userJob);
entry.State = EntityState.Unchanged;
if (aspnetUser.UserJobs.Any(x => x.JobId != userJob.JobId)) // Here is the problem
{
aspnetUser.UserJobs.Add(userJob);
}
userJob.AspNetUser = aspnetUser;
}
}
PROBLEM:
I am trying to set the navigational property with the objects retrieved from the microservice. However, EF6 always hits the database and populate them from local database instead adding the entity came from microservice.
Any idea how to solve the above?
If you mean that by accessing "aspnetUser.UserJobs" you are seeing a query kicked off to load the UserJobs as part of the Any() check, that would be Lazy Loading. If the UserJobs were already eager loaded when the data reader loaded the user then this wouldn't happen. If the user hasn't loaded the jobs, how would the code know if the user had that job or not?
Code like this can be dangerous at runtime if not careful because you are passing entity references around to be associated with other entities where the scope of the DbContext that may, or may not still be tracking these entities is not clearly defined. Provided that "db" DbContext instance is the single overarching DbContext for retrieving all entities (including UserJob) are loaded from or associated to then you should be ok, but when I read things like "objects retrieved from the microservice" that sends up a bit of a red flag as Microservices would be scoping their own DbContext so returning entities from one would not be known by the current DbContext whose entities you are associating such results with.
I'm kinda confused about recognizing a disconnected scenario and a connected scenario, I've searched the internet but I couldn't find any real answer to my questions, I'm kinda confused about entities tracking system, connected and disconnected scenarios, when should I use the Attach method and also in differences between using the Entry(entity).State = EntityState.Deleted and Remove(entity) method, and while I was searching about the last one, most of the time, they were thought identical, but it didn't match with the test that I did and what I expected
I just made a simple console app to test the differences, and how it works is that I make a person completely outside of the context instantiation scope and then pass it to the AddPerson method, because I think this makes a disconnected scenario, right? because the Remove method will complain about why I haven't attached the entity first, so I think that tells us that we're in a disconnected scenario, I'm not sure tho
This is the app:
class Program
{
static void Main(string[] args)
{
Person person = new Person()
{
PersonID = 1,
Name = "John",
Family = "Doe"
};
using (var context = new MyContext())
{
// Why this one requires attaching but the code below doesn't
context.Person.Attach(person);
context.Person.Remove(person);
context.SaveChanges();
// This method of deleting works fine without the entity being attached
context.Entry(person).State = EntityState.Deleted;
context.SaveChanges();
var people = context.Person.ToList();
foreach (var p in people)
{
Console.WriteLine($"PersonID: {p.PersonID} | Name: {p.Name} | Family: {p.Family}");
}
}
Console.ReadKey();
}
}
so for the Remove method, I have to Attach the entity first, otherwise, it will throw an exception, BUT when I use the Entry(person).state = EntityState.Deleted without attaching it, it works fine, and deletes the person, now why is that, isn't this a big difference? why is it not said anywhere, I've read some websites and some other similar questions on Stackoverflow too, but this wasn't said anywhere, and for the most part, these two were presumed to be the same, and do the same thing, yes they both delete the entity, but how can we describe what happened in this test, isn't this a difference between these two?
I have two questions but I think they're related to each other, so I'm just going to ask both of them here:
When does exactly a disconnected scenario happen, and how can I recognize it, does it depend on the scope of the context instantiation, or on retrieving the entity directly from the context and then modifying it (with no need to attach it), or using an entity from outside of the context (like passing it from another scope to our context as a parameter, as I did in my test)?
Why does the Remove method requires attaching but the EntityState.Deleted doesn't, but they're presumed identical? why should I even bother to attach the entity first, while setting the state to deleted works without needing to attach, so When to use each of them?
Basically, The way I assume that how all these work (with my current understanding of Entity Framework which is probably wrong) is that when you're in a disconnected scenario, you have to attach your entity first, but then setting the state to EntityState.Deleted doesn't need attaching, so then why does the Remove method exists at all, we could use the other way of deleting all the time.
EDIT:
Based on the second code block in the accepted answer, I wrote this test, to figure out how it's working, you said that the otherPersonReference is equal to having a Attach(Person) but when I first attach the person and try to use EntityState.Deleted It works then too, and it'll delete it, but you said that it would fail, I'm a little confused :s
class Program
{
static void Main(string[] args)
{
Person person = new Person()
{
PersonID = 3,
Name = "John",
Family = "Doe"
};
using (var context = new MyContext())
{
//var pr = context.Person.Single(p => p.PersonID == 3);
context.Person.Attach(person);
context.Entry(person).State = EntityState.Deleted;
context.SaveChanges();
}
Console.ReadKey();
}
}
if I uncomment the pr variable line and then comment the context.Person.Attach(person) then setting the EntityState to Deleted would fail and it'll throw an exception as expected
Setting context.Entry(person).State tells EF to start tracking the "person" instance if it isn't already tracking it. You would get an error if the DbContext was already tracking an instance for the same record.
For example, you can try the following:
var person = new Person { Id = 100 }; // assume an existing record with ID = 100;
using (var context = new AppDbContext())
{
context.Entry(person).State = EntityState.Deleted;
context.SaveChanges();
}
This works as you expect... However, if you were to have code that did this:
var person = new Person { Id = 100 }; // assume an existing record with ID = 100;
using (var context = new AppDbContext())
{
var otherPersonReference = context.Persons.Single(x => x.Id == 100);
context.Entry(person).State = EntityState.Deleted;
context.SaveChanges();
}
Your attempt to use context.Entry(person).State = EntityState.Deleted; would fail because the context is now already tracking an entity with that ID. It's the same behaviour as if you were to try and call Attach(person).
When dealing with short-lived DbContexts (such as when using using() blocks) and single entity operations, it can be reasonably safe to work with detached entity references, but this will get a lot more "iffy" once you start dealing with multiple possible entity references (I.e. working with lists or objects sharing references etc.) and/or calls across a DbContext which may already be tracking entity references from previous operations / iterations.
Edit: Working with detached references can be problematic and you need to take extra care when doing so. My general recommendation is to avoid it wherever possible. The approach I recommend when dealing with entities is that you should never pass an entity outside of the scope of the DbContext that read it. This means leveraging a ViewModel or DTO to represent entity-sourced details outside the scope of the DbContext. A detached EF Entity
can certainly work, but with a DTO it is explicitly clear that the data cannot be confused with a tracked entity. When it comes to performing operations like a Delete, you only really need to pass the ID.
For example, leveraging Automapper to help translate between DTOs and entities:
PersonDTO AddPerson(PersonDTO details)
{
if(details == null)
throw new ArgumentNullException("details");
using (var context = new AppDbContext())
{
// TODO: Add validations such as verifying unique name/dob etc.
var person = Mapper.Map<Person>(details); // Creates a new Person.
context.Persons.Add(person);
context.SaveChanges();
details.PersonId = person.PersonId; // After SaveChanges we can retrieve the new row's ID.
return details;
}
}
PersonDTO UpdatePerson(PersonDTO details)
{
if(details == null)
throw new ArgumentNullException("details");
using (var context = new AppDbContext())
{
var existingPerson = context.Persons.Single(x => x.PersonId == details.PersonId); // Throws if we pass an invalid PersonId.
Mapper.Map(details, existingPerson); // copies values from our DTO into Person. Mapping is configured to only copy across allowed values.
context.SaveChanges();
return Mapper.Map<PersonDTO>(existingPerson); // Return a fresh, up to date DTO of our data record.
}
}
void DeletePerson(int personId)
{
using (var context = new AppDbContext())
{
var existingPerson = context.Persons.SingleOrDefault(x => x.PersonId == details.PersonId);
if (existingPerson == null)
return; // Nothing to do.
// TODO: Verify whether the current user should be able to delete this person or not. (I.e. based on the state of the person, is it in use, etc.)
context.Persons.Remove(existingPerson);
context.SaveChanges();
}
}
In this example a Person entity does not ever leave the scope of a DbContext. The trouble with detached entities is that whenever passing an entity around to other methods and such, those methods might assume they are working with attached, complete or complete-able (i.e. through lazy loading) entities. Was the entity loaded from a DbContext that is still "alive" so if if the code wants to check person.Address that data is either eager loaded and available, or lazy-loadable? vs. #null which could mean the person does not have an address, or that without a DbContext or lazy loading we cannot determine whether it does or not. As a general rule if a method is written to accept an entity, it should always expect to have a complete, or complete-able version of that entity. Not a detached "maybe complete, maybe not" instance, not a "new"ed up instance of a class that has some arbitrary values populated, (rather than an entity representing a data row) and not a deserialized block of JSON coming from a web client. All of those can be typed as a "Person" entity, but not a Person entity.
Edit 2: "Complete" vs. "Complete-able"
A Complete entity is an entity that has all related entities eager loaded. Any method that accepts a Person should be able to access any property, including navigation properties, and receive the true value. If the Person has an Address, then a #null address should only ever mean that person does not have an address (if that is valid), not "that person does not have an address, or it just wasn't loaded." This also goes for cases where you might have a method that accepts an entity, which you haven't loaded, but want to substitute with a entity class populated with an ID and whatever data you might have on hand. That incomplete "entity" could find itself sent to other methods that expect a more complete entity. Methods should never need to guess at what they receive.
A Complete-able entity is an entity where any related entities within that entity can be lazy loaded if accessed. The consuming method doesn't need to determine whether properties are available or not, it can access Person.Address and it will always get an Address if that person is supposed to have one, whether the caller remembered to eager load it or not.
Where methods are using tightly scoped DbContexts (using()) if you return an entity then there is no way that you can guarantee later down the call-chain that this entity is complete-able. Today you can make the assurance that all properties are eager-loaded, but tomorrow a new relationship could be added leaving a navigation property somewhere within the object graph that might not be remembered to be eager-loaded.
Eager loading is also expensive, given to ensure an entity is "complete", everything needs to be loaded, whether the consumers ever need it or not. Lazy Loading was introduced to facilitate this, however, in many cases this is extremely expensive leading to a LOT of chatter with the database and the introduction of performance costs when the model evolves. Elements like serialization (a common problem in web applications) touch every property by default leading to numerous lazy load calls for every entity sent.
DTOs/ViewModels are highly recommended when data needs to leave the scope of a DbContext as it ensures only the data a consumer needs is loaded, but equally importantly, as a model may evolve, you avoid lazy loading pitfalls. Serializing a DTO rather than an Entity will ensure those new relationships don't come into play until a DTO is updated to actually need that data.
How would you Upsert without select? the upsert would be a collection of entities received by a method which contains DTOs that may not be available in the database so you can NOT use attach range for example.
One way theoretically is to load the ExistingData partially with a select like dbContext.People.Where(x => x exists in requested collection).Select(x => new Person { Id = x.Id, State = x.State }).ToList() which just loads a part of the entity and not the heavy parts. But here if you update one of these returned entityItems from this collection it will not update because of the new Person its not tracking it and you also cannot say dbContext.Entry<Person>(person).State = Modified because it will throw an error and will tell you that ef core is already "Tracking" it.
So what to do.
One way would be to detach all of them from the ChangeTracker and then do the state change and it will do the update but not just on one field even if you say dbContext.Entry<Person>(person).Property(x => x.State).Modified = true. It will overwrite every fields that you haven't read from the database to their default value and it will make a mess in the database.
The other way would be to read the ChangeTracker entries and update them but it will also overwrite and it will consider like everything is chanaged.
So techinically I don't know how ef core can create the following SQL,
update People set state = 'Approved' where state != 'Approved'
without updating anything else. or loading the person first completely.
The reason for not loading your data is that you may want to update like 14000 records and those records are really heavy to load because they contain byte[] and have images stored on them for example.
BTW the lack of friendly documentation on EFCore is a disaster compare to Laravel. Recently it has cost us the loss of a huge amount of data.
btw, the examples like the code below will NOT work for us because they are updating one field which they know that it exists in database. But we are trying to upsert a collection which some of those DTOs may not be available in the database.
try
{
using (var db = new dbContext())
{
// Create new stub with correct id and attach to context.
var entity = new myEntity { PageID = pageid };
db.Pages.Attach(entity);
// Now the entity is being tracked by EF, update required properties.
entity.Title = "new title";
entity.Url = "new-url";
// EF knows only to update the propeties specified above.
db.SaveChanges();
}
}
catch (DataException)
{
// process exception
}
Edit: The used ef core version is #3.1.9
Fantastic, I found the solution (You need to also take care about your unit tests).
Entityframework is actually working fine it can be just a lack of experience which I'm documenting here in case anyone else got into the same issue.
Consider that we have an entity for Person which has a profile picture saved as Blob on it which causes that if you do something like the following for let's say 20k people the query goes slow even when you've tried to have enough correct index on your table.
You want to do this query to update these entities based on a request.
var entityIdsToUpdate = request.PeopleDtos.Select(p => p.Id);
var people = dbContext.People.Where(x => entityIdsToUpdate.Contains(x.Id)).ToList();
This is fine and it works perfectly, you will get the People collection and then you can update them based on the given data.
In these kind of updates you normally will not need to update images even if you do, then you need to increase the `TimeOut1 property on your client but for our case we did not need to update the images.
So the above code will change to this.
var entityIdsToUpdate = request.PeopleDtos.Select(p => p.Id);
var people = dbContext.People
.Select(p => new Person {
Id = p.Id,
Firstname = p.Firstname,
Lastname = p.Lastname,
//But no images to load
})
.Where(p => entityIdsToUpdate.Contains(p.Id)).ToList();
But then with this approach, EntityFramework will lose the track of your entities.
So you need to attach it like this and I will tell you how NOT to attach it.
This is the correct way for a collection
dbContext.People.AttachRange(people); //These are the people you've already queried
Now DO NOT do this, you may want to do this because you get an error from the first one from EntityFramework which says the entity is already being tracked, trust it because it already is. I will explain after the code.
//Do not do this
foreach(var entry in dbContext.ChangeTracker.Entries())
{
entry.State = EntityState.Detached;
}
//and then on updating a record you may write the following to attach it back
dbContext.Entry(Person).State = EntityState.Modified;
The above code will cause EntityFramework not to follow the changes on the entities anymore and by the last line you will tell it literally everything edited or not edited is changed and will cause you to LOSE your unedited properties like the "image".
Note: Now what can u do by mistake that even messes up the correct approach.
Well since you are not loading your whole entity, you may assume that it is still fine to assign values to the unloaded ones even if the value is not different than the one in the database. This causes entity framework to assume that something is changed and if you are setting a ModifiedOn on your records it will change it for no good reason.
And now about testing:
While you test, you may get something out from database and create a dto from that and pass the dto with the same dbContext to your SystemUnderTest the attach method will throw an error here which says this entity is already bein tracked because of that call in your test method. The best way would be create a new dbContext for each process and dispose them after you are done with them.
BTW in testing it may happen that with the same dbContext you update an entity and after the test you want to fetch if from the database. Please take note that this one which is returning to you is the "Cached" one by EntityFramework and if you have fetched it in the first place not completely like just with Select(x => ) then you will get some fields as null or default value.
In this case you should do DbContext.Entry(YOUR_ENTRY).Reload().
It is a really complete answer it may not directly be related to the question but all of the things mentioned above if you don't notice them may cause a disaster.
I am trying to attach an object that exists in the database to new DbContext, modify a single value (property: SubmissionGrpID, which is a foreign key) and then save the changes to the database. I am trying to do this without having to load the whole entity into the DbContext from the database.
I have already seen the question: How to update only one field using Entity Framework? but the answers did not help me.
The property: ID in the PnetProdFile table is the only column used in the primary key for that table. The code I am using is below:
using (SomeEntities db = new SomeEntities())
{
foreach (var id in FileIds)
{
PnetProdFile file = new PnetProdFile();
file.ID = id; // setting the primary key
file.SubmissionGrpID = 5; // setting a foreign key
db.PnetProdFiles.Attach(file);
var entry = db.Entry(file);
entry.Property(e => e.SubmissionGrpID).IsModified = true;
}
db.SaveChanges();
}
When I run the code, I get the following exception:
Validation failed for one or more entities. See 'EntityValidationErrors' property for more details.
Then, after adding in the following line of code before I SaveChanges():
var errors = db.GetValidationErrors();
I get all the Not Null fields as required values for the PnetProdFile table in the Validation Errors. One way to prevent validation errors is to set
db.Configuration.ValidateOnSaveEnabled = false;
but then that prevents validation on the entire entity, even on properties that could have been updated.
So, to summarise, I am trying to update an object which already exists in the database without loading the related record from the database, and only get validation errors on properties that have been changed. How can this be done?
By default EF validates all Added and Modified entities. You can modify this behavior by overwriting the DbContext.ShouldValidateEntity method where you would return false for entities you don't want to be validated.
Attached rather than loaded/read Objects default to UnChanged
If you are using Rowversion/Timestamp optimistic locking then you must read the object first.
But if you have a vanilla object....
Context.Entry(poco).State = state; // tell ef what is going on.
Context.SaveChanges();
// Detached = 1,
// Unchanged = 2,
// Added = 4,
// Deleted = 8,
// Modified = 16,
There is also the option "AddOrdUpdate"
Context.Set<TPoco>().AddOrUpdate(poco);
Context.SaveChanges();
This option was originally intended for seed type scenarios.
People seem to be using it more in everyday cases.
I only use it for seeding. Do a quick search on pros and cons of AddOrdUpdate before
you use it in everyday production scenarios.
More on working with detached entities here
I am using EF 4.5 and have to fetch some records and update a flag of the fetched record. Can I do this in same context?
using (var context = new MyDBEntities())
{
var qry = from a in context.Table1
select a;
foreach(var item in qry)
{
// Logic to fill custom entity (DTO) from qry
item.Fetched = 2; // Changing the status of fetched records in DB
context.Table1.Add(item);
}
context.SaveChanges();
}
You can. You have to, even. The context must have the objects you modify in its change tracker. That's what happens in the query (qry). Then you modify them, the change tracker detects changes when SaveChanges() is called and update statements are generated and send to the database.
The only thing is, you should not do
context.Table1.Add(item);
because that creates new Table1 records, it does not update the existing ones. Just remove the statement.
(By the way, there are other ways to make a context start change tracking, viz. attaching or adding objects to a context).