What persistence behavior should I expect when in this scenario - c#

I have a large block of code with several nested loops which needs to ultimately update many records in the database. I am trying to minimize the number of SaveChanges() calls to Entity Framework. Also note that we we are using the repository pattern.
Essentially I'm iterating a collection and need to update both the items in the collection and upon each iteration, another object retrieved from the db and contextualized by the item from the collection.
Sample code:
foreach (var outer in outerList)
{
obj = unit.GetRepository<MyObj>().Get(s =>
s.id = myId
).SingleOrDefault();
obj.value += outer.value;
outer.objId = obj.objId;
unit.GetRepository<MyOuterObj>().Update(outerObj);
unit.GetRepository<MyObj>().Update(obj);
}
unit.Save();
The call to Update() performs the following:
public virtual void Update(T entityToUpdate)
{
if(entityToUpdate is AuditModelBase)
{
var model = entityToUpdate as AuditModelBase;
model.UpdatedAt = DateTime.UtcNow;
}
DbSet.Attach(entityToUpdate);
Context.Entry(entityToUpdate).State = EntityState.Modified;
}
And the call to Save() of course performs the following:
_context.SaveChanges();
So my question is, as I'm reassigning obj to a different value each time through the loop, do I need Save() inside the foreach loop in order for all instances of "obj" to persist. Or does the DbSet.Attach(obj) ensure that each individual instance is updated regardless on what I do with the object in my loop.
Or perhaps a better way to ask this is:
Given that it looks like Attach() is pass-by-reference so therefore only my last obj will be updated, what are best practices with EF to accomplish this sort of thing (excluding the option of straight calls to SQL) ?

I don't think you have to worry. Once you call Attach on your object, it doesn't matter whether you keep a reference to it yourself or not; it should be fine.
The thing to keep in mind is that your object lives as long that it's referenced by someone. So, it stands to reason that calling Attach would cause your DbSet to reference your object, thus keeping it alive even after you don't reference it anymore yourself.
However, the best thing to do, in my opinion, is to just give it a try and see what happens!

Related

Appropriate update of entity while checking if it exists in EF Core

I have the following method updating an entity. The only biff I had was that when an non-existing ID were provided, I got an harsh exception.
public bool Update(Thing thing)
{
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
So I added a check to control the exception thrown (plus some nice logging and other facilitation). Eventually, I plan to skip the throwing up entirely.
public bool UpdateWithCheck(Thing thing)
{
Thing target = Context.Things.SingleOrDefault(a => a.Id == thing.Id);
if (target == null)
throw new CustomException($"No thing with ID {thing.Id}.");
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
No, this doesn't work, because the entity already is being tracked. I have several options to handle that.
Change to Context.Where(...).AsNoTracking().
Explicitly set the updated fields in target and save it.
Horse around with entity states and tampering with the tracker.
Removing the present and adding the new one.
I can't decide which is the best practice. Googling gave me the default examples that do not contain the check for pre-existing status in the same operation.
The reason for the exception is because by loading the entity from the Context to check if it exists, you now have a tracked reference. When you go to update the detatched reference, EF will complain that a instance is already tracked.
The simplest work-around would be:
public bool UpdateWithCheck(Thing thing)
{
bool doesExist = Context.Things.Any(a => a.Id == thing.Id);
if (!doesExist)
throw new CustomException($"No thing with ID {thing.Id}.");
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
However, there are two problems with this approach. Firstly, because we don't know the scope of the DbContext instance or can guarantee the order of methods, it may be possible that at some point that DbContext instance could have loaded and tracked that instance of the thing. This can manifest as seemingly intermittent errors. The proper way to guard against that would be something like:
public bool UpdateWithCheck(Thing thing)
{
bool doesExist = Context.Things.Any(a => a.Id == thing.Id);
if (!doesExist)
throw new CustomException($"No thing with ID {thing.Id}.");
Thing existing = Context.Things.Local.SingleOrDefault(a => a.Id == thing.Id);
if (existing != null)
Context.Entry(existing).State = EntityState.Detached;
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
This checks the local tracking cache for any loaded instances, and if found, detaches them. The risk here is that any modifications that haven't be persisted in those tracked references will be discarded, and any references floating around that would have assumed were attached, will now be detached.
The second significant issue is with using Update(). When you have detached entities being passed around there is a risk that data you don't intend to be updated could be updated. Update will replace all columns, where typically if a client might only be expected to update a subset of them. EF can be configured to check row versions or timestamps on entities against the database before updating when your database is set up to support them (Such as Snapshot isolation) which can help guard against stale overwrites, but still allow unexpected tampering.
As you've already figured out, the better approach is to avoid passing detached entities around, and instead use dedicated DTOs. This avoids potential confusion about what objects represent view/consumer state vs. data state. By explicitly copying the values across from the DTO to the entity, or configuring a mapper to copy supported values, you also protect your system from unexpected tampering and potential stale overwrites. One consideration with this approach is that you should guard updates to avoid unconditionally overwriting data with potentially stale data by ensuring your Entity and DTO have a RowVersion/Timestamp to compare. Before copying from DTO to the freshly loaded Entity, compare the version, if it matches then nothing has changed in the data row since you fetched and composed your DTO. If it has changed, that means someone else has updated the underlying data row since the DTO was read, so your modifications are against stale data. From there, take an appropriate action such as discard changes, overwrite changes, merge the changes, log the fact, etc.
Just alter properties of target and call SaveChanges() - remove the Update call. I'd say the typical use case these days is for the input thing to not actually be a Thing but to be a ThingViewModel, ThingDto or some other variation on a theme of "an object that carries enough data to identify and update a Thing but isn't actually a DB entity". To that extent, if the notion of updating properties of Thing from ThingViewModel by hand bores you, you can look at a mapper (AutoMapper is probably the most well known but there are many others) to do the copying for you, or even set you up with a new Thing if you decide to turn this method into an Upsert

EF attaching object to context in foreach working only first time

I have a list of objects that I need to attach to the Context in order to track changes and, afterward, saving them, but the foreach iterating the items executes only the first time, after that the method ends.
I'm certain that those items already exist in the database.
I have tried both calling the .Attach() method and setting the Entry state to Unchanged.
protected override Task SetViewModelPropertiesAsync()
{
SelectedItems.ForEach(l =>
{
//Context.Pap_Pedido_ODP.Attach(l);
Context.Entry(l).State = System.Data.Entity.EntityState.Unchanged;
// After the first iteration the method ends
});
return base.SetViewModelPropertiesAsync();
}
I expect all the items to be added to context but after the first iteration of the foreach loop breaks the method and continues to the next one, without even giving an exception.
EDIT:
There is more code after the foreach that is being skipped when I do either the Attach or EntityState.
If I comment both the code executes correctly
The behaviour does sound like an exception is being thrown. This is IMO a huge red-flag about List<T>.ForEach() and the main reason I never use it. If you alter your code to:
foreach(var item in SelectedItems)
{
Context.Pap_Pedido_ODP.Attach(item);
Context.Entry(item).State = System.Data.Entity.EntityState.Unchanged;
}
... you should at least now see the exception(s) that are blocking your code. Attaching/Detaching entities between contexts is messy and there are very, very few scenarios where I personally can ever justify it. You are dealing with references to an entity. This means that:
item must not already be associated to any other context.
Context must not already have another entity tracked with the same PK as item.
Point #1 will hinder you because any code returning an entity that "might" want to attach that entity to another context will need to detach, or otherwise load that entity AsNoTracking. Passing a tracked entity to your method from somewhere will break your code.
Point #2 will hinder you because even if the entity passed is detached, if your context happens to already know about that entity via another reference, you have to essentially discard that untracked entity, and use the reference to the tracked instance. This means before attaching any entity you need to check Context .Local for a matching entity.
Only if the entity isn't tracked, and the context does not have a tracked entity with the same PK can you attach it.
If your code is not breaking on an exception and you are debugging, make sure your debug exception handling is set to break on all exceptions. Alternatively you can pop a try/catch block with a breakpoint in the catch to inspect the exception.
Edit: To check instances
foreach(var item in SelectedItems)
{
if(Context.Pap_Pedido_ODP.Local.Contains(item))
{ // This exact instance is already associated to the Context.
// We shouldn't need to copy anything across or do anything...
}
else
{
var existingItem = Context.Pap_Pedido_ODP.Local.SingleOrDefault(x => x.Id == item.Id);
if(existingItem != null)
{ // A different instance matching this one already exists in the context,
// Here if item represents changes we would need to copy changes across to existingItem...
}
else
{ // Item is not associated, safe to attach.
Context.Pap_Pedido_ODP.Attach(item);
// ...
}
}
}
Now it doesn't end there. If "item" contains references to other entities, each and every one will be updated automatically. This can cause problems if some of them have already been associated to the context. This can be caused when the DbContext is too long-lived or where multiple copies of the same instance of a referenced entity are passed back. For instance if I have a set of Orders being saved, and Orders contain references to Customers. 2 orders have a reference to the same customer. When I attach Order #1 to Customer #1, Customer 1 is now associated to the context. When I try to attach Order #2, the instance of Customer #1 is a different instance to Order #1 so attaching Order #2 will generate an error. When dealing with detached entities, you need to take steps to ensure that all instances of objects in the graph that refer to the same record are using the same object instance reference. When you loaded the data from EF, these would be the same object reference, but if you feed them to a Serializer/Deserializer you will get 2 identical copies as separate references. You cannot simply re-attach those object references.
Unfortunately there's no really simple answer I can offer to make it easier, but this is why serializing and deserializing entities can be a terrible idea, and even detaching/attaching instances can be a pain.

Entity Framework - Loop then call SaveChanges

Would SaveChanges save all changes of debt? Or only the last because it lost the reference because I change element debt in each loop?
List<DTO.ClientDebt> ClientDebtList = Business.Generic.GetAll<DTO.ClientDebt>();
foreach (var oClienteDeuda in oClienteDeudaSyncList) //oClienteDeudaSyncList is a list of debts
{
DTO.ClientDebt debt = ClientDebtList.Where(x => x.ClienteId == oClienteDeuda.ClienteId && x.NumeroComprobante == oClienteDeuda.NumeroComprobante).FirstOrDefault();
debt.Active = oClienteDeuda.Active ? 1 : 0;
}
Data.Generic.SaveChanges();
The way you have done is the correct way when we're dealing with the foreach loops.You can do the same thing inside the loop too.But it'll degrade the performance of the operation heavily.So always do the SaveChanges() after the foreach loop.SaveChanges() method persist modifications made to all entities attached to it.So you don't need to worry about the reference changes and etc.It works as unit of work.That means either save all or none.
Note : SaveChanges() operates within a transaction. SaveChanges() will roll back that transaction and throw an exception if any of the dirty ObjectStateEntry objects cannot be persisted.
the changes to entities are monitored internally, therefore none of your changes should be lost as long as you did not tell the entity framework to skip the change tracking (e.g. using AsNotracking() in your query)

LinQ optimization

Here is a peace of code:
void MyFunc(List<MyObj> objects)
{
MyFunc1(objects);
foreach( MyObj obj in objects.Where(obj1=>obj1.Good))
{
// Do Action With Good Object
}
}
void MyFunc1(List<MyObj> objects)
{
int iGoodCount = objects.Where(obj1=>obj1.Good).Count();
BeHappy(iGoodCount);
// do other stuff with 'objects' collection
}
Here we see that collection is analyzed twice and each time the value of 'Good' property is checked for each member: 1st time when calculating count of good objects, 2nd - when iterating through all good objects.
It is desirable to have that optimized, and here is a straightforward solution:
before call to MyFunc1 makecreate an additional temporary collection of good objects only (goodObjects, it can be IEnumerable);
get count of these objects and pass it as an additional parameter to MyFunc1;
in the 'MyFunc' method iterate not through 'objects.Where(...)' but through the 'goodObjects' collection.
Not too bad approach (as far as I see), but additional variable is required to be created in the 'MyFunc' method and additional parameter is required to be passed.
Question: is there any LinQ out-of-the-box functionality that allows any caching during 1st Where().Count(), remembering a processed collection and use it in the next iteration?
Any thoughts are welcome.
Thanks.
No, LINQ queries are not optimized in this way (what you describe is similar to the way SQL Server reuses a query execution plan). LINQ does not (and, for practical purposes, cannot) know enough about your objects in order to optimize this way. As far as it knows, your collection has changed (or is entirely different) between the two calls.
You're obviously aware of the ability to persist your query into a new List<T>, but apart from that there's really nothing that I can recommend without knowing more about your class and where else MyFunc is used.
As long as MyFunc1 doesn't need to modify the list by adding/removing objects, this will work.
void MyFunc(List<MyObj> objects)
{
ILookup<bool, MyObj> objLookup = objects.ToLookup(obj1 => obj1.Good);
MyFunc1(objLookup[true]);
foreach(MyObj obj in objLookup[true])
{
//..
}
}
void MyFunc1(IEnumerable<MyObj> objects)
{
//..
}

Help me to understand MY`Using` and `DataContext`

Can someone please explain the below to me. First is how I call the method and the second bit is the LINQ Method.
My curiousity stems from the fact that I get a context error if I un-comment the using portion.
Why? I apparently do not fully understand using and context's. And I would like to better understand this.
Guid workerID = new Guid(new ConnectDAL.DAL.Security().GetUserIDByUserLogin(HUD.CurrentUser));
var myMembers = BLLCmo.GetAllMembers(workerID);
if (myMembers.Rows.Count != 0)
{
dgvMyMembers.DataSource = myMembers;
}
else
{
var allMembers = BLLCmo.GetAllMembers();
dgvMyMembers.DataSource = allMembers;
}
internal static CmoDataContext context = new CmoDataContext();
public static DataTable GetAllMembers()
{
DataTable dataTable;
//using (context)
//{
var AllEnrollees = from enrollment in context.tblCMOEnrollments
select new
{
enrollment.ADRCReferralID,
enrollment.ClientID,
enrollment.CMONurseID,
enrollment.CMOSocialWorkerID,
enrollment.DisenrollmentDate,
enrollment.DisenrollmentReasonID,
enrollment.EconomicSupportWorkerID,
enrollment.EnrollmentDate
};
dataTable = AllEnrollees.CopyLinqToDataTable();
//}
return dataTable;
}
"using" blocks automatically dispose of the object you're using. Since you didn't give further details on what the exact error is, I'm betting its related to the fact that the "using" will dispose of your "context", and then later you'll try to use your context again.
Data Contexts should be used atomically. They're already internally coded to be efficient that way, there's usually no justifiable reason to have one as long-running as you do. The reason you see most samples that use a "using" is because they have the data context initialized immediately before the using (or in it) and then don't try to referenced the disposed context.
As a final note, disposing of objects causes them to release all their internal memory references (such as open connections, cached data, etc).
//Our context exists right now ... unless we've already called this method since the app started ;)
var myMembers = BLLCmo.GetAllMembers(workerID); // Context is disposed at the end of this call
if (myMembers.Rows.Count != 0)
{
dgvMyMembers.DataSource = myMembers; //No prob, we didn't call our function again
}
else
{
var allMembers = BLLCmo.GetAllMembers(); // Oops, our context was disposed of earlier
dgvMyMembers.DataSource = allMembers;
}
You get an error if you use using because the context is disposed the second time it's called by GetAllMembers().
If you need to dispose the context I sugest you create one on the fly in the GetAllMembers() as opposed to having a static context.
Check out the documentation of IDisposable and using.
Here's a link to an article that might help you with Lifetime Management of DataContext.
I have had this problem and did not understand it either at that time. I just removed the using and it worked. The problem was Lazy Loading. The DataContext gave me an entity, but later I tried accessing a property of a parent entity (in the sense of a Foreign Key). Because this parent entity was not loaded the first time, it tried to get it but the DataContext was gone. So I used a DataLoadOptions. If I knew I needed a related entity, I loaded it with the original entity.
Ex: You ask for an invoice to your datacontext, but later you want to access the client's name, like in invoice.Client.Name. The client has not been loaded, so the name is not available.
DataLoadOptions are also important for performance, if you need this related entity in a loop, you'll go back to the DB as many times as you loop if you do not pre-load the child (or parent) entity.

Categories