NHibernate re-using identifiers when using session.SaveOrUpdate - c#

I have a client who's been running a program of mine for a couple of years now who started getting strange errors.
My application throws exceptions with this in it:
Cannot insert duplicate key in object
Since I didn't touch the code in literally years, this confused me a lot.
And it only happened sometimes.
After A LOT of debugging and pulling out my hair, I figured out what is happening.
In the code that adds items to the database, I call session.SaveOrUpdate.
I can't recall a specific reason i chose this over the expected session.Save method, but let's continue. (i am changing this for the client's code though).
So what seems to be happening is that the SaveOrUpdate is re-using existing object's ID's and completely overriding the existing item. My code throws an error, but the new item is saved to the DB and there is no trace of the original record any longer. In my nhibernate mapping documents I am using the hilo generator for object IDs.
I am guessing this is only happening because there are now enough items in the DB to make the IDs restart or something, i don't know. I do have an audit table that has/had A LOT of records in it. 10,000's. But i truncated that table to make backups smaller. (could this have causes this problem).
I'm trying to find out if anyone can conclusively state if the SaveOrUpdate does for some reason re-use existing IDs or why changing the call to just Save works now. If this is a known issue i will sleep easy, if not, i need to further debug to see if there isn't still some situation where my client will lose data.
My code is running Nhibernate 3.3.3.4000, which was the latest code when i wrote this app.
Update 1
Session.Save is also re-using ID's.
I keep getting duplicate key errors, when inserting new records. But not every time, only some times. So it's quite random, which makes it hard to debug.

NHibernate users have requested a general purpose method that either saves a transient instance by generating a new identifier or update the persistent state associated with its current identifier. The SaveOrUpdate() method now implements this functionality.
http://nhibernate.info/doc/nh/en/index.html#manipulatingdata-updating-detached)
Based on this, you're hypothesis regarding the behaviour of SaveOrUpdate() would stand if NH allocated the object's key before testing whether or not it's 'transient' or 'persisted'. I.e. the key generator allocates a key that happens to be used and then the save or update logic favours update as it determine's the object is 'persistent'. I would be surprised if this is what's actually happening as it seems quite a basic mistake to make.
If you enable logging, you'll be able to determine whether this is actually the case or not.

I spent many hours to try and figure out this issue, but i sort of gave up in the end.
My solution, which seems to work so far, was to change the nhibernate id generator class from hilo to native.
This did require me to export and re-import all my data so that I could rebuild the tables however, so may not be a great solution for others who find this post, unless they change the identity on the tables manually.

Related

Desktop application which can work offline when no connectivity with SQL Server

I am designing a WPF desktop application and using Entity framework Code First to create and use SQL Server Database. My database will be hosted on One Server machine and will be running 24*7.
I want to provide a feature, where you can modify data offline(when you have no connectivity with SQL Server DB) and Save it somehow. And whenever your application will find connection with SQL Server, all changes can be moved to SQL Server DB.
Is there any way to achieve this by using Entity Framework ?
I want to emphasis on the part that I am using Entity Framework. Is this type of functionality already implemented by EF?? Or I have to do it manually, like have to write that in any file system and then manually merge it later to DB ?
You could figure out the specific exceptions that are generated when the SQL Server connection is lost, and embed your calls in try-catch blocks. If the server is offline, then in your catch block, pass the entity to a method that serializes the entity to JSON and saves it to the hard drive in a special directory or something. On your next successful query, check that directory to see if there are any saved entities that need to be saved.
Be specific with your catches - you don't want unrelated exceptions to trigger this code.
Some things to keep in mind - what if somebody else changed the data in the meantime? Are you intending to overwrite those changes? How did you get the data which needs to be saved in the first place if you are offline?
As long as you have all data loaded into DbContext/ObjectContext you're free to amend those data anyway you want. Only when SaveChanges() is invoked, the connection is really needed.
However, if you're going to load everything into the context, you seem to reimplementing DataSet functionality, which, in addition, allows for xml serialization/deserialization of the changes, so the changes can be even saved between sessions.
Not as trendy as EF, though :)
While I have never tried this with SQL-based data I have done it in the past with filesystem-based data and it's a major can of worms.
First, you have to have some means of indicating what data needs to be stored locally so that it will be available when you're offline. This will need to be updated either all the time or before you head out--and that can involve a lot of data transfer.
Second, once you're back online there's a lot of conflict resolution that must be done. If there's a realistic chance that someone else might have changed the data while you were out you need some way of detecting the conflict and prompting the user as to what to do in that situation. This almost certainly requires a system that keeps a detailed edit trail on every unit of data that could reasonably be updated.
In my situation I was very fortunate in that it was virtually certain that if the remote user edited file [x] that overwriting the system copy was the right thing to do. Remote users would only be carrying the files that pertained to their projects, conflicts should never happen. Thus the writeback was simply based on timestamps, nothing more. Data which people in the field would not normally need to modify was handled by not even looking at it, modified files were simply copied from the system to the laptop.
This leaves the middle step--saving the pending writes. I disagree with Elemental Pete's answer in this regard--simply serializing them and saving the result them does not work because what happens when you read that data back in again? You see the old copy, not the changed copy!
My approach to this was a local store of all relevant data that was accessed exactly like the main system data was, all reads and writes worked normally.
Something a lot fancier might be needed if you have data that needs transactions involved.
Note that we also hit a nasty human problem: the update process took several minutes (note: >10y ago) simply analyzing what needed to be done, not counting any actual copy time. The result was people bypassing it when they thought they could. Sometimes they thought wrong, oops!

How should I handle a potentially large number of edits in entity framework?

I'm using .NET 4.5.1 with EF 6.0.2 and db-first.
The use case is something like this:
Roughly 50k entities are loaded
A set of these entities are displayed for the user, others are required for displaying the items correctly
The user may perform heavy actions on the entities, meaning the user chooses to perform one action which cascades to actually affect potentially hundreds of entities.
The changes are saved back to database.
The question, then, is what is the best way to handle this? So far I've come up with 2 different solutions, but don't really like either:
Create a DbContext at step 1. Keep it around during the whole process, then finally save changes. The reason I don't necessarily like this, is that the process might take hours, and as far as I know, DbContexts should not be preserved for this long.
Create a DbContext at step 1. Discard it right after. At step 4, create a new DbContext, attach the modified entities to it and save changes. The big problem I see with this approach is how do I figure out which entities have actually be changed? Do I need to build a ChangeTracker of my own to be able to do this?
So is there a better alternative for handling this, or should I use one of the solutions above (perhaps with some changes)?
I would go with option number 1 - use a DbContext for the entire process.
The problem I have is with the assertion that the process might take hours. I don't think this is something you want to do. Imagine what happens when your user has been editing the data for 3 hours, and then face a power blackout before clicking the final save. You'll have users running after you with pitchforks.
You're also facing a lot of concurrency issues - what if two users perform the same lengthy process at once? Handling collisions after a few hours of work is going to be a problem, especially if you tell users changes they've made hours ago can't be saved. Pitchforks again.
So, I think you should go with number 3 - save incremental changes of the editing process, so the user's work isn't lost if something bad happens, and so that you can handle collisions if two users are updating the data at the same time.
You would probably want to keep the incremental changes in a separate place, not your main tables, because the business change hasn't been finalized yet.
and as far as I know, DbContexts should not be preserved for this long.
Häh?
There is nothing in a db context about not preserving it. You may get problems with other people having already edited the item, but that is an inherent architectura problem - generally it isn ot adviced to use optimistic AND pessimistic locking in a "multi hour edit marathon".
The only sensible approach if you have editing over hours is using your own change tracker and using proper logic when changes collode - and / or use a logical locking mechanism (flag in the database).

Entity Framework POCO long-term change tracking

I'm using .NET entity framework 4.1 with code-first approach to effectively solve the following problem, here simplified.
There's a database table with tens of thousands of entries.
Several users of my program need to be able to
View the (entire) table in a GridRow, which implied that the entire Table has to be downloaded.
Modify values of any random row, changes are frequent but need not be persisted immediately. It's expected that different users will modify different rows, but this is not always true. Some loss of changes is permitted, as users will most likely update same rows to same values.
On occasion add new rows.
Sounds simple enough. My initial approach was to use a long-running DbContext instance. This one DbContext was supposed to track changes to the entities, so that when SaveChanges() is called, most of the legwork is done automatically. However many have pointed out that this is not an optimal solution in the long run, notably here. I'm still not sure if I understand the reasons, and I don't see what a unit-of-work is in my scenario either. The user chooses herself when to persist changes, and let's say that client always wins for simplicity. It's also important to note that objects that have not been touched don't overwrite any data in the database.
Another approach would be to track changes manually or use objects that track changes for me, however I'm not too familiar with such techniques, and I would welcome a nudge in the right direction.
What's the correct way to solve this problem?
I understand that this question is a bit wishy-washy, but think of it as more fundamental. I lack fundamental understanding about how to solve this class of problems. It seems to me that long living DbContext is the right way, but knowledgeable people tell me otherwise, which leads me to confusion and imprecise questions.
EDIT1
Another point of confusion is the existance of Local property on the DbSet<> object. It invites me to use a long running context, as another user has posted here.
Problem with long running context is that it doesn't refresh data - I more discussed problems here. So if your user opens the list and modify data half an hour she doesn't know about changes. But in case of WPF if your business action is:
Open the list
Do as many actions as you want
Trigger saving changes
Then this whole is unit of work and you can use single context instance for that. If you have scenario where last edit wins you should not have problems with this until somebody else deletes record which current user edits. Additionally after saving or cancelling changes you should dispose current context and load data again - this will ensure that you really have fresh data for next unit of work.
Context offers some features to refresh data but it only refreshes data previously loaded (without relations) so for example new unsaved records will be still included.
Perhaps you can also read about MS Sync framework and local data cache.
Sounds to me like your users could have a copy (cached) of the data for an indefinate period of time. The longer the users are using cached data the greater the odds that they could become disconnected from the database connection in DbContext. My guess is EF doesn't handle this well and you probably want to deal with that. (e.g. occaisionally connected architecture). I would expect implementing that may solve many of your issues.

SaveChanges doesn't save changes

I have an application that loads all the data as expected using EF, however, when it comes to saving, I can't get it to work at all.
I've started off simple, by just using a value from a combobox to alter 1 field in the database. When the value is changed, it executes
this.t.Incident.AssignedTeamID = (int)this.cbTeam.SelectedValue;
I've also confirmed that this changed the EntityState to Modified and that the value is what I expect it to be. Despite this, calling
hdb.SaveChanges();
doesn't save anything back to the database. I know it's probably something simple I'm missing, but I cannot find out what that is at all.
Update:
Adding hdb.context.Attach(this.t.Incident); before using SaveChanges results in an InvalidOperationException stating "An entity object cannot be referenced by multiple instances of IEntityChangeTracker."
If it makes any difference, this is a desktop application, not a web application
Most likely, since you're working with a web app, you have a problem with a disconnected obect context. With all ORMs, you must go through an attach process to update an entity. SaveChanges will never work on both sides of the request/response.
Thank you to everybody who posted here. The answer was quite simple after reading these details.
What I needed to do, as Damien commented on the original question, was to ensure it was all loaded from the same class.
I currently created a private instance of the DB whenever needed, without really thinking. This was fine, it loaded the data as I expected, but meant that I would have around 3 different instances of the database loaded via different classes.
Essentially, I was trying to save the object from a different class with a different instance of the database. Moving the save method back to the class it was created from (presumably like it should have always been) resolved the issue.

How does NHibernate determine whether to Insert or Update a record?

When using Session.SaveOrUpdate(myEntity); how does NHibernate decide how whether to insert a new record or update an existing one?
I am having trouble whilst saving one object in a S#arp project. It is retrieved from storage, then stored in session state for a couple of web requests, then saved back to the database with one property changed (not the S#arp [DomainSignature]).
I have, at runtime, compared the object that is about to be persisted with a freshly retrieved version straight from the database using the Equals() method and that returns true. However, the object still ends up creating a new row in the database.
Elsewhere in the application this is working fine but I am hoping for a pointer on how NHib is working this out.
Basically SaveOrUpdate() is looking for an identifier. If the identifier is present, it will update the record in the database. If the identifier is not present, it will create a new record.
However, it sounds like you might have something funky going on in your session. You might want to try SaveOrUpdateCopy() to see if this solves your issue.

Categories