Linq2Sql: Random double-insert bug - a real poser

Linq2Sql: Random double-insert bug - a real poser - c#

OK, so you think you're a real debugger? Try this one:
I've got a Linq2Sql project where suddenly we've discovered that occasionally, seemingly randomly, we get double-inserts of the same row.
I have subclassed DataContext and overridden SubmitChanges. Inside there I have made a call to GetChangeSet(), both before and after the call to base.SubmitChanges(), and used a text file logger to record the objects in the Inserts collection. Plus I hang on to a reference to the inserted objects long enough to record their autonumbered ID.
When the double-insert happens, I see in the DB that instead of one row each being inserted into MyTableA and MyTableB, there are two each. SQL Profiler shows four insert statements, one after the other in rapid succession:
insert into MyTableA(...
insert into MyTableB(...
insert into MyTableA(...
insert into MyTableB(...
I check in the debug log, and there are only two objects in the Inserts collection: one of MyClassA and one of MyClassB. After the call to base.SubmitChanges(), the changeset is empty (as it should be). And the autonumber IDs show the larger value of the newly inserted rows.
Another piece of useful information: the bug never happens when stepping through in debug mode; only when you run without breakpoints. This makes me suspect it's something to do with the speed of execution.
We have been using the same DataContext subclass for over a year now, and we've never seen this kind of behavior before in our product. It only happens with MyClassA and MyClassB.
To summarize:
From the debug log, everything looks like it's working correctly.
On SQL Profiler, you can see that a double-insert is happening.
This behavior happens frequently but unpredictably, only to the two classes mentioned, excepting that it never happens when stepping through code in debug mode.
EDIT - New information:
Inside my DataContext subclass, I have the following code:
try {
base.SubmitChanges(ConflictMode.ContinueOnConflict);
} catch (ChangeConflictException) {
// Automerge database values for members that client has not modified.
foreach (ObjectChangeConflict occ in ChangeConflicts) {
occ.Resolve(RefreshMode.KeepChanges);
}
}
// Submit succeeds on second try.
base.SubmitChanges(ConflictMode.FailOnFirstConflict);
MyTableA and MyTableB both have a mandatory foreign key OtherTableID referencing OtherTable. The double insert happens when a ChangeConflictException happens during an update of the common parent table OtherTable.
We're on the scent, now...

When I've had a problem like this before it is usually down to multiple threads executing the same code at the same time.
Have you tried using the lock{} command to make sure the insert is only being used by a single thread?
MSDN Lock

looks like it's a BUG in Linq2Sql! Here's a repeatable experiment for you:
using (var db1 = new MyDataContext()) {
var obj1 = db1.MyObjects.Single(x => x.ID == 1);
obj1.Field1 = 123;
obj1.RelatedThingies.Add(new RelatedThingy {
Field1 = 456,
Field2 = "def",
});
using (var db2 = new MyDataContext()) {
var obj2 = db2.MyObjects.Single(x => x.ID == 1);
obj2.Field2 = "abc";
db2.SubmitChanges();
}
try {
db1.SubmitChanges(ConflictMode.ContinueOnConflict);
} catch (ChangeConflictException) {
foreach (ObjectChangeConflict occ in ChangeConflicts) {
occ.Resolve(RefreshMode.KeepChanges);
}
}
base.SubmitChanges(ConflictMode.FailOnFirstConflict);
}
Result: MyObject record with ID = 1 gets updated, Field1 value is 123 and Field2 value is "abc". And there are two new, identical records inserted to RelatedThingy, with MyObjectID = 1, Field1 = 456 and Field2 = "def".
Explain THAT!
UPDATE: After logging this on Microsoft Connect, the nice folks at MS asked me to put together a little demo project highlighting the bug. And wouldn't you know it - I couldn't reproduce it. It seems to be connected to some weird idiosyncrasy of my project. Don't have time to investigate further, and I found a workaround, anyway...

FWIW, we recently found this problem with our retry logic for SubmitChanges. We were doing an InsertAllOnSubmit. When a ChangeConflictException occurred, we would retry with a Resolve(RefreshMode.KeepChanges,true) on each ObjectChangeConflict.
We redid the work a different way (retry logic to re-perform the entire transaction) and that seems to fix the problem.

Related

Is there a simple way to check whether a SQL statement will update the database?

I have a chunk of code that does something like this:
if (sqlStatement.WillUpdateDatabase)
DoThing1();
else
DoThing2();
Currently WillUpdateDatabaseis implemented as
public bool WillUpdateDatabase {
get { return statementText.StartsWith("SELECT"); }
}
This catches the majority of cases, but it gets more complicated with SELECT ... INTO .... And there are possibly a few other cases that I might need to take into account.
Just to be clear: this is not to implement any type of security. There are other systems that check for SQL injection attacks, this bit of code just needs to make a choice whether to do thing1 or thing2.
This seems like it should be a solved problem. Is there an industry standard way to do this reliably?
Update/clarification: Something like UPDATE Table1 SET Column1 = 'a' WHERE 1 = 2 should be treated as an update.

As many others have commented, this really is a nasty problem and inspecting the SQL isn't really ever going to cut it for you because you'll practically end up writing an entire SQL parser (and that really would be reinventing the wheel). You'll probably have to make a database user that only has read permissions for all tables, then actually execute the query you want to test using that read-only user and catch the situations where it fails because of permission violations, (rather than SQL syntax etc)

Since your statement text will change values only with the SQL queries beginning with UPDATE or INSERT, one way to test if you query will try to update could be the following :
public bool WillUpdateDatabase {
get { return (statementText.StartsWith("UPDATE") || statementText.StartsWith("INSERT") ) }
}
But you cannot know if the query will effectively update some fields in the table. As an example, if your query is like UPDATE persons SET age = 25 WHERE name = "John" and there is no entry with name John, your query will try to update, but will obviously not be able to because there is nothing to update.
EDIT
Thanks to #NikBo and #Gusman, I replaced Contains with StartsWith to avoid any issue like Nik Bo explained in the comments bellow.

SubmitChanges call not updating data

The last few hours I'm trying to find out why I'm not able to update the data in the db using the SubmitChanges method.
I'm able to retrieve data normally but when I'm calling the SubmitChanges method, the call is executing like for 5+ minutes, then its proceeding without any error but when I'm checking the db, nothing gets updated.
I researched a bit before and some other posts were saying to check if primary key has been declared but that has been declared in fact.
This is the code I'm using:
SitesDB sitesDB = new SitesDB(SqlConnection);
Site site = sitesDB.GetSite(ProgramArgs.SiteID);
var records = DB.records
.Join(DB.Locations.Where(l => l.SiteID == ProgramArgs.SiteID),
s => Convert.ToInt32(s.store_id),
l => l.LocationID,
(s, l) => s)
.OrderBy(s => s.survey_date_utc);
foreach (var record in records)
{
record.date_local = ConvertTimeFromUTC(date_utc, site.TimeZoneID);
DB.SubmitChanges();
}

You should profile the database to see whay is happening in SQL.
Are you sure it is the submitchanges? Since you have the SubmitChanges() inside your foreach it will only update one record at a time. That should not be that much of a performance problem unless you have a very large table / lots of indexes etc. etc. You may want to move that outside the foreach. However, that will still do one update per record so it will not be that much faster.
Your problem might be before the submithanges. By default, Linq-2-sql is deferred execution. That means your select only is executed in the database as soon as you do your
foreach (var record in records)
So my idea is that this is your performance problem, not so much the submitchanges.
You would want to see this in the debugger, put a breakpoint on the first line inside the foreach record.data... and test if you actually get there.

Multiple operations under linq2Entities

I've been using Linq2Entities for quite some time now on small scale programs. So usually my queries are basic (return elements with a certain value in a specific column, add/update element, remove element,...).
Now i'm moving to a larger scale program and given my experience went with Linq2Entities and everything was fine until i had to remove a large number of elements.
In Linq2Entities i can only find a remove method that takes on entity - and so to remove 1000 elemnts i need first to retrieve them then apply a remove one by one then call savechanges.
Either i am missing something or i feel a) this is really bad performance-wise and b) i would be better off making this a procedure in the database.
Am i right in my assumptions? Should massive removals / additions be made as procedures only? or is there a way to do this efficiently with Linq2Entities that i am missing?

If you have the primary key values then you can use the other features of the Context by creating the objects manually, setting their key(s), and attaching them to the context and setting their state to delete.
var ids = new List<Guid>();
foreach (var id in ids)
{
Employee employee = new Employee();
employee.Id = id;
entityEntry = context.Entry(employee);
entityEntry.State = EntityState.Deleted;
}
context.SaveChanges();

Entity Framework: Different ObjectContext object error with LINQ to Entities assignment

I keep getting the following InvalidOperationException:
The relationship between the two objects cannot be defined because
they are attached to different ObjectContext objects.
when trying to do the following code:
newCorr.ReqCode = (from req in context.ReqCodeSet
where req.Code.Equals(requirement.Code)
select req).FirstOrDefault();
Just before this line, I am doing the following:
foreach (Requirement requirement in myInformation.Reqs)
{
MyHwReqCorr newCorr = new MyHwReqCorr();
newCorr.HwItem = Dictionaries.Instance.HwIdHwRecordDictionary[requirement.Id];
So what I'm doing is parsing through the my Information.Reqs list, creating a new instance of MyHwReqCorr, setting the HwItem to an item that was stored in a dictionary earlier on, and then setting the ReqCode by using a LINQ to SQL command which to look in a table for a req code that matches the one I'm passing in. Any help would be greatly appreciated. Any info you need, I'd be happy to provide.
EDIT: Right before I call this foreach, I can call this (as testing to verify that I can access the db):
List<ReqCode> reqCodeList = (from req in context.ReqCodeSet select req).ToList();
And I never get any errors with that. But when I try to set an item in that list (using the where extension method like:
newCorr.ReqCode = reqCodeList.Where(t=>t.Code == requirement.Code).FirstOrDefault();
or using a dictionary as done similar to the newCorr.HwItem, I get the main error.
EDIT2: I have also noticed something weird happening: When I initially run, with any setup (my original or the variable method or the method Rony posted), it works. But any subsequent run, meaning if I stop debugging and start debugging again, it fails with that error. Only when I kill all instances of excel (which is running in the background generating a log for viewing later on) and wait about 2-3 minutes, does it work again and then follows the same situation as before...passing the first time, failing immediate subsequent times.
EDIT3: It's definitely not Excel related as I prevented Excel from starting and I still get that error. But I did notice that if I wait some time, and try again, it works....sometimes.

Are you retrieving all items on the same thread/context? Try retrieving the items on same thread.

newCorr.ReqCode = (from req in context.ReqCodeSet
where req.Code equals requirement.Code
select req).FirstOrDefault();
OR
newCorr.ReqCode = context.ReqCodeSet
Where( r => r.Code == requirement.Code)
.FirstOrDefault();

IEnumerator seems to be effecting all objects, and not one at a time

Hey, I am trying to alter an attribute of an object. I am setting it to the value of that same attribute stored on another table. There is a one to many relationship between the two. The product end is the one and the versions is the many. Right now, both these methods that I have tried have set all the products returned equal to the final version object. So, in this case they are all the same. I am not sure where the issue lies. Here are my two code snipets, both yield the same result.
int x = 1
IEnumerator<Product> ie = productQuery.GetEnumerator();
while (ie.MoveNext())
{
ie.Current.RSTATE = ie.Current.Versions.First(o => o.VersionNumber == x).RSTATE;
x++;
}
and
foreach (var product in productQuery)
{
product.RSTATE = product.Versions.Single(o => o.VersionNumber == x).RSTATE;
x++;
}
The versions table holds information for previous products, each is distinguished by the version number. I know that it will start at 1 and go until it reaches the current version, based on my query returning the proper number of products.
Thanks for any advice.

It looks like you're creating a closure over the variable x in the lambda expression, but it's a little weird that you're having issues with it because you're executing the lambda expression right away - there's no delayed execution effect here that would normally be the source of problems with a closure.
Still, there's one way to test if it's a closure causing the problem — try taking a copy of the x variable inside the loop and see if that fixes the problem, like so:
foreach (var product in productQuery)
{
int y = x;
product.RSTATE = product.Versions.Single(o => o.VersionNumber == y).RSTATE;
x++;
}
Also, I suspect you could avoid the whole loop (and therefore the issue) with a .Select() projection, but because your product object was designed to be mutable it will be a little tricky.

Ok, so apparently it is altering all of them because product has a unique Identity column, so when I modify one instance of it, it changes them all. So, looks like it is back to the drawing board.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.