Transactions - Avoid collisions on insert

Transactions - Avoid collisions on insert - c#

I am using EF6 in my asp.net application, and I have a problem, that is a bit annoying, and I can't seem to figure out a good solution for it.
My code looks like this:
using (var scope = TransactionScopeUtil.RepeatableReadMaxTimeoutRequired())
{
bool hasConflict = await BookingService.HasConflictAsync(newBooking);
if (!hasConflict)
{
await BookingRepository.InsertAsync(newBooking);
return Json(AjaxPayload.Success());
}
else
{
return Json(AjaxPayload.Error());
}
}
// The transaction scope builder:
public static TransactionScope ReadCommittedMaxTimeoutRequired()
{
return new TransactionScope(TransactionScopeOption.Required, new TransactionOptions()
{
IsolationLevel = IsolationLevel.ReadCommitted,
Timeout = TransactionManager.MaximumTimeout
}, TransactionScopeAsyncFlowOption.Enabled);
}
The problem is, if two clients push the same booking time, a conflict should be registered. And one of the calls should return a message that the timeslot is already booked. But it doesn't if they hit the server exactly right(with in the same milis). Both bookings are saved without a problem.
I can fix this by doing a Serializable hardcore locking scope, but I am sure there is a better way, and I'm just too blind to see it?
What is best practices in situations like this?

if two clients push the same booking time, a conflict should be registered
If I understand correctly, you don't want to prevent two bookings at the same time. (You told Stefan a "superuser" could force one.) You just want to register a conflict?
It's easily done, but you have to use the database. At least, there has to be some arbiter of truth, some single place where there's only one time and one final understanding of the state of things. Usually, that's the database. The logic looks like this
insert into T values (X, time, priority = 1) where X not in T
if rows_affected = 1
hurrah
else
while rows_affected < 1
priority = max(priority) + 1
insert into T values (X, time, priority) where (X, priority) not in T
register conflict, you are $priority in line
Convert that to SQL or whatever you're using, pass in {X, time, priority} as parameters, and you're done.
By the way, in case it helps, this approach has a name: optimistic concurrency. With luck, that term might turn up in the documentation for your environment.

Related

Event sourcing incremental int id

I looked at a lot of event sourcing tutorials and all are using simple demos to focus on the tutorials topic (Event sourcing)
That's fine until you hit in a real work application something that is not covered in one of these tutorials :)
I hit something like this.
I have two databases, one event-store and one projection-store (Read models)
All aggregates have a GUID Id, what was 100% fine until now.
Now I created a new JobAggregate and a Job Projection.
And it's required by my company to have a unique incremental int64 Job Id.
Now I'm looking stupid :)
An additional issue is that a job is created multiple times per second!
That means, the method to get the next number have to be really safe.
In the past (without ES) I had a table, defined the PK as auto increment int64, save Job, DB does the job to give me the next number, done.
But how can I do this within my Aggregate or command handler?
Normally the projection job is created by the event handler, but that's to late in the process, because the aggregate should have the int64 already. (For replaying the aggregate on an empty DB and have the same Aggregate Id -> Job Id relation)
How should I solve this issue?
Kind regards

In the past (without ES) I had a table, defined the PK as auto increment int64, save Job, DB does the job to give me the next number, done.
There's one important thing to notice in this sequence, which is that the generation of the unique identifier and the persistence of the data into the book of record both share a single transaction.
When you separate those ideas, you are fundamentally looking at two transactions -- one that consumes the id, so that no other aggregate tries to share it, and another to write that id into the store.
The best answer is to arrange that both parts are part of the same transaction -- for example, if you were using a relational database as your event store, then you could create an entry in your "aggregate_id to long" table in the same transaction as the events are saved.
Another possibility is to treat the "create" of the aggregate as a Prepare followed by a Created; with an event handler that responds to the prepare event by reserving the long identifier post facto, and then sends a new command to the aggregate to assign the long identifier to it. So all of the consumers of Created see the aggregate with the long assigned to it.
It's worth noting that you are assigning what is effectively a random long to each aggregate you are creating, so you better dig in to understand what benefit the company thinks it is getting from this -- if they have expectations that the identifiers are going to provide ordering guarantees, or completeness guarantees, then you had best understand that going in.
There's nothing particularly wrong with reserving the long first; depending on how frequently the save of the aggregate fails, you may end up with gaps. For the most part, you should expect to be able to maintain a small failure rate (ie - you check to ensure that you expect the command to succeed before you actually run it).
In a real sense, the generation of unique identifiers falls under the umbrella of set validation; we usually "cheat" with UUIDs by abandoning any pretense of ordering and pretending that the risk of collision is zero. Relational databases are great for set validation; event stores maybe not so much. If you need unique sequential identifiers controlled by the model, then your "set of assigned identifiers" needs to be within an aggregate.
The key phrase to follow is "cost to the business" -- make sure you understand why the long identifiers are valuable.

Here's how I'd approach it.
I agree with the idea of an Id generator which is the "business Id" but not the "techcnical Id"
Here the core is to have an application-level JobService that deals with all the infrastructure services to orchestrate what is to be done.
Controllers (like web controller or command-lines) will directly consume the JobService of the application level to control/command the state change.
It's in PHP-like pseudocode, but here we talk about the architecture and processes, not the syntax. Adapt it to C# syntax and the thing is the same.
Application level
class MyNiceWebController
{
public function createNewJob( string $jobDescription, xxxx $otherData, ApplicationJobService $jobService )
{
$projectedJob = $jobService->createNewJobAndProject( $jobDescription, $otherData );
$this->doWhateverYouWantWithYourAleadyExistingJobLikeForExample301RedirectToDisplayIt( $projectedJob );
}
}
class MyNiceCommandLineCommand
{
private $jobService;
public function __construct( ApplicationJobService $jobService )
{
$this->jobService = $jobService;
}
public function createNewJob()
{
$jobDescription = // Get it from the command line parameters
$otherData = // Get it from the command line parameters
$projectedJob = $this->jobService->createNewJobAndProject( $jobDescription, $otherData );
// print, echo, console->output... confirmation with Id or print the full object.... whatever with ( $projectedJob );
}
}
class ApplicationJobService
{
// In application level because it just serves the first-level request
// to controllers, commands, etc but does not add "domain" logic.
private $application;
private $jobIdGenerator;
private $jobEventFactory;
private $jobEventStore;
private $jobProjector;
public function __construct( Application $application, JobBusinessIdGeneratorService $jobIdGenerator, JobEventFactory $jobEventFactory, JobEventStoreService $jobEventStore, JobProjectorService $jobProjector )
{
$this->application = $application; // I like to lok "what application execution run" is responsible of all domain effects, I can trace then IPs, cookies, etc crossing data from another data lake.
$this->jobIdGenerator = $jobIdGenerator;
$this->jobEventFactory = $jobEventFactory;
$this->jobEventStore = $jobEventStore;
$this->jobProjector = $jobProjector;
}
public function createNewJobAndProjectIt( string $jobDescription, xxxx $otherData ) : Job
{
$applicationExecutionId = $this->application->getExecutionId();
$businessId = $this->jobIdGenerator->getNextJobId();
$jobCreatedEvent = $this->jobEventFactory->createNewJobCreatedEvent( $applicationExecutionId, $businessId, $jobDescription, $otherData );
$this->jobEventStore->storeEvent( $jobCreatedEvent ); // Throw exception if it fails so no projecto will be invoked if the event was not created.
$entityId = $jobCreatedEvent->getId();
$projectedJob = $this->jobProjector->project( $entityId );
return $projectedJob;
}
}
Note: if projecting is too expensive for synchronous projection just return the Id:
// ...
$entityId = $jobCreatedEvent->getId();
$this->jobProjector->enqueueProjection( $entityId );
return $entityId;
}
}
Infrastructure level (common to various applications)
class JobBusinessIdGenerator implements DomainLevelJobBusinessIdGeneratorInterface
{
// In infrastructure because it accesses persistance layers.
// In the creator, get persistence objects and so... database, files, whatever.
public function getNextJobId() : int
{
$this->lockGlobalCounterMaybeAtDatabaseLevel();
$current = $this->persistance->getCurrentJobCounter();
$next = $current + 1;
$this->persistence->setCurrentJobCounter( $next );
$this->unlockGlobalCounterMaybeAtDatabaseLevel();
return $next;
}
}
Domain Level
class JobEventFactory
{
// It's in this factory that we create the entity Id.
private $idGenerator;
public function __construct( EntityIdGenerator $idGenerator )
{
$this->idGenerator = $idGenerator;
}
public function createNewJobCreatedEvent( Id $applicationExecutionId, int $businessId, string $jobDescription, xxxx $otherData ); : JobCreatedEvent
{
$eventId = $this->idGenerator->createNewId();
$entityId = $this->idGenerator->createNewId();
// The only place where we allow "new" is in the factories. No other places should do a "new" ever.
$event = new JobCreatedEvent( $eventId, $entityId, $applicationExecutionId, $businessId, $jobDescription, $otherData );
return $event;
}
}
If you do not like the factory creating the entityId, could seem ugly to some eyes, just pass it as a parameter with a specific type and pss the responsibility to create a new fresh one and do not reuse one at some other intermedaite service (never the application service) to create it for you.
Nevertheless if you do so, pay care to what if a "silly" service just creates "two" JobCreatedEvent with the same entity Id? That would really be ugly. At the end, creation would only occur once, and the Id is created at the very core of the "creation of the event of JobCreationEvent" (reundant redundancy). Your choice anyway.
Other classes...
class JobCreatedEvent;
class JobEventStoreService;
class JobProjectorService;
Things that do not matter in this post
We could discuss much if the projectors shoud be in the infrastructure level global to multiple applications calling them... or even in the domain (as I need "at least" one way to read the model) or it belongs more to the application (maybe the same model can be read in 4 different ways in 4 different applications and each they have their own projectors)...
We could discuss much where are the side-effects triggered if implicit in the event-store or in the application level (I've not called any side-effects processor == event listener). I think of side-effects being in the application layer as they depend on infrastructure...
But all this... is not the topic of this question.
I don't care about all those things for this "post". Of course they are not negligible topics and you will have your own strategy for them. And you have to design all this very carefully. But here the question was where to crete the auto-incremental Id coming from a business requierement. And doing all those projectors (sometimes called calculators) and side-effects (sometimes called reactors) in a "clean-code" way here would blur the focus of this answer. You get the idea.
Things that I care in this post
What I care is that:
If the experts what an "autonumeric" then it's a "domain requirement" and therefore its a property in the same level of definition than "description" or "other data".
The fact they want this property does not conflict with the fact that all entities have an "internal id" in the format that the coder chooses, being an uuid, a sha1 or whatever.
If you need sequential ids for that property, you need a "supplier of values" AKA JobBusinessIdGeneratorService which has nothing to do with the "entity Id" itself.
That Id generator will be the responsible to ensure that once the number has been autoincremented, it is syncrhonously persisted before it's being returned to the client, so it is impossible to return two times the same id upon failures.
Drawbacks
There's a sequence-leak you'll have to deal with:
If the Id generator points to 4007, the next call to getNextJobId() will increment it to 4008, persist the pointer to "current = 4008" and then return.
If for some reason the creation and persistence fails, then the next call will give 4009. We then will have a sequence of [ 4006, 4007, 4009, 4010 ], with 4008 missing.
It was because from the generator point of view, 4008 was "actually used" and it, as a generator, does not know what you did with it, the same way than if you have a dummy silly loop that extracts 100 numbers.
Do never compensate with a ->rollback() in a catch of a try / catch block because that can generate you concurrency problems if you get 2008, another process gets 2009, then the first process fails, the rollback will break. Just assume that "on failure" the Id was "just consumed" and do not blame the generator. Blame who failed.
I hope it helps!

#SharpNoizy, very simple.
Create your own Id Generator. Say an alphanumeric string, for example "DB3U8DD12X" that gives you billions of possibilites. Now, what you want to do is generate these ids in a sequencial order by giving each character an ordered value...
0 - 0
1 - 1
2 - 2
.....
10 - A
11 - B
Get the idea? So, what you do next is to create your function that will increment each index of your "D74ERT3E4" string using that matrix.
So, "R43E4D", "R43E4E", "R43E4F", "R43E4G"... get the idea?
Then when you application loads, you look at the database and find the latest Id generated. Then you load in memory the next 50,000 combinations (in case that you want super speed) and create a static class/method that is going to give you that value back.
Aggregate.Id = IdentityGenerator.Next();
this way you have control over the generation of your IDs because that's the only class that has that power.
I like this approach because is more "readable" when using it in your web api for example. GUIDs are hard (and tedious) to read, remember, etc.
GET api/job/DF73 is way better to remember than api/job/XXXX-XXXX-XXXXX-XXXX-XXXX
Does that make sense?

Can an async function run at the same time as another instance of it is running?

So I have a function like this in a singleton service that is injected into a controller.
public async Task<ResponseModel> Put(BoardModel request)
{
var board = await dbService.GetBoardAsync(request.UserId, request.TargetId, request.Ticker);
// Update the model
// ...
var response = await dbService.SetBoardAsync(request.UserId, request.TargetId, request.Ticker, request);
return new ResponseModel
{
ResponseStatus = response.Successful(replacements: 1) ? ResponseStatus.Success : ResponseStatus.Failure
};
}
What I'm worried about is race conditions, say if two instances of the function are running at the same time, and one overwrites the entry in the db.
Is this possible? There's a very small chance of it happening, but I'm still a bit worried.
Thanks!

Yes, assuming your server has more than one thread (which will be any production capable web server), then two or more threads can be simultaneously running the same block of code. The typical way to handle this type of situation is with optimistic concurrency. What that means is that EF will attempt to save the record (optimistically assuming it will be able to without issue), and if the record ends up having been modified before it got to it, it will return an exceptions (specifically OptimisticConcurrencyException). You can see this ASP.NET getting started article for a walkthrough on how to set it up. Essentially, it just involves adding a rowversion column to your database table(s). Each time the row is updated, the value of that column mutates. Therefore, EF can check the value on the record it's trying to update with what's currently on the table. If they're the same, it can continue updating. If not, then something else modified the record and it stops the update. By catching the exception that's returned, you can then respond appropriately by reloading the data and trying to do another update.
It's highly unlikely you would end up hitting a concurrency issue multiples times, but just in case, I would recommend using something like Polly (Nuget) to handle the exception. Among other things, it allows you retry a set number of times or even forever, until no exception is raised. This then would ensure that the record would eventually get updated, even if there were multiple concurrency conflicts.
Policy
.Handle<OptimisticConcurrencyException>()
.RetryForever((exception, context) => {
// resolve concurrency issue
// See: https://msdn.microsoft.com/en-us/data/jj592904.aspx
})
.Execute(() => {
db.SaveChanges();
});

How to avoid convoluted logic for custom log messages in code?

I know the title is a little too broad, but I'd like to know how to avoid (if possible) this piece of code I've just coded on a solution of ours.
The problem started when this code resulted in not enough log information:
...
var users = [someRemotingProxy].GetUsers([someCriteria]);
try
{
var user = users.Single();
}
catch (InvalidOperationException)
{
logger.WarnFormat("Either there are no users corresponding to the search or there are multiple users matching the same criteria.");
return;
}
...
We have a business logic in a module of ours that needs there to be a single 'User' that matches some criteria. It turned out that, when problems started showing up, this little 'inconclusive' information was not enough for us to properly know what happened, so I coded this method:
private User GetMappedUser([searchCriteria])
{
var users = [remotingProxy]
.GetUsers([searchCriteria])
.ToList();
switch (users.Count())
{
case 0:
log.Warn("No user exists with [searchCriteria]");
return null;
case 1:
return users.Single();
default:
log.WarnFormat("{0} users [{1}] have been found"
users.Count(),
String.Join(", ", users);
return null;
}
And then called it from the main code like this:
...
var user = GetMappedUser([searchCriteria]);
if (user == null) return;
...
The first odd thing I see there is the switch statement over the .Count() on the list. This seems very strange at first, but somehow ended up being the cleaner solution. I tried to avoid exceptions here because these conditions are quite normal, and I've heard that it is bad to try and use exceptions to control program flow instead of reporting actual errors. The code was throwing the InvalidOperationException from Single before, so this was more of a refactor on that end.
Is there another approach to this seemingly simple problem? It seems to be kind of a Single Responsibility Principle violation, with the logs in between the code and all that, but I fail to see a decent or elegant way out of it. It's even worse in our case because the same steps are repeated twice, once for the 'User' and then for the 'Device', like this:
Get unique user
Get unique device of unique user
For both operations, it is important to us to know exactly what happened, what users/devices were returned in case it was not unique, things like that.

#AntP hit upon the answer I like best. I think the reason you are struggling is that you actually have two problems here. The first is that the code seems to have too much responsibility. Apply this simple test: give this method a simple name that describes everything it does. If your name includes the word "and", it's doing too much. When I apply that test, I might name it "GetUsersByCriteriaAndValidateOnlyOneUserMatches()." So it is doing two things. Split it up into a lookup function that doesn't care how many users are returned, and a separate function that evaluates your business rule regarding "I can handle only one user returned".
You still have your original problem, though, and that is the switch statement seems awkward here. The strategy pattern comes to mind when looking at a switch statement, although pragmatically I'd consider it overkill in this case.
If you want to explore it, though, think about creating a base "UserSearchResponseHandler" class, and three sub classes: NoUsersReturned; MultipleUsersReturned; and OneUserReturned. It would have a factory method that would accept a list of Users and return a UserSearchResponseHandler based on the count of users (encapsulating the logic of the switch inside the factory.) Each handler method would do the right thing: log something appropriate then return null, or return a single user.
The main advantage of the Strategy pattern comes when you have multiple needs for the data it identifies. If you had switch statements buried all over your code that all depended on the count of users found by a search, then it would be very appropriate. The factory can also encapsulate substantially more complex rules, such as "user.count must = 1 AND the user[0].level must = 42 AND it must be a Tuesday in September". You can also get really fancy with a factory and use a registry, allowing for dynamic changes to the logic. Finally, the factory nicely separates the "interpreting" of the business rule from the "handling" of the rule.
But in your case, probably not so much. I'm guessing you likely have only the one occurrence of this rule, it seems pretty static, and it's already appropriately located near the point where you acquired the information it's validating. While I'd still recommend splitting out the search from the response parser, I'd probably just use the switch.
A different way to consider it would be with some Goldilocks tests. If it's truly an error condition, you could even throw:
if (users.count() < 1)
{
throw TooFewUsersReturnedError;
}
if (users.count() > 1)
{
throw TooManyUsersReturnedError;
}
return users[0]; // just right

How about something like this?
public class UserResult
{
public string Warning { get; set; }
public IEnumerable<User> Result { get; set; }
}
public UserResult GetMappedUsers(/* params */) { }
public void Whatever()
{
var users = GetMappedUsers(/* params */);
if (!String.IsNullOrEmpty(users.Warning))
log.Warn(users.Warning);
}
Switch for a List<string> Warnings if required. This treats your GetMappedUsers method more like a service that returns some data and some metadata about the result, which allows you to delegate your logging to the caller - where it belongs - so your data access code can get on with just doing its job.
Although, to be honest, in this scenario I would prefer simply to return a list of user IDs from GetMappedUsers and then use users.Count to evaluate your "cases" in the caller and log as appropriate.

Using Transactions or Locking in Entity Framework to ensure proper operation

I am fairly new to EF and SQL in general, so I could use some help clarifying this point.
Let's say I have a table "wallet" (and EF code first object Wallet) that has an ID and a balance. I need to do an operation like this:
if(wallet.balance > 100){
doOtherChecksThatTake10Seconds();
wallet.balance -= 50;
context.SaveChanges();
}
As you can see, it checks to see if a condition is valid, then if so it has to do a bunch of other operations first that take a long time (in this exaggerated example we say 10 seconds), then if that passes it subtracts $50 from the wallet and saves the new data.
The issue is, there are other things happening that can change the wallet balance at any time (this is a web application). If this happens:
wallet.balance = 110;
this operation passes its "if" check because wallet.balance > 110
while it's doing the "doOtherChecksThatTake10Seconds()", a user transfers $40 out of their wallet
now wallet.balance = 70
"doOtherChecksThatTake10Seconds()" finishes, subtracts 50 from wallet.balance, and then saves the context with the new data.
In this case, the check of wallet.balance > 100 is no longer true, but the operation still happened because of the delay. I need to find a way of locking the table and not releasing it until the entire operation is finished, so nothing gets edited during. What is the most effective way to do this?
It should be noted that I have tried putting this operation within a TransactionScope(), I am not sure if that will have the intended effect or not but I did notice it started causing a lot of deadlocks with an entirely different database operation that is running.

Use Optimistic concurrency http://msdn.microsoft.com/en-us/data/jj592904
//Object Property:
public byte[] RowVersion { get; set; }
//Object Configuration:
Property(p => p.RowVersion).IsRowVersion().IsConcurrencyToken();
This Allows dirty read. BUT when you go to update the record the system checks the rowversion hasn't changed in the mean time, it fails if someone has changed the record in the meantime.
Rowversion is maintained by DB each time a record changes.
Out of the box EF optimistic locking.

you can use Transaction Scope.
Import the namespace
using System.Transactions;
and use it like below:
public string InsertBrand()
{
try
{
using (TransactionScope transaction = new TransactionScope())
{
//Do your operations here
transaction.Complete();
return "Mobile Brand Added";
}
}
catch (Exception ex)
{
throw ex;
}
}

Another approach could be to use one or many internal queues and consume this queue(s) by one thread only (producer-consumer-pattern). I use this approach in a booking system and it works quite well and is very easy.
In my case I have multiple queues (one for each 'product') that are created and deleted dynamically and multiple consumers, where only one consumer can be assigned to one queue. This allows also to handle higher concurrency. In a high-concurrency scenario with houndredthousands of user you could also use separate servers and queues like msmq to handle this.
There might be a problem with this approach in a ticket system where a lot of users want to have a ticket for a concert or in a shopping system, when a new "Harry Potter" is released but I dont have this scenarios.

Multi threading C# application with SQL Server database calls

I have a SQL Server database with 500,000 records in table main. There are also three other tables called child1, child2, and child3. The many to many relationships between child1, child2, child3, and main are implemented via the three relationship tables: main_child1_relationship, main_child2_relationship, and main_child3_relationship. I need to read the records in main, update main, and also insert into the relationship tables new rows as well as insert new records in the child tables. The records in the child tables have uniqueness constraints, so the pseudo-code for the actual calculation (CalculateDetails) would be something like:
for each record in main
{
find its child1 like qualities
for each one of its child1 qualities
{
find the record in child1 that matches that quality
if found
{
add a record to main_child1_relationship to connect the two records
}
else
{
create a new record in child1 for the quality mentioned
add a record to main_child1_relationship to connect the two records
}
}
...repeat the above for child2
...repeat the above for child3
}
This works fine as a single threaded app. But it is too slow. The processing in C# is pretty heavy duty and takes too long. I want to turn this into a multi-threaded app.
What is the best way to do this? We are using Linq to Sql.
So far my approach has been to create a new DataContext object for each batch of records from main and use ThreadPool.QueueUserWorkItem to process it. However these batches are stepping on each other's toes because one thread adds a record and then the next thread tries to add the same one and ... I am getting all kinds of interesting SQL Server dead locks.
Here is the code:
int skip = 0;
List<int> thisBatch;
Queue<List<int>> allBatches = new Queue<List<int>>();
do
{
thisBatch = allIds
.Skip(skip)
.Take(numberOfRecordsToPullFromDBAtATime).ToList();
allBatches.Enqueue(thisBatch);
skip += numberOfRecordsToPullFromDBAtATime;
} while (thisBatch.Count() > 0);
while (allBatches.Count() > 0)
{
RRDataContext rrdc = new RRDataContext();
var currentBatch = allBatches.Dequeue();
lock (locker)
{
runningTasks++;
}
System.Threading.ThreadPool.QueueUserWorkItem(x =>
ProcessBatch(currentBatch, rrdc));
lock (locker)
{
while (runningTasks > MAX_NUMBER_OF_THREADS)
{
Monitor.Wait(locker);
UpdateGUI();
}
}
}
And here is ProcessBatch:
private static void ProcessBatch(
List<int> currentBatch, RRDataContext rrdc)
{
var topRecords = GetTopRecords(rrdc, currentBatch);
CalculateDetails(rrdc, topRecords);
rrdc.Dispose();
lock (locker)
{
runningTasks--;
Monitor.Pulse(locker);
};
}
And
private static List<Record> GetTopRecords(RecipeRelationshipsDataContext rrdc,
List<int> thisBatch)
{
List<Record> topRecords;
topRecords = rrdc.Records
.Where(x => thisBatch.Contains(x.Id))
.OrderBy(x => x.OrderByMe).ToList();
return topRecords;
}
CalculateDetails is best explained by the pseudo-code at the top.
I think there must be a better way to do this. Please help. Many thanks!

Here's my take on the problem:
When using multiple threads to insert/update/query data in SQL Server, or any database, then deadlocks are a fact of life. You have to assume they will occur and handle them appropriately.
That's not so say we shouldn't attempt to limit the occurence of deadlocks. However, it's easy to read up on the basic causes of deadlocks and take steps to prevent them, but SQL Server will always surprise you :-)
Some reason for deadlocks:
Too many threads - try to limit the number of threads to a minimum, but of course we want more threads for maximum performance.
Not enough indexes. If selects and updates aren't selective enough SQL will take out larger range locks than is healthy. Try to specify appropriate indexes.
Too many indexes. Updating indexes causes deadlocks, so try to reduce indexes to the minimum required.
Transaction isolational level too high. The default isolation level when using .NET is 'Serializable', whereas the default using SQL Server is 'Read Committed'. Reducing the isolation level can help a lot (if appropriate of course).
This is how I might tackle your problem:
I wouldn't roll my own threading solution, I would use the TaskParallel library. My main method would look something like this:
using (var dc = new TestDataContext())
{
// Get all the ids of interest.
// I assume you mark successfully updated rows in some way
// in the update transaction.
List<int> ids = dc.TestItems.Where(...).Select(item => item.Id).ToList();
var problematicIds = new List<ErrorType>();
// Either allow the TaskParallel library to select what it considers
// as the optimum degree of parallelism by omitting the
// ParallelOptions parameter, or specify what you want.
Parallel.ForEach(ids, new ParallelOptions {MaxDegreeOfParallelism = 8},
id => CalculateDetails(id, problematicIds));
}
Execute the CalculateDetails method with retries for deadlock failures
private static void CalculateDetails(int id, List<ErrorType> problematicIds)
{
try
{
// Handle deadlocks
DeadlockRetryHelper.Execute(() => CalculateDetails(id));
}
catch (Exception e)
{
// Too many deadlock retries (or other exception).
// Record so we can diagnose problem or retry later
problematicIds.Add(new ErrorType(id, e));
}
}
The core CalculateDetails method
private static void CalculateDetails(int id)
{
// Creating a new DeviceContext is not expensive.
// No need to create outside of this method.
using (var dc = new TestDataContext())
{
// TODO: adjust IsolationLevel to minimize deadlocks
// If you don't need to change the isolation level
// then you can remove the TransactionScope altogether
using (var scope = new TransactionScope(
TransactionScopeOption.Required,
new TransactionOptions {IsolationLevel = IsolationLevel.Serializable}))
{
TestItem item = dc.TestItems.Single(i => i.Id == id);
// work done here
dc.SubmitChanges();
scope.Complete();
}
}
}
And of course my implementation of a deadlock retry helper
public static class DeadlockRetryHelper
{
private const int MaxRetries = 4;
private const int SqlDeadlock = 1205;
public static void Execute(Action action, int maxRetries = MaxRetries)
{
if (HasAmbientTransaction())
{
// Deadlock blows out containing transaction
// so no point retrying if already in tx.
action();
}
int retries = 0;
while (retries < maxRetries)
{
try
{
action();
return;
}
catch (Exception e)
{
if (IsSqlDeadlock(e))
{
retries++;
// Delay subsequent retries - not sure if this helps or not
Thread.Sleep(100 * retries);
}
else
{
throw;
}
}
}
action();
}
private static bool HasAmbientTransaction()
{
return Transaction.Current != null;
}
private static bool IsSqlDeadlock(Exception exception)
{
if (exception == null)
{
return false;
}
var sqlException = exception as SqlException;
if (sqlException != null && sqlException.Number == SqlDeadlock)
{
return true;
}
if (exception.InnerException != null)
{
return IsSqlDeadlock(exception.InnerException);
}
return false;
}
}
One further possibility is to use a partitioning strategy
If your tables can naturally be partitioned into several distinct sets of data, then you can either use SQL Server partitioned tables and indexes, or you could manually split your existing tables into several sets of tables. I would recommend using SQL Server's partitioning, since the second option would be messy. Also built-in partitioning is only available on SQL Enterprise Edition.
If partitioning is possible for you, you could choose a partion scheme that broke you data in lets say 8 distinct sets. Now you could use your original single threaded code, but have 8 threads each targetting a separate partition. Now there won't be any (or at least a minimum number of) deadlocks.
I hope that makes sense.

Overview
The root of your problem is that the L2S DataContext, like the Entity Framework's ObjectContext, is not thread-safe. As explained in this MSDN forum exchange, support for asynchronous operations in the .NET ORM solutions is still pending as of .NET 4.0; you'll have to roll your own solution, which as you've discovered isn't always easy to do when your framework assume single-threadedness.
I'll take this opportunity to note that L2S is built on top of ADO.NET, which itself fully supports asynchronous operation - personally, I would much prefer to deal directly with that lower layer and write the SQL myself, just to make sure that I fully understood what was transpiring over the network.
SQL Server Solution?
That being said, I have to ask - must this be a C# solution? If you can compose your solution out of a set of insert/update statements, you can just send over the SQL directly and your threading and performance problems vanish.* It seems to me that your problems are related not to the actual data transformations to be made, but center around making them performant from .NET. If .NET is removed from the equation, your task becomes simpler. After all, the best solution is often the one that has you writing the smallest amount of code, right? ;)
Even if your update/insert logic can't be expressed in a strictly set-relational manner, SQL Server does have a built-in mechanism for iterating over records and performing logic - while they are justly maligned for many use cases, cursors may in fact be appropriate for your task.
If this is a task that has to happen repeatedly, you could benefit greatly from coding it as a stored procedure.
*of course, long-running SQL brings its own problems like lock escalation and index usage that you'll have to contend with.
C# Solution
Of course, it may be that doing this in SQL is out of the question - maybe your code's decisions depend on data that comes from elsewhere, for example, or maybe your project has a strict 'no-SQL-allowed' convention. You mention some typical multithreading bugs, but without seeing your code I can't really be helpful with them specifically.
Doing this from C# is obviously viable, but you need to deal with the fact that a fixed amount of latency will exist for each and every call you make. You can mitigate the effects of network latency by using pooled connections, enabling multiple active result sets, and using the asynchronous Begin/End methods for executing your queries. Even with all of those, you will still have to accept that there is a cost to shipping data from SQL Server to your application.
One of the best ways to keep your code from stepping all over itself is to avoid sharing mutable data between threads as much as possible. That would mean not sharing the same DataContext across multiple threads. The next best approach is to lock critical sections of code that touch the shared data - lock blocks around all DataContext access, from the first read to the final write. That approach might just obviate the benefits of multithreading entirely; you can likely make your locking more fine-grained, but be ye warned that this is a path of pain.
Far better is to keep your operations separate from each other entirely. If you can partition your logic across 'main' records, that's ideal - that is to say, as long as there aren't relationships between the various child tables, and as long as one record in 'main' doesn't have implications for another, you can split your operations across multiple threads like this:
private IList<int> GetMainIds()
{
using (var context = new MyDataContext())
return context.Main.Select(m => m.Id).ToList();
}
private void FixUpSingleRecord(int mainRecordId)
{
using (var localContext = new MyDataContext())
{
var main = localContext.Main.FirstOrDefault(m => m.Id == mainRecordId);
if (main == null)
return;
foreach (var childOneQuality in main.ChildOneQualities)
{
// If child one is not found, create it
// Create the relationship if needed
}
// Repeat for ChildTwo and ChildThree
localContext.SaveChanges();
}
}
public void FixUpMain()
{
var ids = GetMainIds();
foreach (var id in ids)
{
var localId = id; // Avoid closing over an iteration member
ThreadPool.QueueUserWorkItem(delegate { FixUpSingleRecord(id) });
}
}
Obviously this is as much a toy example as the pseudocode in your question, but hopefully it gets you thinking about how to scope your tasks such that there is no (or minimal) shared state between them. That, I think, will be the key to a correct C# solution.
EDIT Responding to updates and comments
If you're seeing data consistency issues, I'd advise enforcing transaction semantics - you can do this by using a System.Transactions.TransactionScope (add a reference to System.Transactions). Alternately, you might be able to do this on an ADO.NET level by accessing the inner connection and calling BeginTransaction on it (or whatever the DataConnection method is called).
You also mention deadlocks. That you're battling SQL Server deadlocks indicates that the actual SQL queries are stepping on each other's toes. Without knowing what is actually being sent over the wire, it's difficult to say in detail what's happening and how to fix it. Suffice to say that SQL deadlocks result from SQL queries, and not necessarily from C# threading constructs - you need to examine what exactly is going over the wire. My gut tells me that if each 'main' record is truly independent of the others, then there shouldn't be a need for row and table locks, and that Linq to SQL is likely the culprit here.
You can get a dump of the raw SQL emitted by L2S in your code by setting the DataContext.Log property to something e.g. Console.Out. Though I've never personally used it, I understand the LINQPad offers L2S facilities and you may be able to get at the SQL there, too.
SQL Server Management Studio will get you the rest of the way there - using the Activity Monitor, you can watch for lock escalation in real time. Using the Query Analyzer, you can get a view of exactly how SQL Server will execute your queries. With those, you should be able to get a good notion of what your code is doing server-side, and in turn how to go about fixing it.

I would recommend moving all the XML processing into the SQL server, too. Not only will all your deadlocks disappear, but you will see such a boost in performance that you will never want to go back.
It will be best explained by an example. In this example I assume that the XML blob already is going into your main table (I call it closet). I will assume the following schema:
CREATE TABLE closet (id int PRIMARY KEY, xmldoc ntext)
CREATE TABLE shoe(id int PRIMARY KEY IDENTITY, color nvarchar(20))
CREATE TABLE closet_shoe_relationship (
closet_id int REFERENCES closet(id),
shoe_id int REFERENCES shoe(id)
)
And I expect that your data (main table only) initially looks like this:
INSERT INTO closet(id, xmldoc) VALUES (1, '<ROOT><shoe><color>blue</color></shoe></ROOT>')
INSERT INTO closet(id, xmldoc) VALUES (2, '<ROOT><shoe><color>red</color></shoe></ROOT>')
Then your whole task is as simple as the following:
INSERT INTO shoe(color) SELECT DISTINCT CAST(CAST(xmldoc AS xml).query('//shoe/color/text()') AS nvarchar) AS color from closet
INSERT INTO closet_shoe_relationship(closet_id, shoe_id) SELECT closet.id, shoe.id FROM shoe JOIN closet ON CAST(CAST(closet.xmldoc AS xml).query('//shoe/color/text()') AS nvarchar) = shoe.color
But given that you will do a lot of similar processing, you can make your life easier by declaring your main blob as XML type, and further simplifying to this:
INSERT INTO shoe(color)
SELECT DISTINCT CAST(xmldoc.query('//shoe/color/text()') AS nvarchar)
FROM closet
INSERT INTO closet_shoe_relationship(closet_id, shoe_id)
SELECT closet.id, shoe.id
FROM shoe JOIN closet
ON CAST(xmldoc.query('//shoe/color/text()') AS nvarchar) = shoe.color
There are additional performance optimizations possible, like pre-computing repeatedly invoked Xpath results in a temporary or permanent table, or converting the initial population of the main table into a BULK INSERT, but I don't expect that you will really need those to succeed.

sql server deadlocks are normal & to be expected in this type of scenario - MS's recommendation is that these should be handled on the application side rather than the db side.
However if you do need to make sure that a stored procedure is only called once then you can use a sql mutex lock using sp_getapplock. Here's an example of how to implement this
BEGIN TRAN
DECLARE #mutex_result int;
EXEC #mutex_result = sp_getapplock #Resource = 'CheckSetFileTransferLock',
#LockMode = 'Exclusive';
IF ( #mutex_result < 0)
BEGIN
ROLLBACK TRAN
END
-- do some stuff
EXEC #mutex_result = sp_releaseapplock #Resource = 'CheckSetFileTransferLock'
COMMIT TRAN

This may be obvious, but looping through each tuple and doing your work in your servlet container involves a lot of per-record overhead.
If possible, move some or all of that processing to the SQL server by rewriting your logic as one or more stored procedures.

If
You don't have a lot of time to spend on this issue and need it to fix it right now
You are sure that your code is done so that different thread will NOT modify the same record
You are not afraid
Then ... you can just add "WITH NO LOCK" to your queries so that MSSQL doesn't apply the locks.
To use with caution :)
But anyway, you didn't tell us where the time is lost (in the mono-threaded version). Because if it's in the code, I'll advise you to write everything in the DB directly to avoid continuous data exchange. If it's in the db, I'll advise to check index (too much ?), i/o, cpu etc.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.