Performing "atomic" operation "IncreaseIf" on database - c#

I need to perform atomic operation check value of some field of some entity framework model and increase it if its value is 0.
I thought about transactions, sth like:
bool controlPassed = false;
using (TransactionScope scope = new TransactionScope())
{
var model = ...ModelEntities.first_or_default(...)
if (model.field == 0){
++model.field;
...saveChanges();
controlPassed = true;
}
scope.Complete();
}
if (controlPassed)
{
...
using (TransactionScope scope = new TransactionScope())
{
--model.field;
...saveChanges();
scope.Complete();
}
}
Of course, everything in try catch and so on.
My question is: how would it work?
It is really hard to check.
I have multithreaded application.
Is there a possibility, that two or more threads would pass control (check that field == 0 and increase it)?
Whout would be blocked in database (database, table, row, field)?
I can't let two or more threads to be in controlPassed section simultaneously.

Is there a possibility, that two or more threads would pass control
(check that field == 0 and increase it)?
You have Serializable transaction (that is default for TransactionScope). It means that there can be two threads with field == 0 but immediately after that deadlock happens because transaction for the first thread holds a shared lock on the filed and transaction for the second thread holds another shared lock for the same field. Neither of these transaction can upgrade the lock to exclusive to save changes because they are blocked by shared lock in other transaction. I think same would happen for RepeatableRead isolation level.
If you change isolation level to ReadCommitted (the default for SaveChangeswithout TransactionScope when using MS SQL Server) the answer will be again yes but this time without deadlock because EF uses normal selects without any table hints - that means that no lock on record is held when select completes. Only save changes (modification) locks records until transaction commits or rolls back.
To lock record in ReadCommitted transaction with select you must use native SQL query and UPDLOCK (to lock record for update during select and held it until transaction ends) table hint. EF queries do not support table hints.
Edit: I wrote a long article about pessimistic concurrency which describes why your solution doesn't work and what must be changed to make it work correctly.

Related

C# performing bulk update on table from multiple threads without a deadlock

I have written a following piece of code:
public void BulkUpdateItems(List<Items> items)
{
var bulk = new BulkOperations();
using (var trans = new TransactionScope())
{
using (SqlConnection conn = new SqlConnection(#"connstring"))
{
bulk.Setup()
.ForCollection(items)
.WithTable("Items")
.AddColumn(x => x.QuantitySold)
.BulkUpdate()
.MatchTargetOn(x => x.ItemID)
.Commit(conn);
}
trans.Complete();
}
}
With using a SQLBulkTools library... But the problem here is when I run this procedure from multiple threads at a time I run on deadlocks...
And the error states that a certain process ID was deadlocked or something like that....
Is there any alternative to perform a bulk update of 1 table from multiple threads in an efficient way?
Can someone help me out?
I don't know much about that API but a quick read suggests a few things you could try. I would try them in the order listed.
Use a smaller batch size, and/or set the batch timeout higher. This will let each thread take turns.
Use a temporary table. This will allow the threads to work independently.
Set the options to use a table lock. If you lock the whole table, different threads won't be able to lock different rows, so you shouldn't get any deadlocks.
The deadlock message is coming from SQL Server - it means that one of your connections is waiting on a resource locked by another, and that second connection is waiting on a resource held on the first.
If you are trying to update the same table, you are likely running into a simple SQL locking issue and nothing really to do with C#. You need to think more thoroughly about the implications of doing a bulk update on multiple threads; its probably (depending on the percentage of the table you are updating) better to do this on a single connection and use a queue style of mechanism to de-conflict the individual calls.
Try
lock
{
....
}
What this will do is, when one process is executing the code within the curly braces, it will cause other processes to wait until the first one is finished. In that way, only one process will execute the block at a time.

Deadlock with only ONE Resource and Isolation Level Serializable...?

I use Entity Framework so process long running tasks (10-30 secs on avg). I have many instances of workers and each worker fetches the next task id from a database table and with that it gets to the work description for that id.
Of course, the access to the task table must be serialized so that each request from a worker gets a new id. I thought this would do it:
static int? GetNextDetailId()
{
int? id = null;
using ( var ctx = Context.GetContext() )
using ( var tsx = ctx.Database.BeginTransaction( System.Data.IsolationLevel.Serializable ))
{
var obj = ctx.DbsInstrumentDetailRaw.Where( x => x.ProcessState == ProcessState.ToBeProcessed ).FirstOrDefault();
if ( obj != null )
{
id = obj.Id;
obj.ProcessState = ProcessState.InProcessing;
ctx.SaveChanges();
}
tsx.Commit();
}
return id;
} // GetNextDetailId
Unfortunately when I run it with 10 workers I nearly immediately get
Transaction (Process ID 65) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
I do not have any explanation for this behavior. I know deadlock situations: we have two or more resources and two or more processes that try to aquire the resources not in the same order. But here we only have one resource! All I want is the processes to have sequential access to this resource. So if A has a transaction open, B should simply wait until A commits/rollbacks. This seems not to happen here.
Can someone please
shed some light what is going on here, to educate me.
Give a ( "THE?" ) solution to the problem. I assume that this problem should be very common in programing.
Thanks
Martin
You can verify this using SQL Profiler to sniff the SQL statements that are being executed on your SQL server, but the issue likely is that, even though you are inside a transaction with the isolation level set to serializable, an exclusive lock is not being issued, so what is occurring is that two threads are accessing the same row at the same time, and both are trying to update it.
The best advice I've seen is that, if you need to control locking at this level, execute a stored procedure or SQL instead of attempting to use LINQ.
Locking a table with a select in Entity Framework
shed some light what is going on here, to educate me.
ctx.DbsInstrumentDetailRaw.Where ... acquires a shared lock on the table. With serializable isolation level, this lock is held until the transaction is committed or rolled back.
ctx.SaveChanges() needs an exclusive lock to update the row.
Two or more transactions can simultaneously get a shared lock in step 1, but then none of them can get an exclusive lock in step 2. Deadlock.
Give a ( "THE?" ) solution to the problem.
I can think of 2 ways to solve this problem.
Change the order of operations: update one row, then return it. You will have to use a stored procedure to do it in EF.
Use a lower isolation level (e.g. repeatable read) and optimistic concurrency. You will not get deadlocks (shared locks will be released immediately after select). When 2 workers try to update the same row, one of them will get a concurrency exception.
OK, I now do this:
select 1 from InstrumentDetailRaw with (tablockx, holdlock) where 0 = 1"
at the beginnning of my transaction, which, according to this post:
Locking a table with a select in Entity Framework
does the trick. No deadlocks with 10 workers running for hours now.

Acquiring lock - C# SQL server

Database : SQL server 2005
Programming language : C#
I have a method that does some processing with the User object passed to it. I want to control the way this method works when it is called by multiple threads with the same user object. I have implemented a simple locking which make use of database. I can't use the C#'s lock statement as this method is on a API which will be delivered on different machines. But the database is centralized.
Following code shows what I have. (Exception handled omitted for clarity)
Eg:
void Process(User user)
{
using(var transaction = BeginTransaction())
{
if(LockUser()) {
try {
/* Other processing code */
}
finally {
UnLockUser();
}
}
}
}
LockUser() inserts a new entry into a database table. This table has got a unique constraint on the user id. So when the second thread tries to insert the same data, constraint gets violated and will be an exception. LockUser() catches it and return false. UnlockUser just deletes the entry from the lock table.
Note: Please don't consider the possibilities of lock not getting deleted correctly. We have a SQL job that cleans items that are locked for long time.
Question
Consider two threads executing this method at same time and both of them started the transaction. Since transaction is committed only after all processing logic, will the transaction started on thread2 see the data inserted by thread1 to the lock table?
Is this locking logic OK? Do you see any problems with this approach?
If the acquisition of the lock - by virtue of inserting an entry into the database table - is part of the same transaction then either all or none of the changes of that transaction will become visible to the second thread. This is true for the default isolation level (ReadCommitted).
In other words: Whichever thread has a successful commit of that single transaction has also successfully acquired the lock (= inserted successfully the entry into the database).
In your code example I'm missing the handling of Commit()/Rollback(). Make sure you consider this as part of your implementation.
It depends on the transaction isolation level that you use.
The default isolation (ReadCommitted) level assures that other connections cannot see the uncommitted changes that a connection is making.
When executing your SQL statement, you can explicitly acquire a lock by using locking hints.

LINQ2SQL performance with transactions

I'm having a major performance issue with LINQ2SQL and transactions. My code does the following using IDE generated LINQ2SQL code:
Run a stored proc checking for an existing record
Create the record if it doesn't exist
Run a stored proc that wraps its own code in a transaction
When I run the code with no transaction scope, I get 20 iterations per second. As soon as I wrap the code in a transaction scope, it drops to 3-4 iterations per second. I don't understand why the addition of a transaction at the top level reduces the performance by so much. Please help?
Psuedo stored proc with transaction:
begin transaction
update some_table_1;
insert into some_table_2;
commit transaction;
select some, return, values
Pseudo LINQ code without transaction:
var db = new SomeDbContext();
var exists = db.RecordExists(some arguments);
if (!exists) {
var record = new SomeRecord
{
// Assign property values
};
db.RecordsTable.InsertOnSubmit(record);
db.SubmitChanges();
var result = db.SomeStoredProcWithTransactions();
}
Pseudo LINQ code with transaction:
var db = new SomeDbContext();
var exists = db.RecordExists(some arguments);
if (!exists) {
using (var ts = new TransactionScope())
{
var record = new SomeRecord
{
// Assign property values
};
db.RecordsTable.InsertOnSubmit(record);
db.SubmitChanges();
var result = db.SomeStoredProcWithTransactions();
ts.Complete();
}
}
I know the transaction isn't being escalated to the DTC because I've disabled the DTC. SQL Profiler shows that several of the queries take much longer with the transactionscope enabled, but I'm not sure why. The queries involved are very short lived and I've got indexes that I have verified are being used. I'm unable to determine why the addition of a parent transaction causes so much degredation in performance.
Any ideas?
EDIT:
I've traced the problem to the following query within the final stored procedure:
if exists
(
select * from entries where
ProfileID = #ProfileID and
Created >= #PeriodStart and
Created < #PeriodEnd
) set #Exists = 1;
If I had with(nolock) as shown below, the problem disappears.
if exists
(
select * from entries with(nolock) where
ProfileID = #ProfileID and
Created >= #PeriodStart and
Created < #PeriodEnd
) set #Exists = 1;
However, I'm concerned that doing so may cause problems down the road. Any advice?
One big thing that changes as soon as you get a transaction - the isolation level. Is your database under heavy contention? If so: by default a TransactionScope is at the highest "serializable" isolation level, which involves read locks, key-range locks, etc. If it can't acquire those locks immediately it will slow down while it it blocked. You could investigate by reducing the isolation level of the transaction (via the constructor). For example (but pick your own isolation-level):
using(var tran = new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.Snapshot })) {
// code
tran.Complete();
}
However, picking an isolation level is... tricky; serializable is the safest (hence the default). You can also use granular hints (but not via LINQ-to-SQL) such as NOLOCK and UPDLOCK to help control locking of specific tables.
You could also investigate whether the slowdown is due to trying to talk to DTC. Enable DTC and see if it speeds up. The LTM is good, but I've seen composite operations to a single database escalate to DTC before...
Although you are using a single datacontext, your code sample is likely to use more than one connection and that will escalate your transaction to a distributed transaction.
Try initializing your datacontext with an explicit db connection, or call db.Connection.Open() right after creating the datacontext. That removes the overhead of distributed transactions...
Does the Stored Procedure you call participate in the ambient (parent) transaction? - that is the question.
It's likely that the Stored Procedure participates in the ambient transaction, which is causing the degredation. There's an MSDN article here discussing how they interrelate.
From the article:
"When a TransactionScope object joins an existing ambient transaction, disposing of the scope object may not end the transaction, unless the scope aborts the transaction. If the ambient transaction was created by a root scope, only when the root scope is disposed of, does Commit get called on the transaction. If the transaction was created manually, the transaction ends when it is either aborted, or committed by its creator."
There's also a serious looking document on nested transactions which looks like it is directly applicable localted on MSDN here.
Note:
"If TransProc is called when a transaction is active, the nested transaction in TransProc is largely ignored, and its INSERT statements are committed or rolled back based on the final action taken for the outer transaction."
I think that explains the difference in performance - it's essentially the cost of maintaining the parent transaction. Kristofer's suggestion may help to reduce the overhead.

How does TransactionScope roll back transactions?

I'm writing an integration test where I will be inserting a number of objects into a database and then checking to make sure whether my method retrieves those objects.
My connection to the database is through NHibernate...and my usual method of creating such a test would be to do the following:
NHibernateSession.BeginTransaction();
//use nhibernate to insert objects into database
//retrieve objects via my method
//verify actual objects returned are the same as those inserted
NHibernateSession.RollbackTransaction();
However, I've recently found out about TransactionScope which apparently can be used for this very purpose...
Some example code I've found is as follows:
public static int AddDepartmentWithEmployees(Department dept)
{
int res = 0;
DepartmentAdapter deptAdapter = new DepartmentAdapter();
EmployeeAdapter empAdapter = new EmployeeAdapter();
using (TransactionScope txScope = new TransactionScope())
{
res += deptAdapter.Insert(dept.DepartmentName);
//Custom method made to return Department ID
//after inserting the department "Identity Column"
dept.DepartmentID = deptAdapter.GetInsertReturnValue();
foreach(Employee emp in dept.Employees)
{
emp.EmployeeDeptID = dept.DepartmentID;
res += empAdapter.Insert(emp.EmployeeName, emp.EmployeeDeptID);
}
txScope.Complete();
}
return res;
}
I believe that if I don't include the line txScope.Complete() that the data inserted will be rolled back. But unfortunately I don't understand how that is possible... how does the txScope object keep a track of the deptAdapter and empAdapter objects and their transactions on the database.
I feel like I'm missing a bit of information here...am I really able to replace my BeginTransaction() and RollbackTransaction() calls by surrounding my code using TransactionScope?
If not, how then does TransactionScope work to roll back transactions?
Essentially TransactionScope doesn't track your Adapter's, what it does is it tracks database connections. When you open a DB connection the connections will looks if there is an ambient transaction (Transaction Scope) and if so enlist with it. Caution if there are more the one connection to the same SQL server this will escalate to a Distribtued Transaction.
What happens since you're using a using block you are ensuring dispose will be called even if an exception occurs. So if dispose is called before txScope.Complete() the TransactionScope will tell the connections to rollback their transactions (or the DTC).
The TransactionScope class works with the Transaction class, which is thread-specific.
When the TransactionScope is created, it checks to see if there is a Transaction for the thread; if one exists then it uses that, otherwise, it creates a new one and pushes it onto the stack.
If it uses an existing one, then it just increments a counter for releases (since you have to call Dispose on it). On the last release, if the Transaction was not comitted, it rolls back all the work.
As for why classes seem to magically know about transactions, that is left as an implementation detail for those classes that wish to work with this model.
When you create your deptAdapter and emptAdapter instances, they check to see if there is a current transaction on the thread (the static Current property on the Transaction class). If there is, then it registers itself with the Transaction, to take part in the commit/rollback sequence (which Transaction controls, and might propogate to varying transaction coordinators, such as kernel, distributed, etc.).

Categories