High concurrency upserts cause deadlocks and query time of > 30s

High concurrency upserts cause deadlocks and query time of > 30s - c#

I'm working on an application which uses EF6 for most database operations, and for the most part the database stuff is non-critical and isn't under a lot of load. There is one exception to this rule, where we have a stream of possibly 100/s events coming in which need to insert or update a row in a specific table based on a column value.
I'm a little weak with SQL in general, but I've written this query to insert or update and return the id of the element:
DECLARE #Id [int];
MERGE {tableName} WITH (UPDLOCK) AS target
USING (SELECT #MatchName AS id) AS source
ON source.id = target.MatchColumn
WHEN MATCHED
THEN UPDATE SET #Id = target.Id, ...
WHEN NOT MATCHED
THEN INSERT (...) VALUES (...);
IF #Id IS NULL
BEGIN
SELECT #Id = CAST(SCOPE_IDENTITY() as [int]);
END;
SELECT #Id;
It's done inside of an (EF) serializable transaction block and it's the only thing that executes in an explicit transaction, and is the only code that updates this table. (other things can read). If a transaction is rolled back by the database (EF throws an exception) it is retried immediately up to 3 times.
The problem is that when we start getting into higher load situations we can end up in this state where so many things are trying to update the database and queries against this table can start to take 30+ seconds (queries to other tables remain fine). I'm under the impression that even though this executes in a serializable transaction it will only lock rows that are selected by the merging matching expression, and this should be a relatively quick operation.
I've been doing some research the past days and some people suggest that only a HOLDLOCK is sufficient in the default transaction, and others claim that a serializable transaction is necessary or you could have data integrity problems.
I was hoping someone could explain why the long deadlocks might be happening, and in detail what locking mechanism is optimal in this scenario.

By default, merge acquires updlocks, so with (updlock) is not doing anything for you. Changing your updlock to holdlock (or serializable) and running the statement in a transaction will guarantee that the proper locks are acquired and held for the duration of the operation.
To prevent concurrent sessions from inserting data with the same key, an incompatible lock must be acquired to ensure only one session can read the key and that lock must be held until the transaction completes.
UPSERT Race Condition With Merge - Dan Guzman
Do you need to use set transaction isolation level serializable?
In this instance I don't think there would be any difference by explicitly setting the transaction isolation level to serializable if the above code is all that would be in the transaction. merge with (holdlock) is going to acquire update locks by default, which are not compatible with shared locks, using the holdlock hint solves the race-condition issue as explained by Dan Guzman in the referenced article and excerpt above.
with (holdlock) is a table hint. Table hints override the default behavior of the statement.
If there were other statements in your transaction, then those would be effected by the difference in transaction isolation level from the default isolation level or an explicitly set set transaction isolation level (which is session level) unless overridden with a table hint.
Highest granularity wins in:
lowest: Database (default is read committed)
middle: Session (set transaction isolation level ...)
highest: Table Hint (with (updlock, serializable))
More on Transaction Isolation Levels:
SQL Server Isolation Levels: A Series - Paul White
Questions About T-SQL Transaction Isolation Levels You Were Too Shy to Ask - Robert Sheldon

Related

What type of SQL locks could be involved in the following queries executed inside of a TransactionScope which is ReadCommited?

So we have four statements (changed for purposes of the question)
They are run inside a TransactionScope (ReadCommitted) and multiple processes could be calling the same stored procedure containing these statements at once:
SELECT #BathroomId = B.BathroomId
FROM Bathrooms B
WHERE B.BathroomSuite = #BathroomSuite AND B.SuiteIsAvailable = 1
(No indexes used at all)
SELECT #OrderReceiptId = O.OrderReceiptId
FROM Order O
WHERE O.OrderId = #OrderId
(Clustered Index)
IF ISNULL(#OrderReceiptId, -1) = -1
BEGIN
INSERT INTO [dbo].[OrderReceipt]
.....
(Clustered index on PK)
UPDATE Order
SET OrderReceiptId = ##SCOPE_IDENTITY
WHERE OrderId = #OrderId
From my limited reading, I understand that only row locks should be used for the SELECTS (therefore, limiting the impact on contention on these tables)
But, then, what lock(s) would be used for the INSERT/UPDATE and then what impact does this have on other processes vying with the overarching transaction?
Are we effectively gating these tables until the transaction has completed? Or just some of the tables (I.e. just the INSERT and the UPDATE - owing to the fact that the transaction is really only related to the INSERT and UPDATE in my mind (can't rollback a SELECT for example)
Are other contending processes going to have to wait until the transaction completes (which is certainly not unreasonable I think)
The database locking and transaction relationship is somewhat fuzzy as is then how this affects multiple callers of the stored procedure containing these statements.
N.B. Please ignore dodgy relationship between Order and OrderReceipt, it is definitely sub-optimal.
I think I am conflating Transaction as a locking mechanism (sort of like a thread lock used for thread synchronisation) and database level locking
EDIT: Yes I am conflating Transaction and DB locking (used together but slightly different responsibilities), any google 101 site tells me this. It's embarrassing but it will teach me for not having a mooch first.

If you are using SQL Server this probably depends on your snapshot isolation level.
Inserting something into the clustered index will lock the whole table, as far as I know. In other words, the first insert will block all other inserts until completed. Reading (SELECT) is a different story:
Check the properties/options of the database, look for "Is read committed snapshot on". If this setting is True, concurrent processes can read while you are holding that insert transaction. If not - all other reads will be blocked until the transaction is completed.
Note that turning this option on might affect the db performance in some scenarios, though I've personally not had any real issues with it.

If there are multiple process calling the same SP,there will be no contention or locking,blocking ..
Your select takes a shared lock in the isolation level you are in and the lock will be released as soon as the row as read
But you might see locking ,blocking if there are some update processes or delete processes which try to access the table at the same time..
You could also use this traceflag to see all the locks taken ..This trace flag will write the locks taken in order to messages tab
DBCC TRACEON(1200,3604,-1);
SELECT ....
DBCC TRACEOFF(1200,3604,-1);

Does LINQ2SQL automatically put ExecuteCommand in a transaction

Does the documentation quotation from this answer: https://stackoverflow.com/a/542691/1011724
When you call SubmitChanges, LINQ to SQL checks to see whether the call is in the scope of a Transaction or if the Transaction property (IDbTransaction) is set to a user-started local transaction. If it finds neither transaction, LINQ to SQL starts a local transaction (IDbTransaction) and uses it to execute the generated SQL commands. When all SQL commands have been successfully completed, LINQ to SQL commits the local transaction and returns.
apply to the .ExecuteCommand() method? In otherwords, can I trust that the following delete is handled in a transaction and will automatically rollback if it fails or do I need to manually tell it to use a transaction and if so how? Should I use TransactionScope?
using(var context = Domain.Instance.GetContext())
{
context.ExecuteCommand("DELETE FROM MyTable WHERE MyDateField = {0}", myDate)
}

Every SQL statement, whether or not wrapped in an explicit transaction, occurs transactionally. So, explicit transaction or not, individual statements are always atomic -- they either happen entirely or not at all. In the example above, either all rows that match the criterion are deleted or none of them are -- this is irrespective of what client code does. There is literally no way to get SQL Server to delete the rows partially; even yanking out the power cord will simply mean whatever was already done for the delete will be undone when the server restarts and reads the transaction log.
The only fly in the ointment is that which rows match can vary depending on how the statement locks. The statement logically happens in two phases, the first to determine which rows will be deleted and the second to actually delete them (while under an update lock). If you, say, issued this statement, and while it was running issued an INSERT that inserted a row matching the DELETE criterion, whether the row is in the database or not after the DELETE has finished depends on which transaction isolation level was in effect for the statements. So if you want practical guarantees about "all rows" being deleted, what client code does comes into scope. This goes a little beyond the scope of the original question, though.

Selecting all rows from a SQL Server table locked with the readcommited IsolationMode

One of my client session creates an entry in the table from a transaction and continues its processing. The transaction runs under the isolation mode read committed. Meanwhile, the other client session reports all data in the table.
The selecting all action is entirely locked now because of the locked row (inserted by other client).
How can I just retrieve the committed data during the select all, instead of getting completely locked?
Any help would be greatly appreciated.

You haven't really been very specific about your usage scenario, but it is possible to get the data out of the table, there are just some severe caveats.
You can use the READ UNCOMMITTED snapshot isolation level, as marc_s said, and that has the same effect as using WITH (NOLOCK) on all of your select statements within the transaction. If you want to read just that table but treat all other reads in your transaction normally then you are better putting the NOLOCK hint on the specific table within the query. So, for example:
SELECT *
FROM firstTable f INNER JOIN
secondTable s WITH (NOLOCK) on f.Key = s.Key
That would issue a normal read lock for firstTable but read secondTable with no locks. No locks can be quite dangerous - as you can get effectively corrupt data out. If the insert being performed reorders data and causes a page split, you can get out the same row twice and all sorts of similar unpleasantness.
So it is possible, but it's not ideal, and you should be wary of the effect. Some good further reading is here, courtesy of Jason Strate.

DeadLock DbContext Concurrent Transactions

Hi please see the deadlock graph portion in above image.I have two transactions that update the same table and one of them is long transaction that update that table (same row) 5 times but the other transaction updates that table only once and is a small transaction of two DB hits .Its logically true from the deadlock graph that both the transaction have X lock on different rows and attempting to get U lock . I cant understand why the shorter transaction acquire X lock though it hasn't fired update query yet (as it is the update query that causes deadlock this means it isn't fired yet ). Any help will be highly appericiable.
1) I am using isolation level read committed
2) I cant understand how the second/first transaction can get X lock while the other transaction already got it on some row.I read that on Update query first U lock is applied and then is upgraded to X lock for that particular updated row.Now when one transaction has X lock then how another transaction can have U lock as during table scan (to determine the row to be updated) it can not read the row that has X lock by other transaction.3) Both the transaction updates one different row of same table.Any possible Solution at DB level without changing Isolation Level.

I cant understand how the second/first transaction can get X lock
while the other transaction already got it on some row.
That is magic behind databases and their performance. The locks can be issued on different levels and if the second transaction didn't use table scan it could issue X lock without conflicting with the first transaction. It is possible that records for update where searched using index and table scan didn't happen so there can be multiple concurrent X locks in your table.
I read that on Update query first U lock is applied and then is
upgraded to X lock for that particular updated row.
No. Update should use X lock on the record directly. U lock must be forced by your read query which reads data to be udpated (that is what #Marc mentioned in his comment). As you already know EF doesn't support this because it cannot use hints.

Should I be using SQL transactions, while reading records?

SQL transactions is used for insert, update, but should it be used for reading records?

If you are querying all the records in a single query, and pulling them back in one go, there is no need. Everything is wrapped up in an implicit transaction. That is to say, even if you get back one million records, and even if other processes are changing the records, you'll see what all one million records looked like at the same point in time.
The only times you would really need a transaction (and, often, a specific locking hint) in a read only process are:
- You read the records "piece-meal" and need nothing else to alter the values while you itterate though. [Such as a connected recordset in ADO that you then cursor through.]
- You read some data, do some calculations, then read some related data, but on the assumption nothing changed in the mean time.
In short, you need transactions when you want other processes to be stopped from interfering with your data between SQL statements.

Transaction wrapping is not needed for pure reads.
Within your SQL statement, Lock Hints should take care returning proper data to you (http://msdn.microsoft.com/en-us/library/aa213026%28SQL.80%29.aspx).
On a server level, you can set Transaction Isolation levels (http://msdn.microsoft.com/en-us/library/ms173763.aspx).
Edit
Explaining pure reads
If all your SQL statement has these kinds of reads then you do not need to wrap in a transaction
SELECT Col1, Col2
From Table1
INNER JOIN Table2
ON Table1.Id = Table2.Table1Id
If you are reading results that can be affected by other transactions in parallel then you must wrap in a transaction. For eg:
BEGIN TRANSACTION
INSERT INTO AccountTransactions (Type, Amount) Values ('Credit', 43.21)
UPDATE AccountSummary SET Balance = Balance + 43.21
SELECT #Balance = Balance FROM AccountSummary
COMMIT TRANSACTION
Really, you are just returning the balance, but the entire monetary transaction has to work in two places.

If you need the most up to date to the millisecond information you can use a transaction that is constructed with a TransactionOptions having an IsolationLevel of Serializable.
This would effect performance as it will lock the table (or parts of the table), so you need to figure out if you really need this.
For most uses, if you are doing a read, you do not need to wrap a transaction around it (assuming you are only doing reads in the one operation).
It really depends on your application, what data it requires and how it uses it.
For example, if you do a read and depending on the results you do a write or update, but it is critical that the data you just read is current, you should probably wrap the whole logic into a single transaction.

No, transactions are not generally needed to read data and it will slow down your data reads as well.
I would suggest you read up on the term ATOMIC. This will help you understand what transactions are for.

It's posssible to to do transactions but what is purpose of it?
You can set the appropriate isolation level for an entire SQL Server session by using the SET TRANSACTION ISOLATION LEVEL statement.
This is the syntax from SQL Server Books Online:
SET TRANSACTION ISOLATION LEVEL
{
READ COMMITTED
| READ UNCOMMITTED
| REPEATABLE READ
| SERIALIZABLE
}
Locking in Microsoft SQL Server.

When you modified something in a transaction, you can use read statement to check if the operation takes effect, just before you commit.

Transactions are meant to avoid concurrency issues when one logical transaction actually maps to several SQL queries. For example, for a bank account, if you are transferring money from one account to another, you will 1st subtract the amount from account and then add it to other(or vice versa). But, if some error occurs in between your database would be in a invalid state (You may have subtracted the amount from one account but not added it to other). So, if you are reading all your data in one query, you dont need a transaction.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

High concurrency upserts cause deadlocks and query time of > 30s - c#

Related

What type of SQL locks could be involved in the following queries executed inside of a TransactionScope which is ReadCommited?

Does LINQ2SQL automatically put ExecuteCommand in a transaction

Selecting all rows from a SQL Server table locked with the readcommited IsolationMode

DeadLock DbContext Concurrent Transactions

Should I be using SQL transactions, while reading records?

Categories

Resources