I am working on a project with 2 applications developed in C# (.NET framework 4) -
WCF service (exposed to customers)
ASP.NET webforms application (Head office use).
Both applications can select and update rows from a common “accounts” table in a SQL Server 2005 database. Each row in the accounts table holds a customer balance.
The business logic for a request in both applications involves selecting a row from "accounts" table, doing some processing based on the balance, followed by updating the balance in the database. The process between selecting and updating of the balance cannot participate in a transaction.
I realized it is possible between selecting the row and updating it, the row could be selected and updated by the another request from same or different application.
I found this issue described in the below article. I am referring to 2nd scenario of "lost update".
http://www.codeproject.com/Articles/342248/Locks-and-Duration-of-Transactions-in-MS-SQL-Serve
The second scenario is when one transaction (Transaction A) reads a
record and retrieve the value into a local variable and that same
record will be updated by another transaction (Transaction B). And
later Transaction A will update the record using the value in the
local variable. In this scenario the update done by Transaction B can
be considered as a "Lost Update".
I am looking for a way to prevent the above situation and to prevent balance from becoming negative if multiple concurrent requests are received for the same row. A row should be selected and updated by only a single request (from either application) at a time to ensure the balance is consistent.
I am thinking along the lines of blocking access to a row as soon as it has been selected by one request. Based on my research below are my observations.
Isolation levels
With 'Repeatable read' isolation level it is possible for 2 transactions to select a common row.
I tested this be opening 2 SSMS windows. In both windows I started a transaction with Repeatable read isolation level followed by select on a common row. I was able to select the row in each transaction.
Next I tried to update the same row from each transaction. The statements kept running for few seconds. Then the update from 1st transaction was successful while the update from 2nd transaction failed with the below message.
Error 1205 : Transaction (Process ID) was deadlocked on lock resources
with another process and has been chosen as the deadlock victim. Rerun
the transaction.
So if I am using transaction with Repeatable read it should not be possible for 2 concurrent transactions to update the same row. Sql server automatically chooses to rollback 1 transactions. Is this correct?
But I would also like to avoid the deadlock error by allowing a particular row to be selected by a single transaction only.
Rowlock
I found the below answer on Stackoverflow that mentioned use of ROWLOCK hint to prevent deadlock. (see the comment of the accepted answer).
Minimum transaction isolation level to avoid "Lost Updates"
I started a transaction and used a select statement with ROWLOCK and UPDLOCK. Then in a new SSMS window, I started another transaction and tried to use the same select query (with same locks). This time I was not able to select the row. The statement kept running in the new SSMS window.
So use of Rowlock with transactions seems to be blocking rows for select statements which use the same lock hints.
I would appreciate it if someone could answer the below questions.
Are my observations regarding isolation levels and rowlock correct?
For the scenario that I described should I use ROWLOCK and UPDLOCK hints to block access to a row? If not what is the correct approach?
I am planning to place my select and update code in a transaction. The first select query in the transaction will use the ROWLOCK and UPDLOCK hints. This will prevent the record from being selected by another transaction that uses select with the same locks to retrieve the same row.
I would suggest SQL Isolation level of SNAPSHOT. Very similar to Oracles lock management.
See http://www.databasejournal.com/features/mssql/snapshot-isolation-level-in-sql-server-what-why-and-how-part-1.html
If your code is not too complicated, you can probably implement this without any changes. Bare in mind that some visibility may be affected (ie Dirty reads may not give dirty data)
I find this blanket system easier and more precise than using query hints all over the place.
Configure the database using:
SET ALLOW_SNAPSHOT_ISOLATION ON
Then use this to prefix your transaction statements:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
Related
Our production setup is that we have an application server with applications connecting to a SQL Server 2016 database. On the application server there is several IIS applications which run under a GMSA account. The GMSA account has db_datawriter and db_datareader privileges on the database.
Our team have db_datareader privileges on the same SQL Server database. We require this for production support purposes.
We recently had an incident where a team member invoked a query on SQL Server Management Studio on their local machine:
SELECT * FROM [DatabaseA].[dbo].[TableA] order by CreateDt desc;
TableAhas about 1.4m records and there are multiple blob type columns. CreateDt is a DATETIME2 type column.
We have RedGate SQL Monitor configured for the SQL Server Database Server. This raised a long-running query alert that ran for 1738 seconds.
At the same time one of our web applications (.NET 4.6) which exclusively inserts new records to TableA was experiencing constant query timeout errors:
Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
These errors occurred for almost the exact same 1738 second period. This leads me to believe these are connected.
My understanding is that a SELECT query only creates a Shared lock and would not block access to this table for another connection. Is my understanding correct here?
My question is that is db_datareader safe for team members? Is there a lesser privilege that would allow reading data but absolutely no way for blocking behaviours to be created.
The presence of SELECT * (SELECT STAR) in a query, leads generally to do not use an index and make a SCAN of the table.
With many LOBs (BLOBs or CLOBS or NCLOBs) and many rows, the order by clause will take a long time to :
generate the entries
make a sort on CreateDt
So a read lock (shared lock) is put while reading all the data of the table. This lock accepts other shared locks but prohibit to put an exclusive lock to modify data (INSERT, UPDATE, DELETE). This may guarantee to other users that the data won't be modified.
This locking technics is well known as pessimistic lock. The locks are taken before beginning the execution of the query and relaxed at the end. So reader blocks writers and writers blocks all.
The other technic, that SQL Server can do, called optimistic locking, consists to use a copy of the data, without any locking and verify at the end of the execution that the data involved in writes has not been modified since the beginning. So the blocking is less...
To do a pessimistic locking you have the choise to allow or to force:
ALTER DATABASE CURRENT SET ALLOW_SNAPSHOT_ISOLATION ON;
ALTER DATABASE CURRENT SET READ_COMMITTED_SNAPSHOT ON;
In SQL Server, writers block readers, and readers block writers.
This query doesn't have a where clause and will touch the entire table, probably starting with an IS (Intent Shared) and eventually escalating to a shared lock that updates/inserts/deletes can't access while the lock is there. This is likely held during that very long sort, the order by is causing.
It can be bypassed in several ways, but I don't assume you're actually after how, seeing as whoever ran the query was probably not really thinking straight anyway, and this is not a regular occurrence.
Nevertheless, here are some ways to bypass:
Read Committed Snapshot Isolation
With (nolock). But only if you don't really care about the data that is retrieved, as it can return rows twice, rows that were never committed and skip rows altogether.
Reducing the columns you return and reading from a non-clustered index instead.
But to answer your question, yes selects can block inserts.
Does a transaction lock my table when I'm running multiple queries?
Example: if another user will try to send data in same time which I use transaction, what will happen?
Also how can I avoid this, but also to be sure that all data has inserted successfully into database?
Begin Tran;
Insert into Customers (name) values(name1);
Update CustomerTrans
set CustomerName = (name2);
Commit;
You have to implement transaction smartly. Below are some performance related points :-
Locking Optimistic/Pessimistic. In pessimistic locking whole table is locked. but in optimistic locking only specific row is locked.
Isolation level Read Committed/Read Uncommitted. When table is locked it depends upon on your business scenario if it allowed you then you can go for dirty read using with NoLock.
Try to use where clause in update and do proper indexing. For any heavy query check the query plan.
Transaction timeout should be very less. So if the table is locked then it should throw error and In catch block you can retry.
These are few points you can do.
You cannot avoid that multiples users load data to the database. It is neither feasible nor clever to lock every time a single user requested the usage of a table. Actually you do not have to worry about it, because the DB itself will provide mechanism to avoid such issues. I would recommend you reading into ACID properties.
Atomicity
Consistency
Isolation
Durability
What may happen is that you could suffer a ghost read, which basically consist that you cannot read data unless the user who is inserting data commits. And even if you have finished inserting data and do not commit, there is a fair chance that you will not see the changes.
DDL operations such as creation, removal, etc. are themselves committed at the end. However DML operation, such as update, insert, delete, etc. are not committed at the end.
My scenario is common:
I have a stored procedure that need to update multiple tables.
if one of updates failed - all the updates should be rolled back.
the strait forward answer is to include all the updates in one transaction and just roll that back. however, in system like ours , this will cause concurrency issues.
when we break the updates into multiple short transactions - we get throughput of ~30 concurrent executions per second before and deadlocking issues start to emerge.
if we put it to one transaction which span all of them - we get concurrent ~2 per second before deadlock shows up.
in our case, we place a try-catch block after every short transaction, and manually DELETE/Update back the changes from the previous ones. so essentially we mimic the transaction behavior in a very expensive way...
It is working alright since its well written and dont get many "rollbacks"...
one thing this approach cannot resolve at all is a case of command timeout from the web server / client.
I have read extensively in many forms and blogs and scanned through the MSDN and cannot find a good solution. many have presented the problem but I am yet to see a good solution.
The question is this: is there ANY solution to this issue that will allow a stable rollback of update to multiple tables, without require to establish exclusivity lock on all of the rows for the entire duration of the long transaction.
Assume that it is not an optimization issue. The tables are almost at the max optimization probably, and can give a very high throughput as long as deadlock don't hit it. there are no table locks/page locks etc. all row locks on updates - but when you have so many concurrent sessions some of them need to update the same row...
it can be via SQL, client side C#, server side C# (extend the SQL server?).
Is there such solution in any book/blog that i have not found?
we are using SQL server 2008 R2, with .NET client/web server connecting to it.
Code example:
Create procedure sptest
Begin transaction
Update table1
Update table2
Commit transaction
In this case, if sptest is run twice, the second instance cannot update table 1 until instance 1 has committed.
Compared to this
Create sptest2
Update table1
Update table2
Sptest2 has a much higher throughput - but it has chance to corrupt the data.
This is what we are trying to solve. Is there even a theoretical solution to this?
Thanks,
JS
I would say that you should dig deeper to find out the reason why deadlock occurs. Possibly you should change the order of updates to avoid them. Maybe some index is "guilty".
You cannot roolback changes if other transactions can change data. So you need to have update lock on them. But you can use snapshot isolation level to allow consistent reads before update commits.
For all inner joined tables that are mostly static or with a high degree of probability not effect the query by using dirty data then you can apply:
INNER JOIN LookupTable (with NOLOCK) lut on lut.ID=SomeOtherTableID
This will tell the query that I do not care about updates made to SomeOtherTable
This can reduce your issue in most cases. For more difficult deadlocks I have implemented a deadlock graph that is generated and emailed when a deadlock occurs contains all the detailed info for the deadlock.
I have a situation where I am using a transaction scope in .NET.
Within it are multiple method calls, the first perform database updates, and then the last reads the database.
My question is will the database reads pick up the changes in the first method calls that update the databases (note there are commits in these methods, but they are not truly committed until the transaction scope completes).
E.g Using TransactionScope.
{
Method 1 (Insert new comment into database).
Method 2 (Retrieve all comments from database).
complete.
}
Will method 2 results include the method 1 insert?
Thing that is confusing me is that I have ran loads of tests, and sometimes the update is there, sometimes its not!?!
I am aware there are isolation levels (at a high level), is there one that would allow reads to uncommitted data ONLY within the transactionscope?
Any and all help greatly appreciated......
You can do any operations on databases that you want (ms-sql), and until you don't make
transaction.commit()
any changes will appear.
Even if you insert NEW record in one transaction you can get its value in this same transaction. (ofc if you wont rollback()) it.
Yes, this is the purpose of transactions. Think about the situation where you have 2 tables, and 1 foreign keys the other. In your transaction, you insert into one and then the other one with a foreign key of your first insert, and it works. If the data was not available to you, the transaction would be pointless: it would be one operation at a time, which would be atomic, and thus negate the need for transactions.
Here is the code to modify table from in one transatcion. As I know why IsolationLevel Serializable the read is not blocked, but I can't select records from the table. How can I run transaction while not blocking selects from the table ?
TransactionOptions opt = new TransactionOptions();
opt.IsolationLevel = IsolationLevel.Serializable;
using (TransactionScope scope = new TransactionScope(
TransactionScopeOption.Required, opt))
{
// inserts record into table
myObj.MyAction();
// trying to select the table from Management Studio
myObj2.MyAction();
scope.Complete();
}
Have a look at http://msdn.microsoft.com/en-us/library/ms173763.aspx for an explanation of the isolation levels in SQL Server. SERIALIZABLE offers the highest level of isolation and takes range locks on tables which are held until the transaction completes. You'll have to use a lower isolation level to allow concurrent reads during your transaction.
It doesn't matter what isolation level your (insert, update, etc) code is running under - it matters what isolation level the SELECT is running under.
By default, this is READ COMMITTED - so your SELECT query is unable to proceed whilst there is *un*committed data in the table. You can change the isolation level that the select is running under using SET TRANSACTION ISOLATION LEVEL to allow it to READ UNCOMMITTED. Or specify a table hint (NOLOCK).
But whatever you do, it has to be done to the connection/session where the select is running. There's no way for you to tell SQL Server "Please, ignore the settings that other connections have set, just break their expectations".
If you generally want selects to be able to proceed on a database wide basis, you might look into turning on READ_COMMITTED_SNAPSHOT. This is a global change to the database - not something that can or should be toggled on or off for the running of a single statement or set of statements, but it then allow READ COMMITTED queries to continue, without requiring locks.
Serializable is the highest transaction level. It will hold the most restricted locks.
What are you trying to protect with an isolation level of Serializable.
Read Commited Snapshot might be more appropriate, but we would need more information to be sure.