Our production setup is that we have an application server with applications connecting to a SQL Server 2016 database. On the application server there is several IIS applications which run under a GMSA account. The GMSA account has db_datawriter and db_datareader privileges on the database.
Our team have db_datareader privileges on the same SQL Server database. We require this for production support purposes.
We recently had an incident where a team member invoked a query on SQL Server Management Studio on their local machine:
SELECT * FROM [DatabaseA].[dbo].[TableA] order by CreateDt desc;
TableAhas about 1.4m records and there are multiple blob type columns. CreateDt is a DATETIME2 type column.
We have RedGate SQL Monitor configured for the SQL Server Database Server. This raised a long-running query alert that ran for 1738 seconds.
At the same time one of our web applications (.NET 4.6) which exclusively inserts new records to TableA was experiencing constant query timeout errors:
Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
These errors occurred for almost the exact same 1738 second period. This leads me to believe these are connected.
My understanding is that a SELECT query only creates a Shared lock and would not block access to this table for another connection. Is my understanding correct here?
My question is that is db_datareader safe for team members? Is there a lesser privilege that would allow reading data but absolutely no way for blocking behaviours to be created.
The presence of SELECT * (SELECT STAR) in a query, leads generally to do not use an index and make a SCAN of the table.
With many LOBs (BLOBs or CLOBS or NCLOBs) and many rows, the order by clause will take a long time to :
generate the entries
make a sort on CreateDt
So a read lock (shared lock) is put while reading all the data of the table. This lock accepts other shared locks but prohibit to put an exclusive lock to modify data (INSERT, UPDATE, DELETE). This may guarantee to other users that the data won't be modified.
This locking technics is well known as pessimistic lock. The locks are taken before beginning the execution of the query and relaxed at the end. So reader blocks writers and writers blocks all.
The other technic, that SQL Server can do, called optimistic locking, consists to use a copy of the data, without any locking and verify at the end of the execution that the data involved in writes has not been modified since the beginning. So the blocking is less...
To do a pessimistic locking you have the choise to allow or to force:
ALTER DATABASE CURRENT SET ALLOW_SNAPSHOT_ISOLATION ON;
ALTER DATABASE CURRENT SET READ_COMMITTED_SNAPSHOT ON;
In SQL Server, writers block readers, and readers block writers.
This query doesn't have a where clause and will touch the entire table, probably starting with an IS (Intent Shared) and eventually escalating to a shared lock that updates/inserts/deletes can't access while the lock is there. This is likely held during that very long sort, the order by is causing.
It can be bypassed in several ways, but I don't assume you're actually after how, seeing as whoever ran the query was probably not really thinking straight anyway, and this is not a regular occurrence.
Nevertheless, here are some ways to bypass:
Read Committed Snapshot Isolation
With (nolock). But only if you don't really care about the data that is retrieved, as it can return rows twice, rows that were never committed and skip rows altogether.
Reducing the columns you return and reading from a non-clustered index instead.
But to answer your question, yes selects can block inserts.
Related
Does a transaction lock my table when I'm running multiple queries?
Example: if another user will try to send data in same time which I use transaction, what will happen?
Also how can I avoid this, but also to be sure that all data has inserted successfully into database?
Begin Tran;
Insert into Customers (name) values(name1);
Update CustomerTrans
set CustomerName = (name2);
Commit;
You have to implement transaction smartly. Below are some performance related points :-
Locking Optimistic/Pessimistic. In pessimistic locking whole table is locked. but in optimistic locking only specific row is locked.
Isolation level Read Committed/Read Uncommitted. When table is locked it depends upon on your business scenario if it allowed you then you can go for dirty read using with NoLock.
Try to use where clause in update and do proper indexing. For any heavy query check the query plan.
Transaction timeout should be very less. So if the table is locked then it should throw error and In catch block you can retry.
These are few points you can do.
You cannot avoid that multiples users load data to the database. It is neither feasible nor clever to lock every time a single user requested the usage of a table. Actually you do not have to worry about it, because the DB itself will provide mechanism to avoid such issues. I would recommend you reading into ACID properties.
Atomicity
Consistency
Isolation
Durability
What may happen is that you could suffer a ghost read, which basically consist that you cannot read data unless the user who is inserting data commits. And even if you have finished inserting data and do not commit, there is a fair chance that you will not see the changes.
DDL operations such as creation, removal, etc. are themselves committed at the end. However DML operation, such as update, insert, delete, etc. are not committed at the end.
I'm looking for a solution to a thorny problem.
Me and my colleagues have made an 'engine' (in C#) that performs different elaborations on a SQL Server database.
Initially, these elaborations were contained in many stored procedures called in series in a nightly batch. It was a system with many flaws.
Now we have extracted every single query from each stored procedure and, may sound strange, we have inserted the queries into the DB.
(Note: the reasons are different and I'm not listing them all, but you just need to know that, for business reasons, we do not have the opportunity to make frequent software releases... but we have a lot of freedom with SQL scripts).
Mainly, the logic behind our engine is:
there are Phases, called sequentially
each Phase contains several Step, then subdivided into Set
the Set is a set of Steps, that will be executed sequentially
the Sets, unless otherwise specified, start running parallel to each other
the Step that by default does not belong to any Set, will be embedded in a Set (created at runtime)
a Set before starting may have to wait the completion of one or more Steps
Step corresponds to atomic (or almost) SQL queries or C# methods to run
at start the engine queries the database, then composes the Phases, Step and Set (and related configurations)... which will be executed
We have created the engine, we have all the configurations... and everything works.
However, we have a need: some phases must have a transaction. If even a single step of that phase fails, we need to rollback the entire phase.
What creates problems is the management of the transaction.
Initially we created a single transaction and connection for the entire phase, but we soon realized that - because of multithreading - this is not thread-safe.
In addition, after several tests, we have had exceptions regarding the transaction. Apparently, when a phase contains a LOT of steps (= many database queries), the same transaction cannot execute any further statements.
So, now, we've made a change and made sure that each step in the phases that require a transaction opens a connection and a transaction on its own, and if all goes well, all commits (otherwise rollback).
It works. However, we have noticed a limitation: the use of temporary tables.
In a transactional phase, when I create a temporary temp table (#TempTable1) in a step x, I can't use #TempTable1 in the next step y (SELECT TOP 1 1 FROM #TempTable1).
This is logical: as it is a separate transaction and #TempTable1 is deleted at the end of the execution instance.
Then we tried to use a global temp table ##TempTable2, but, in step y, the execution of the SELECT is blocked until the timeout passes..
I also tried lowering the transaction isolation level, but nothing.
Now we are in the unfortunate situation of having to create real tables instead of using temporary tables.
I'm looking for a compromise between the use of transactions on a large number of steps and the use of temporary tables. I believe that the focus of the speech is the management of transactions. Suggestions?
My scenario is common:
I have a stored procedure that need to update multiple tables.
if one of updates failed - all the updates should be rolled back.
the strait forward answer is to include all the updates in one transaction and just roll that back. however, in system like ours , this will cause concurrency issues.
when we break the updates into multiple short transactions - we get throughput of ~30 concurrent executions per second before and deadlocking issues start to emerge.
if we put it to one transaction which span all of them - we get concurrent ~2 per second before deadlock shows up.
in our case, we place a try-catch block after every short transaction, and manually DELETE/Update back the changes from the previous ones. so essentially we mimic the transaction behavior in a very expensive way...
It is working alright since its well written and dont get many "rollbacks"...
one thing this approach cannot resolve at all is a case of command timeout from the web server / client.
I have read extensively in many forms and blogs and scanned through the MSDN and cannot find a good solution. many have presented the problem but I am yet to see a good solution.
The question is this: is there ANY solution to this issue that will allow a stable rollback of update to multiple tables, without require to establish exclusivity lock on all of the rows for the entire duration of the long transaction.
Assume that it is not an optimization issue. The tables are almost at the max optimization probably, and can give a very high throughput as long as deadlock don't hit it. there are no table locks/page locks etc. all row locks on updates - but when you have so many concurrent sessions some of them need to update the same row...
it can be via SQL, client side C#, server side C# (extend the SQL server?).
Is there such solution in any book/blog that i have not found?
we are using SQL server 2008 R2, with .NET client/web server connecting to it.
Code example:
Create procedure sptest
Begin transaction
Update table1
Update table2
Commit transaction
In this case, if sptest is run twice, the second instance cannot update table 1 until instance 1 has committed.
Compared to this
Create sptest2
Update table1
Update table2
Sptest2 has a much higher throughput - but it has chance to corrupt the data.
This is what we are trying to solve. Is there even a theoretical solution to this?
Thanks,
JS
I would say that you should dig deeper to find out the reason why deadlock occurs. Possibly you should change the order of updates to avoid them. Maybe some index is "guilty".
You cannot roolback changes if other transactions can change data. So you need to have update lock on them. But you can use snapshot isolation level to allow consistent reads before update commits.
For all inner joined tables that are mostly static or with a high degree of probability not effect the query by using dirty data then you can apply:
INNER JOIN LookupTable (with NOLOCK) lut on lut.ID=SomeOtherTableID
This will tell the query that I do not care about updates made to SomeOtherTable
This can reduce your issue in most cases. For more difficult deadlocks I have implemented a deadlock graph that is generated and emailed when a deadlock occurs contains all the detailed info for the deadlock.
I'm maintaining a ASP/C# program that uses an MS SQL Server 2008 R2 for its database requirements.
On normal and perfect days, everything works fine as it is. But we don't live in a perfect world.
An Application (for Leave, Sick Leave, Overtime, Undertime, etc.) Approval process requires up to ten separate connections to the database. The program connects to the database, passes around some relevant parameters, and uses stored procedures to do the job. Ten times.
Now, due to the structure of the entire thing, which I can not change, a dip in the connection, or heck, if I put a debug point in VS2005 and let it hang there long enough, the Application Approval Process goes incomplete. The tables are often just joined together, so a data mismatch - a missing data here, a primary key that failed to update there - would mean an entire row would be useless.
Now, I know that there is nothing I can do to prevent this - this is a connection issue, after all.
But are there ways to minimize connection lag / failure? Or a way to inform the users that something went wrong with the process? A rollback changes feature (either via program, or SQL), so that any incomplete data in the database will be undone?
Thanks.
But are there ways to minimize connection lag / failure? Or a way to
inform the users that something went wrong with the process? A
rollback changes feature (either via program, or SQL), so that any
incomplete data in the database will be undone?
As we discussed in the comments, transactions will address many of your concerns.
A transaction comprises a unit of work performed within a database
management system (or similar system) against a database, and treated
in a coherent and reliable way independent of other transactions.
Transactions in a database environment have two main purposes:
To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system
failure, when execution stops (completely or partially) and many
operations upon a database remain uncompleted, with unclear status.
To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the program's outcome
are possibly erroneous.
Source
Transactions in .Net
As you might expect, the database is integral to providing transaction support for database-related operations. However, creating transactions from your business tier is quite easy and allows you to use a single transaction across multiple database calls.
Quoting from my answer here:
I see several reasons to control transactions from the business tier:
Communication across data store boundaries. Transactions don't have to be against a RDBMS; they can be against a variety of entities.
The ability to rollback/commit transactions based on business logic that may not be available to the particular stored procedure you are calling.
The ability to invoke an arbitrary set of queries within a single transaction. This also eliminates the need to worry about transaction count.
Personal preference: c# has a more elegant structure for declaring transactions: a using block. By comparison, I've always found transactions inside stored procedures to be cumbersome when jumping to rollback/commit.
Transactions are most easily declared using the TransactionScope (reference) abstraction which does the hard work for you.
using( var ts = new TransactionScope() )
{
// do some work here that may or may not succeed
// if this line is reached, the transaction will commit. If an exception is
// thrown before this line is reached, the transaction will be rolled back.
ts.Complete();
}
Since you are just starting out with transactions, I'd suggest testing out a transaction from your .Net code.
Call a stored procedure that performs an INSERT.
After the INSERT, purposely have the procedure generate an error of any kind.
You can validate your implementation by seeing that the INSERT was rolled back automatically.
Transactions in the Database
Of course, you can also declare transactions inside a stored procedure (or any sort of TSQL statement). See here for more information.
If you use the same SQLConnection, or other connection types that implement IDbConnection, you can do something similar to transactionscopes but without the need to create the security risk that is a transactionscope.
In VB:
Using scope as IDbTransaction = mySqlCommand.Connection.BeginTransaction()
If blnEverythingGoesWell Then
scope.Commit()
Else
scope.Rollback()
End If
End Using
If you don't specify commit, the default is to rollback the transaction.
I have one BIG table(90k rows, size cca 60mb) which holds info about free rooms capacities for about 50 hotels. This table has very few updates/inserts per hour.
My application sends async requests to this(and joined tables) at max 30 times per sec.
When i start 30 threads(with default AppPool class at .NET 3.5 C#) at one time(with random valid sql query string), only few(cca 4) are processed asynchronously and other threads waits. Why?
Is it becouse of SQL SERVER 2008 table locking, or becouse of .NET core? Or something else?
If it is a SQL problem, can help if i split this big table into one table per each hotel model?
My goal is to have at least 10 threads servet at a time.
This table is tiny. It's doesn't even qualify as a "medium sized" table. It's trivial.
You can be full table scanning it 30 times per second, you can be copying the whole thing in ram, no server is going to be the slightest bit bothered.
If your data fits in ram, databases are fast. If you don't find this, you're doing something REALLY WRONG. Therefore I also think the problems are all on the client side.
It is more than likely on the .NET side. If it were table locking more threads would be processing, but they would be waiting on their queries to return. If I remember correctly there's a property for thread pools that controls how many actual threads they create at once. If there are more pending threads than that number, then they get in line and wait for running threads to finish. Check that.
Have you tried changing the transaction isolation level?
Even when reading from a table Sql Server will lock the table
try setting the isolation level to read uncommitted and see if that improves the situation,
but be advised that its feasible that you will read 'dirty' data make sure you understand the ramifications if this is the solution
this link explains what it is.
link text
Rather than ask, measure. Each of your SQL queries that is actually submitted by your application will create a request on the server, and the sys.dm_exec_requests DMV shows the state of each request. When the request is blocked the wait_type column shows a non-empty value. You can judge from this whether your requests are blocked are not. If they are blocked you'll also know the reason why they are blocked.