Long running transactions with EF and SQL Server using Read Commited

Long running transactions with EF and SQL Server using Read Commited - c#

I'm using EF6.1 and SQL Server on a WPF thick-client application and by default I'm opening a transaction with each DbContext I instantiate (I commit it and reopen it on every SaveChanges() unless specified otherwise). The isolation level for these transactions is READ COMMITED (IsolationLevel.ReadCommited).
I'm by default opening a new context (thus a new transaction) on each "main view". The application is kind of a fake-MDI app and each MDI View will use its own DbContext... "main views" (every MDI tab/window) can contain other secondary views (think of small modal windows for specific data entry and things like that) which will share the same context (and transaction) as the opened in the main view. I'm using a structure like UseCase -> Views -> ViewModels... generally a "UseCase" will open up a DbContext and can spawn multiple views, which will share it. Those secondary views usually call SaveChanges() without committing the transaction, that's why I want to have them in first place.
I've done some performance tests with a single user on a lab server and there doesn't seem to exist any difference (performance-wise) either opening the transaction when instancing the context, or not having transactions at all (other than the one EF opens by default on SaveChanges()).
I'm no SQL Server expert, so I'm wondering if there are any implications (when the app is used by multiple users on a production server) on having many long-running transactions opened on SQL Server with that isolation level (I understand the implications on other isolation levels which may lock reads, but it's not the case). I'm handling concurrency errors manually when committing the transactions.
Am I doing the right thing here, should I stick to short-living transactions, or is it just a matter of preference?
I've been trying to find an answer to this but haven't found anything definitive (there's some people that says long-living transactions are not a good idea, but they don't seem to explain why).

there's some people that says long-living transactions are not a good
idea, but they don't seem to explain why
Just a couple of reasons:
MS SQL transaction, depending on its isolation level, could obtain record (more general) or even metadata (more exotic) locks. The more time transaction lives, the more locks it could obtain, hence, the probability of deadlocks increases.
Also, uncommitted transaction means server resource utilization. Transaction log will grow and its data for active transactions could not be truncated, server must remember all of things, which has been done within transaction to commit them or rollback.
by default I'm opening a transaction with each DbContext I instantiate
There should be a reason to do this. The only reason I can imagine, is non-EF changes to a database, which must be consistent with EF changes. Otherwise, you're doing an extra-job, which at least useless, and could waste resources of database server.

Related

Two db contexts under TransactionScope fails

I am stuck using two db connections with entity framework contexts under a single transaction.
I am trying to use two db contexts under one transaction scope. I get "MSTDC not available". I read it's not an EF problem it's TDC which does not allow two connections.
Is there any answer for this problem?

This happens because the framework thinks that you are trying to have a transaction span multiple databases. This is called a distributed transaction.
To use distributed transactions, you need a transaction coordinator. In your case, the coordinator is the Microsoft Distributed Transaction Coordinator, which runs as a Widows Service on your server. You will need to make sure that this service is running:
Starting the service should solve your immediate issue.
Two-phase commit
From a purely theoretical point of view, distributed transactions are an impossibility* - that is, disparate systems cannot coordinate their actions in such a way that they can be absolutely certain that they either all commit or all roll back.
However, using a transaction coordinator, you get pretty darn close (and 'close enough' for any conceivable purpose). When using a distributed transaction, each party in the transaction will try to make the required changes and report back to the coordinator whether all went well or not. If all parties report success, the coordinator will tell all parties to commit. However, if one or more parties report a failure, the coordinator will tell all parties to roll back their changes. This is the "Two-phase commit protocol".
Watch out
It obviously takes time for the coordinator to communicate with the different parties of the transaction. Thus, using distributed transactions can hamper performance. Moreover, you may experience blocking and deadlocking among your transactions, and MSDTC obviously complicates your infrastructure.
Thus, before you turn on the Distributed Transaction Coordinator service and forge ahead with your project, you should first take a long, hard look at your architecture and convince yourself that you really need to use multiple contexts.
If you do need multiple contexts, you should investigate whether you can prevent transactions from being escalated to distributed transactions.
Further reading
You may want to read:
MSDN: "Managing Connections and Transactions" (specifically on EF)
Blog: "Avoid unwanted escalation to distributed transactions" (a bit dated, though)
***** See for example: Reasoning About Knowledge

You should run MSDTC (Distributed Transaction Coordinator) system service.

How do I minimize or inform users of database connection lag / failure?

I'm maintaining a ASP/C# program that uses an MS SQL Server 2008 R2 for its database requirements.
On normal and perfect days, everything works fine as it is. But we don't live in a perfect world.
An Application (for Leave, Sick Leave, Overtime, Undertime, etc.) Approval process requires up to ten separate connections to the database. The program connects to the database, passes around some relevant parameters, and uses stored procedures to do the job. Ten times.
Now, due to the structure of the entire thing, which I can not change, a dip in the connection, or heck, if I put a debug point in VS2005 and let it hang there long enough, the Application Approval Process goes incomplete. The tables are often just joined together, so a data mismatch - a missing data here, a primary key that failed to update there - would mean an entire row would be useless.
Now, I know that there is nothing I can do to prevent this - this is a connection issue, after all.
But are there ways to minimize connection lag / failure? Or a way to inform the users that something went wrong with the process? A rollback changes feature (either via program, or SQL), so that any incomplete data in the database will be undone?
Thanks.

But are there ways to minimize connection lag / failure? Or a way to
inform the users that something went wrong with the process? A
rollback changes feature (either via program, or SQL), so that any
incomplete data in the database will be undone?
As we discussed in the comments, transactions will address many of your concerns.
A transaction comprises a unit of work performed within a database
management system (or similar system) against a database, and treated
in a coherent and reliable way independent of other transactions.
Transactions in a database environment have two main purposes:
To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system
failure, when execution stops (completely or partially) and many
operations upon a database remain uncompleted, with unclear status.
To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the program's outcome
are possibly erroneous.
Source
Transactions in .Net
As you might expect, the database is integral to providing transaction support for database-related operations. However, creating transactions from your business tier is quite easy and allows you to use a single transaction across multiple database calls.
Quoting from my answer here:
I see several reasons to control transactions from the business tier:
Communication across data store boundaries. Transactions don't have to be against a RDBMS; they can be against a variety of entities.
The ability to rollback/commit transactions based on business logic that may not be available to the particular stored procedure you are calling.
The ability to invoke an arbitrary set of queries within a single transaction. This also eliminates the need to worry about transaction count.
Personal preference: c# has a more elegant structure for declaring transactions: a using block. By comparison, I've always found transactions inside stored procedures to be cumbersome when jumping to rollback/commit.
Transactions are most easily declared using the TransactionScope (reference) abstraction which does the hard work for you.
using( var ts = new TransactionScope() )
{
// do some work here that may or may not succeed
// if this line is reached, the transaction will commit. If an exception is
// thrown before this line is reached, the transaction will be rolled back.
ts.Complete();
}
Since you are just starting out with transactions, I'd suggest testing out a transaction from your .Net code.
Call a stored procedure that performs an INSERT.
After the INSERT, purposely have the procedure generate an error of any kind.
You can validate your implementation by seeing that the INSERT was rolled back automatically.
Transactions in the Database
Of course, you can also declare transactions inside a stored procedure (or any sort of TSQL statement). See here for more information.

If you use the same SQLConnection, or other connection types that implement IDbConnection, you can do something similar to transactionscopes but without the need to create the security risk that is a transactionscope.
In VB:
Using scope as IDbTransaction = mySqlCommand.Connection.BeginTransaction()
If blnEverythingGoesWell Then
scope.Commit()
Else
scope.Rollback()
End If
End Using
If you don't specify commit, the default is to rollback the transaction.

How to rollback transaction at later stage?

I have a data entry ASP.NET application. During a one complete data entry many transactions occur. I would like to keep track of all those transactions so that if the user wants to abandon the data entry, all the transaction of which I have been keeping record can be rolled back.
SQL 2008 ,Framework version is 4.0 and I am using c#.

This is always a tough lesson to learn for people that are new to web development. But here it is:
Each round trip web request is a separate, stand-alone thread of execution
That means, simply put, each time you submit a page request (click a button, navigate to a new page, even refresh a page) then it can run on a different thread than the previous one. What's more, even if you do get the same thread twice, several other web requests may have been processed by the thread in the time between your two requests.
This makes it effectively impossible to span simple transactions across more than one web request.
Here's another concept that you should keep in mind:
Transactions are intended for batch operations, not interactive operations.
What this means is that transactions are meant to be short-lived, and to encompass several operations executing sequentially (or simultaneously) in which all operations are atomic, and intended to either all complete, or all fail. Transactions are not typically designed to be long-lived (meaning waiting for a user to decide on various actions interactively).
Web apps are not desktop apps. They don't function like them. You have to change your thinking when you do web apps. And the biggest lesson to learn, each request is a stand-alone unit of execution.
Now, above, I said "simple transactions", also known as lightweight or local transactions. There's also what's known as a Distributed Transaction, and to use those requires a Distributed Transaction Coordinator. MSDTC is pretty commonly used. However, DT's perform much more slowly than LWT's. Also, they require that the infrastructure be setup to use a DTC.
It's possible to span a transaction over web requests using a DTC. This is done by "Enlisting" in a Distribute Transaction, and then somehow sharing this transaction identifier between requests. But this is a lot of work to setup, and deal with, and has a lot of error prone situations. It's not something you want to do if you have other options.
In general, you're better off adding the data to a temporary table or tables, and then when the final save is done, transfer that data to the permanent tables. Another option is to maintain some state (such as using ViewState or Session) to keep track of the changes.
One popular way of doing this is to perform operations client-side using JavaScript and then submitting all the changes to the server when you are done. This is difficult to implement if you need to navigate to different pages, however.

From your question, it appears that the transactions are complete when the user exercises the option to roll them back. In such cases, I doubt if the DBMS's transaction rollback semantics would be available. So, I would provide such semantics at the application layer as follows:
Any atomic operation that can be performed on the database should be encapsulated in a Command object. Each command will implement the undo method that would revert the action performed by its execute method.
Each transaction would contain a list of commands that were run as part of it. The transaction is persisted as is for further operations in future.
The user would be provided with a way to view these transactions that can be potentially rolled back. Upon selection of a transaction by user to roll it back, the list of commands corresponding to such a transaction are retrieved and the undo method is called on all those command objects.
HTH.

You can also store them on temporary Table and move those records to your original table 'at later stage'..

If you are just managing transactions during a single save operation, use TransactionScope. But it doesn't sound like that is the case.
If the user may wish to abandon n number of previous save operations, it suggests that an item may exist in draft form. There might be one working draft or many. Subsequently, there must be a way to promote a draft to a final version, either implicitly or explicitly. Think of how an email program saves a draft. It doesn't actually send your message, you may abandon it at any time, and you may recall it at a later time. When you send the message, you have "committed the transaction".
You might also add a user interface to rollback to a specific version.
This will be a fair amount of work, but if you are willing to save and manage multiple copies of the same item it can be accomplished.
You may save the a copy of the same data in the same schema using a status flag to indicate that it is a draft, or you might store the data in an intermediate format in separate table(s). I would prefer the first approach in that it allows the same structures to be used.

Does TransactionScope makes a cross process lock on the database in LinQ queries?

I got several applications run onto the one machine, also with an MSSQL server on that machine.
Applications are various typed, like WPF, WCF Service, MVC App and so on.
All of them accessing the only database, which is located on the sql server.
The access mode is the simple LinQ-to-SQL class calls.
In each database concact I make some queries, some checks and some db-writes.
My question is:
Can I be sure that calls inside those transaction scopes are not running at the same time (are thread and process safe) by using simple TransactionScope instance?

Using a transaction scope will obviously make a particular connection transactional. The use of transaction scopes in itself doesn't stop two different processes on a machine doing the same thing at once. It does ensure that all actions performed are either committed or rolled back. The view of data each process sees depends on the isolation level, which by default is serializable, which can easily lead to deadlocks. A more practical isolation level is read comitted, preferably with snapshot isolation as this further reduces deadlocks and waits times.
If you want to ensure only one instance of application is doing something, you can use a mutex or use a database lock that all different processes will attempt to acquire and if necessary wait for.

What's the best way to manage concurrency in a database access application?

A while ago, I wrote an application used by multiple users to handle trades creation.
I haven't done development for some time now, and I can't remember how I managed the concurrency between the users. Thus, I'm seeking some advice in terms of design.
The original application had the following characteristics:
One heavy client per user.
A single database.
Access to the database for each user to insert/update/delete trades.
A grid in the application reflecting the trades table. That grid being updated each time someone changes a deal.
I am using WPF.
Here's what I'm wondering:
Am I correct in thinking that I shouldn't care about the connection to the database for each application? Considering that there is a singleton in each, I would expect one connection per client with no issue.
How can I go about preventing the concurrency of the accesses? I guess I should lock when modifying the data, however don't remember how to.
How do I set up the grid to automatically update whenever my database is updated (by another user, for example)?
Thank you in advance for your help!

Consider leveraging Connection Pooling to reduce # of connections. See: http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
lock as late as possible and release as soon as possible to maximize concurrency. You can use TransactionScope (see: http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope.aspx and http://blogs.msdn.com/b/dbrowne/archive/2010/05/21/using-new-transactionscope-considered-harmful.aspx) if you have multiple db actions that need to go together to manage consistency or just handle them in DB stored proc. Keep your query simple. Follow the following tips to understand how locking work and how to reduce resource contention and deadlock: http://www.devx.com/gethelpon/10MinuteSolution/16488
I am not sure other db, but for SQL, you can use SQL Dependency, see http://msdn.microsoft.com/en-us/library/a52dhwx7(v=vs.80).aspx

Concurrency is usually granted by the DBMS using locks. Locks are a type of semaphore that grant the exclusive lock to a certain resource and allow other accesses to be restricted or queued (only restricted in the case you use uncommited reads).
The number of connections itself does not pose a problem while you are not reaching heights where you might touch on the max_connections setting of your DBMS. Otherwise, you might get a problem connecting to it for maintenance purposes or for shutting it down.
DBMSes usually use a concept of either table locks (MyISAM) or row locks (InnoDB, most other DBMSes). The type of lock determines the volume of the lock. Table locks can be very fast but are usually considered inferior to row level locks.
Row level locks occur inside a transaction (implicit or explicit). When manually starting a transaction, you begin your transaction scope. Until you manually close the transaction scope, all changes you make will be attributes to this exact transaction. The changes you make will also obey the ACID paradigm.
Transaction scope and how to use it is a topic far too long for this platform, if you want, I can post some links that carry more information on this topic.
For the automatic updates, most databases support some kind of trigger mechanism, which is code that is run at specific actions on the database (for instance the creation of a new record or the change of a record). You could post your code inside this trigger. However, you should only inform a recieving application of the changes, not really "do" the changes from the trigger, even if the language might make it possible. Remember that the action which triggered the code is suspended until you finish with your trigger code. This means that a lean trigger is best, if it is needed at all.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.