WCF Transaction - TransactionScopeRequired = a hammer when we want tweezers?

WCF Transaction - TransactionScopeRequired = a hammer when we want tweezers? - c#

So in WCF to flow transactions from client to server you must have your
[OperationBehavior(TransactionScopeRequired = true)]
On your instance methods and
[TransactionFlow(TransactionFlowOption.Allowed)]
On your service interfaces. And everything works. However, I find it concerning that
the server allocates a TX even if the client isn't flowing one up. It seems wasteful
I understand .NET transactions can be lightweight. Am I overreacting? Should I just
trust in .NET and let it allocate a needless local transaction? I'm worried it's
unnecessary bulk, and even more worried it may get promoted to MSDTC involvement
EDIT 1:
The operation at hand which makes this clumsy is:
insert on table A
insert on table B
read on table A
insert on table C
Operation 3, read MUST be marked up as above as transactionscoperequired. Otherwise since TX is not flowed, read times out. I find this a little weird, brute forcing a TX to exist for a read-only operation. It implies I'll have to mark most of the WCF calls in the system with a TransactionScopeRequired=true

A transaction is a tiny .NET in-memory data structure. It is nothing. What's expensive are the resource enlistments. That said, you are going to have at least one such enlistment.
Transactions usually help with database throughput, especially with writes.
You probably want your method to execute under a transaction anyway because you want effects to be atomic and reads to be consistent. It doesn't matter whether the client requests a tran or not.
and even more worried it may get promoted to MSDTC involvement
That's a valid concern. That said distributed transactions are best avoided because they are slow and they do not work at all with some HA strategies like mirroring and AG.

Related

Concurrency for many API requests

I have an API that are used for add/update new records to DB.
On start of such request I try to get data from db by some identifiers for request and do some updates.
In case there there are few concurrent requests to my API, some duplicates maybe be created.
So I am thinking about "wait till prev request is finished".
For this I found a solution to use new SemaphoreSlim(1,1) to allow only 1 request in a time.
But now I am wondering if it is a good solution. Because in case 1 request may take up to 1 min of processing, will other requests be alive until SemaphoreSlim allow to process them?
For sure it is something related to configuration, but it always some approx. number in settings and it may be limited by some additional threshold settings.

The canonical way to do this is to use database transactions.
For example, SQL Server's transaction isolation level "serializable" ensures that even if transactions are concurrent, the effect will be as if they had been executed one after another. This will give you the best of both worlds: Your requests can be processed in parallel, and the database engine ensures that locks and serialization happen if, and only if, it's required to avoid transactional inconsistency.
Conveniently, "serializable" is the default isolation level used by TransactionScope. Thus, if your DB library provider supports it, wrapping your code in a TransactionScope block might be all you need.

Two db contexts under TransactionScope fails

I am stuck using two db connections with entity framework contexts under a single transaction.
I am trying to use two db contexts under one transaction scope. I get "MSTDC not available". I read it's not an EF problem it's TDC which does not allow two connections.
Is there any answer for this problem?

This happens because the framework thinks that you are trying to have a transaction span multiple databases. This is called a distributed transaction.
To use distributed transactions, you need a transaction coordinator. In your case, the coordinator is the Microsoft Distributed Transaction Coordinator, which runs as a Widows Service on your server. You will need to make sure that this service is running:
Starting the service should solve your immediate issue.
Two-phase commit
From a purely theoretical point of view, distributed transactions are an impossibility* - that is, disparate systems cannot coordinate their actions in such a way that they can be absolutely certain that they either all commit or all roll back.
However, using a transaction coordinator, you get pretty darn close (and 'close enough' for any conceivable purpose). When using a distributed transaction, each party in the transaction will try to make the required changes and report back to the coordinator whether all went well or not. If all parties report success, the coordinator will tell all parties to commit. However, if one or more parties report a failure, the coordinator will tell all parties to roll back their changes. This is the "Two-phase commit protocol".
Watch out
It obviously takes time for the coordinator to communicate with the different parties of the transaction. Thus, using distributed transactions can hamper performance. Moreover, you may experience blocking and deadlocking among your transactions, and MSDTC obviously complicates your infrastructure.
Thus, before you turn on the Distributed Transaction Coordinator service and forge ahead with your project, you should first take a long, hard look at your architecture and convince yourself that you really need to use multiple contexts.
If you do need multiple contexts, you should investigate whether you can prevent transactions from being escalated to distributed transactions.
Further reading
You may want to read:
MSDN: "Managing Connections and Transactions" (specifically on EF)
Blog: "Avoid unwanted escalation to distributed transactions" (a bit dated, though)
***** See for example: Reasoning About Knowledge

You should run MSDTC (Distributed Transaction Coordinator) system service.

How do I minimize or inform users of database connection lag / failure?

I'm maintaining a ASP/C# program that uses an MS SQL Server 2008 R2 for its database requirements.
On normal and perfect days, everything works fine as it is. But we don't live in a perfect world.
An Application (for Leave, Sick Leave, Overtime, Undertime, etc.) Approval process requires up to ten separate connections to the database. The program connects to the database, passes around some relevant parameters, and uses stored procedures to do the job. Ten times.
Now, due to the structure of the entire thing, which I can not change, a dip in the connection, or heck, if I put a debug point in VS2005 and let it hang there long enough, the Application Approval Process goes incomplete. The tables are often just joined together, so a data mismatch - a missing data here, a primary key that failed to update there - would mean an entire row would be useless.
Now, I know that there is nothing I can do to prevent this - this is a connection issue, after all.
But are there ways to minimize connection lag / failure? Or a way to inform the users that something went wrong with the process? A rollback changes feature (either via program, or SQL), so that any incomplete data in the database will be undone?
Thanks.

But are there ways to minimize connection lag / failure? Or a way to
inform the users that something went wrong with the process? A
rollback changes feature (either via program, or SQL), so that any
incomplete data in the database will be undone?
As we discussed in the comments, transactions will address many of your concerns.
A transaction comprises a unit of work performed within a database
management system (or similar system) against a database, and treated
in a coherent and reliable way independent of other transactions.
Transactions in a database environment have two main purposes:
To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system
failure, when execution stops (completely or partially) and many
operations upon a database remain uncompleted, with unclear status.
To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the program's outcome
are possibly erroneous.
Source
Transactions in .Net
As you might expect, the database is integral to providing transaction support for database-related operations. However, creating transactions from your business tier is quite easy and allows you to use a single transaction across multiple database calls.
Quoting from my answer here:
I see several reasons to control transactions from the business tier:
Communication across data store boundaries. Transactions don't have to be against a RDBMS; they can be against a variety of entities.
The ability to rollback/commit transactions based on business logic that may not be available to the particular stored procedure you are calling.
The ability to invoke an arbitrary set of queries within a single transaction. This also eliminates the need to worry about transaction count.
Personal preference: c# has a more elegant structure for declaring transactions: a using block. By comparison, I've always found transactions inside stored procedures to be cumbersome when jumping to rollback/commit.
Transactions are most easily declared using the TransactionScope (reference) abstraction which does the hard work for you.
using( var ts = new TransactionScope() )
{
// do some work here that may or may not succeed
// if this line is reached, the transaction will commit. If an exception is
// thrown before this line is reached, the transaction will be rolled back.
ts.Complete();
}
Since you are just starting out with transactions, I'd suggest testing out a transaction from your .Net code.
Call a stored procedure that performs an INSERT.
After the INSERT, purposely have the procedure generate an error of any kind.
You can validate your implementation by seeing that the INSERT was rolled back automatically.
Transactions in the Database
Of course, you can also declare transactions inside a stored procedure (or any sort of TSQL statement). See here for more information.

If you use the same SQLConnection, or other connection types that implement IDbConnection, you can do something similar to transactionscopes but without the need to create the security risk that is a transactionscope.
In VB:
Using scope as IDbTransaction = mySqlCommand.Connection.BeginTransaction()
If blnEverythingGoesWell Then
scope.Commit()
Else
scope.Rollback()
End If
End Using
If you don't specify commit, the default is to rollback the transaction.

What's the best way to manage concurrency in a database access application?

A while ago, I wrote an application used by multiple users to handle trades creation.
I haven't done development for some time now, and I can't remember how I managed the concurrency between the users. Thus, I'm seeking some advice in terms of design.
The original application had the following characteristics:
One heavy client per user.
A single database.
Access to the database for each user to insert/update/delete trades.
A grid in the application reflecting the trades table. That grid being updated each time someone changes a deal.
I am using WPF.
Here's what I'm wondering:
Am I correct in thinking that I shouldn't care about the connection to the database for each application? Considering that there is a singleton in each, I would expect one connection per client with no issue.
How can I go about preventing the concurrency of the accesses? I guess I should lock when modifying the data, however don't remember how to.
How do I set up the grid to automatically update whenever my database is updated (by another user, for example)?
Thank you in advance for your help!

Consider leveraging Connection Pooling to reduce # of connections. See: http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
lock as late as possible and release as soon as possible to maximize concurrency. You can use TransactionScope (see: http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope.aspx and http://blogs.msdn.com/b/dbrowne/archive/2010/05/21/using-new-transactionscope-considered-harmful.aspx) if you have multiple db actions that need to go together to manage consistency or just handle them in DB stored proc. Keep your query simple. Follow the following tips to understand how locking work and how to reduce resource contention and deadlock: http://www.devx.com/gethelpon/10MinuteSolution/16488
I am not sure other db, but for SQL, you can use SQL Dependency, see http://msdn.microsoft.com/en-us/library/a52dhwx7(v=vs.80).aspx

Concurrency is usually granted by the DBMS using locks. Locks are a type of semaphore that grant the exclusive lock to a certain resource and allow other accesses to be restricted or queued (only restricted in the case you use uncommited reads).
The number of connections itself does not pose a problem while you are not reaching heights where you might touch on the max_connections setting of your DBMS. Otherwise, you might get a problem connecting to it for maintenance purposes or for shutting it down.
DBMSes usually use a concept of either table locks (MyISAM) or row locks (InnoDB, most other DBMSes). The type of lock determines the volume of the lock. Table locks can be very fast but are usually considered inferior to row level locks.
Row level locks occur inside a transaction (implicit or explicit). When manually starting a transaction, you begin your transaction scope. Until you manually close the transaction scope, all changes you make will be attributes to this exact transaction. The changes you make will also obey the ACID paradigm.
Transaction scope and how to use it is a topic far too long for this platform, if you want, I can post some links that carry more information on this topic.
For the automatic updates, most databases support some kind of trigger mechanism, which is code that is run at specific actions on the database (for instance the creation of a new record or the change of a record). You could post your code inside this trigger. However, you should only inform a recieving application of the changes, not really "do" the changes from the trigger, even if the language might make it possible. Remember that the action which triggered the code is suspended until you finish with your trigger code. This means that a lean trigger is best, if it is needed at all.

Sql Transaction - SQL Server or C#?

Am I right in saying that from a performance perspective, sql transactions are far better within a stored procedure than code?
At the moment I use most of my transactions in stored procs but sometimes I use code for more complex routines - which obviously I keep to a minimum as much as possible.
It's just that there was a complex routine that required too many "variables" that writing the sql transaction in c# was far easier than using SQL Server. It's a fine line between code readability and performance.
Any ideas?

The performance varies; a SqlTransaction can have less overhead than a TransactionScope, especially if the TransactionScope decides it needs to get entangled with DTC. But I wouldn't expect a vast difference between SqlTransaction and a BEGIN TRAN, except for the extra round trip. However, TransactionScope is still fast, and is the most convenient option for encapsulating multiple operations in a transaction, as the ambient transaction does not need to be manually associated with the command each time.
Perhaps a better (and more significant) factor is the isolation-level. TransactionScope defaults to the highest (serializable). Lower isolation levels allow morefor less blocking (but at the risk of non-repeatable reads, etc). IIRC a TSQL transaction defaults to one of the lower levels. But the isolation level can be tweaked for all 3 options.

I'm for TransactionScope. As per Marc, use a factory method on your TransactionScope to drop the isolation level to READ COMMITTED for most common usages I can think of.
Note that you can use both SQL transactions AND TransactionScope - the SQL BEGIN TRAN / COMMIT TRAN will have little effect on TransactionScope (other than incrementing / decrementing ##TRANCOUNT) - this way if you do need to call the same SQL Sproc elsewhere, e.g. from an adhoc query that you will still get the benefit of a transaction.
The benefit of TransactionScope IMO is that it will manage DTC for you if you DO need to do 2 phase commit (e.g. Multiple databases, Queues or other XA transactions). And with SQL 2005 and later, it works with the Lightweight Transaction Manager, so DTC won't be required e.g. if all accesses are to the one database, one connection at a time.

Transaction management in the code (read c#) is an option and to be followed for managing transactions over multiple data source or system. For managing transaction for a single database, the management at the server end will be always simpler. But if you think that the code might need to cater to a scenario where multiple data-sources get added in the transaction, keep the transactions in the code level.

I belive that the store procedure will have better performance.

If you mean you write SQL Transactions in C# and use ADO.Net or similar to execute them then they are probably less efficient because SQL will cache the query plan for a stored procedure (which is also now the case for Entity Framework - though still not as quick as proc I don't think) so really you should probably be doing it the other way round - complex procedures in SQL to get the caching benefits (if only it were that simple...)

It depends on what application.
But I will say that in most cases it is best to surprised to have logic in the database. A very advantageous also to have business logic in the databse, is that then it will be the same even if you have riders a WinForms version and a web, or whatever
But if you're talking about the CLR in SQL. The negative with this, it becomes much more difficult for a DBA to find any errors or something about. performance.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.