I want to understand what is the trade-of/downside of using TransactionScopeOption.RequiresNew on EntityFramework (w/ Sql Server 2008), what are the reasons why we should NOT use RequiresNew always.
Regards.
You should use Required not RequiresNew. RequiresNew means every operation will use a new transaction, even if there is an encompassing already existing transaction scope. This will certainly lead to deadlocks. Even with Required there is another serious problem with TransactionScope, namely that it creates by default a Serializable transaction, which is a horribly bad choice and yet another shortcut to deadlock hell and no scalability. See using new TransactionScope() Considered Harmful. You should always create a transaction scope with the explicit TransactionOption setting the isolation level to ReadCommitted, which a much much much more sane isolation level:
using(TransactionScope scope = new TransactionScope(
TransactionScopeOption.Required,
new TransactionOptions {
IsolationLevel = IsolationLevel.ReadCommitted}))
{
/// do work here
...
scope.Complete();
}
I just wanted to add here that in a couple certain cases the method i've written is inside a parent transaction scope that may or may not be closed with scope.Complete() in these cases I didn't want to be dependent on the parent transaction so we needed to set RequiresNew.
In general though I agree it's not necessary and should use read committed.
http://msdn.microsoft.com/en-us/library/ms172152(v=vs.90).aspx
Related
I'm implementing a DAL abstraction layer on top of the C# Mongo DB driver using the Repository + Unit of Work pattern.
My current design is that every unit of work instance opens (and closes) a new Mongo DB session.
The problem is that Mongo DB only allows a 1:1 ratio between a session and a transaction, so multiple units of work under the same .NET transaction will not be possible.
The current implementation is:
public class MongoUnitOfWork
{
private IClientSessionHandle _sessionHandle;
public MongoUnitOfWork(MongoClient mongoClient)
{
_sessionHandle = mongoClient.StartSession();
}
public void Dispose()
{
if (_sessionHandle != null)
{
// Must commit transaction, since the session is closing
if (Transaction.Current != null)
_sessionHandle.CommitTransaction();
_sessionHandle.Dispose();
}
}
}
And because of this, the following code won't work. The first batch of data will be committed ahead of time:
using (var transactionScope = new TransactionScope())
{
using (var unitOfWork = CreateUnitOfWork())
{
//... insert items
unitOfWork.SaveChanges();
} // Mongo DB unit of work implementation will commit the changes when disposed
// Do other things
using (var unitOfWork = CreateUnitOfWork())
{
//... insert some more items
unitOfWork.SaveChanges();
}
transactionScope.Complete();
}
Obviously the immediate answer would be to bring all of the changes into one unit of work, but it's not always possible, and also this leaks the Mongo DB limitation.
I thought about session pooling, so that multiple units of work will use the same session, and commit/rollback when the transient transaction completes/aborts.
What other solutions are possible?
Clarification:
The question here is specifically regarding Unit-Of-Work implementation over MongoDB using the MongoDB 4.0 (or later) built-in transaction support.
I never used MongoDB; do not know anything about it. I am only answering in terms of TransactionScope; so not sure if this will help you.
Please refer the Magic Of TransactionScope. IMO, there are three factors you should look for:
Connection to the database should be opened inside the TransactionScope.
So remember, the connection must be opened inside the TransactionScope block for it to enlist in the ambient transaction automatically. If the connection was opened before that, it will not participate in the transaction.
Not sure but it looks that you can manually enlist the connection opened outside the scope using connection.EnlistTransaction(Transaction.Current).
Looking at your comment and the edit, this is not an issue.
All operations should run on same thread.
The ambient transaction provided by the TransactionScope is a thread-static (TLS) variable. It can be accessed with static Transaction.Current property. Here is the TransactionScope code at referencesource.microsoft.com. ThreadStatic ContextData, contains the CurrentTransaction.
and
Remember that Transaction.Current is a thread static variable. If your code is executing in a multi-threaded environments, you may need to take some precautions. The connections that need to participate in ambient transactions must be opened on the same thread that creates the TransactionScope that is managing that ambient transaction.
So, all the operations should run on same thread.
Play with TransactionScopeOption (pass it to constructor of TransactionScope) values as per your need.
Upon instantiating a TransactionScope by the new statement, the transaction manager determines which transaction to participate in. Once determined, the scope always participates in that transaction. The decision is based on two factors: whether an ambient transaction is present and the value of the TransactionScopeOption parameter in the constructor.
I am not sure what your code expected to do. You may play with this enum values.
As you mentioned in the comment, you are using async/await.
Lastly, if you are using async/await inside the TransactionScope block, you should know that it does not work well with TransactionScope and you might want to look into new TransactionScope constructor in .NET Framework 4.5.1 that accepts a TransactionScopeAsyncFlowOption. TransactionScopeAsyncFlowOption.Enabled option, which is not the default, allows TransactionScope to play well with asynchronous continuations.
For MongoDB, see if this helps you.
The MongoDB driver isn't aware of ambient TransactionScopes. You would need to enlist with them manually, or use JohnKnoop.MongoRepository which does this for you: https://github.com/johnknoop/MongoRepository#transactions
What is difference of these codes:
using (TransactionScope tran = new TransactionScope ())
{
using (Entities ent = new Entities())
{
and
using (Entities ent = new Entities())
{
using (TransactionScope tran = new TransactionScope ())
{
Does line order matter?
Thanks
In this case it wouldn't matter, since the Entities seems like Model / Data class, which cannot / needn't enlist into a transaction, so you can have any order, it would not create a difference. Now if the there's any issue in the Entities operation, Read / DML then transaction would also abort, assuming that operation takes place in the ambient transaction context, though class / object doing it like IDBConnection would anyway automatically enlisted into a transaction (provided not set to false), however even if Connection is created outside transaction context, then would not automatically enlist, needs explicit enlisting for Transaction Context
In Short
For your current code, it doesn't matter till the point you are dealing with an object that needs Transaction enlistment like IDBConnection. Out of two code snippets though my preference would be first one, where ambient transaction is all encompassing all the objects need to be enlist automatically does, we don't leave any object by chance.
Important Difference
What you though may want to be careful about whether you want to access Entities outside the Transaction Context, since that is not possible in the Option 1 but would not be an issue in the Option 2
Yes the order matters. Or rather we can't say it doesn't matter without looking at your code.
DbConnection instances will enist in an ambient transaction if one exists when they are Open()ed.
Your DbContext constructor might open the underlying DbConnection, in which case the two patterns differ.
The first one is the normal pattern, and you should stick to that.
Also if you are using SQL Server don't use the default constructor of TransactionScope. See Using new TransactionScope() Considered Harmful
Can I do any other things (not related to DataBase) in the TransactionScope ?
Will it be a bad practice ?
e.g
Starting a workflow inside the scope.
I don't want to save in DB unless starting the workflow fails.
If it's a bad practice, what would be a good practice?
Thanks in advance.
using (TransactionScope scope = new TransactionScope())
{
using (ComponentCM.Audit auditComponent = new ComponentCM.Audit())
{
using (AccessCM.Biz dataAccess = new AccessCM.Biz())
{
auditComponent.SaveInDB();
dataAccess.SaveinDB()
StartWorkflow();
}
}
scope.Complete();
}
You want transactions to take as little time as possible, as they can cause locking on the database.
In this regard, it is better to do other work outside of the transaction scope.
I agree with Oded.
I'd also add that the scope of the TransactionScope should define the state changes that are defined in your unit of work. In other words, state changes inside the scope should all succeed or fail together and are usually part of the same logical operation.
In order to keep the scope of the as small as possible, one common pattern is optimistic concurrency. When using optimistic concurrency, you typically only open a transaction scope when performing a write operation on the state store. In this case, you are excepting that usually the data you are working on will not have changed (by another operation) between the time that you query and commit.
I would like to be 100% that when I use such pattern
using (var scope = new TransactionScope())
{
db1.Foo.InsertOnSubmit(new Foo()); // just an example
db1.SubmitChanges();
scope.Commit();
}
the transaction scope will only refer to db1 not db2 (db1 and db2 are DataBaseContext objects). So how to limit scope to db1 only?
Thank you in advance.
The strength of TransactionScope is that it automatically catches everything within it, down the call stack (unless another TransactionScope with RequiresNew or Supress). If this is not what you want, you should use another mechanism for the transaction handling.
One way is to open SqlConnection and call BeginTransaction to get a transaction, then use that DB connection when creating your data context. (Maybe you'll have to set the Transaction propery on the data context, I'm not sure).
In the example you have given above, the use of a TransactionScope is totally redundant. There is only one function call that actually modifies the database: SubmitChanges and it always creates its own transaction if there isn't already one existing. The reason is that when you're doing several operations SubmitChanges should either succeed in them all or fail all. If you're only after transactional integrity for a single SubmitChanges call for a single data context, then you can just drop the TransactionScope.
As far as I know, what you are after is exactly what you have written, the compiler will ensure that transaction is completed when you commit.
If you use:
using (var scope = new TransactionScope(TransactionScopeOption.RequiresNew))
{
db1.Foo.InsertOnSubmit(new Foo()); // just an example
db1.SubmitChanges();
scope.Commit();
}
You can be sure a new transcation is created for this scope, so everyting you do inside the scope has its own transaction, and will never use any existing ambient transactions.
I'm having a major performance issue with LINQ2SQL and transactions. My code does the following using IDE generated LINQ2SQL code:
Run a stored proc checking for an existing record
Create the record if it doesn't exist
Run a stored proc that wraps its own code in a transaction
When I run the code with no transaction scope, I get 20 iterations per second. As soon as I wrap the code in a transaction scope, it drops to 3-4 iterations per second. I don't understand why the addition of a transaction at the top level reduces the performance by so much. Please help?
Psuedo stored proc with transaction:
begin transaction
update some_table_1;
insert into some_table_2;
commit transaction;
select some, return, values
Pseudo LINQ code without transaction:
var db = new SomeDbContext();
var exists = db.RecordExists(some arguments);
if (!exists) {
var record = new SomeRecord
{
// Assign property values
};
db.RecordsTable.InsertOnSubmit(record);
db.SubmitChanges();
var result = db.SomeStoredProcWithTransactions();
}
Pseudo LINQ code with transaction:
var db = new SomeDbContext();
var exists = db.RecordExists(some arguments);
if (!exists) {
using (var ts = new TransactionScope())
{
var record = new SomeRecord
{
// Assign property values
};
db.RecordsTable.InsertOnSubmit(record);
db.SubmitChanges();
var result = db.SomeStoredProcWithTransactions();
ts.Complete();
}
}
I know the transaction isn't being escalated to the DTC because I've disabled the DTC. SQL Profiler shows that several of the queries take much longer with the transactionscope enabled, but I'm not sure why. The queries involved are very short lived and I've got indexes that I have verified are being used. I'm unable to determine why the addition of a parent transaction causes so much degredation in performance.
Any ideas?
EDIT:
I've traced the problem to the following query within the final stored procedure:
if exists
(
select * from entries where
ProfileID = #ProfileID and
Created >= #PeriodStart and
Created < #PeriodEnd
) set #Exists = 1;
If I had with(nolock) as shown below, the problem disappears.
if exists
(
select * from entries with(nolock) where
ProfileID = #ProfileID and
Created >= #PeriodStart and
Created < #PeriodEnd
) set #Exists = 1;
However, I'm concerned that doing so may cause problems down the road. Any advice?
One big thing that changes as soon as you get a transaction - the isolation level. Is your database under heavy contention? If so: by default a TransactionScope is at the highest "serializable" isolation level, which involves read locks, key-range locks, etc. If it can't acquire those locks immediately it will slow down while it it blocked. You could investigate by reducing the isolation level of the transaction (via the constructor). For example (but pick your own isolation-level):
using(var tran = new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.Snapshot })) {
// code
tran.Complete();
}
However, picking an isolation level is... tricky; serializable is the safest (hence the default). You can also use granular hints (but not via LINQ-to-SQL) such as NOLOCK and UPDLOCK to help control locking of specific tables.
You could also investigate whether the slowdown is due to trying to talk to DTC. Enable DTC and see if it speeds up. The LTM is good, but I've seen composite operations to a single database escalate to DTC before...
Although you are using a single datacontext, your code sample is likely to use more than one connection and that will escalate your transaction to a distributed transaction.
Try initializing your datacontext with an explicit db connection, or call db.Connection.Open() right after creating the datacontext. That removes the overhead of distributed transactions...
Does the Stored Procedure you call participate in the ambient (parent) transaction? - that is the question.
It's likely that the Stored Procedure participates in the ambient transaction, which is causing the degredation. There's an MSDN article here discussing how they interrelate.
From the article:
"When a TransactionScope object joins an existing ambient transaction, disposing of the scope object may not end the transaction, unless the scope aborts the transaction. If the ambient transaction was created by a root scope, only when the root scope is disposed of, does Commit get called on the transaction. If the transaction was created manually, the transaction ends when it is either aborted, or committed by its creator."
There's also a serious looking document on nested transactions which looks like it is directly applicable localted on MSDN here.
Note:
"If TransProc is called when a transaction is active, the nested transaction in TransProc is largely ignored, and its INSERT statements are committed or rolled back based on the final action taken for the outer transaction."
I think that explains the difference in performance - it's essentially the cost of maintaining the parent transaction. Kristofer's suggestion may help to reduce the overhead.