MSDTC getting invoked. But why? - c#

For my data access I use TransactionScopes at the API level to wrap entire operations in a single transaction so that my SQL operations can be somewhat composable. I have a web project that hosts an API and a separate service library that is the implementation and calls to SQL. At the beginning of an Operation (an API entry-point) I open the TransactionScope. Whenever a SqlConnection is needed within the processing of the Operation, ask for the AmbientConnection instead of directly making a new connection. AmbientConnection finds or creates a new SqlConnection for the current transation. Doing this is supposed to allow for good composibility but also avoid the invocation of the MSDTC because it should keep using the same connection for the each suboperation within the transaction. When the transaction is completed (with scope.complete()), the connection is automatically closed.
The problem is that every once in a while the MSDTC is still getting invoked and I cannot figure out why. I've used this before sucessfully and I believe I never got an MSDTC invoked. Two things seem different to me this time though: 1) I'm using SQL Server 2008 R1 (10.50.4000) - not my choice - and I'm aware that the MSDTC behavior changed beginning with this version and perhaps not all the kinks were worked out until later versions. 2) The use of async-await is new and I believe I'm having to use TransactionScopeAsyncFlowOption.Enabled to accommodate this new feature in case some part of the implementation is async. Perhaps more measures are necessary.
I tried Pooling=false in the connection string in case it was MSDTC getting invoked because of two independent logical connections handled errantly under a single pooled connection. But that didn't work.
API Operation
// Exposed API composing multiple low-level operations within a single TransactionScope
// independent of any database platform specifics.
[HttpPost]
public async Task<IHttpActionResult> GetMeTheTwoThings()
{
using (var scope = new TransactionScope(TransactionScopeOption.Required, TransactionScopeAsyncFlowOption.Enabled))
{
var result = new TwoThings(
await serviceLayer.GetThingOne(),
await serviceLayer.GetThingTwo());
scope.Complete();
return Ok(result);
}
}
Service layer implementation
public async Task<ThingOne> GetThingOne()
{
using (var cmd = connManagement.AmbientConnection.CreateCommand())
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandText = "dbo.GetThingOne";
return (ThingOne)(await cmd.ExecuteScalarAsync());
}
}
public async Task<ThingTwo> GetThingTwo()
{
using (var cmd = connManagement.AmbientConnection.CreateCommand())
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandText = "dbo.GetThingTwo";
return (ThingTwo)(await cmd.ExecuteScalarAsync());
}
}
AmbientConnection implementation
internal class SQLConnManagement
{
readonly string connStr;
readonly ConcurrentDictionary<Transaction, SqlConnection> txConnections = new ConcurrentDictionary<Transaction, SqlConnection>();
private SqlConnection CreateConnection(Transaction tx)
{
var conn = new SqlConnection(this.connStr);
// When the transaction completes, close the connection as well
tx.TransactionCompleted += (s, e) =>
{
SqlConnection closing_conn;
if (txConnections.TryRemove(e.Transaction, out closing_conn))
{
closing_conn.Dispose(); // closing_conn == conn
}
};
conn.Open();
return conn;
}
internal SqlConnection AmbientConnection
{
get
{
var txCurrent = Transaction.Current;
if (txCurrent == null) throw new InvalidOperationException("An ambient transaction is required.");
return txConnections.GetOrAdd(txCurrent, CreateConnection);
}
}
public SQLConnManagement(string connStr)
{
this.connStr = connStr;
}
}
Not to overcomplicate the post, but this might be relevant because it seems to me that every time MSDTC has been invoked the logged stack trace shows that this next mechanism has been involved. Certain data I cache with the built in ObjetCache because it doesn't change often and so I just get it at most once per minute or whatever. This is a little fancy, but I don't see why the Lazy generator would be treated any differently from a more typical call and why this specifically would cause the MSSDTC to sometimes be invoked. I've tried LazyThreadSafetyMode.ExecutionAndPublication too just in case but that doesn't help anyway (and then the exception just keeps getting delivered as the cached result for subsequent requests before the expiration, of course, and that's not desirable).
/// <summary>
/// Cache element that gets the item by key, or if it is missing, creates, caches, and returns the item
/// </summary>
static T CacheGetWithGenerate<T>(ObjectCache cache, string key, Func<T> generator, DateTimeOffset offset) where T : class
{
var generatorWrapped = new Lazy<T>(generator, System.Threading.LazyThreadSafetyMode.PublicationOnly);
return ((Lazy<T>)cache.AddOrGetExisting(
key,
generatorWrapped,
offset))?.Value ?? generatorWrapped.Value;
}
public ThingTwo CachedThingTwo
{
get
{
return CacheGetWithGenerate(
MemoryCache.Default,
"Services.ThingTwoData",
() => GetThingTwo(), // ok, GetThingTwo isn't async this time, fudged example
DateTime.Now.Add(TimeSpan.FromMinutes(1)));
}
}
Do you know why MSDTC is being invoked?

PublicationOnly means that two connections can be created and one thrown away. I'm surprised you made this bug because you explicitly stated PublicationOnly (as opposed to the default safety mode which is safe). You explicitly allowed this bug.
For some reason I did not see that you tried ExecutionAndPublication already. Since not using it is a bug please fix the code in the question.
CreateConnection is also broken in the sense that in case of exception on open the connection object is not getting disposed. Probably harmless but you never know.
Also, audit this code for thread aborts which can happen when ASP.NET times out a request. You are doing very dangerous and brittle things here.
The pattern that I use is to use an IOC container to inject a connection that is shared for the entire request. The first client for that connection opens it. The request end event closes it. Simple, and does away with all that nasty shared, mutable, multi-threaded state.
Why are you using a cache for data that you do not want to lose? This is probably the bug. Don't do that.
What is ?.Value ?? generatorWrapped.Value about? The dictionary can never return null. Delete that code. If it could return null then forcing the lazy value would create a second connection so that's a logic bug as well.

Related

Linq to Sql: Change Database for each connection

I'm working on an ASP.NET MVC application which uses Linq to SQL to connect to one of about 2000 databases. We've noticed in our profiling tools that the application spends a lot of time making connections to the databases, and I suspect this is partly due to connection pool fragmentation as described here: http://msdn.microsoft.com/en-us/library/8xx3tyca(v=vs.110).aspx
Many Internet service providers host several Web sites on a single
server. They may use a single database to confirm a Forms
authentication login and then open a connection to a specific database
for that user or group of users. The connection to the authentication
database is pooled and used by everyone. However, there is a separate
pool of connections to each database, which increase the number of
connections to the server.
There is a relatively simple way to avoid this
side effect without compromising security when you connect to SQL
Server. Instead of connecting to a separate database for each user or
group, connect to the same database on the server and then execute the
Transact-SQL USE statement to change to the desired database.
I am trying to implement this solution in Linq to Sql so we have fewer open connections, and so there is more likely to be a connection available in the pool when we need one. To do that I need to change the database each time Linq to Sql attempts to run a query. Is there any way to accomplish this without refactoring the entire application? Currently we just create a single data context per request, and that data context may open and close many connections. Each time it opens the connection, I'd need to tell it which database to use.
My current solution is more or less like this one - it wraps a SqlConnection object inside a class that inherits from DbConnection. This allows me to override the Open() method and change the database whenever a connection is opened. It works OK for most scenarios, but in a request that makes many updates, I get this error:
System.InvalidOperationException: Transaction does not match
connection
My thought was that I would then wrap a DbTransaction object in a similar way to what I did with SqlConnection, and ensure that its connection property would point back to the wrapped connection object. That fixed the error above, but introduced a new one where a DbCommand was unable to cast my wrapped connection to a SqlConnection object. So then I wrapped DbCommand too, and now I get new and exciting errors about the transaction of the DbCommand object being uninitialized.
In short, I feel like I'm chasing specific errors rather than really understanding what's going on in-depth. Am I on the right track with this wrapping strategy, or is there a better solution I'm missing?
Here are the more interesting parts of my three wrapper classes:
public class ScaledSqlConnection : DbConnection
{
private string _dbName;
private SqlConnection _sc;
public override void Open()
{
//open the connection, change the database to the one that was passed in
_sc.Open();
if (this._dbName != null)
this.ChangeDatabase(this._dbName);
}
protected override DbTransaction BeginDbTransaction(IsolationLevel isolationLevel)
{
return new ScaledSqlTransaction(this, _sc.BeginTransaction(isolationLevel));
}
protected override DbCommand CreateDbCommand()
{
return new ScaledSqlCommand(_sc.CreateCommand(), this);
}
}
public class ScaledSqlTransaction : DbTransaction
{
private SqlTransaction _sqlTransaction = null;
private ScaledSqlConnection _conn = null;
protected override DbConnection DbConnection
{
get { return _conn; }
}
}
public class ScaledSqlCommand : DbCommand
{
private SqlCommand _cmd;
private ScaledSqlConnection _conn;
private ScaledSqlTransaction _transaction;
public ScaledSqlCommand(SqlCommand cmd, ScaledSqlConnection conn)
{
this._cmd = cmd;
this._conn = conn;
}
protected override DbConnection DbConnection
{
get
{
return _conn;
}
set
{
if (value is ScaledSqlConnection)
_conn = (ScaledSqlConnection)value;
else
throw new Exception("Only ScaledSqlConnections can be connections here.");
}
}
protected override DbTransaction DbTransaction
{
get
{
if (_transaction == null && _cmd.Transaction != null)
_transaction = new ScaledSqlTransaction(this._conn, _cmd.Transaction);
return _transaction;
}
set
{
if (value == null)
{
_transaction = null;
_cmd.Transaction = null;
}
else
{
if (value is ScaledSqlTransaction)
_transaction = (ScaledSqlTransaction)value;
else
throw new Exception("Don't set the transaction of a ScaledDbCommand with " + value.ToString());
}
}
}
}
}
I don't think that's going to work off a single shared connection.
LINQ to SQL works best with Unit of Work type connections - create your connection, do your atomically grouped work and close the connection as quickly as possible and reopen for the next task. If you do that then passing in a connection string (or using custom constructor that only passes a tablename) is pretty straight forward.
If factoring your application is a problem, you could use a getter to manipulate the cached DataContext 'instance' and instead create a new instance each time you request it instead of using the cached/shared instance and inject the connection string in the Getter.
But - I'm pretty sure this will not help with your pooling issue though. The SQL Server driver caches connections based on different connection string values - since the values are still changing you're right back to having lots of connections active in the connection string cache, which likely will result in lots of cache misses and therefore slow connections.
I think I figured out a solution that works for my situation. Rather than wrapping SqlConnection and overriding Open() to change databases, I'm passing the DBContext a new SqlConnection and subscribing to the connection's StateChanged event. When the state changes, I check to see if the connection has just been opened. If so, I call SqlConnection.ChangeDatabase() to point it to the correct database. I tested this solution and it seems to work - I see only one connection pool for all the databases rather than one pool for each db that has been accessed.
I realize this isn't the ideal solution in an ideal application, but for how this application is structured I think it should make a decent improvement for relatively little cost.
I think, that the best way is making UnitOfWork pattern with Repository pattern to work with Entity Framework. Entity Framework has FirstAsync, FirstOrDefaultAsync, this helped me to fix the same bug.
https://msdn.microsoft.com/en-us/data/jj819165.aspx

Wrapping methods using db connection in MVC

I was following the MVC music store tutorial and I came to the part where they are using database connections (DbConnection is a child of DbContext). I was taught to create methods like this (wrapping with using):
public class StoreManagerController : Controller
{
//
// GET: /StoreManager/
public ActionResult Index()
{
using(var db = new DbConnection())
{
var albums = db.Albums.Include(a => a.Genre).Include(a => a.Artist);
return View(albums.ToList());
}
}
...
}
but Visual Studio generated me a controller which looked like this:
public class StoreManagerController : Controller
{
private DbConnection db = new DbConnection();
//
// GET: /StoreManager/
public ActionResult Index()
{
var albums = db.Albums.Include(a => a.Genre).Include(a => a.Artist);
return View(albums.ToList());
}
...
}
I assume, Visual Studio isn't wrong, but why was I told to wrap each method with using to make the connections as short as possible and also the users to use separate connections?
I assume, Visual Studio isn't wrong, but why was I told to wrap each method with using
using(var db = new DbConnection())
{
var albums = db.Albums.Include(a => a.Genre).Include(a => a.Artist);
return View(albums.ToList());
}
The scope of db remains only within the curly braces. This is perhaps another purpose that using keyword serves in C#. It defines the scope of a variable, here in the above case it is your db object.
Now, if you debug the code that visual studio generated for you, then you notice that there is a Dispose method being invoked each and every time an object of a controller class is made or in other words, an action method is called within the corresponding Controller.
The DBContext instance is always disposed because of the following -
As you load more objects and their references into memory, the memory consumption of the context may increase rapidly. This may cause performance issues.
If an exception causes the context to be in an unrecoverable state, the whole application may terminate.
The chances of running into concurrency-related issues increase as the gap between the time when the data is queried and updated grows.
For more info - Reference
This might depend on the usability of your app; whether or not you need a persistent connection, and the cost of creating one (and a myriad of other factors).
But for starters, you should always dispose of the connection (as in the first pattern, not the one suggested by Visual Studio) and then move to other patterns based on new requirements or performance-related issues.
The biggest issue I see with the Visual Studio-suggested options is that you have no way of controlling the lifetime of the DbConncetion object, and are leaving it up to the garbage collector to eventually dispose of it. This could leave the connection resources in use for an rather undetermined period of time.

Transaction Deadlock and DBContext

Hi i am using Entity Framework 4.1 code first approach . I have class MyContainer that inherits DBContext .
I have a process that has 7 steps each step accesses many repository methods (about 60) .This process executes automatically or mannually depends upon business logic and user requirement . Now for performance point of view for automatic process i created context i.e object of MyContainer once and pass it to all methods and dispose it at the end of process and its working fine and improved the performance.But when this process is executed mannually same methods are executed and container is created and disposed in the method itself. eg below but it is just rough code.
public bool UpdateProcess(Process process, MyContainer container = null)
{
bool disposeContainer = false;
if (container == null)
{
container = new MyContainer();
disposeContainer = true;
}
var result = SaveProcess(container, process);
container.SaveChanges();
if (disposeContainer)
container.Dispose();
return result;
}
For Automatic Process transaction is created at the beginning of the process and ended at the end of the process and for manual transaction is created at the bll in the method which is called according to the user action on ui.Now suppose my automatic process is running and simultaneously user did some action on ui I gets the Exception "Transaction (Process ID 65) Was Deadlocked On Lock Resources With Another Process And Has Been Chosen" When the UpdateProcess() method is called together from both mannual and automatic process, I get it on container.SaveChanges().
Any help will be highly appreciated.
If i create a transaction scope in this repository method like
public bool UpdateProcess(Process process, MyContainer container = null)
{ bool disposeContainer = false;
if (container == null)
{
container = new MyContainer();
disposeContainer = true;
}
using(var trans=new TransactionScop(TransactionScopeOption.RequiresNew))
{
var result = SaveProcess(container, process);
container.SaveChanges();
trans.Complete();
}
if (disposeContainer)
container.Dispose();
return result;
}
It works fine.
However I dont want to create the transaction in repository as the transactions been already made in bll.
Any help will be appericiated.
The issue (sql deadlocking) you are having is common to most non-trivial client - database systems and can be very difficult to resolve.
The primary rule in designing code where deadlocks may occur is to assume that they will and design the application to handle them appropriately. This is normally handled via resubmitting the transaction. As documented by microsoft here
Although deadlocks can be minimized, they cannot be completely avoided. That is why the front-end application should be designed to handle deadlocks.
In order to minimise any deadlocks you are seeing the normal approach I take is as follows:
Run profiler with a deadlock graph enabled to capture all deadlocks
Export all deadlocks to text and investigate why they are occurring
Check to see these can be minimised by using table hints and/or reducing the transation isolation level
See if they can be minimised by other means, e.g. changing the order of operation
Finally after all this has been completed ensure that you are trapping for deadlocks anytime you communicate with the database & resubmit the transaction / query etc.

Correct use of connections with C# and MySQL

A typical example for C#/MySQL interaction involves code like this (I'm skipping try/catches and error checking for simplicity's sake):
conn = new MySqlConnection(cs);
conn.Open();
string stm = "SELECT * FROM Authors";
MySqlCommand cmd = new MySqlCommand(stm, conn);
rdr = cmd.ExecuteReader();
The scenario is a class library for both a standalone application and a WCF web-service. So, should I open a connection whenever I make a query? Or should I open it once when I open the program?
To expand on HackedByChinese's recommendation consider the following. You have one main coordinating method that handles creating the connection, opening it, setting the transaction, and then calling the worker methods that do the different types of work (queries).
public static void UpdateMyObject(string connection, object myobject)
{
try
{
using (SqlConnection con = new SqlConnection(connection))
{
con.Open();
using (SqlTransaction trans = con.BeginTransaction())
{
WorkingMethod1(con, myobject);
WorkingMethod2(con, myobject);
WorkingMethod3(con, myobject);
trans.Commit();
}
con.Close();
}
}
catch (Exception ex)
{
MessageBox.Show("SOMETHING BAD HAPPENED!!!!!!! {0}", ex.Message);
}
}
private static void WorkingMethod1(SqlConnection con, object myobject)
{
// Do something here against the database
}
private static void WorkingMethod2(SqlConnection con, object myobject)
{
// Do something here against the database
}
private static void WorkingMethod3(SqlConnection con, object myobject)
{
// Do something here against the database
}
You typically want to open one connection per unit of work. The MySQL ADO.NET driver can pool connections for you, however, I would recommend against having each query open a connection. What can happen is you start using several methods in order to service one business transaction, and since each is opening a connection, you can end up that one business transaction starving your available connections. This of course can lead to timeouts, poor performance, etc.
Instead, consider defining an IUnitOfWork/UnitOfWork API that creates and opens a connection and transaction, and having your data access layer request the current IUnitOfWork, which will provide the current connection and transaction.
The trick, of course, is knowing when a unit of work begins and when it ends. They are associated with a complete, meaningful action (a "business transaction"). For example, when you are servicing a request to a method on one of your WCF services. When the service instance starts, in response to a request, a unit of work should be created. Then you DAL components can ask for the current unit of work and use it. Then when the request is completed, the unit of work should be committed (which should commit the transaction and close the connection).

Access to SQL DB in multithread server app

In my server application I want to use DB (SQL Server) but I am quite unsure of the best method. There are clients whose requests comes to threadpool and so their processing is async. Every request usually needs to read or write to DB, so I was thinking about static method which would create connection, execute the query and return the result. I'm only afraid whether opening and closing connection is not too slow and whether some connection limit could not be reached? Is this good approach?
IMHO the best is to rely on the ADO.NET connection pooling mechanism and don't try to handle database connections manually. Write your data access methods like this:
public void SomeMethod()
{
using (var connection = new SqlConnection(connectionString))
using (var command = connection.CreateCommand())
{
connection.Open();
command.CommandText = "SELECT Field1 FROM Table1";
using (var reader = command.ExecuteReader())
{
while(reader.Read())
{
// do something with the results
}
}
}
}
Then you can call this method from wherever you like, make it static, call it from threads whatever. Remember that calling Dispose on the connection won't actually close it. It will return it to the connection pool so that it can be reused.
Surprised that no one mentioned connection pooling. If you think you are going to have a large number of requests, why not just setup a pool with a min pool size set to say 25 (arbitrary number here, do not shoot) and max pool size set to say 200.
This will decrease the number of connection attempts and make sure that if you are not leaking connection handles (something that you should take explicit care to not let happen), you will always have a connection waiting for you.
Reference article on connection pooling: http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
Another side note, why the need to have the connection string in the code? Set it in the web.config or app.config for the sake of maintainability. I had to "fix" code that did such things and I always swore copiously at the programmer responsible for such things.
I have had exactly the same problem like you. Had huge app that i started making multithreaded. Benefit over having one connection open and being reused is that you can ask DB multiple times for data as new connection is spawned on request (no need to wait for other threads to finish getting data), and if for example you loose connection to sql (and it can happen when network goes down for a second or two) you will have to always check if connection is open before submitting query anyway.
This is my code for getting Database rows in MS SQL but other stuff should be done exactly the same way. Keep in mind that the sqlConnectOneTime(string varSqlConnectionDetails) has a flaw of returning null when there's no connection so it needs some modifications for your needs or the query will fail if sql fails to establish connection. You just need to add proper code handling there :-) Hope it will be useful for you :-)
public const string sqlDataConnectionDetails = "Data Source=SQLSERVER\\SQLEXPRESS;Initial Cata....";
public static string sqlGetDatabaseRows(string varDefinedConnection) {
string varRows = "";
const string preparedCommand = #"
SELECT SUM(row_count) AS 'Rows'
FROM sys.dm_db_partition_stats
WHERE index_id IN (0,1)
AND OBJECTPROPERTY([object_id], 'IsMsShipped') = 0;";
using (var varConnection = Locale.sqlConnectOneTime(varDefinedConnection))
using (var sqlQuery = new SqlCommand(preparedCommand, varConnection))
using (var sqlQueryResult = sqlQuery.ExecuteReader())
while (sqlQueryResult.Read()) {
varRows = sqlQueryResult["Rows"].ToString();
}
return varRows;
}
public static SqlConnection sqlConnectOneTime(string varSqlConnectionDetails) {
SqlConnection sqlConnection = new SqlConnection(varSqlConnectionDetails);
try {
sqlConnection.Open();
} catch (Exception e) {
MessageBox.Show("Błąd połączenia z serwerem SQL." + Environment.NewLine + Environment.NewLine + "Błąd: " + Environment.NewLine + e, "Błąd połączenia");
}
if (sqlConnection.State == ConnectionState.Open) {
return sqlConnection;
}
return null;
}
Summary:
Defined one global variable with ConnectionDetails of your SQL Server
One global method to make connection (you need to handle the null in there)
Usage of using to dispose connection, sql query and everything when the method of reading/writing/updating is done.
The one thing that you haven't told us, that would be useful for giving you an answer that's appropriate for you is what level of load you're expecting your server application to be under.
For pretty much any answer to the above question though, the answer would be that you shouldn't worry about it. ADO.net/Sql Server provides connection pooling which removes some of the overhead of creating connections from each "var c = new SqlConnection(connectionString)" call.

Categories