Transactions with multiple connections (MySql, C#) - c#

I'd like to ask a question. I've been trying to find some information regarding transactions with multiple connections, but I haven't been able to find any good source of information.
Now for what I'm trying to do. I have code that looks like this:
using (var Connection1 = m_Db.CreateConnection())
using (var Connection2 = m_Db.CreateConnection())
{
Connection1.DoRead(..., (IDataReader Reader) =>
{
// Do stuff
Connection2.DoWrite(...);
Connection2.DoRead(..., (IDataReader Reader) =>
{
// Do more stuff
using (var Connection3 = m_Db.CreateConnection())
{
Connection3.DoWrite(...);
Connection3.Commit(); // Is this even right?
}
});
});
Connection1.DoRead(..., (IDataReader) =>
{
// Do yet more stuff
});
Connection1.Commit();
Connection2.Commit();
}
Each CreateConnection creates a new transaction using MySqlConnection::BeginTransaction. The CreateConnection method creates a Connection object which wraps a MySqlConnection. The DoRead function executes some SQL, and disposes the IDataReader when done.
Every Connection will do a Rollback when disposed.
Now for some notes:
I have ONE server with multiple databases.
I am running MySql server with InnoDB databases.
I am doing both reads and writes to these databases.
For performance reasons and not to mess up the database, I am using transactions.
The code is (at least, for now) entirely serial. There are NO concurrent threads. All inserts and queries are done in serial fashion.
I use multiple connections to the database because a read or write is not allowed while another read is in progress (basically the reader object has not yet been disposed).
I basically want every connection to see all changes. So for example, after Connection 3 does some writes, Connection 1 should see those. But the data should be in the transaction and not written to the database (yet).
Now, as for my questions:
Does this work? Will everything ONLY be committed only once the last Commit function is called? Should I use another approach?
Is this right? Is my approach completely and utterly wrong and silly?
Any drawbacks? Especially regarding performance.
Thanks.

Welp, it seems no one knows. But that's okay.
For now, I just went with the method of just using one connection and reading all the results into a List>, then closing the reader, thereby avoiding the problem of having to use multiple connections.
Might there be performance problems? Maybe, but it's better than having to deal with uncertainty and deadlocks.

Related

Shared transaction across multiple connections, or ReadUncommitted in PostgreSQL

I want to open several connections within a single transaction scope, so that each connection could see the changes done by the previous ones.
I need this for tests - real code writes to the database, and testing code verifies the data was actually inserted/updated. In the end I rollback transaction scope so that the real database is not affected.
This approach works fine in SQL Server, but doesn't seem to work in PostgreSQL (I use 9.3 with Npgsql provider), below is a small example.
Here's the helper to run arbitrary query within a transaction scope
private void RunQuery(string query, Action<IDataReader> process)
{
using (var connection = new NpgsqlConnection(Config.ConnectionString)) {
connection.Open();
connection.EnlistTransaction(Transaction.Current);
using (var command = connection.CreateCommand()) {
command.CommandText = query;
using (var reader = command.ExecuteReader()) {
while (reader.Read()) {
process(reader);
}
}
}
}
}
..and here's the test code - it inserts into users table and then checks whether the user was actually inserted:
using (var scope = new TransactionScope()) {
//"tested scenario"
int id = 0;
RunQuery("INSERT INTO users (name) VALUES ('foo') RETURNING id;", reader => {
id = (int)reader.GetValue(0);
});
//checking
int id2 = 0;
RunQuery("SELECT id, name FROM users WHERE id=" + id, reader => {
id2 = (int)reader.GetValue(0);
});
Assert.That(id2, Is.Not.EqualTo(0));
}
The test above fails on Postgres as id2 is always zero. I tried TransactionScope constructor with TransactionOptions.ReadUncommitted but it doesn't seem to help. Note that if I run this against SQL Server (change NpgsqlConnection to SqlConection, use SCOPE_IDENTITY to retrieve the id) then everything works just fine and id2 is not zero.
As you may expect, selects within the same connection work for Postgres, but I don't need that, my goal is to use multiple connections on a shared transaction scope. I also don't need multithreading, those connections happen sequentially.
First a disclaimer: while I know a bit about postgresql, I know very little about .NET.
I suspect you are conflating two related but separate concepts - that of Distributed Transactions and the level of transaction isolation that exists.
According to the .NET Documentation, EnlistTransaction adds the connection into a distributed transaction. A distributed transaction is described as follows
A distributed transaction is a transaction that affects several
resources. For a distributed transaction to commit, all participants
must guarantee that any change to data will be permanent. Changes must
persist despite system crashes or other unforeseen events. If even a
single participant fails to make this guarantee, the entire
transaction fails, and any changes to data within the scope of the
transaction are rolled back.
In a database, such transactions are implemented by a two-phase commit process amongst what are actually separate transactions in the database. All of the participating transactions are progressed to the end of the first phase by executing PREPARE TRANSACTION. Once they are all in this state, then they can be fully committed by executing COMMIT PREPARED. If any of them fails during PREPARE TRANSACTION, then they can all be rolled back by ROLLBACK PREPARED. This guarantees that either they are all committed, or they are all rolled back.
When using middleware such as that provided by .NET, you do not see any of these details: the framework handles the two-phase commit for you.
So, you might be wondering what this has to do with the fact that you are not seeing changes made in one part of this distributed transaction reflected in another. The answer is probably nothing. The two transactions are actually completely separate - in fact it is possible for them to be on completely separate databases.
What you are trying to achieve - to be able to see changes made in one transaction from another prior to commit - is related to the level of transaction isolation.
The bad news for you is that it sounds like the isolation level you would like to have is 'read uncommitted', which is not supported in postgresql.
Maybe you need to describe what you are trying to achieve, at a higher level - it is likely there is another (maybe better) way to achieve it.

I needed to do Bulk insert with dapper rainbow

I am using the dapper rainbow database.cs extensions,
private void insertList(IEnumerable<myObject> list)
{
using (SqlConnection conn = new SqlConnection(connectionString))
{
var db = myDB.Init(conn, commandTimeout: 100);
db.myTable.tableName = "ds.myTable";
Parallel.ForEach(dsList, a => db.myTableInsert(a)
);
db.Dispose();
}
}
This won't work, i think i need to open and close the connection inside the Parallel.ForEach. Is that the write way to do it?
I wanted to use this extension, its very helpful and hand, but having this problem of inserting a list. I could not find anything online about using this extension and using a list as well.
Generally database connections are NOT thread safe, so doing inserts in parallel on the same connection like that is bound to cause trouble.
So I would say that yes, you would need to open and close the connection inside the Parallel.ForEach(). You might want to benchmark that as well. I'm not entirely convinced that doing inserts in parallel like that with multiple database connections would be any faster than doing them in a regular loop on a single connection.

How to minimize concurrent database connections?

My hosting company blocked my website for using more than 15 concurrent database connections. But in my code I closed each and every connection that I opened. But still they are saying that there are too many concurrent connections. And suggested me the solution that I should change the source code of my website. So please tell me the solution about this? And my website is dynamic, so would making it static simple HTML old days type will make a difference or not?
Also note that I tried this when no solution I can think of, before every con.open(), I added con.Close(), So that any other connection opened will be closed.
The first thing to do is to check when you open connections - see if you can minimise that. For example, and you doing "n+1" on different connections?
If you have a single server, the technical solution here is a semaphore - for example, something like:
someSemaphore.TakeOne();
try {
using(var conn = GetConnection()) {
...
}
} finally {
someSemaphore.Release();
}
which will (assuming someSemaphore is shared, for example static) ensure that you can only get into that block "n" times at once. In your case, you would create the semaphore with 15 spaces:
static readonly Semaphore someSemaphore = new Semaphore(15,15);
However! Caution is recommended: in some cases you could get a deadlock: imagine 2 threads poorly written each need 9 connections - thread A takes 7 and thread B takes 8. They both need more - and neither will ever get them. Thus, using WaitOne with a timeout is important:
static void TakeConnection() {
if(!someSemaphore.TakeOne(3000)) {
throw new TimeoutException("Unable to reserve connection");
}
}
static void ReleaseConnection() {
someSemaphore.Release();
}
...
TakeConnection();
try {
using(var conn = GetConnection()) {
...
}
} finally {
ReleaseConnection();
}
It would also be possible to wrap that up in IDisposable to make usage more convenient.
Change Hosting Company.
Seriously.
Unless you run a pathetic Little home blog.
You can easily have more than 15 pages / requests being handled at the same time. I am always wary of "run away Connections" but I would not consider 15 Connections to even be something worth mentioning. This is like a car rental Company complaining you drive more than 15km - this simply is a REALLY low Limit.
On a busy Website you can have 50, 100, even 200 open Connections just because you ahve that many requests at the same time.
This is something not so obvious, but even if you care about opening and closing your connections properly, you have to look at something particular.
If you make the smallest change on the text you use to build a connection string, .net will create a whole new connection instead of using one already opened (even if the connection uses MARS), so just in case, look for your code if you are creating connection strings on the fly instead of using a single one from your web config.
I believe SQL Connections are pooled. When you close one, you actually just return it to connection pool.
You can use SqlConnection.ClearPool(connection) or SqlConnection.ClearAllPools to actually close the connection, but it will affect the performance of your site.
Also, you can disable pooling by using connection string parameter Pooling=false.
There are also Max Pool Size (default 100), you may want to set it to a lower number.
This all might work, but i would also suggest you to switch providers ....
If you only fetch data from database then it is not very difficult to create some sort of cache. But if there full CRUD then the better solution is to change hosting provider.

LINQ-To-SQL NOLOCK (NOT ReadUncommitted)

I've been searching for some time now in here and other places and can't find a good answer to why Linq-TO-SQL with NOLOCK is not possible..
Every time I search for how to apply the with(NOLOCK) hint to a Linq-To-SQL context (applied to 1 sql statement) people often answer to force a transaction (TransactionScope) with IsolationLevel set to ReadUncommitted. Well - they rarely tell this causes the connection to open an transaction (that I've also read somewhere must be ensured closed manually).
Using ReadUncommitted in my application as is, is really not that good. Right now I've got using context statements for the same connection within each other. Like:
using( var ctx1 = new Context()) {
... some code here ...
using( var ctx2 = new Context()) {
... some code here ...
using( var ctx3 = new Context()) {
... some code here ...
}
... some code here ...
}
... some code here ...
}
With a total execution time of 1 sec and many users on the same time, changing the isolation level will cause the contexts to wait for each other to release a connection because all the connections in the connection pool is being used.
So one (of many reasons) for changing to "nolock" is to avoid deadlocks (right now we have 1 customer deadlock per day). The consequence of above is just another kind of deadlock and really doesn't solve my issue.
So what I know I could do is:
Avoid nested usage of same connection
Increase the connection pool size at the server
But my problem is:
This is not possible within near future because of many lines of code re-factoring and it will conflict with the architecture (without even starting to comment whether this is good or bad)
Even though this of course will work, this is what I would call "symptomatic treatment" - as I don't know how much the application will grow and if this is a reliable solution for the future (and then I might end up with a even worse situation with a lot more users being affected)
My thoughts are:
Can it really be true that NoLock is not possible (for each statement without starting transactions)?
If 1 is true - can it really be true no one other got this problem and solved it in a generic linq to sql modification?
If 2 is true - why is this not a issue for others?
Is there another workaround I havn't looked at maybe?
Is the using of the same connection (nested) many times so bad practice that no-one has this issue?
1: LINQ-to-SQL does indeed not allow you to indicate hints like NOLOCK; it is possible to write your own TSQL, though, and use ExecuteQuery<T> etc
2: to solve in an elegant way would be pretty complicated, frankly; and there's a strong chance that you would be using it inappropriately. For example, in the "deadlock" scenario, I would wager that actually it is UPDLOCK that you should be using (during the first read), to ensure that the first read takes a write lock; this prevents a second later query getting a read lock, so you generally get blocking instead of deadlock
3: using the connection isn't necessarily a big problem (although note that new Context() won't generally share a connection; to share a connection you would use new Context(connection)). If seeing this issue, there are three likely solutions (if we exclude "use an ORM with hint support"):
using an explicit transaction (which doesn't have to be TransactionScope - it can be a connection level transaction) to specify the isolation level
write your own TSQL with hints
use a connection-level isolation level (noting the caveat I added as a comment)
IIRC there is also a way to subclass the data-context and override some of the transaction-creation code to control the isolation-level for the transactions that it creates internally.

c# how to implement Data Base which is queried by threads

hello there i have a situation the entities are customerManager warehouse customer and suppliers
my goals are :
the warehouse is singletone and open db in runtime.
the customerManager manage customers as threads who query the warehouse and update it (after buying staff).
when one of the items in the warehouse is run out of we ask supplier in a diffrent thread to supply it for us ' while the supplier does his thing (let's assume it's something like 5 seconds ) the customer waits(in queue) and invoked when the supplier method returned true (let's assume it return true always)..
so my questions are about 3 things :
design - should the customerManager holds inside him the warehouse and the customers ? it seems like the best soultion, does someone recommend otherwise?(c# design topic )
how many threads can go to the db at once ? can a db handle it by himself so i wont need to do it myself ? should I hold for them SqlCommand(s)? should I use dataset or datareader? in other words can someone advice me how to do it ?
should i do for 10 threads :
for (int i = 0 ; i < 10 ; i++)
{
SqlConnection sqlConnection = new SqlConnection(r_ConnectionString);
sqlConnection.Open();
sqlConnection.Close();
}
...so the conection pool would be open for 10 connectiones ?
** database ADO.NET ** topic
how should the threads wait in a queue?(in order to wait to the supplier method to awake them ) how to wake them ? is there a good solution in c# for that ? (c# threads topic)
I think the question is too long but otherwise would be too out of context so I would appreciate if you would write in the the title what question you want to reference.
thank you .
Your worker threads could be fed work via BlockingCollection or ConcurrentQueue.
For connection management you are better off doing this:
using (SqlConnection conn = new SqlConnection(...))
{
}
since this ensures Dispose() gets called for you. As noted in other feedback, you can do this without worrying about actual conn count to the DB since ADO.Net manages a pool of physical connections behind the scenes.
Nobody can tell you whether DataSet or DataReader works best, it depends on your usage of the data once it's loaded. DataReader provides a sequential read of each record in turn, while DataSet provides an in-memory cache of underlying DB data and in that sense is a 'higher-level' abstraction.
SqlConnections are implemented in .net using a connection pool. You do not need to worry about the management of the connections themselves. The only requirement on you is that adfter you open them you call .close. .net will manage the rest for you in an efficient way.
If you wanted to run multiple queries simultaneously then you can call the sqlcommand wtih begin invoke and end invoke.
By using both of these you cna work at a level that does not require you to mange the threads whilst gettting a multi threaded behaviour.
However you shuold read up on ADO.Net because a lot of what you are talking about is unnecessary when you kow how it works.
as for dataset or datareader that depends on your problem, Dataset is a very heavy object though, datareaders are lightweight and fast that allow you to populate a collection fairly easily.
I prefer using linq2sql though or entity framework. ADO.Net is kinda fragile because you ahve to do a lot of casting on data adn manual mapping onto objects that is prone to errors at run time rather than compile time.

Categories