Queries from background worker conflict? - c#

Is there a problem if I execute queries from multiple threads using the same ConnectionString? What happens if two or more threads try to send data at the same time?
string globalConnectionString = #"some_stringHere!";
//create new backgroundWorker if new logFile is created (txt file).
// ....
private void backgroundWorker_DoWork(object sender, DoWorkEventArgs e)
{
// get some data from created logFile
string serialNumber = getSerialNumber(logFile);
string testResult = getTestResult(logFile);
// if server is online, send data
if(serverIsOnline)
{
using(SqlConnection connection = new SqlConnecton(globalConnectionString))
{
SqlCommand someCommand = new SqlCommand("some insert/update command here!", connection);
connection.Open();
Command.ExecuteNonQuery();
connection.Close();
}
}
}

Concurrent connections are OK, if used correctly
There's no problem with using multiple connections concurrently, assuming it's done for the right reason. Databases can handle thousands of concurrent client connections.
Executing the same slow query in parallel to make it finish faster will probably make it even slower as each connection may block the others. Many databases parallelize query processing already, producing far better results than crude client-side parallelism.
If you want to make a slow query go faster, you'd get better results by investigating why it's slow and fixing the perf issues. For example, if you want to insert 10K rows, it's faster to use eg SqlBulkCopy or BULK INSERT to load the rows than executing 10K INSERTs that will end up blocking each other for access to the same table and even data pages
You can use the same connection to execute asynchronous queries (eg with ExecuteNonQueryAsync(), ExecuteReaderAsync() etc, provided they execute one after the other. You can't execute multiple concurrent queries on the same connection, at least not without going through some hoops.
The real problem
The real problem is using a BackgroundWorker in the first place. That class is obsolete since 2012 when async/await were introduced. With BGW it's extremely hard to combine multiple asynchronous operations. Progress reporting is available through the Progress<T> class and cooperative cancellation through CancellationTokenSource. Check Async in 4.5: Enabling Progress and Cancellation in Async APIs for a detailed explanation.
You can replace the BGW calls in your code with only await command.ExecuteNonQueryAsync(). You could create an asynchronous method to perform insert the data into the database :
private async Task InsertTestData(string serialNumber,string testResult)
{
// if server is online, send data
if(serverIsOnline)
{
using(SqlConnection connection = new SqlConnecton(globalConnectionString))
{
var someCommand = new SqlCommand("some insert/update command here!", connection);
someCommand.Parameters.Add("#serial",SqlDbType.NVarChar,30).Value=serialNumber;
...
connection.Open();
Command.ExecuteNonQueryAsync();
}
}
}
If retrieving the serial number and test data is time consuming, you can use Task.Run to run each of them in the background :
string serialNumber = await Task.Run(()=>getSerialNumber(logFile));
string testResult = await Task.Run(()=>getTestResult(logFile));
await InsertTestData(serialNumber,testResult);
You could also use a library like Dapper to simplify the database :
private async Task InsertTestData(string serialNumber,string testResult)
{
// if server is online, send data
if(serverIsOnline)
{
using(SqlConnection connection = new SqlConnecton(globalConnectionString))
{
await connection.ExecuteAsync("INSERT .... VALUES(#serial,#test)",
new {serial=serialNumber,test=testResults});
}
}
}
Dapper will generate a parameterized query and match the parameters in the query with properties in the anonymous object by name.

Reading the connection string isn't an issue here. You would have a problem if you would share the SqlConnection object through multiple threads. But that's not the case in your code.

I believe this is a question about Isolation from ACID properties. Please have a look at them.
Based on the SQL standard a single SQL query operates on a steady (consistent) state of the table(s) which the query works on. So this definition dictates that, it can NOT see any changes while it's being executed. However, as far as I know not all DBMS software follow this rule perfectly. For example there are products and / or Isolation levels that allow dirty reads.
Here is very detailed explanation from another user.

Related

MySqlConnection Threading Optimization

How to optimize the functions that connect to the database so that if many users access the database at the same time, the server does not crash or create another problem.
Is it possible to use threading? Is it possible that if the database is late with the response, the main thread freezes or blocks other code?
public static void UpdatePassword(string email, string password)
{
using (MySqlConnection connection = new MySqlConnection(""))
{
connection.Open();
MySqlCommand command = connection.CreateCommand();
string saltedPassword = PasswordDerivation.Derive(password);
command.CommandText = "UPDATE users SET password=#password WHERE email=#email LIMIT 1";
command.Parameters.AddWithValue("#email", email);
command.Parameters.AddWithValue("#password", saltedPassword);
command.ExecuteNonQuery();
connection.Close();
}
}
In most situations, a single "program" should use a single connection to the database. Having lots of "connections" incurs overhead, at least for creating the connections.
Async actions are rarely beneficial in database work since SQL is very good at working efficiently with millions of rows in a single query.
MySQL is very good at letting separate clients talk to the database at the same time. However, this needs "transactions" to keep the data from getting messed up.
If your goal in C# is to get some parallelism, please describe it further. We will either convince you that it won't be as beneficial as you think or help you rewrite the SQL to be more efficient and avoid the need for parallelism.

MSDTC getting invoked. But why?

For my data access I use TransactionScopes at the API level to wrap entire operations in a single transaction so that my SQL operations can be somewhat composable. I have a web project that hosts an API and a separate service library that is the implementation and calls to SQL. At the beginning of an Operation (an API entry-point) I open the TransactionScope. Whenever a SqlConnection is needed within the processing of the Operation, ask for the AmbientConnection instead of directly making a new connection. AmbientConnection finds or creates a new SqlConnection for the current transation. Doing this is supposed to allow for good composibility but also avoid the invocation of the MSDTC because it should keep using the same connection for the each suboperation within the transaction. When the transaction is completed (with scope.complete()), the connection is automatically closed.
The problem is that every once in a while the MSDTC is still getting invoked and I cannot figure out why. I've used this before sucessfully and I believe I never got an MSDTC invoked. Two things seem different to me this time though: 1) I'm using SQL Server 2008 R1 (10.50.4000) - not my choice - and I'm aware that the MSDTC behavior changed beginning with this version and perhaps not all the kinks were worked out until later versions. 2) The use of async-await is new and I believe I'm having to use TransactionScopeAsyncFlowOption.Enabled to accommodate this new feature in case some part of the implementation is async. Perhaps more measures are necessary.
I tried Pooling=false in the connection string in case it was MSDTC getting invoked because of two independent logical connections handled errantly under a single pooled connection. But that didn't work.
API Operation
// Exposed API composing multiple low-level operations within a single TransactionScope
// independent of any database platform specifics.
[HttpPost]
public async Task<IHttpActionResult> GetMeTheTwoThings()
{
using (var scope = new TransactionScope(TransactionScopeOption.Required, TransactionScopeAsyncFlowOption.Enabled))
{
var result = new TwoThings(
await serviceLayer.GetThingOne(),
await serviceLayer.GetThingTwo());
scope.Complete();
return Ok(result);
}
}
Service layer implementation
public async Task<ThingOne> GetThingOne()
{
using (var cmd = connManagement.AmbientConnection.CreateCommand())
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandText = "dbo.GetThingOne";
return (ThingOne)(await cmd.ExecuteScalarAsync());
}
}
public async Task<ThingTwo> GetThingTwo()
{
using (var cmd = connManagement.AmbientConnection.CreateCommand())
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.CommandText = "dbo.GetThingTwo";
return (ThingTwo)(await cmd.ExecuteScalarAsync());
}
}
AmbientConnection implementation
internal class SQLConnManagement
{
readonly string connStr;
readonly ConcurrentDictionary<Transaction, SqlConnection> txConnections = new ConcurrentDictionary<Transaction, SqlConnection>();
private SqlConnection CreateConnection(Transaction tx)
{
var conn = new SqlConnection(this.connStr);
// When the transaction completes, close the connection as well
tx.TransactionCompleted += (s, e) =>
{
SqlConnection closing_conn;
if (txConnections.TryRemove(e.Transaction, out closing_conn))
{
closing_conn.Dispose(); // closing_conn == conn
}
};
conn.Open();
return conn;
}
internal SqlConnection AmbientConnection
{
get
{
var txCurrent = Transaction.Current;
if (txCurrent == null) throw new InvalidOperationException("An ambient transaction is required.");
return txConnections.GetOrAdd(txCurrent, CreateConnection);
}
}
public SQLConnManagement(string connStr)
{
this.connStr = connStr;
}
}
Not to overcomplicate the post, but this might be relevant because it seems to me that every time MSDTC has been invoked the logged stack trace shows that this next mechanism has been involved. Certain data I cache with the built in ObjetCache because it doesn't change often and so I just get it at most once per minute or whatever. This is a little fancy, but I don't see why the Lazy generator would be treated any differently from a more typical call and why this specifically would cause the MSSDTC to sometimes be invoked. I've tried LazyThreadSafetyMode.ExecutionAndPublication too just in case but that doesn't help anyway (and then the exception just keeps getting delivered as the cached result for subsequent requests before the expiration, of course, and that's not desirable).
/// <summary>
/// Cache element that gets the item by key, or if it is missing, creates, caches, and returns the item
/// </summary>
static T CacheGetWithGenerate<T>(ObjectCache cache, string key, Func<T> generator, DateTimeOffset offset) where T : class
{
var generatorWrapped = new Lazy<T>(generator, System.Threading.LazyThreadSafetyMode.PublicationOnly);
return ((Lazy<T>)cache.AddOrGetExisting(
key,
generatorWrapped,
offset))?.Value ?? generatorWrapped.Value;
}
public ThingTwo CachedThingTwo
{
get
{
return CacheGetWithGenerate(
MemoryCache.Default,
"Services.ThingTwoData",
() => GetThingTwo(), // ok, GetThingTwo isn't async this time, fudged example
DateTime.Now.Add(TimeSpan.FromMinutes(1)));
}
}
Do you know why MSDTC is being invoked?
PublicationOnly means that two connections can be created and one thrown away. I'm surprised you made this bug because you explicitly stated PublicationOnly (as opposed to the default safety mode which is safe). You explicitly allowed this bug.
For some reason I did not see that you tried ExecutionAndPublication already. Since not using it is a bug please fix the code in the question.
CreateConnection is also broken in the sense that in case of exception on open the connection object is not getting disposed. Probably harmless but you never know.
Also, audit this code for thread aborts which can happen when ASP.NET times out a request. You are doing very dangerous and brittle things here.
The pattern that I use is to use an IOC container to inject a connection that is shared for the entire request. The first client for that connection opens it. The request end event closes it. Simple, and does away with all that nasty shared, mutable, multi-threaded state.
Why are you using a cache for data that you do not want to lose? This is probably the bug. Don't do that.
What is ?.Value ?? generatorWrapped.Value about? The dictionary can never return null. Delete that code. If it could return null then forcing the lazy value would create a second connection so that's a logic bug as well.

Database timeout

I have a program that access database and excecute different methods that have a database call.
I have used one conenction for everything but it caused a timeout while executing a long task:
I basically had to go through the more than 6000 records and execute a stored procedure. I thing that caused a timeout since I used only one database connection for everything.
Then I changed the code, so I open and closing the connection for every method I call with "using" approach.
How should I handle the method that will be called a lot. Shouls I open/close connection everytime I access that method?
Or there is a different approach to it?
I do something like this:
foreach(record in MyCollection)//6000
{
using(connection = new SqlConnection(conString))
{
singledata = GetSingleData(record);
}
}
Here is a GetSingleData()
private byte[] GetSingleData(MyObject Data)
{
byte[] singleData = null;
using(SqlCommans......)
{
try
{
.......
//executing stored proc to get just a single row
reader = command.ExecuteReader();
while(reader.Read())
{
singleData = (byte[])reader["ColumnName"];
}
}
catch(SqlException ex)
{
if(!reader.isClosed)
reader.Close();
}
}
return singleData;
}
Is it efficient or I can set up some kind of counter and for each 500 records I can check if connection is closed and if it is then reopen it.
Thank's
Try using a persistent connection. Here's a post that might help if you want to try to tune your system (for MySQL):
http://www.mysqlperformanceblog.com/2011/04/19/mysql-connection-timeouts/
Hope that helps.
There is no such a thing as the only good way to do something. It all depends. In cases where agility is a must and you need to create ad-hoc solutions, opening and closing a connection in each method call might not be good theoretically, but accepted practically.
I urge you to read about these terms and concepts:
Connection pooling
Bulk operations (bulk update, bulk insert)
They might help you in getting more performance.

Multiple asynchronous method calls to method while in a loop

I have spent a whole day trying various ways using 'AddOnPreRenderCompleteAsync' and 'RegisterAsyncTask' but no success so far.
I succeeded making the call to the DB asynchronous using 'BeginExecuteReader' and 'EndExecuteReader' but that is missing the point. The asynch handling should not be the call to the DB which in my case is fast, it should be afterwards, during the 'while' loop, while calling an external web-service.
I think the simplified pseudo code will explain best:
(Note: the connection string is using 'MultipleActiveResultSets')
private void MyFunction()
{
"Select ID, UserName from MyTable"
// Open connection to DB
ExecuteReader();
if (DR.HasRows)
{
while (DR.Read())
{
// Call external web-service
// and get current Temperature of each UserName - DR["UserName"].ToString()
// Update my local DB
Update MyTable set Temperature = ValueFromWebService where UserName =
DR["UserName"];
CmdUpdate.ExecuteNonQuery();
}
// Close connection etc
}
}
Accessing the DB is fast. Getting the returned result from the external web-service is slow and that at least should be handled Asynchnously.
If each call to the web service takes just 1 second, assuming I have only 100 users it will take minimum 100 seconds for the DB update to complete, which obviously is not an option.
There eventually should be thousands of users (currently only 2).
Currently everything works, just very synchronously :)
Thoughts to myself:
Maybe my way of approaching this is wrong?
Maybe the entire process should be called Asynchnously?
Many thanx
Have you considered spinning this whole thing off into it's own thread?
What is really your concern ?
Avoid the long task blocking your application ?
If so, you can use a thread (see BackgroundWorker)
Process several call to the web service in parallel to speed up the whole think ?
If so, maybe the web service can be called asynchronously providing a callback. You could also use a ThreadPool or Tasks. But you'll have to manage to wait for all your calls or tasks to complete before proceeding to the DB update.
You should keep the database connection open for as short of a time as possible. Therefore, don't do stuff while iterating through a DataReader. Most application developers prefer to put their actual database access code on a separate layer, and in a case like this, you would return a DataTable or a typed collection to the calling code. Furthermore, if you are updating the same table you are reading from, this could result in locks.
How many users will be executing this method at once, and how often does it need to be refreshed? Are you sure you need to do this from inside the web app? You may consider using a singleton for this, in which case spinning off a couple worker threads is totally appropriate even if it's in the web app. Another thing to consider is using a Windows Service, which I think would be more appropriate for periodically updating data via from a web service that doesn't even have to do with the current user's session.
Id say, Create a thread for each webrequest, and do something like this:
extra functions:
int privCompleteThreads = 0;
int OpenThreads = 0;
int CompleteThreads
{
get{ return privCompleteThreads; }
set{ privCompleteThreads = value; CheckDoneOperations(); }
}
void CheckDoneOperations
{
if(CompleteThreads == OpenThreads)
{
//done!
}
}
in main program:
foreach(time i need to open a request)
{
OpenThreads = OpenThreads + 1;
//Create thread here
}
inside the threaded function:
//do your other stuff here
//do this when done the operation:
CompleteThreads = CompleteThreads + 1;
now im not sure how reliable this approach would be, its up to you. but a normal web request shouldnt take a second, your browser doesnt take a second loading this page does it? mine loads it as fast as i can hit F5. Its just opening a stream, you could try opening the web request once, and just using the same instance over and over aswell, and see if that speeds it up at all

Access to SQL DB in multithread server app

In my server application I want to use DB (SQL Server) but I am quite unsure of the best method. There are clients whose requests comes to threadpool and so their processing is async. Every request usually needs to read or write to DB, so I was thinking about static method which would create connection, execute the query and return the result. I'm only afraid whether opening and closing connection is not too slow and whether some connection limit could not be reached? Is this good approach?
IMHO the best is to rely on the ADO.NET connection pooling mechanism and don't try to handle database connections manually. Write your data access methods like this:
public void SomeMethod()
{
using (var connection = new SqlConnection(connectionString))
using (var command = connection.CreateCommand())
{
connection.Open();
command.CommandText = "SELECT Field1 FROM Table1";
using (var reader = command.ExecuteReader())
{
while(reader.Read())
{
// do something with the results
}
}
}
}
Then you can call this method from wherever you like, make it static, call it from threads whatever. Remember that calling Dispose on the connection won't actually close it. It will return it to the connection pool so that it can be reused.
Surprised that no one mentioned connection pooling. If you think you are going to have a large number of requests, why not just setup a pool with a min pool size set to say 25 (arbitrary number here, do not shoot) and max pool size set to say 200.
This will decrease the number of connection attempts and make sure that if you are not leaking connection handles (something that you should take explicit care to not let happen), you will always have a connection waiting for you.
Reference article on connection pooling: http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
Another side note, why the need to have the connection string in the code? Set it in the web.config or app.config for the sake of maintainability. I had to "fix" code that did such things and I always swore copiously at the programmer responsible for such things.
I have had exactly the same problem like you. Had huge app that i started making multithreaded. Benefit over having one connection open and being reused is that you can ask DB multiple times for data as new connection is spawned on request (no need to wait for other threads to finish getting data), and if for example you loose connection to sql (and it can happen when network goes down for a second or two) you will have to always check if connection is open before submitting query anyway.
This is my code for getting Database rows in MS SQL but other stuff should be done exactly the same way. Keep in mind that the sqlConnectOneTime(string varSqlConnectionDetails) has a flaw of returning null when there's no connection so it needs some modifications for your needs or the query will fail if sql fails to establish connection. You just need to add proper code handling there :-) Hope it will be useful for you :-)
public const string sqlDataConnectionDetails = "Data Source=SQLSERVER\\SQLEXPRESS;Initial Cata....";
public static string sqlGetDatabaseRows(string varDefinedConnection) {
string varRows = "";
const string preparedCommand = #"
SELECT SUM(row_count) AS 'Rows'
FROM sys.dm_db_partition_stats
WHERE index_id IN (0,1)
AND OBJECTPROPERTY([object_id], 'IsMsShipped') = 0;";
using (var varConnection = Locale.sqlConnectOneTime(varDefinedConnection))
using (var sqlQuery = new SqlCommand(preparedCommand, varConnection))
using (var sqlQueryResult = sqlQuery.ExecuteReader())
while (sqlQueryResult.Read()) {
varRows = sqlQueryResult["Rows"].ToString();
}
return varRows;
}
public static SqlConnection sqlConnectOneTime(string varSqlConnectionDetails) {
SqlConnection sqlConnection = new SqlConnection(varSqlConnectionDetails);
try {
sqlConnection.Open();
} catch (Exception e) {
MessageBox.Show("Błąd połączenia z serwerem SQL." + Environment.NewLine + Environment.NewLine + "Błąd: " + Environment.NewLine + e, "Błąd połączenia");
}
if (sqlConnection.State == ConnectionState.Open) {
return sqlConnection;
}
return null;
}
Summary:
Defined one global variable with ConnectionDetails of your SQL Server
One global method to make connection (you need to handle the null in there)
Usage of using to dispose connection, sql query and everything when the method of reading/writing/updating is done.
The one thing that you haven't told us, that would be useful for giving you an answer that's appropriate for you is what level of load you're expecting your server application to be under.
For pretty much any answer to the above question though, the answer would be that you shouldn't worry about it. ADO.net/Sql Server provides connection pooling which removes some of the overhead of creating connections from each "var c = new SqlConnection(connectionString)" call.

Categories