How to optimize the functions that connect to the database so that if many users access the database at the same time, the server does not crash or create another problem.
Is it possible to use threading? Is it possible that if the database is late with the response, the main thread freezes or blocks other code?
public static void UpdatePassword(string email, string password)
{
using (MySqlConnection connection = new MySqlConnection(""))
{
connection.Open();
MySqlCommand command = connection.CreateCommand();
string saltedPassword = PasswordDerivation.Derive(password);
command.CommandText = "UPDATE users SET password=#password WHERE email=#email LIMIT 1";
command.Parameters.AddWithValue("#email", email);
command.Parameters.AddWithValue("#password", saltedPassword);
command.ExecuteNonQuery();
connection.Close();
}
}
In most situations, a single "program" should use a single connection to the database. Having lots of "connections" incurs overhead, at least for creating the connections.
Async actions are rarely beneficial in database work since SQL is very good at working efficiently with millions of rows in a single query.
MySQL is very good at letting separate clients talk to the database at the same time. However, this needs "transactions" to keep the data from getting messed up.
If your goal in C# is to get some parallelism, please describe it further. We will either convince you that it won't be as beneficial as you think or help you rewrite the SQL to be more efficient and avoid the need for parallelism.
Related
Is there a problem if I execute queries from multiple threads using the same ConnectionString? What happens if two or more threads try to send data at the same time?
string globalConnectionString = #"some_stringHere!";
//create new backgroundWorker if new logFile is created (txt file).
// ....
private void backgroundWorker_DoWork(object sender, DoWorkEventArgs e)
{
// get some data from created logFile
string serialNumber = getSerialNumber(logFile);
string testResult = getTestResult(logFile);
// if server is online, send data
if(serverIsOnline)
{
using(SqlConnection connection = new SqlConnecton(globalConnectionString))
{
SqlCommand someCommand = new SqlCommand("some insert/update command here!", connection);
connection.Open();
Command.ExecuteNonQuery();
connection.Close();
}
}
}
Concurrent connections are OK, if used correctly
There's no problem with using multiple connections concurrently, assuming it's done for the right reason. Databases can handle thousands of concurrent client connections.
Executing the same slow query in parallel to make it finish faster will probably make it even slower as each connection may block the others. Many databases parallelize query processing already, producing far better results than crude client-side parallelism.
If you want to make a slow query go faster, you'd get better results by investigating why it's slow and fixing the perf issues. For example, if you want to insert 10K rows, it's faster to use eg SqlBulkCopy or BULK INSERT to load the rows than executing 10K INSERTs that will end up blocking each other for access to the same table and even data pages
You can use the same connection to execute asynchronous queries (eg with ExecuteNonQueryAsync(), ExecuteReaderAsync() etc, provided they execute one after the other. You can't execute multiple concurrent queries on the same connection, at least not without going through some hoops.
The real problem
The real problem is using a BackgroundWorker in the first place. That class is obsolete since 2012 when async/await were introduced. With BGW it's extremely hard to combine multiple asynchronous operations. Progress reporting is available through the Progress<T> class and cooperative cancellation through CancellationTokenSource. Check Async in 4.5: Enabling Progress and Cancellation in Async APIs for a detailed explanation.
You can replace the BGW calls in your code with only await command.ExecuteNonQueryAsync(). You could create an asynchronous method to perform insert the data into the database :
private async Task InsertTestData(string serialNumber,string testResult)
{
// if server is online, send data
if(serverIsOnline)
{
using(SqlConnection connection = new SqlConnecton(globalConnectionString))
{
var someCommand = new SqlCommand("some insert/update command here!", connection);
someCommand.Parameters.Add("#serial",SqlDbType.NVarChar,30).Value=serialNumber;
...
connection.Open();
Command.ExecuteNonQueryAsync();
}
}
}
If retrieving the serial number and test data is time consuming, you can use Task.Run to run each of them in the background :
string serialNumber = await Task.Run(()=>getSerialNumber(logFile));
string testResult = await Task.Run(()=>getTestResult(logFile));
await InsertTestData(serialNumber,testResult);
You could also use a library like Dapper to simplify the database :
private async Task InsertTestData(string serialNumber,string testResult)
{
// if server is online, send data
if(serverIsOnline)
{
using(SqlConnection connection = new SqlConnecton(globalConnectionString))
{
await connection.ExecuteAsync("INSERT .... VALUES(#serial,#test)",
new {serial=serialNumber,test=testResults});
}
}
}
Dapper will generate a parameterized query and match the parameters in the query with properties in the anonymous object by name.
Reading the connection string isn't an issue here. You would have a problem if you would share the SqlConnection object through multiple threads. But that's not the case in your code.
I believe this is a question about Isolation from ACID properties. Please have a look at them.
Based on the SQL standard a single SQL query operates on a steady (consistent) state of the table(s) which the query works on. So this definition dictates that, it can NOT see any changes while it's being executed. However, as far as I know not all DBMS software follow this rule perfectly. For example there are products and / or Isolation levels that allow dirty reads.
Here is very detailed explanation from another user.
I have TcpListener class and I'm using async/await reading and writing.
For this server I have created single database instance where I have prepared all database queries.
But for more then one TcpClient I'm keep getting exception:
An exception of type MySql.Data.MySqlClient.MySqlException occurred
in MySql.Data.dll but was not handled in user code
Additional information: There is already an open DataReader associated
with this Connection which must be closed first.
If I understand it correctly there can't be more then one database query at time which is problem with more then one async client.
So I simply added locks in my queries like this and everything seems fine.
// One MySqlConnection instance for whole program.
lock (thisLock)
{
var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT Count(*) FROM logins WHERE username = #user AND password = #pass";
cmd.Parameters.AddWithValue("#user", username);
cmd.Parameters.AddWithValue("#pass", password);
var count = int.Parse(cmd.ExecuteScalar().ToString());
return count > 0;
}
I have also try the method with usings which create new connection for every query as mentioned from someone of SO community but this method is much more slower than locks:
using (MySqlConnection connection = new MySqlConnection(connectionString))
{
connection.Open(); // This takes +- 35ms and makes worse performance than locks
using (MySqlCommand cmd = connection.CreateCommand())
{
cmd.CommandText = "SELECT Count(*) FROM logins WHERE username = #user AND password = #pass";
cmd.Parameters.AddWithValue("#user", username);
cmd.Parameters.AddWithValue("#pass", password);
int count = int.Parse(cmd.ExecuteScalar().ToString());
return count > 0;
}
}
I used Stopwatch to benchmarks this methods and queries with one connection with locks are performed in +- 20ms which is +- only delay of network but with usings it is +- 55ms because of .Open() method which takes +- 35ms.
Why a lot of people use method with usings if there is much worse performance? Or am I doing something wrong?
You're right, opening connection is a time-consuming operation. To mitigate this, ADO.NET has Connection pooling. Check this article for details.
If you go on with your performance test and check timings for subsequent connections, you should see that time for connection.Open() improves and gets close to 0 ms because connections are actually taken from the Pool.
With your lock implementation, you actually use connection pool with just one connection. While this approach could show better performance within a trivial test, it will show very poor results in highly loaded applications.
I would like to know what could be best approach to open a SqlConnection with Sql Server 2008R2 Express Edition Database. This Version of Sql has Limitations of RAM Usage and CPU Usage so we must adopt something best to open a SqlConnection.
Right Now i am Checking Connection on Start and End of each and every Method. Here is an example of that.
private void CheckValidId(string Id)
{
CheckConnectionStatus();
try
{
sqlConnection.Open();
sqlCommand = new SqlCommand("select * from ValidId where id=#id", sqlConnection);
sqlCommand.Parameters.AddWithValue("#id", Id);
sqlDataReader = sqlCommand.ExecuteReader();
While (sqlDataReader.Read())
{
string Test = sqlDataReader["Id"].toString();
MessageBox.Show("Value of Id : " , Test);
}
}
catch (Exception exp)
{
MessageBox.Show(exp.Message.ToString(), "Exception in CheckValidId");
}
finally
{
CheckConnectionStatus();
}
}
Here is CheckConnectionStatus Method
private void CheckConnectionStatus()
{
if (sqlConnection.State == ConnectionState.Open)
{
sqlConnection.Close();
}
}
What is best approach to perform this operation.
Thanks
Just use using as it disposes of the connection once done.
using(SqlConnection conn = new SqlConnection("Connection string")){
//do sql stuff
conn.Open();
//etc etc
conn.Close();
}
You'll want to make use of the disposable pattern to ensure everything is closed and disposed properly:
var query = "select * from ValidId where id=#id";
using (var conn = new System.Data.SqlClient.SqlConnection(usingConnectionString))
using (var command = new System.Data.SqlClient.SqlCommand(query, conn))
{
command.Parameters.Add("#id", SqlDbType.Int).Value = Id;
conn.Open;
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
string Test = reader["Id"].ToString();
}
}
command.Parameters.Clear();
}
You don't need to check the connection state; it will close when it's being disposed.
One thing to note: it's best practice to explicitly specify your parameter data types. I've assumed SqlDbType.Int in your case, but you can change it to whatever it really is.
Another thing to note: you don't want to do too much inside the reader while loop. You want to build your collection or whatever and get out of there. The shorter your connection is open, the better. That's because you could potentially be holding a read lock on some of the rows in the database that might affect other users and their applications.
Your pattern for open and close is correct. However you must note that this doesn't open and close the connection to the SQL Server so doesn't really address your concerns over memory usage and CPU - in fact it wont make any difference.
What Open and Close does is lease and return a connection to the ADO Connection Pool on the client PC. This means that Closing an ADO connection is not guaranteed (and in most cases will not) close and release the connection to the SQL Server. This is becasue establishing and authenticating a connection is relatively expensive and slow, so the ADO connection pool keeps your connections in a pool, still open, just in case you want to re-establish a connection.
What makes the difference to SQL Server is the number of concurrent queries it needs to execute - and the dataset size of the queries, and the total size of the data in the database.
Concurrent queries squeeze CPU, and the datasets returned squeeze the RAM available. Obviously the bigger your database the less can be cached in RAM and so the less likely you are to get a cache hit when querying.
In practice my experience with SQL Express editions is that you wont notice any difference between it and the full edition of SQL Server unless you are doing some very specific things;
1) Writing a BI style tool which allows the user to construct user-defined or user-scoped queries.
2) Writing terrible SQL - "big SQL" may mask your bad query syntax, but Express wont be able to because it has less available RAM to play with.
If you write efficient, constrained SQL, you probably wont actually ever hit any of SQL Express's limitations.
In my server application I want to use DB (SQL Server) but I am quite unsure of the best method. There are clients whose requests comes to threadpool and so their processing is async. Every request usually needs to read or write to DB, so I was thinking about static method which would create connection, execute the query and return the result. I'm only afraid whether opening and closing connection is not too slow and whether some connection limit could not be reached? Is this good approach?
IMHO the best is to rely on the ADO.NET connection pooling mechanism and don't try to handle database connections manually. Write your data access methods like this:
public void SomeMethod()
{
using (var connection = new SqlConnection(connectionString))
using (var command = connection.CreateCommand())
{
connection.Open();
command.CommandText = "SELECT Field1 FROM Table1";
using (var reader = command.ExecuteReader())
{
while(reader.Read())
{
// do something with the results
}
}
}
}
Then you can call this method from wherever you like, make it static, call it from threads whatever. Remember that calling Dispose on the connection won't actually close it. It will return it to the connection pool so that it can be reused.
Surprised that no one mentioned connection pooling. If you think you are going to have a large number of requests, why not just setup a pool with a min pool size set to say 25 (arbitrary number here, do not shoot) and max pool size set to say 200.
This will decrease the number of connection attempts and make sure that if you are not leaking connection handles (something that you should take explicit care to not let happen), you will always have a connection waiting for you.
Reference article on connection pooling: http://msdn.microsoft.com/en-us/library/8xx3tyca.aspx
Another side note, why the need to have the connection string in the code? Set it in the web.config or app.config for the sake of maintainability. I had to "fix" code that did such things and I always swore copiously at the programmer responsible for such things.
I have had exactly the same problem like you. Had huge app that i started making multithreaded. Benefit over having one connection open and being reused is that you can ask DB multiple times for data as new connection is spawned on request (no need to wait for other threads to finish getting data), and if for example you loose connection to sql (and it can happen when network goes down for a second or two) you will have to always check if connection is open before submitting query anyway.
This is my code for getting Database rows in MS SQL but other stuff should be done exactly the same way. Keep in mind that the sqlConnectOneTime(string varSqlConnectionDetails) has a flaw of returning null when there's no connection so it needs some modifications for your needs or the query will fail if sql fails to establish connection. You just need to add proper code handling there :-) Hope it will be useful for you :-)
public const string sqlDataConnectionDetails = "Data Source=SQLSERVER\\SQLEXPRESS;Initial Cata....";
public static string sqlGetDatabaseRows(string varDefinedConnection) {
string varRows = "";
const string preparedCommand = #"
SELECT SUM(row_count) AS 'Rows'
FROM sys.dm_db_partition_stats
WHERE index_id IN (0,1)
AND OBJECTPROPERTY([object_id], 'IsMsShipped') = 0;";
using (var varConnection = Locale.sqlConnectOneTime(varDefinedConnection))
using (var sqlQuery = new SqlCommand(preparedCommand, varConnection))
using (var sqlQueryResult = sqlQuery.ExecuteReader())
while (sqlQueryResult.Read()) {
varRows = sqlQueryResult["Rows"].ToString();
}
return varRows;
}
public static SqlConnection sqlConnectOneTime(string varSqlConnectionDetails) {
SqlConnection sqlConnection = new SqlConnection(varSqlConnectionDetails);
try {
sqlConnection.Open();
} catch (Exception e) {
MessageBox.Show("Błąd połączenia z serwerem SQL." + Environment.NewLine + Environment.NewLine + "Błąd: " + Environment.NewLine + e, "Błąd połączenia");
}
if (sqlConnection.State == ConnectionState.Open) {
return sqlConnection;
}
return null;
}
Summary:
Defined one global variable with ConnectionDetails of your SQL Server
One global method to make connection (you need to handle the null in there)
Usage of using to dispose connection, sql query and everything when the method of reading/writing/updating is done.
The one thing that you haven't told us, that would be useful for giving you an answer that's appropriate for you is what level of load you're expecting your server application to be under.
For pretty much any answer to the above question though, the answer would be that you shouldn't worry about it. ADO.net/Sql Server provides connection pooling which removes some of the overhead of creating connections from each "var c = new SqlConnection(connectionString)" call.
I'm getting this error (Distributed transaction completed. Either enlist this session in a new transaction or the NULL transaction.) when trying to run a stored procedure from C# on a SQL Server 2005 database. I'm not actively/purposefully using transactions or anything, which is what makes this error weird. I can run the stored procedure from management studio and it works fine. Other stored procedures also work from C#, it just seems to be this one with issues. The error returns instantly, so it can't be a timeout issue. The code is along the lines of:
SqlCommand cmd = null;
try
{
// Make sure we are connected to the database
if (_DBManager.CheckConnection())
{
cmd = new SqlCommand();
lock (_DBManager.SqlConnection)
{
cmd.CommandText = "storedproc";
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.Connection = _DBManager.SqlConnection;
cmd.Parameters.AddWithValue("#param", value);
int affRows = cmd.ExecuteNonQuery();
...
}
}
else
{
...
}
}
catch (Exception ex)
{
...
}
It's really got me stumped. Thanks for any help
It sounds like there is a TransactionScope somewhere that is unhappy. The _DBManager.CheckConnection and _DBManager.SqlConnection sounds like you are keeping a SqlConnection hanging around, which I expect will contribute to this.
To be honest, in most common cases you are better off just using the inbuilt connection pooling, and using your connections locally - i.e.
using(var conn = new SqlConnection(...)) { // or a factory method
// use it here only
}
Here you get a clean SqlConnection, which will be mapped to an unmanaged connection via the pool, i.e. it doesn't create an actual connection each time (but will do a logical reset to clean it up).
This also allows much more flexible use from multiple threads. Using a static connection in a web app, for example, would be horrendous for blocking.
From the code it seems that you are utilizing an already opened connection. May be there's a transaction pending previously on the same connection.