Prepared statements and the built-in connection pool in .NET

Prepared statements and the built-in connection pool in .NET - c#

I have a long-running service with several threads calling the following method hundreds of times per second:
void TheMethod()
{
using (var c = new SqlConnection("..."))
{
c.Open();
var ret1 = PrepareAndExecuteStatement1(c, args1);
// some code
var ret2 = PrepareAndExecuteStatement2(c, args2);
// more code
}
}
PrepareAndExecuteStatement is something like this:
void PrepareAndExecuteStatement*(SqlConnection c, args)
{
var cmd = new SqlCommand("query", c);
cmd.Parameters.Add("#param", type);
cmd.Prepare();
cmd.Parameters["#param"] = args;
return cmd.execute().read().etc();
}
I want reuse the prepared statements, preparing once per connection and executing them until the connection breaks. I hope this will improve performance.
Can I use the built-in connection pool to achieve this? Ideally every time a new connection is made, all statements should be automatically prepared, and I need to have access to the SqlCommand objects of these statements.

Suggest taking a slightly modified approach. Close your connection immedately after use. You can certainly re-use your SqlConnection.
The work being done at //some code may take a long time. Are you interacting with other network resources, disk resources, or spending any amount of time with calculations? Could you ever, in the future, need to do so? Perhaps the intervals between executing statement are/could be so long that you'd want to reopen that connection. Regardless, the Connection should be opened late and closed early.
using (var c = new SqlConnection("..."))
{
c.Open();
PrepareAndExecuteStatement1(c, args);
c.Close();
// some code
c.Open();
PrepareAndExecuteStatement2(c, args);
c.Close();
// more code
}
Open Late, Close Early as MSDN Magazine by John Papa.
Obviously we've now got a bunch of code duplication here. Consider refactoring your Prepare...() method to perform the opening and closing operations.
Perhaps you'd consider something like this:
using (var c = new SqlConnection("..."))
{
var cmd1 = PrepareAndCreateCommand(c, args);
// some code
var cmd2 = PrepareAndCreateCommand(c, args);
c.Open();
cmd1.ExecuteNonQuery();
cmd2.ExecuteNonQuery();
c.Close();
// more code
}

Related

C# Parallel.For and Oracle database access - memory exception

I am working on some code that I would like to access an Oracle database inside of a Parallel.For loop. The loop will run for several minutes, and then result in the error:
"Attempted to read or write protected memory. This is often an
indication that other memory is corrupt."
There is no inner exception. Inside my Parallel.For loop, I am creating an opening the database connection as local objects. My code looks like this:
static void CheckSinglePath(Path p)
{
string sqlBase = "select * from table where hour = #HOUR#";
Parallel.For (1, 24, i =>
{
DBManager localdbm = new DBManager();
string sql = sqlBase;
sql = sql.Replace("#HOUR#", i.ToString());
OracleDataReader reader = db.GetData(sql);
if (reader.Read())
{
//do some stuff
}
reader.Close();
});
}
class DBManager
{
OracleConnection conn;
OracleCommand cmd;
public DBManager()
{
string connStr = "blahblahblah;Connection Timeout=600;";
conn = new OracleConnection(connStr);
conn.Open();
cmd = conn.CreateCommand();
}
public OracleDataReader GetData(string sql)
{
cmd.CommandText = sql;
return cmd.ExecuteReader();//EXCEPTION HERE!
}
}
What am I doing wrong? How can I create 24 parallel Oracle connections to process the data? I'm guessing there is some sort of race condition or memory leak that is going on here which I don't fully understand because it seems to be coming from inside the OracleConnection object. Is the database connection not threadsafe? I tried changing the connection string to use a connection pool and that didn't change anything.

Memory problems is always caused by wrong resources usage. You do not properly release your connections after the loop exit.
You need to implement IDisposable interface and after that you need rewrite your code in such manner with using keyword:
// dispose the connection after command finished
using (var localdbm = new DBManager())
{
var sql = sqlBase;
sql = sql.Replace("#HOUR#", i.ToString());
using (var reader = db.GetData(sql))
{
if (reader.Read())
{
//do some stuff
}
// no need to close reader
// as it's being disposed inside using directive
}
}

Opening and Closing OleDbConnection during Data Processing - is this good form?

Is this a good way to process data, i dont like the idea of copying the open close connection all over the place. In essence, is this good form/style?
Data Processing Method:
public int Process(Func<Product, OleDbConnection, int> func, Product data)
{
var oleConnect = new OleDbConnection { ConnectionString = #"stuff" };
oleConnect.Open();
oleConnect.ChangeDatabase("InventoryManager");
var ret = func(data, oleConnect);
oleConnect.Close();
return ret;
}
Typical Method used by the Func:
(Update, Delete, Select are the others to pass)
public int Insert(Product data, OleDbConnection oleConnect)
{
var oleCommand = new OleDbCommand("pInsProduct", oleConnect) { CommandType = CommandType.StoredProcedure };
oleCommand.Parameters.Add(new OleDbParameter("#ProductId", data.ProductID));
oleCommand.Parameters.Add(new OleDbParameter("#ProductName", data.ProductName));
return oleCommand.ExecuteNonQuery();
}
The usage code ends up more or less written as:
Process(Insert, data);
Process(Update, data);
EDIT:
I thought up the following alternative method, which is a better implementation? (using's aside):
(open connection more or less equals the Process method above)
int Insert(Product data)
{
Using ( OleDbConnection oleConnect = OpenConnection() )
{
//do stuff
oleConnect.Close(); // maybe redundant with Using statement?
}
}

So, you should be making sure to wrap your connections in using statements to ensure that the connections get closed and disposed of properly. You should do the same for commands. In the end, it is fine to open and close connections like that as typically you won't pay a penalty due to connection pooling, but you still want to re-use connections as much as you can, so do so whenever possible as long as you make sure you close / clean up when done.

What's the most DRY-appropriate way to execute an SQL command?

I'm looking to figure out the best way to execute a database query using the least amount of boilerplate code. The method suggested in the SqlCommand documentation:
private static void ReadOrderData(string connectionString)
{
string queryString = "SELECT OrderID, CustomerID FROM dbo.Orders;";
using (SqlConnection connection = new SqlConnection(connectionString))
{
SqlCommand command = new SqlCommand(queryString, connection);
connection.Open();
SqlDataReader reader = command.ExecuteReader();
try
{
while (reader.Read())
{
Console.WriteLine(String.Format("{0}, {1}", reader[0], reader[1]));
}
}
finally
{
reader.Close();
}
}
}
mostly consists of code that would have to be repeated in every method that interacts with the database.
I'm already in the habit of factoring out the establishment of a connection, which would yield code more like the following. (I'm also modifying it so that it returns data, in order to make the example a bit less trivial.)
private SQLConnection CreateConnection()
{
var connection = new SqlConnection(_connectionString);
connection.Open();
return connection;
}
private List<int> ReadOrderData()
{
using(var connection = CreateConnection())
using(var command = connection.CreateCommand())
{
command.CommandText = "SELECT OrderID FROM dbo.Orders;";
using(var reader = command.ExecuteReader())
{
var results = new List<int>();
while(reader.Read()) results.Add(reader.GetInt32(0));
return results;
}
}
}
That's an improvement, but there's still enough boilerplate to nag at me. Can this be reduced further? In particular, I'd like to do something about the first two lines of the procedure. I don't feel like the method should be in charge of creating the SqlCommand. It's a tiny piece of repetition as it is in the example, but it seems to grow if transactions are being managed manually or timeouts are being altered or anything like that.
edit: Assume, at least hypothetically, there's going to have to be a bunch of different types of data being returned. And consequently the solution can't be just one one-size-fits-all method, there will have to be a few different ones depending, at minimum, on whether ExecuteNonQuery, ExecuteScalar, ExecuteReader, ExecuteReaderAsync, or any of the others are being called. I'd like to cut down on the repetition among those.

Tried Dapper?
Granted this doesn't get you a DataReader but you might just prefer it this way once you've tried it.
It's about the lightest-weight an ORM can be while still being called an ORM. No more methods to map between DataReader and strong types for me.
Used right here on all the StackExchange sites.
using (var conn = new SqlConnection(cs))
{
var dogs = connection.Query("select name, age from dogs");
foreach (dynamic dog in dogs)
{
Console.WriteLine("{0} age {1}", dog.name, dog.age);
}
}
or
using (var conn = new SqlConnection(cs))
{
var dogs = connection.Query<Dog>("select Name, Age from dogs");
foreach (Dog dog in dogs)
{
Console.WriteLine("{0} age {1}", dog.Name, dog.Age);
}
}
class Dog
{
public string Name { get; set; }
public int Age { get; set; }
}

If you want to roll data access on your own, this pattern of help methods could be one way to remove duplication:
private List<int> ReadOrderData()
{
return ExecuteList<int>("SELECT OrderID FROM dbo.Orders;",
x => x.GetInt32("orderId")).ToList();
}
private IEnumerable<T> ExecuteList(string query,
Func<IDataRecord, T> entityCreator)
{
using(var connection = CreateConnection())
using(var command = connection.CreateCommand())
{
command.CommandText = query;
connection.Open();
using(var reader = command.ExecuteReader())
{
while(reader.Read())
yield return entityCreator(reader);
}
}
}
You'll have to add support for parameters and this might not compile, but the pattern is what I'm trying to illustrate.

What I typically do is use a custom class that I wrote a while back that accepts a SQL string, and optionally a list of parameters and it returns a DataTable.
Since the thing that changes between invocations is typically just the SQL that is optimal IMHO.
If you truly do need to use a DataReader you can do something like this:
public void ExecuteWithDataReader(string sql, Action<DataReader> stuffToDo) {
using (SqlConnection connection = new SqlConnection(connectionString)) {
using (SqlCommand command = new SqlCommand(sql, connection)) {
connection.Open();
using (SqlDataReader reader = command.ExecuteReader()) {
try {
while (reader.Read()) {
stuffToDo(reader);
}
}
finally {
reader.Close();
}
}
}
}
}
private static void ReadOrderData(string connectionString) {
string sql = "SELECT OrderID, CustomerID FROM dbo.Orders;";
ExecuteWithDataReader(sql, r => Console.WriteLine(String.Format("{0}, {1}", r[0], r[1])));
}

The first two line are the most important thing you need...
but if you still wish to do it, you can turn them to a database handler class, yes it will become more of code, but in refactoring concept, every thing will move to the related topic...
try to write a singleton class, that receive a command and do action, so return result of type SqlDataReader reader...

Doing this in comments was too much.
I would suggest that the boilerplate code around
using(conn = new sqlconnection)
using(cmd = new sqlcommand) {
// blah blah blah
}
isn't something to be lightly removed and instead would encourage that you keep it exactly where it's at. Resources, especially unmanaged ones, should be opened and released at the closest point to execution as possible IMHO.
In no small part due to the ease with which other developers will fail to follow the appropriate clean up conventions.
If you do something like
private SQLConnection CreateConnection()
{
var connection = new SqlConnection(_connectionString);
connection.Open();
return connection;
}
Then you are inviting another programmer to call this method and completely fail to release the resource as soon as the query is executed. I don't know what kind of app you are building, but in a web app such a thing will lead to memory / connection / resource errors of types that are difficult to debug, unless you've been through it before.
Instead, I'd suggest you look into a lightweight ORM such as Dapper.net or similar to see how they approached it. I don't use dapper, but I hear it's pretty good. The reason I don't use it is simply that we don't allow inline sql to be executed against our databases (but that's a very different conversation).
Here's our standard:
public static DataTable StatisticsGet( Guid tenantId ) {
DataTable result = new DataTable();
result.Locale = CultureInfo.CurrentCulture;
Database db = DatabaseFactory.CreateDatabase(DatabaseType.Clients.ToString());
using (DbCommand dbCommand = db.GetStoredProcCommand("reg.StatsGet")) {
db.AddInParameter(dbCommand, "TenantId", DbType.Guid, tenantId);
result.Load(db.ExecuteReader(dbCommand));
} // using dbCommand
return result;
} // method::StatisticsGet
We make heavy use of Enterprise Library. It's short, simple and to the point and very well tested. This method just returns a datatable but you could easily have it return an object collection.. or nothing.

Very slow foreach loop

I am working on an existing application. This application reads data from a huge file and then, after doing some calculations, it stores the data in another table.
But the loop doing this (see below) is taking a really long time. Since the file sometimes contains 1,000s of records, the entire process takes days.
Can I replace this foreach loop with something else? I tried using Parallel.ForEach and it did help. I am new to this, so will appreciate your help.
foreach (record someredord Somereport.r)
{
try
{
using (var command = new SqlCommand("[procname]", sqlConn))
{
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add(…);
IAsyncResult result = command.BeginExecuteReader();
while (!result.IsCompleted)
{
System.Threading.Thread.Sleep(10);
}
command.EndExecuteReader(result);
}
}
catch (Exception e)
{
…
}
}
After reviewing the answers , I removed the Async and used edited the code as below. But this did not improve performance.
using (command = new SqlCommand("[sp]", sqlConn))
{
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
foreach (record someRecord in someReport.)
{
command.Parameters.Clear();
command.Parameters.Add(....)
command.Prepare();
using (dr = command.ExecuteReader())
{
while (dr.Read())
{
if ()
{
}
else if ()
{
}
}
}
}
}

Instead of looping the sql connection so many times, ever consider extracting the whole set of data out from sql server and process the data via the dataset?
Edit: Decided to further explain what i meant..
You can do the following, pseudo code as follow
Use a select * and get all information from the database and store them into a list of the class or dictionary.
Do your foreach(record someRecord in someReport) and do the condition matching as usual.

Step 1: Ditch the try at async. It isn't implemented properly and you're blocking anyway. So just execute the procedure and see if that helps.
Step 2: Move the SqlCommand outside of the loop and reuse it for each iteration. that way you don't incurr the cost of creating and destroying it for every item in your loop.
Warning: Make sure you reset/clear/remove parameters you don't need from the previous iteration. We did something like this with optional parameters and had 'bleed-thru' from the previous iteration because we didn't clean up parameters we didn't need!

Your biggest problem is that you're looping over this:
IAsyncResult result = command.BeginExecuteReader();
while (!result.IsCompleted)
{
System.Threading.Thread.Sleep(10);
}
command.EndExecuteReader(result);
The entire idea of the asynchronous model is that the calling thread (the one doing this loop) should be spinning up ALL of the asynchronous tasks using the Begin method before starting to work with the results with the End method. If you are using Thread.Sleep() within your main calling thread to wait for an asynchronous operation to complete (as you are here), you're doing it wrong, and what ends up happening is that each command, one at a time, is being called and then waited for before the next one starts.
Instead, try something like this:
public void BeginExecutingCommands(Report someReport)
{
foreach (record someRecord in someReport.r)
{
var command = new SqlCommand("[procname]", sqlConn);
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add(…);
command.BeginExecuteReader(ReaderExecuted,
new object[] { command, someReport, someRecord });
}
}
void ReaderExecuted(IAsyncResult result)
{
var state = (object[])result.AsyncState;
var command = state[0] as SqlCommand;
var someReport = state[1] as Report;
var someRecord = state[2] as Record;
try
{
using (SqlDataReader reader = command.EndExecuteReader(result))
{
// work with reader, command, someReport and someRecord to do what you need.
}
}
catch (Exception ex)
{
// handle exceptions that occurred during the async operation here
}
}

In SQL on the other end of a write is a (one) disk. You rarely can write faster in parallel. In fact in parallel often slows it down due to index fragmentation. If you can sort the data by primary (clustered) key prior to loading. In a big load even disable other keys, load data rebuild keys.
Not really sure what are doing in the asynch but for sure it was not doing what you expected as it was waiting on itself.
try
{
using (var command = new SqlCommand("[procname]", sqlConn))
{
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
foreach (record someredord Somereport.r)
{
command.Parameters.Clear()
command.Parameters.Add(…);
using (var rdr = command.ExecuteReader())
{
while (rdr.Read())
{
…
}
}
}
}
}
catch (…)
{
…
}

As we were talking about in the comments, storing this data in memory and working with it there may be a more efficient approach.
So one easy way to do that is to start with Entity Framework. Entity Framework will automatically generate the classes for you based on your database schema. Then you can import a stored procedure which holds your SELECT statement. The reason I suggest importing a stored proc into EF is that this approach is generally more efficient than doing your queries in LINQ against EF.
Then run the stored proc and store the data in a List like this...
var data = db.MyStoredProc().ToList();
Then you can do anything you want with that data. Or as I mentioned, if you're doing a lot of lookups on primary keys then use ToDictionary() something like this...
var data = db.MyStoredProc().ToDictionary(k => k.MyPrimaryKey);
Either way, you'll be working with your data in memory at this point.

It seems executing your SQL command puts lock on some required resources and that's the reason enforced you to use Async methods (my guess).
If the database in not in use, try an exclusive access to it. Even then in there are some internal transactions due to data-model complexity consider consulting to database designer.

Is it possible to rollback committed data with TransactionScope?

The goal is simple - rollback data inserted by a unit test. Here is how it goes. In a unit test, a method is called that creates a new connection and inserts some data. After that a unit test creates a new connection and tries to find what has been inserted and assert that. I was hoping to wrap these two things with TransactionScope, not call Complete and see inserted data rolled back. That's not happening. Am I doing something wrong or I am just missing the point?
using (new TransactionScope())
{
// call a method that inserts data
var target = new ....
target.DoStuffAndEndupWithDataInDb();
// Now assert what has been added.
using (var conn = new SqlConnection(connectionString))
using (var cmd = conn.CreateCommand())
{
// Just read the data from DB
cmd.CommandText = "SELECT...";
conn.Open();
int count = 0;
using (var rdr = cmd.ExecuteReader())
{
// Read records here
...
count++;
}
// Expecting, say, 3 records here
Assert.AreEqual(3, count);
}
}
EDIT: I don't think I had DTC running and configured on my machine. So I started the service and tried to configure DTC but I am getting this error.

are you using MSTest ? then you can use MsTestExtensions
you unit test needs to derive from MSTestExtensionsTestFixture and your test needs to have TestTransaction Attribute, it uses AOP to automatically start a transaction and roll it back.

I don't think you're missing the point but just attacking the problem incorrectly.
In NUnit terms, the concepts are [SetUp] and [TearDown] methods. You've already defined the setup method in your description and your tear down method should just undo what the setup method did (assuming what you're unit testing has no residual side effects).

Do you have Distributed Transaction Coordinator properly configured? This is a big gotcha when trying to use TransactionScope like this... if it isn't configured, sometimes you'll get an error, but other times the transaction will just commit and not rollback.
I'd recommend looking at this article, which shows you all the various steps that need to be done in order to rollback your unit tests using MSDTC.

Your code should work as you expect. How are you adding data in DoStuffAndEndupWithDataInDb()? I'm wondering whether the data initialization is not participating in the transaction.
For reference, the following console application correctly outputs 3 rows, and does not commit the rows to the database (checked using SSMS).
public class Program
{
private static void Main(string[] args)
{
using (var trx = new TransactionScope())
{
InitializeData();
using (var connection = new SqlConnection("server=localhost;database=Test;integrated security=true"))
using (var command = connection.CreateCommand())
{
command.CommandText = "select count(*) from MyTable";
connection.Open();
Console.WriteLine("{0} rows", command.ExecuteScalar());
}
}
Console.ReadLine();
}
private static void InitializeData()
{
using (var connection = new SqlConnection("server=localhost;database=Test;integrated security=true"))
using (var command = connection.CreateCommand())
{
command.CommandText = "insert into MyTable values (1),(2),(3)";
connection.Open();
command.ExecuteNonQuery();
}
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Prepared statements and the built-in connection pool in .NET - c#

Related

C# Parallel.For and Oracle database access - memory exception

Opening and Closing OleDbConnection during Data Processing - is this good form?

What's the most DRY-appropriate way to execute an SQL command?

Very slow foreach loop

Is it possible to rollback committed data with TransactionScope?

Categories

Resources