TLDR; I have an ASP.NET Core 5.0 API that's sitting at AWS. It makes a large call to MSSQL db to return ~1-4k rows of data. A single request is fine, taking ~500ms, but when multiple requests come in about the same time (4-5), the request slows to ~2000ms per call. What's going on?
There's not much more to state than what I have above. I open a connection to our DB then initialize a SqlCommand.
using (var connection = new SqlConnection(dbConnection))
connection.Open();
using (SqlCommand command = new SqlCommand(strSQLCommand))
I've tried both filling a datatable with SqlDataAdapter and using a SqlDataReader to fill up a custom object, I get similar slow downs either way. As stated above the query returns ~1-4k rows of data of varying types. And Postman says the returned Json data is about 1.95MB of size after decompression. The slowdown only occurs when multiple requests come in around the same time. I don't know if it's having trouble with multiple connections to the db, or if it's about the size of the data and available memory. Paging isn't an option, the request needs to return that much data.
This all occurs within a HttpGet function
[HttpGet]
[Route("Foo")]
[Consumes("application/json")]
[EnableCors("DefaultPolicy")]
public IActionResult Foo([FromHeader] FooRequest request)
{
///stuff
DataTable dt = new DataTable();
using (var connection = new SqlConnection(_dataDBConnection))
{
timer.Start();
connection.Open();
using (SqlCommand command = new SqlCommand(
"SELECT foo.name, bar.first, bar.second, bar.third, bar.fourth
FROM dbo.foo with(nolock)
JOIN dbo.bar with(nolock) ON bar.name = foo.name
WHERE bar.date = #date", connection))
{
command.Parameters.AddWithValue("#date", request.Date.ToString("yyyyMMdd"));
using (SqlDataAdapter adapter = new SqlDataAdapter(command))
{
adapter.Fill(dt);
}
}
timer.Stop();
long elapsed = timer.ElapsedMilliseconds;
}
///Parse the data from datatable into a List<object> and return
///I've also used a DataReader to put the data directly into the List<object> but experienced the same slowdown.
///response is a class containing an array of objects that returns all the data from the SQL request
return new JsonResult(response);
}
Any insights would be appreciated!
--EDIT AFTER ADDITOINAL TESTING---
[HttpGet]
[Route("Foo")]
[Consumes("application/json")]
[EnableCors("DefaultPolicy")]
public IActionResult Foo([FromHeader] FooRequest request)
{
///stuff
using (var connection = new SqlConnection(_dataDBConnection))
{
connection.Open();
///This runs significantly faster
using (SqlCommand command = new SqlCommand(#"dbo.spGetFoo", connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("#date", request.date.ToString("yyyyMMdd"));
using (SqlDataReader reader = command.ExecuteReader())
{
while (reader.Read())
{
///Add data to list to be returned
}
}
}
}
///Parse the data from datatable into a List<object> and return
///I've also used a DataReader to put the data directly into the List<object> but experienced the same slowdown.
///response is a class containing an array of objects that returns all the data from the SQL request
return new JsonResult(response);
}
--FINAL EDIT PLEASE READ--
People seem to be getting caught up on the DataAdapter and Fill portion instead of reading the full post. So, I'll include a final example here that provides the same issue above.
[HttpGet]
[Route("Foo")]
[Consumes("application/json")]
[EnableCors("DefaultPolicy")]
public async Task<IActionResult> Foo([FromHeader] FooRequest request)
{
///stuff
using (var connection = new SqlConnection(_dataDBConnection))
{
await connection.OpenAsync();
///This runs significantly faster
using (SqlCommand command = new SqlCommand(#"dbo.spGetFoo", connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("#date", request.date.ToString("yyyyMMdd"));
using (SqlDataReader reader = await command.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
///Add data to list to be returned
}
}
}
}
///Parse the data from datatable into a List<object> and return
///response is a class containing an array of objects that returns all the data from the SQL request
return new JsonResult(response);
}
First thing to note here is that your action method is not asynchronous. Second thing to note here is that using adapters to fill datasets is something I hadn't seen for years now. Use Dapper! Finally, that call to the adapter's Fill() method is most likely synchronous. Move to Dapper and use asynchronous calls to maximize your ASP.net throughput.
I think your idea is correct, it shouldn't be a database problem.
I think that Session can be one suspect. If you use ASP.NET Core
Session in your application, requests are queued and processed one by
one. So, the last request can stay holt in the queue while the
previous requests are being processed.
Another can be bits of MVC running in your pipeline and that can bring
Session without asking you.
In addition, another possible reason is that all threads in the
ASP.NET Core Thread Pool are
busy.
In this case, a new thread will be created to process a new request
that takes additional time.
This is just my idea, any other cause is possible. Hope it can help you.
The reason this is slow is that the method is not async. This means that threads are blocked. Since Asp.Net has a limited thread pool, it will be exhausted after a while, and then additional requests will have to queue, which makes the system slow. All of this should be fixed by using async await pattern.
Since SQLDataAdapter does not provide any async methods, it could be easier to use a technology which provides such an async methods, e.g. EF Core. Otherwise you could start a new task for adapter.Fill, however, this is not a clean way of doing it.
Related
I am using Microsoft.Enterprise Library with MySql database.
I am trying to call a database method asynchronously.
My method is like below..
public static async Task<DataTable> ExecuteDataSetWithParameterAsSourceAsync(this Database database, CommandDetail cmdDetail, params Object[] parameterSource)
{
List<int> outParamIndex = new List<int>();
AsyncCallback cb = new AsyncCallback(EndExecuteReaderCallBack);
DataTable dt = new DataTable();
DbAsyncState state = BeginExecuteReader(database, cmdDetail.CommandText, cb, parameterSource);
IDataReader reader = (IDataReader)state.State;
dt.Load(reader);
reader.Close();
...
return await Task.FromResult(dt);
}
I am getting below Error
{"The database type "MySqlDatabase" does not support asynchronous operations."}
Below is the complete stack image of the error..
My connection string is
<add name="VirtualCloudDB" providerName="EntLibContrib.Data.MySql" connectionString="database=test;uid=xxx;password=xxx;Data Source=test-instance; maximumpoolsize=3"/>
About the error
Oracle's Connector/.NET library didn't even allow asynchronous operations before v8.0. Even now, there are several quirks. It's better to use the independent, open-source MySqlConnector library.
If you absolutely must use Connector/.NET, upgrade to the latest version.
About the code (no history lesson)
Forget EntLib, especially DAAB. Even the docs say:
The Database class leverages the provider factory model from ADO.NET. A database instance holds a reference to a concrete DbProviderFactory object to which it forwards the creation of ADO.NET objects.
What you use isn't the real thing anyway, it's a community-supported clone of the official code that used to be stored in Codeplex. The only thing that is still in development is the Unity DI container.
Real async operations are available in ADO.NET and implemented by most providers. The database-agnostic, factory-based model of EntLib 1 was incorporated into ADO.NET 2 back in 2006. Entlib 2.0 DAAB is essentially a thin layer of convenience methods over ADO.NET 2.
ADO.NET 2 "Raw"
In ADO.NET 2.0 alone, the entire method can be replaced with :
async Task<DataTable> LoadProducts(string category)
{
var sql="select * from Products where category=#category";
using(var connection=new MySqlConnection(_connStrFromConfig))
using(var cmd=new MySqlCommand(sql,connection))
{
cmd.Parameters.AddWithValue("#category",category);
await connection.OpenAsync();
using(var reader=await cmd.ExecuteReaderAsync())
{
DataTable dt = new DataTable();
dt.Load(reader);
return dt;
}
}
}
Especially for MySQL, it's better to use the open-source MySqlConnector library than Oracle's official Connector/.NET.
ADO.NET 2 Factory model
ADO.NET 2 added abstract base classes and a factory model (based on DAAB 1, but easier) that allows using database-agnostic code as much as possible.
The previous code, without using the provider factory, can be rewritten as :
string _providerName="MySqlConnector"
DbCommand CreateConnection()
{
DbProviderFactory _factory =DbProviderFactories.GetFactory(_providerName);
connection = _factory.CreateConnection();
connection.ConnectionString = connectionString;
return connection;
}
async Task<DataTable> LoadProducts(string category)
{
var sql="select * from Products where category=#category";
using(DbConnection connection=CreateConnection())
using(DbCommand cmd= connection.CreateCommand())
{
cmd.CommandText=sql;
var param=cmd.CreateParameter();
param.Name="#category";
//The default is String, so we don't have to set it
//param.DbType=DbType.String;
param.Value=category;
cmd.Parameters.Add("#category",category);
await connection.OpenAsync();
using(var reader=await cmd.ExecuteReaderAsync())
{
DataTable dt = new DataTable();
dt.Load(reader);
return dt;
}
}
}
All that's needed to target eg SQL Server or Oracle is registering and using a different provider name.
The code can be simplified. For example, DbParameterCollection.AddRange can be used to add multiple parameters at once. That's still too much code by modern standards though.
Entlib 2 DAAB - it's the same classes
Entlib 2 DAAB uses the same abstract classes. In fact, the Database class does little more than add convenience methods on top of the abstract classes, eg methods to create a DbCommand, or execute a query and return a reader or a Dataset.
If you didn't need parameters, you could write just :
DataTable LoadProducts(Database database)
{
var sql="select * from Products";
var set=database.ExecuteDataSet(CommandType.Text,sql);
return set.Tables[0];
}
Unfortunately, there's no way to combine a raw query and parameters. Back when EntLib 1 was created it was thought that complex code should always be stored in a stored procedure. So while there's a ExecuteDataSet(string storedProcedureName,params object[] parameterValues), there's no equivalent for raw SQL.
And no Task-based async methods either. By 2010 EntLib was in support mode already.
Unfortunately again there's no way to directly create a DbCommand from Database. Again, the assumption was that people would either execute raw SQL or called a stored procedure. There's a GetSqlStringCommand that accepts no parameters. There's also Database.ProviderFactory that can be used to do everything manually, and end up with the same code as raw ADO.NET.
Another possible option is to cheat, and use Database.GetStoredProcCommand with positional parameters and change the CommandType
async Task<DataTable> LoadProducts(Database database,string category)
{
var sql="select * from Products where category=#category";
using(var cmd=database.GetStoredProcCommand(sql,category))
{
cmd.CommandType=CommandType.Text;
using(var reader=await cmd.ExecuteReaderAsync())
{
DataTable dt = new DataTable();
dt.Load(reader);
return dt;
}
}
return set.Tables[0];
}
Dapper
With microORM libraries like Dapper the code can be reduced to :
async Task<IEnumerable<Product>> LoadProducts(string category)
{
var sql="select * from Products where category=#category";
using(var connection=CreateConnection())
{
var products=await connection.Query<Product>(sql,new {category=category});
return products;
}
}
Dapper will open the connection if it's closed, execute the query asynchronously and map the results to the target object, in a single line of code. Parameters will be mapped by name from the parameter object.
When called without a type parameter, Query returns a list of dynamic objects
async Task<IEnumerable<dynamic>> LoadProducts(string category)
{
var sql="select * from Products where category=#category";
using(var connection=CreateConnection())
{
var products=await connection.Query(sql,new {category=category});
return products;
}
}
I need a little help here.
I'm facing a problem now with Oracle.ManagedDataAccess.Core.
I have a class written for centralize oracle queries (Clases.Oracle()), which works perfectly, but with one query the memory usage rises up to 1GB, which is not a real problem, considering that the resultset has about 260.000 rows in the worst scenario. The real problem is that it never frees that memory, and if I execute that query again, rises up to 2GB, been that the higher limit so far.
I've tried adding GC.Collect() and GC.WaitForPendingFinalizers() with no results.
My command execution function in Clases.Oracle() is:
private DataTable ExecuteReader(string package, ref OracleParameter[] parametros, string owner)
{
var dt = new DataTable();
using (var cn = new OracleConnection(_connection_string))
{
using var cmd = cn.CreateCommand();
try
{
cn.Open();
cmd.CommandText = $"{owner}.{package}";
cmd.CommandType = CommandType.StoredProcedure;
foreach (var par in parametros)
{
cmd.Parameters.Add(par);
}
using var rdr = cmd.ExecuteReader(CommandBehavior.CloseConnection);
dt.Load(rdr);
}
catch (Exception ex)
{
throw new Exceptions.OracleException(ex.Message);
}
finally
{
cn?.Close();
cmd?.Dispose();
cn?.Dispose();
}
}
return dt;
}
I'm using the using clause, so the objects are disposing.
And I'm calling the connection with this function:
public List<AuditoriaUsuarios> ObtieneAuditoriaUsuarios(long incluyeCargoRol = 0)
{
var ora = new Clases.Oracle();
var param = new OracleParameter[]
{
ora.AddInParameter("PIN_INCLUYECARGOROL",OracleDbType.Decimal, incluyeCargoRol),
ora.AddOutCursor("CUR_OUT"),
ora.AddOutParameter("PON_CODE", OracleDbType.Decimal),
ora.AddOutParameter("POV_ERROR", OracleDbType.Varchar2)
};
var result = ora.ExecuteReader<AuditoriaUsuarios>($"{_PCK}.p_AUDIT_USUARIOS", ref param);
if (ora.HayError(param))
{
throw new Exceptions.OracleException(ora.CodigoError, ora.MensajeError);
}
//GC.Collect();
//GC.WaitForPendingFinalizers();
return result;
}
Clases.Oracle() doesn't need to be Disposable, because all the objects that I'm using are Disposable, and are being disposed, and 2 strings for the ConnectionString and the name of the Database Owner.
This is a Memory Usage dump from VS
You can see an oracle related object (OracleInternal.Common.ZoneValue) using a lot of memory way long after the ExecuteReader finished and the results where returned.
Don't know if I'm doing something wrong.
Edit:
I've forgot. This is a ASP .Net Core x64 WebAPI, using .NET Core 3.1 and C#, with Visual Studio 2019 Enterprise.
Edit2:
I know it's dirty, but adding this to ObtieneAuditoriaUsuarios made things a little better. (In this case I don't care about CPU usage, because this data extraction it's supposed to be executed a few times a week, and is not part of everyday operation):
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
Edit3:
I've sent 8 simultaneous requests and the memory usage rised up to 3GB in some tests. But only it takes 1 request with a filter that returns less than 100rows, and the memory usage drops to less than 1GB
I'm trying to modify an existing database call so that it makes asynchronous database calls. This is my first foray into asynchronous database calls, so I've looked at this post, this post, and the first two sections of this MSDN article. This is the code that I've come up with, which is similar to what's found in the second answer of the second post:
public async Task<IEnumerable<Item>> GetDataAsync(int id)
{
using (SqlConnection conn = new SqlConnection(oCSB.ConnectionString))
{
using (SqlCommand cmd = new SqlCommand("stored_procedure", conn))
{
cmd.CommandType = System.Data.CommandType.StoredProcedure;
cmd.Parameters.AddWithValue("param", "value");
await conn.OpenAsync();
SqlDataReader reader = await cmd.ExecuteReaderAsync();
return ReadItems(reader, id).ToList();
}
}
}
private IEnumerable<Item> ReadItems(SqlDataReader reader, long id)
{
while (reader.Read())
{
var item = new Item(id);
yield return item;
}
}
The application is a Web Forms application, and the call is initiated by a jQuery ajax request to a static WebMethod in an aspx page, which then calls the GetDataAsync method. Unfortunately, the application hangs on the cmd.ExecuteReaderAsync call with no exception thrown, and I haven't been able to figure out why. I've run it both on the VS dev server and on my local IIS 8, but I get the same result. I've also tried modifying it so that it makes a very simple select on a table. I've also tried changing the code based on other posts I've come across either on MSDN or SO. Anybody know what could possibly be causing it to hang on the ExecuteReaderAsync call?
I am working on an existing application. This application reads data from a huge file and then, after doing some calculations, it stores the data in another table.
But the loop doing this (see below) is taking a really long time. Since the file sometimes contains 1,000s of records, the entire process takes days.
Can I replace this foreach loop with something else? I tried using Parallel.ForEach and it did help. I am new to this, so will appreciate your help.
foreach (record someredord Somereport.r)
{
try
{
using (var command = new SqlCommand("[procname]", sqlConn))
{
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add(…);
IAsyncResult result = command.BeginExecuteReader();
while (!result.IsCompleted)
{
System.Threading.Thread.Sleep(10);
}
command.EndExecuteReader(result);
}
}
catch (Exception e)
{
…
}
}
After reviewing the answers , I removed the Async and used edited the code as below. But this did not improve performance.
using (command = new SqlCommand("[sp]", sqlConn))
{
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
foreach (record someRecord in someReport.)
{
command.Parameters.Clear();
command.Parameters.Add(....)
command.Prepare();
using (dr = command.ExecuteReader())
{
while (dr.Read())
{
if ()
{
}
else if ()
{
}
}
}
}
}
Instead of looping the sql connection so many times, ever consider extracting the whole set of data out from sql server and process the data via the dataset?
Edit: Decided to further explain what i meant..
You can do the following, pseudo code as follow
Use a select * and get all information from the database and store them into a list of the class or dictionary.
Do your foreach(record someRecord in someReport) and do the condition matching as usual.
Step 1: Ditch the try at async. It isn't implemented properly and you're blocking anyway. So just execute the procedure and see if that helps.
Step 2: Move the SqlCommand outside of the loop and reuse it for each iteration. that way you don't incurr the cost of creating and destroying it for every item in your loop.
Warning: Make sure you reset/clear/remove parameters you don't need from the previous iteration. We did something like this with optional parameters and had 'bleed-thru' from the previous iteration because we didn't clean up parameters we didn't need!
Your biggest problem is that you're looping over this:
IAsyncResult result = command.BeginExecuteReader();
while (!result.IsCompleted)
{
System.Threading.Thread.Sleep(10);
}
command.EndExecuteReader(result);
The entire idea of the asynchronous model is that the calling thread (the one doing this loop) should be spinning up ALL of the asynchronous tasks using the Begin method before starting to work with the results with the End method. If you are using Thread.Sleep() within your main calling thread to wait for an asynchronous operation to complete (as you are here), you're doing it wrong, and what ends up happening is that each command, one at a time, is being called and then waited for before the next one starts.
Instead, try something like this:
public void BeginExecutingCommands(Report someReport)
{
foreach (record someRecord in someReport.r)
{
var command = new SqlCommand("[procname]", sqlConn);
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add(…);
command.BeginExecuteReader(ReaderExecuted,
new object[] { command, someReport, someRecord });
}
}
void ReaderExecuted(IAsyncResult result)
{
var state = (object[])result.AsyncState;
var command = state[0] as SqlCommand;
var someReport = state[1] as Report;
var someRecord = state[2] as Record;
try
{
using (SqlDataReader reader = command.EndExecuteReader(result))
{
// work with reader, command, someReport and someRecord to do what you need.
}
}
catch (Exception ex)
{
// handle exceptions that occurred during the async operation here
}
}
In SQL on the other end of a write is a (one) disk. You rarely can write faster in parallel. In fact in parallel often slows it down due to index fragmentation. If you can sort the data by primary (clustered) key prior to loading. In a big load even disable other keys, load data rebuild keys.
Not really sure what are doing in the asynch but for sure it was not doing what you expected as it was waiting on itself.
try
{
using (var command = new SqlCommand("[procname]", sqlConn))
{
command.CommandTimeout = 0;
command.CommandType = CommandType.StoredProcedure;
foreach (record someredord Somereport.r)
{
command.Parameters.Clear()
command.Parameters.Add(…);
using (var rdr = command.ExecuteReader())
{
while (rdr.Read())
{
…
}
}
}
}
}
catch (…)
{
…
}
As we were talking about in the comments, storing this data in memory and working with it there may be a more efficient approach.
So one easy way to do that is to start with Entity Framework. Entity Framework will automatically generate the classes for you based on your database schema. Then you can import a stored procedure which holds your SELECT statement. The reason I suggest importing a stored proc into EF is that this approach is generally more efficient than doing your queries in LINQ against EF.
Then run the stored proc and store the data in a List like this...
var data = db.MyStoredProc().ToList();
Then you can do anything you want with that data. Or as I mentioned, if you're doing a lot of lookups on primary keys then use ToDictionary() something like this...
var data = db.MyStoredProc().ToDictionary(k => k.MyPrimaryKey);
Either way, you'll be working with your data in memory at this point.
It seems executing your SQL command puts lock on some required resources and that's the reason enforced you to use Async methods (my guess).
If the database in not in use, try an exclusive access to it. Even then in there are some internal transactions due to data-model complexity consider consulting to database designer.
I have a long-running service with several threads calling the following method hundreds of times per second:
void TheMethod()
{
using (var c = new SqlConnection("..."))
{
c.Open();
var ret1 = PrepareAndExecuteStatement1(c, args1);
// some code
var ret2 = PrepareAndExecuteStatement2(c, args2);
// more code
}
}
PrepareAndExecuteStatement is something like this:
void PrepareAndExecuteStatement*(SqlConnection c, args)
{
var cmd = new SqlCommand("query", c);
cmd.Parameters.Add("#param", type);
cmd.Prepare();
cmd.Parameters["#param"] = args;
return cmd.execute().read().etc();
}
I want reuse the prepared statements, preparing once per connection and executing them until the connection breaks. I hope this will improve performance.
Can I use the built-in connection pool to achieve this? Ideally every time a new connection is made, all statements should be automatically prepared, and I need to have access to the SqlCommand objects of these statements.
Suggest taking a slightly modified approach. Close your connection immedately after use. You can certainly re-use your SqlConnection.
The work being done at //some code may take a long time. Are you interacting with other network resources, disk resources, or spending any amount of time with calculations? Could you ever, in the future, need to do so? Perhaps the intervals between executing statement are/could be so long that you'd want to reopen that connection. Regardless, the Connection should be opened late and closed early.
using (var c = new SqlConnection("..."))
{
c.Open();
PrepareAndExecuteStatement1(c, args);
c.Close();
// some code
c.Open();
PrepareAndExecuteStatement2(c, args);
c.Close();
// more code
}
Open Late, Close Early as MSDN Magazine by John Papa.
Obviously we've now got a bunch of code duplication here. Consider refactoring your Prepare...() method to perform the opening and closing operations.
Perhaps you'd consider something like this:
using (var c = new SqlConnection("..."))
{
var cmd1 = PrepareAndCreateCommand(c, args);
// some code
var cmd2 = PrepareAndCreateCommand(c, args);
c.Open();
cmd1.ExecuteNonQuery();
cmd2.ExecuteNonQuery();
c.Close();
// more code
}