Unusual SQL/Data issues - c#

We have a report that has been giving us some serious issues, so I decided to put it into a console application in order to troubleshoot the issues.
The report is just a simple single select from SQL, returning approximately 25 columns,
and our date range can be 3-6 months, returning around 10k rows, so we are not talking about a lot of data.
Here is whats happening, when the report runs, it is timing out from our website, in the console, it takes anywhere from 13-18 mins to finish, the wait seems to happen at the da.Fill(ds);
Now here is the strange thing, it runs approximately 1-3 seconds within SQL Server Management Studio, and when our Delphi developers create a similar application, it is also a few seconds to run, this only happens using .NET
We tried changing from a dataset to loading into a datareader,
using this code..
using (var dr = _command.ExecuteReader())
{
if (dr.HasRows)
{
int i = 0;
while (dr.Read())
{
var startRead = DateTime.Now;
Console.Write("{2}\t{0}\t{1}\t", dr.GetInt32(0), dr.GetString(1), i);
var tookRead = DateTime.Now.Subtract(startRead);
Console.WriteLine("Took: " + tookRead);
i++;
}
}
However it did not help at all, it just displays in chucks but has frequent delays. I'm thinking its SQL, but can't explain why it works fine in Delphi and in SQL Management Studio.
I've tried using .NET 2.0, 3.5 and 4, happens on all frameworks.
Here is my code
public static DataSet GetData()
{
var now = DateTime.Now;
var _command = new SqlCommand();
var _connection = new SqlConnection();
try
{
_connection.ConnectionString = connectionString;
_command.Connection = _connection;
_command.CommandText = storedProcedure;
_command.CommandType = CommandType.StoredProcedure;
_command.CommandTimeout = 60;
if (string.IsNullOrEmpty(_connection.ConnectionString)) { throw new Exception("Connection String was not supplied"); }
_command.Parameters.Add(new SqlParameter("DateFrom", dateFrom));
_command.Parameters.Add(new SqlParameter("DateTo", dateTo));
SqlDataAdapter da;
var ds = new DataSet();
_connection.Open();
var done = DateTime.Now;
da = new SqlDataAdapter(_command);
da.Fill(ds);
if (ds == null) { throw new Exception("DataSet is null."); }
if (ds.Tables.Count == 0) { throw new Exception("Table count is 0"); }
var took = done.Subtract(now);
return ds;
}
catch (Exception ex)
{
File.WriteAllText(Path.Combine(Application.StartupPath, String.Format("Exception{0:MMddyyyy_HHmmss}.log", DateTime.Now)), ex.ToString());
}
finally
{
if (_connection.State != ConnectionState.Closed) { _connection.Close(); }
}
return null;
}
Any ideas? Our DBA is blaming the framework, I'm actually blaming something in SQL.. (maybe statistics, or corrupted db)

Differences in SQL performance between .NET and other clients (SQL Management Studio) are usually down to the connections being configured differently - frequent culprits are ANSI_NULLS; ANSI_PADDING.
Try looking at how the connection is configured in SQL Management Studio, then replicate the same thing in your .NET application.

The information you give doesn't contain enough details to really help...
IF SSMS is really that much faster then the reason could be some session/connection setting - SSMS uses subtly different settings in comparison to .NET.
For some explanation and hints on what could be different/wrong etc. see http://www.sommarskog.se/query-plan-mysteries.html

Related

C# Low in memory and Insert into Oracle DB data - 1 million rows - OutOfMemory Exception

The Problem:
I have a web application where people can upload xml, xmls, csv files.
I then take their content and insert it into my Oracle DB.
Technical details:
I recently had a problem where I get OutOfMemory Exception trying to use the data.
The previous developer created a list of lists on the data in order to manage them. However, this is giving us OutOfMemory Exception.
We are using the LinqToExcel library.
Sample code:
excel = new ExcelQueryFactory(excelFile);
IEnumerable<RowNoHeader> data = from row in excel.WorksheetNoHeader(sheetName)
select row;
List<List<string>> d = new List<List<string>>(data.Count());
foreach (RowNoHeader row in data)
{
List<string> list = new List<string>();
foreach (Cell cell in row)
{
string cellValue = cell.Value.ToString().Trim(' ').Trim(null);
list.Add(cellValue);
}
d.Add(list);
}
I have tried to change the code and instead did this:
string connectionstring = string.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 12.0;HDR=YES;';", excelFile);
OleDbConnection connection = new OleDbConnection();
connection.ConnectionString = connectionstring;
OleDbCommand excelCommand = new OleDbCommand();
excelCommand.Connection = connection;
excelCommand.CommandText = String.Format("Select * FROM [{0}$]", sheetName);
connection.Open();
DataTable dtbl = CreateTable(TableColumns);
OleDbDataReader reader = excelCommand.ExecuteReader();
while (reader.Read())
{
DataRow row = dtbl.NewRow();
dtbl.Rows.Add(row);
}
using (OracleCommand command = new OracleCommand(selectCommand, _oracleConnection))
{
using (OracleDataAdapter adapter = new OracleDataAdapter(command))
{
using (OracleCommandBuilder builder = new OracleCommandBuilder(adapter))
{
OracleTransaction trans = _oracleConnection.BeginTransaction();
command.Transaction = trans;
adapter.InsertCommand = builder.GetInsertCommand(true);
adapter.Update(dtbl);
trans.Commit();
}
}
}
However, I still get the same OutOfMemory Exception.
I have read online and've seen that I should make my project x64 and use the following:
<runtime>
<gcAllowVeryLargeObjects enabled="true" />
</runtime>
However, I can't change my web application to run on x64.
My solution was to make this in batches like this:
int rowCount = 0;
while (reader.Read())
{
DataRow row = dtbl.NewRow();
dtbl.Rows.Add(row);
if (rowCount % _batches == 0 && rowCount != 0)
{
DBInsert(dtbl, selectCommand);
dtbl = CreateTable(TableColumns);
}
}
private void DBInsert(DataTable dt, string selectCommand)
{
using (OracleCommand command = new OracleCommand(selectCommand, _oracleConnection))
{
using (OracleDataAdapter adapter = new OracleDataAdapter(command))
{
using (OracleCommandBuilder builder = new OracleCommandBuilder(adapter))
{
OracleTransaction trans = _oracleConnection.BeginTransaction();
command.Transaction = trans;
adapter.InsertCommand = builder.GetInsertCommand(true);
adapter.Update(dt);
trans.Commit();
}
}
}
}
}
It works, however this is very slow. I was wondering if there is a way to either solve the problem with the memory serially or write in memory in parallel.
I have tried to insert the data in parallel using threads but this takes a lot of memory and throws OutOfMemory Exception as well.
Just don't load 1M rows into a DataTable. Use whatever bulk import mechanism is available to load a stream of rows. Oracle, like SQL Server offers several ways to bulk import data.
Collections like List or DataTable use an internal buffer to store data that they reallocate when it fills up, using twice the original size. With 1M rows that leads to a lot of reallocations and a lot of memory fragmentation. The runtime may no longer be able to even find a contiguous block of memory large enough to store 2M entries. That's why it's important to set the capacity parameter when creating a new List.
Apart from that, it doesn't serve any purpose to load everything in memory and then send it to the database. It's actually faster to send the data as soon as each file is read, or as soon as a sufficiently large number is loaded. Instead of trying to load 1M rows at once, read 500 or 1000 of them each time and send them to the database.
Furthermore, Oracle's ADO.NET provider includes the OracleBulkCopy class that works in a way similar to SqlBulkCopy for SQL Server. The WriteToServer method can accept a DataTable or a DataReader. You can use the DataTable overload to send batches of item. An even better idea is to use the overload that accepts a reader and have the class collect the batch and send it to the database.
Eg :
using(var bcp=OracleBulkCopy(connectionString))
{
bcp.BatchSize=5000;
bcp.DestinationTableName = "MyTable";
//For each source/target column pair, add a mapping
bcp.ColumnMappings.Add("ColumnA","ColumnA");
var reader = excelCommand.ExecuteReader();
bcp.WriteToServer(reader);
}

C# - DataTable Out of Memory exception in application to catch SQL Server "INSERT" events

I have been tasked with creating an application that monitors any "INSERT" events on a specific table. I was going to go about this using SqlDependency to create a notification link between the DB and the C# app, but it turns out I am not able to do this due to security issues.
Due to this, I have modeled my application as follows:
This is well and good, but as it turns out, the SQL table I am querying has a rather large size. The table has nearly 3.5 Million rows 55 columns. When loading into the C# DataTable object, I am getting an out of memory exception.
internal static DataTable ExecuteQuery(string query, Dictionary<string,string> parameters = null)
{
try
{
using (SqlConnection dbconn = new SqlConnection(SQLServer.Settings.ConnectionString))
using (SqlCommand cmd = new SqlCommand())
{
dbconn.Open(); // Open the connection
cmd.CommandText = query; // Set the query text
cmd.Connection = dbconn;
if (parameters != null)
{
foreach (var parameter in parameters) // Add filter parameters
cmd.Parameters.AddWithValue(parameter.Key, parameter.Value);
}
var dt = new DataTable();
using (SqlDataAdapter adpt = new SqlDataAdapter(cmd)){adpt.Fill(dt);} // MY ERROR OCCURS HERE!
dbconn.Close();
queryError = false;
return dt;
}
}
catch(Exception ex)
{
queryError = true;
EventLogger.WriteToLog("ExecuteQuery()", "Application", "Error: An error has occured while performing a database query.\r\nException: " + ex.Message);
return null;
}
}
When running the code above, I get the following error at the line for SqlDataAdapter.Fill(dt)
Exception of type 'System.OutOfMemoryException' was thrown.
Is there a way that I can either restructure my application OR prevent this incredibly high memory consumption from the DataTable class? SQL server seems capable enough to do a select * from the table but when I fill a DataTable with the same data, I use up over 6GB of RAM! Why is there so much overhead when using DataTable?
Here is a link to my flowchart.
I was able to resolve this issue by making use of the SqlDataReaderclass. This class lets you "stream" the sql result set row by row rather bringing back the entire result set all at once and loading that into memory.
So now in step 5 from the flow chart, I can query for only the very first row. Then in step 6, I can query again at a later date and iterate through the new result set one row at a time until I find the original row I started at. All the while, I am filling a DataTable with the new results. This accomplishes two things.
I don't need to load all the data from the query all at once into local memory.
I can immediately get the "inverse" DataSet. AKA... I can get the newly inserted rows that didn't exist the first time I checked.
Which is exactly what I was after. Here is just a portion of the code:
private static SqlDataReader reader;
private static SqlConnection dbconn = new SqlConnection(SQLServer.Settings.ConnectionString);
private void GetNextRows(int numRows)
{
if (dbconn.State != ConnectionState.Open)
OpenConnection();
// Iterate columns one by one for the specified limit.
int rowCnt = 0;
while (rowCnt < numRows)
{
while (reader.Read())
{
object[] row = new object[reader.FieldCount];
reader.GetValues(row);
resultsTable.LoadDataRow(row, LoadOption.PreserveChanges);
rowCnt++;
sessionRowPosition++;
break;
}
}
}
The whole class would be too large for me to post here but one of the caveats was that the interval between checks for me was long, on the order of days, so I needed to close the connection between checks. When closing the connection with a SqlDataReader, you loose your row position so I needed to add a counter to keep track of that.
Check you query for select. You probably get from database many rows.

How can I log query time?

I have few oledb connections like this:
try
{
OleDbConnection Connection8;
using (Connection8 = new OleDbConnection("Provider=MSDAORA.1;Data Source=DATABASE:1521/orcl;Persist Security Info=True;Password=PASSWORD;User ID=USERNAME;"))
{
string sqlQuery = "select * from TABLE";
using (OleDbDataAdapter cmd = new OleDbDataAdapter(sqlQuery, Connection8))
{
Connection8.Open();
DataTable dt = new DataTable();
cmd.Fill(dt);
GridView5.DataSource = dt;
GridView5.DataBind();
v8 = 1;
Connection8.Close();
}
}
}
catch (Exception)
{
v8 = 0;
}
Some connections waiting so much, but I can't know which one.
How can I log or see query time for every connection? Any suggestion for that? Thank you.
You can use Stopwatch:
var stopwatch = new Stopwatch();
DataTable dt = new DataTable();
stopwatch.Start();
Connection8.Open();
cmd.Fill(dt);
stopwatch.Stop();
var timeElapsed = stopwatch.ElapsedMilliseconds;
Notice here in sample I've shown time to open connection will be included in measured time. If you don't need it and want "pure" query execution time - then just change the order od lines where connection being opened and stopwatch started.
I don't know if this will work because you are using a OleDbConnection but one thing you may be able to do is in control panel open the "Odbc Administrator" (Be sure to check if you want 32 bit or 64 bit) there is a "Tracing" tab you can turn on, that gives you a log file of all ODBC requests that are processed.
But remember, like I said, because you are using a OleDbConnection it may not log anything.
I've used Glimpse in the past, might be as well useful for you:
http://getglimpse.com/

DataAdapter.Update() performance

I have a relatively simply routine that looks at database entries for media files, calculates the width, height and filesize, and writes them back into the database.
The database is SQLite, using the System.Data.SQLite library, processing ~4000 rows. I load all rows into an ADO table, update the rows/columns with the new values, then run adapter.Update(table); on it.
Loading the dataset from the db tables half a second or so, updating all the rows with image width/height and getting the file length from FileInfo took maybe 30 seconds. Fine.
The adapter.Update(table); command took somewhere in the vicinity of 5 to 7 minutes to run.
That seems awfully excessive. The ID is a PK INTEGER and thus - according to SQLite's docs, is inherently indexed, yet even so I can't help but think that if I were to run a separate update command for each individual update, this would have completed much faster.
I had considered ADO/adapters to be relatively low level (as opposed to ORMs anyway), and this terrible performance surprised me. Can anyone shed some light on why it would take 5-7 minutes to update a batch of ~4000 records against a locally placed SQLite database?
As a possible aside, is there some way to "peek into" how ADO is processing this? Internal library stepthroughs or...??
Thanks
public static int FillMediaSizes() {
// returns the count of records updated
int recordsAffected = 0;
DataTable table = new DataTable();
SQLiteDataAdapter adapter = new SQLiteDataAdapter();
using (SQLiteConnection conn = new SQLiteConnection(Globals.Config.dbAppNameConnectionString))
using (SQLiteCommand cmdSelect = new SQLiteCommand())
using (SQLiteCommand cmdUpdate = new SQLiteCommand()) {
cmdSelect.Connection = conn;
cmdSelect.CommandText =
"SELECT ID, MediaPathCurrent, MediaWidth, MediaHeight, MediaFilesizeBytes " +
"FROM Media " +
"WHERE MediaType = 1 AND (MediaWidth IS NULL OR MediaHeight IS NULL OR MediaFilesizeBytes IS NULL);";
cmdUpdate.Connection = conn;
cmdUpdate.CommandText =
"UPDATE Media SET MediaWidth = #w, MediaHeight = #h, MediaFilesizeBytes = #b WHERE ID = #id;";
cmdUpdate.Parameters.Add("#w", DbType.Int32, 4, "MediaWidth");
cmdUpdate.Parameters.Add("#h", DbType.Int32, 4, "MediaHeight");
cmdUpdate.Parameters.Add("#b", DbType.Int32, 4, "MediaFilesizeBytes");
SQLiteParameter param = cmdUpdate.Parameters.Add("#id", DbType.Int32);
param.SourceColumn = "ID";
param.SourceVersion = DataRowVersion.Original;
adapter.SelectCommand = cmdSelect;
adapter.UpdateCommand = cmdUpdate;
try {
conn.Open();
adapter.Fill(table);
conn.Close();
}
catch (Exception e) {
Core.ExceptionHandler.HandleException(e, true);
throw new DatabaseOperationException("", e);
}
foreach (DataRow row in table.Rows) {
try {
using (System.Drawing.Image img = System.Drawing.Image.FromFile(row["MediaPathCurrent"].ToString())) {
System.IO.FileInfo fi;
fi = new System.IO.FileInfo(row["MediaPathCurrent"].ToString());
if (img != null) {
int width = img.Width;
int height = img.Height;
long length = fi.Length;
row["MediaWidth"] = width;
row["MediaHeight"] = height;
row["MediaFilesizeBytes"] = (int)length;
}
}
}
catch (Exception e) {
Core.ExceptionHandler.HandleException(e);
DevUtil.Print(e);
continue;
}
}
try {
recordsAffected = adapter.Update(table);
}
catch (Exception e) {
Core.ExceptionHandler.HandleException(e);
throw new DatabaseOperationException("", e);
}
}
return recordsAffected;
}
Use Connection.BeginTransaction() to speed up the DataAdapter update.
conn.Open() 'open connection
Dim myTrans As SQLiteTransaction
myTrans = conn.BeginTransaction()
'Associate the transaction with the select command object of the DataAdapter
objDA.SelectCommand.Transaction = myTrans
objDA.Update(objDT)
Try
myTrans.Commit()
Catch ex As Exception
myTrans.Rollback()
End Try
conn.Close()
This vastly speeds up the update.
Loading the dataset from the db tables half a second or so
This is a single SQL statement (so it's fast). Excute SQL SELECT, populate the dataset, done.
updating all the rows with image width/height and getting the file
length from FileInfo took maybe 30 seconds. Fine.
This is updating the in memory data (so that's fast too), change x row in the dataset, don't talk to SQL at all.
The adapter.Update(table); command took somewhere in the vicinity of 5
to 7 minutes to run.
This will run a SQL update for every updated row. Which is why it's slow.
yet even so I can't help but think that if I were to run a separate
update command for each individual update, this would have completed
much faster.
This is basically what it's doing anyway!
From MSDN
The update is performed on a by-row basis. For every inserted,
modified, and deleted row, the Update method determines the type of
change that has been performed on it (Insert, Update or Delete).
Depending on the type of change, the Insert, Update, or Delete command
template executes to propagate the modified row to the data source.
When an application calls the Update method, the DataAdapter examines
the RowState property, and executes the required INSERT, UPDATE, or
DELETE statements iteratively for each row, based on the order of the
indexes configured in the DataSet.
is there some way to "peek into" how ADO is processing this?
Yes: Debug .NET Framework Source Code in Visual Studio 2012?

Updating a dataset changes with data adapter does not seem to work properly

I wanted to update my dataset changes to my database, so I used this sort of code:
SqlCommandBuilder mySqlCommandBuilder = new SqlCommandBuilder(sqladap);
sqladap.Update(ds, TableName);
While it works properly I have used this code for another dataset in my project but the second one does not work. I traced this code and saw the rows of the dataset. It contains both last fetched rows and new rows but the SQLDataAdapter updates any data and also it does not throw an error.
here is the full code:
public static SqlDataAdapter AdapterStoredProcedure(string sp_Name, object obj)
{
ClsEventLogs EventLogs = new ClsEventLogs();
try
{
SqlConnection connection = SQLDBConnection();
SqlDataAdapter sqladap = new SqlDataAdapter(sp_Name, connection);
sqladap.SelectCommand.CommandType = CommandType.StoredProcedure;
if (obj != null)
{
Type t = obj.GetType();
string str = string.Empty;
System.Reflection.FieldInfo[] fields = t.GetFields();
foreach (System.Reflection.FieldInfo field in fields)
{
sqladap.SelectCommand.Parameters.Add(new SqlParameter(field.Name, SqlDbType.VarChar, 200));
sqladap.SelectCommand.Parameters[field.Name].Value = field.GetValue(obj).ToString();
}
}
return sqladap;
}
catch(Exception er)
{
EventLogs.Eventlog("clsDataStore : ExecuteStoredProcedure", er.Message, ClsEventLogs.EventType.etCriticalError, false);
return null;
}
}
// Creating Adapter
SqlDataAdapter dAdap = null;
DataSet ds = new DataSet();
dAdap = clsDataStore.AdapterStoredProcedure("sp_SelectTbl_Client", null);
dAdap.Fill(ds, "tbl_client");
//here is where i'm Updating the dataset
SqlCommandBuilder mySqlCommandBuilder = new SqlCommandBuilder(sqladap);
sqladap.Update(ds, TableName);
You'll have to look (Debugger) at the generated SQL Update/Insert statements. Most likely they are flawed or even empty.
The CommandBuilder is extremely limited, it only deals with very simple SELECT a, b FROM table statements.
You will probably find that the SP that doesn't work contains a JOIN, computed column or something like that.
Best course: provide your own Update statements or SPs
To ask the dumb question: did you tell your adapter to commit the changes after calling the Update method?
EDIT: OK, now that you've posted your code, I have to ask another dumb question: what are you looking at to determine if the update worked? Are you connecting to a remote database right now, or a test database in the project? If it's the latter, then if you are rebuilding each time (something I'm in the habit of doing) then your working copy of the database (in the \bin directory) gets blown away and replaced with a fresh copy from wherever it's referenced from in the project. That assumes, of course, that you're using an embedded DB (like MSSQLCE).

Categories