sqlite commit performance problem with indexes - c#

I have run into a problem where the time to do a commit starts taking
longer and longer. We are talking on the orders of 250ms for a table
with ~ 20k lines and a disc size of around 2-3mb. And it just keeps getting worse. I have tracked the
performance problem down to something to do with indexs. It's almost
as if sqlite is creating the index on every commit. The commit consists of
100 INSERTS. I have made a as small program as I could where I can
reproduce the problem and have tried running this on Linux as well.
There the problem doesn't seem to occur. The problem exists with both
WAL and truncate journaling mode. The problem doesn't seem to exist
when I use a memory database instead of a file. I have tried both
version 3.6.23.1 and 3.7.6.3.
On Windows where I'm experiencing the problem I run sqlite in a C#
program. I have checked the implementation of transaction support in
the System.Date.Sqlite wrapper and it does absolutely nothing else
than simply to a COMMIT. Sadly I don't have a C compiler for Windows
so I can't check it when not running the wrapper, but it should be the
same.
System.IO.File.Delete("test.db");
var db_connection = new SQLiteConnection(#"Data Source=test.db");
db_connection.Open();
using (var cmd = db_connection.CreateCommand())
{
cmd.CommandText = "CREATE TABLE test (id integer primary key, dato integer)";
cmd.ExecuteNonQuery();
cmd.CommandText = "CREATE INDEX i on test(dato)";
cmd.ExecuteNonQuery();
}
SQLiteTransaction trans = null;
List<string> paths = new List<string>();
var random = new Random();
for (var j = 0; j < 150; ++j)
{
for (var i = 0; i < 1000; ++i)
{
if (i % 100 == 0)
{
trans = db_connection.BeginTransaction();
}
using (var cmd = db_connection.CreateCommand())
{
cmd.CommandText = String.Format("INSERT INTO test (dato) values ({0})", random.Next(1, 100000000));
cmd.ExecuteNonQuery();
}
if (i % 100 == 99 && trans != null)
{
var now = DateTime.Now;
trans.Commit();
trans.Dispose();
System.Console.WriteLine("commit {0}", (DateTime.Now - now).TotalMilliseconds);
}
}
}

Did you try reducing hard disk access, for example adding this command before creating any table:
cmd.CommandText = "PRAGMA locking_mode = EXCLUSIVE";
cmd.ExecuteNonQuery();
Providing your app allows exclusive locking of the database.
Also can help:
PRAGMA Synchronous=OFF

Related

MySql Deadlocks / Commit not unlocking in c#?

Can anyone tell me why the following code can deadlock? I'm simulating our webserver on multiple threads in a console app.
The console app has 5 threads and updates 250 records on each thread.
I am finding that transaction.Commit() is not enough, I will get deadlocks, so it clearly isn't releasing the locks at that point.
Unless I put the transaction.Dispose() in and the Sleep(50ms), I consistently get deadlocks on innodb. If I turn the code into a sproc, then the sleep needs to be bigger to avoid deadlocks. I'm not sure it does avoid them totally actually, need to run it with more threads.
Closing the connection after the transaction is more reliable but in the web app ideally we want to have a connection per request for performance.
Also putting transaction.Dispose() is far more reliable in terms of avoiding deadlocks, than using (var transaction = ...
We are using .NET currently, not .NET core.
I would bet if I write the same program using SqlClient for Sql/Server it will work - I'm going to try that tomorrow.
Can anyone explain this? What am I doing wrong?
static void Main(string[] args)
{
Console.WriteLine("GenerateBarcodesTestConsoleApp");
var connectionString = ConfigurationManager.ConnectionStrings["MyConnection"].ConnectionString;
var threads = Enumerable.Range(1, 5);
Parallel.ForEach(threads, t =>
{
GenerateBarcodes2(t, connectionString, 250);
});
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
static void GenerateBarcodes2(int thread, string connectionString, int numberToGenerate)
{
using (var con = new MySqlConnection(connectionString))
{
con.Open();
var sql1 = "SELECT p.barcode, p..barcode_id " +
"FROM p_barcode p " +
"WHERE p.company_id = 1 " +
"AND SUBSTRING(p.barcode,1,2) = 'OK' " +
"AND players.in_use = 0 " +
"LIMIT 1 " +
"FOR UPDATE;";
var sql2 = "UPDATE p_barcode SET in_use = 1 WHERE company_id = 1 AND barcode_id = ?barcode_id AND in_use = 0";
for (int b = 0; b < numberToGenerate; b++)
{
using (var transaction = con.BeginTransaction(System.Data.IsolationLevel.RepeatableRead))
{
string barcode = string.Empty;
int barcodeId = 0;
using (var cmd = new MySqlCommand(sql1, con, transaction))
{
var rdr = cmd.ExecuteReader();
if (rdr.Read())
{
barcode = (string)rdr["barcode"];
barcodeId = (int)rdr["barcode_id"];
}
rdr.Close();
Console.WriteLine(barcode);
}
if (barcodeId != 0)
{
using (var cmd = new MySqlCommand(sql2, con, transaction))
{
cmd.Parameters.AddWithValue("barcode_id", barcodeId);
cmd.ExecuteNonQuery();
}
}
transaction.Commit();
System.Threading.Thread.Sleep(50);
}
//transaction.Dispose();
}
con.Close();
}
}
In MariaDb, SKIP LOCKED is the solution to prevent deadlocks.
There isn't a perfect solution to prevent deadlocks, without redesigning the system to avoid two threads trying to update the same record at the same time, however adding a small sleep after commit transaction appears to help massively. 20ms was about right on my dev machines.
Does this suggest that commit returns before the database has actually committed the transaction and released the locks? Either way, this behaviour is the same for INNODB and MARIADB.

Inserting many rows into SQL Server 2000 via SqlCommand in .NET - NOT updating the sysindexes (rows) properly

I am calling an insert method about 2000 times in a row (through a foreach loop in C#). Everything inserts into the table just fine, but the sysindexes rowcnt for that table does not update properly.
Let's say that I call the insert method 2100 times in that foreach loop, the rowcnt from sysindexes table for <table> now says 2085. But if I do a select count(1) from <table> I get the proper number of records (2100).
The number that I get from the rowcnt usually varies, but it always around the correct number of rows.
We are running SQL Server 2000 - yes I know, it is 15 years old and that might have something to do with it. I did see that sysindexes for SQL Server 2000 can be a little shaky, and they recommend using other views rather than that one if you have a never version, but we don't.
The really silly part is that if I run that insert method just one time, and insert just one record, it updates the sysindexes rows & rowcnt to the proper number. Isn't that something?
The code below shows the method that is being called all those times in a row. Any help is greatly appreciated. I make the insert string and then I send it to the method to run the query - yes I know how to use Command.Parameters and I should not hard code values into a string, but I replicated another process without doing it properly just to test it and it gives me the same results/problem that I am asking about.
Insert Statement is as follows: INSERT INTO (item_no, totalTransactions, compileDate) VALUES('test', 1.0, 20140820)
private void btnTestTableInsert_Click(object sender, EventArgs e)
{
int rowsAffected = 0;
String bigInsert = "";
bigInsert += "INSERT INTO iminvtrx_sql_i(item_no, totalTransactions, compileDate) ";
for (int i = 0; i < 2100; i++)
{
rowsAffected += addToDatabase(i);
}
MessageBox.Show("There were " + rowsAffected.ToString() + " added to the table!");
}
private int addToDatabase(int count)
{
int rowsAffected = 0;
string insertString = "";
using (SqlConnection connection = finder.getConnectionFor("data_01"))
{
SqlCommand command = connection.CreateCommand();
insertString = "BEGIN TRANSACTION INSERT INTO iminvtrx_sql_i(item_no, totalTransactions, compileDate) VALUES('test', 1.0, #increment) COMMIT TRANSACTION";
command.CommandText = insertString;
command.Parameters.AddWithValue("#increment", count);
try
{
connection.Open();
rowsAffected = command.ExecuteNonQuery();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message, ex.InnerException.ToString());
}
}
return rowsAffected;
}
The behavior you mention is expected. The row count stored in system tables is not guaranteed to be 100% accurate and can be corrected with DBCC UPDATEUSAGE (which sp_spaceused optionally invokes when the updateusage option is specified). The only way to get a transactionally consistent rowcount in any version of SQL Server is with SELECT COUNT.
The rowcount stored in the DMVs in SQL 2005 and later versions tends to be more accurate.
can you change the code to have a separate counter it almost looks like the += assignment is returning and or tracking the count incorrectly just break it apart like this
private void btnTestTableInsert_Click(object sender, EventArgs e)
{
int rowsAffected = 0;
String bigInsert = "";
bigInsert = "INSERT INTO iminvtrx_sql_i(item_no, totalTransactions, compileDate)";
for (int i = 0; i < 2100; i++)
{
rowsAffected += addToDatabase(i);
}
MessageBox.Show(string.Format("There were {0} added to the table!", rowsAffected.ToString()));
}
private int addToDatabase(int count)
{
int rowsAffected = 0;
string insertString = "";
SqlConnection connection = finder.getConnectionFor("data_01");
SqlCommand command = connection.CreateCommand();
insertString = "BEGIN TRANSACTION INSERT INTO iminvtrx_sql_i(item_no, totalTransactions, compileDate) VALUES('test', 1.0, #increment) COMMIT TRANSACTION";
command.CommandText = insertString;
command.Parameters.AddWithValue("#increment", count);
try
{
connection.Open();
rowsAffected = command.ExecuteNonQuery();
rowsAffected++;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message, ex.InnerException.ToString());
}
finally
{
if (connection != null)
{
connection.Close();
}
}
return rowsAffected;
}
you do not need this += in this method private int addToDatabase(int count)
I changed it to only use rowsAffected
I would suggest reading up on what += , +-, etc...Operator Overloads are and when to use them

Fastest way to update more than 50.000 rows in a mdb database c#

I searched on the net something but nothing really helped me. I want to update, with a list of article, a database, but the way that I've found is really slow.
This is my code:
List<Article> costs = GetIdCosts(); //here there are 70.000 articles
conn = new OleDbConnection(string.Format(MDB_CONNECTION_STRING, PATH, PSW));
conn.Open();
transaction = conn.BeginTransaction();
using (var cmd = conn.CreateCommand())
{
cmd.Transaction = transaction;
cmd.CommandText = "UPDATE TABLE_RO SET TABLE_RO.COST = ? WHERE TABLE_RO.ID = ?;";
for (int i = 0; i < costs.Count; i++)
{
double cost = costs[i].Cost;
int id = costs[i].Id;
cmd.Parameters.AddWithValue("data", cost);
cmd.Parameters.AddWithValue("id", id);
if (cmd.ExecuteNonQuery() != 1) throw new Exception();
}
}
transaction.Commit();
But this way take a lot of minutes something like 10 minutes or more. There are another way to speed up this updating ? Thanks.
Try modifying your code to this:
List<Article> costs = GetIdCosts(); //here there are 70.000 articles
// Setup and open the database connection
conn = new OleDbConnection(string.Format(MDB_CONNECTION_STRING, PATH, PSW));
conn.Open();
// Setup a command
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
cmd.CommandText = "UPDATE TABLE_RO SET TABLE_RO.COST = ? WHERE TABLE_RO.ID = ?;";
// Setup the paramaters and prepare the command to be executed
cmd.Parameters.Add("?", OleDbType.Currency, 255);
cmd.Parameters.Add("?", OleDbType.Integer, 8); // Assuming you ID is never longer than 8 digits
cmd.Prepare();
OleDbTransaction transaction = conn.BeginTransaction();
cmd.Transaction = transaction;
// Start the loop
for (int i = 0; i < costs.Count; i++)
{
cmd.Parameters[0].Value = costs[i].Cost;
cmd.Parameters[1].Value = costs[i].Id;
try
{
cmd.ExecuteNonQuery();
}
catch (Exception ex)
{
// handle any exception here
}
}
transaction.Commit();
conn.Close();
The cmd.Prepare method will speed things up since it creates a compiled version of the command on the data source.
Small change option:
Using StringBuilder and string.Format construct one big command text.
var sb = new StringBuilder();
for(....){
sb.AppendLine(string.Format("UPDATE TABLE_RO SET TABLE_RO.COST = '{0}' WHERE TABLE_RO.ID = '{1}';",cost, id));
}
Even faster option:
As in first example construct a sql but this time make it look (in result) like:
-- declaring table variable
declare table #data (id int primary key, cost decimal(10,8))
-- insert union selected variables into the table
insert into #data
select 1121 as id, 10.23 as cost
union select 1122 as id, 58.43 as cost
union select ...
-- update TABLE_RO using update join syntax where inner join data
-- and copy value from column in #data to column in TABLE_RO
update dest
set dest.cost = source.cost
from TABLE_RO dest
inner join #data source on dest.id = source.id
This is the fastest you can get without using bulk inserts.
Performing mass-updates with Ado.net and OleDb is painfully slow. If possible, you could consider performing the update via DAO. Just add the reference to the DAO-Library (COM-Object) and use something like the following code (caution -> untested):
// Import Reference to "Microsoft DAO 3.6 Object Library" (COM)
string TargetDBPath = "insert Path to .mdb file here";
DAO.DBEngine dbEngine = new DAO.DBEngine();
DAO.Database daodb = dbEngine.OpenDatabase(TargetDBPath, false, false, "MS Access;pwd="+"insert your db password here (if you have any)");
DAO.Recordset rs = daodb.OpenRecordset("insert target Table name here", DAO.RecordsetTypeEnum.dbOpenDynaset);
if (rs.RecordCount > 0)
{
rs.MoveFirst();
while (!rs.EOF)
{
// Load id of row
int rowid = rs.Fields["Id"].Value;
// Iterate List to find entry with matching ID
for (int i = 0; i < costs.Count; i++)
{
double cost = costs[i].Cost;
int id = costs[i].Id;
if (rowid == id)
{
// Save changed values
rs.Edit();
rs.Fields["Id"].Value = cost;
rs.Update();
}
}
rs.MoveNext();
}
}
rs.Close();
Note the fact that we are doing a full table scan here. But, unless the total number of records in the table is many orders of magnitude bigger than the number of updated records, it should still outperform the Ado.net approach significantly...

UPDATE faster in SQLite + BEGIN TRANSACTION

This one is related to spatilite also (not only SQLite)
I have a file database (xyz.db) which I am using by SQLiteconnection (SQLiteconnection is extends to spatialite).
I have so many records needs to update into database.
for (int y = 0; y < castarraylist.Count; y++)
{
string s = Convert.ToString(castarraylist[y]);
string[] h = s.Split(':');
SQLiteCommand sqlqctSQL4 = new SQLiteCommand("UPDATE temp2 SET GEOM = " + h[0] + "WHERE " + dtsqlquery2.Columns[0] + "=" + h[1] + "", con);
sqlqctSQL4.ExecuteNonQuery();
x = x + 1;
}
At above logic castarraylist is Arraylist which contains value which need to process into database.
When I checked above code updating around 400 records in 1 minute.
Is there any way by which I can able to improve performance ?
NOTE :: (File database is not thread-safe)
2. BEGIN TRANSACTION
Let's suppose I like to run two (or millions) update statement with single transaction in Spatialite.. is it possible ?
I read online and prepare below statement for me (but not get success)
BEGIN TRANSACTION;
UPDATE builtuparea_luxbel SET ADMIN_LEVEL = 6 where PK_UID = 2;
UPDATE builtuparea_luxbel SET ADMIN_LEVEL = 6 where PK_UID = 3;
COMMIT TRANSACTION;
Above statement not updating records in my database.
is SQLite not support BEGIN TRANSACTION ?
is there anything which I missing ?
And If I need to run individual statement then it's taking too much time to update as said above...
SQLite support Transaction, you can try below code.
using (var cmd = new SQLiteCommand(conn))
using (var transaction = conn.BeginTransaction())
{
for (int y = 0; y < castarraylist.Count; y++)
{
//Add your query here.
cmd.CommandText = "INSERT INTO TABLE (Field1,Field2) VALUES ('A', 'B');";
cmd.ExecuteNonQuery();
}
transaction.Commit();
}
The primary goal of a database transaction to get everything done, or nothing if something fails inside;
Reusing the same SQLiteCommand object by changing its CommandText property and execute it again and again might be faster, but leads to a memory overhead: If you have an important amount of queries to perform, the best is to dispose the object after use and create a new one;
A common pattern for an ADO.NET transaction is:
using (var tra = cn.BeginTransaction())
{
try
{
foreach(var myQuery in myQueries)
{
using (var cd = new SQLiteCommand(myQuery, cn, tra))
{
cd.ExecuteNonQuery();
}
}
tra.Commit();
}
catch(Exception ex)
{
tra.Rollback();
Console.Error.Writeline("I did nothing, because something wrong happened: {0}", ex);
throw;
}
}

Limit your use of DataTable

I recently saw this, see below. I got OutofMemoryException when loading 2.5 million records with DataTable. And near the bottom, there is a table.Dispose(). Memory usage: 560Mb! Why use DataTable anyway?
public string[] GetIDs()
{
DataTable table = new DataTable();
using (SqlConnection dwConn = new SqlConnection(this.ConnectionString))
{
dwConn.Open();
SqlCommand cmd = dwConn.CreateCommand();
cmd.CommandText = "SELECT ID FROM Customer";
SqlDataReader reader = cmd.ExecuteReader();
table.Load(reader);
}
var result = new string[table.Rows.Count];
for(int i = 0; i < result.Length; i++ )
{
result[i] = table.Rows[i].ItemArray[0].ToString();
}
table.Dispose();
table = null;
return result;
}
I turned this in the following, and the memory used was now 250Mb for 2.5 million records, same as above. The memory used is now less than 45% of the original.
public IEnumerable<String> GetIDs()
{
var result = new List<string>();
using (var dwConn = new SqlConnection(ConnectionString))
{
dwConn.Open();
SqlCommand cmd = dwConn.CreateCommand();
cmd.CommandText = "SELECT ID FROM Customer";
using (SqlDataReader reader = cmd.ExecuteReader())
{
while (reader.Read())
{
result.Add(reader["ID"].ToString());
}
}
}
return result;
}
Its good to see that you have a problem to your solution, but i would also recommend to have a look at this discussion which depicts that DataReader is a better solution than DataTable, but it depends on it use as well. After reading this you will understand memory consumption is expected to be less in case of DataReader.
Another advantage using SqlDataReader is documented in MSDN documentation is:
A part from Remarks:
Changes made to a result set by another process or thread while data
is being read may be visible to the user of the SqlDataReader.
So it is a possible reason that you are getting this difference in observation.
Hope it is useful for you and others as well.

Categories