Read bytea data is slow in PostgreSQL - c#

I store data in bytea column in PostgreSQL 9.5 database on Windows.
The data transmission speed is lower than I expect : about 1.5mb per second.
The following code
using (var conn = ConnectionProvider.GetOpened())
using (var comm = new NpgsqlCommand("SELECT mycolumn FROM mytable", conn))
using (var dr = comm.ExecuteReader())
{
var clock = Stopwatch.StartNew();
while (dr.Read())
{
var bytes = (byte[])dr[0];
Debug.WriteLine($"bytes={bytes.Length}, time={clock.Elapsed}");
clock.Restart();
}
}
Produces the following output
bytes=3895534, time=00:00:02.4397086
bytes=4085257, time=00:00:02.7220734
bytes=4333460, time=00:00:02.4462513
bytes=4656500, time=00:00:02.7401579
bytes=5191876, time=00:00:02.7959250
bytes=5159785, time=00:00:02.7693224
bytes=5184718, time=00:00:03.0613514
bytes=720401, time=00:00:00.0227767
bytes=5182772, time=00:00:02.7704914
bytes=538456, time=00:00:00.2996142
bytes=246085, time=00:00:00.0003131
Total: 00:00:22.5199268
The strange thing is that reading last 246kb took less than a millisecond, and reading 720kb in the middle took just 22ms.
Is the reading speed 5mb per 3 sec normal? How I can increase the reading speed?
Details.
My application starts PostgreSQL server on startup and shut downs it on exit.
I start server with the following code
public static void StartServer(string dataDirectory, int port)
{
Invoke("pg_ctl", $"start -w -D \"{dataDirectory}\" -m fast -o \"-B 512MB -p {port} -c temp_buffers=32MB -c work_mem=32MB\"");
}
Also, i change the storage type to my column :
ALTER TABLE mytable ALTER COLUMN mycolumn SET STORAGE EXTERNAL;
I use npgsql 3.0.4.0 and PostgreSQL 9.5 on Windows 10

Running here with Npgsql 3.0.8 (should be the same), PostgreSQL 9.5, windows 10 and Npgsql I don't get your results at all:
bytes=3895534, time=00:00:00.0022591
bytes=4085257, time=00:00:00.0208912
bytes=4333460, time=00:00:00.0228702
bytes=4656500, time=00:00:00.0237144
bytes=5191876, time=00:00:00.0317834
bytes=5159785, time=00:00:00.0268229
bytes=5184718, time=00:00:00.0159028
bytes=720401, time=00:00:00.0130150
bytes=5182772, time=00:00:00.0153306
bytes=538456, time=00:00:00.0021693
bytes=246085, time=00:00:00.0005174
First, what server are you running against, is it on localhost or on some remote machine?
The second thing that comes to mind, is you stopping and starting the server as part of the test. The slow performance you're seeing may be part of a warm-up on PostgreSQL's side. Try to remove it and see if the results can be reproduced after several times of running your tests.
Otherwise this looks like some environmental problem (client machine, server machine or network). I'd try to reproduce the problem on a different machine or in a different setting and go from there.

Related

MySql Long Running Query Fails on Docker .NET Core: Attempted to read past the end of the stream / Expected to read 4 header bytes but only received 0

I am attempting to query a MySql database with 95M rows with a query that has a where clause on a non-indexed column (please don't judge, I have no control over that part as the server is not ours).
I've tried both MySqlConnector and MySqlClient with the same result. Consistently, after 5 minutes, they both error:
Using MySqlConnector:
Expected to read 4 header bytes but only received 0.
Using MySql.Data.MySqlClient:
Attempted to read past the end of the stream.
This only happens in a docker container (running Docker Desktop on Windows the aspnet:3.1-buster-slim image, but I've tried others with the same result).
Running the same code via a IIS express hosted web api or a console app works fine.
The connection string specifies Connect Timeout=21600; Default Command Timeout=21600; MinPoolSize=0; and I've tried various Min/Max pool size configs and turning pooling off with no luck.
I have tried changing the connection string SslMode to None with no change.
The code to query the data is pretty straight forward:
protected virtual async IAsyncEnumerable<List<object>> GetDataAsync(string connectionString, string sql, int timeout = 21600, IsolationLevel isolationLevel = IsolationLevel.ReadCommitted)
{
await using var conn = new MySqlConnection { ConnectionString = connectionString };
await conn.OpenAsync();
using var trans = await conn.BeginTransactionAsync(isolationLevel);
await using var cmd = new MySqlCommand { Connection = conn, CommandText = sql, CommandTimeout = timeout, Transaction = trans };
await using (var reader = await cmd.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
var values = new object[reader.FieldCount];
reader.GetValues(values);
yield return values.Select(v => v is DBNull ? null : v).ToList();
}
}
await trans.CommitAsync();
}
I have tried with and without the transaction - no change.
If I try a simpler query, I get results back w/o issue using that same GetDataAsync method. Even stranger, other long-running queries are working fine too. If I try to do a similar, non-indexed, query on a table with 30M rows, it runs past the 5 minute mark and eventually (over an hour) returns results.
Running show variables yields the following (none of which seem to point to the issue):
connect_timeout 10
delayed_insert_timeout 300
innodb_flush_log_at_timeout 1
innodb_lock_wait_timeout 50
innodb_rollback_on_timeout OFF
interactive_timeout 28800
lock_wait_timeout 31536000
net_read_timeout 30
net_write_timeout 60
rpl_stop_slave_timeout 31536000
slave_net_timeout 3600
wait_timeout 28800
Is there some kind of idle network timeout occurring in the docker container?
Set Keepalive=120 in your connection string; this will send TCP keepalive packets every two minutes (120 seconds) which should stop the connection from being closed. (You may need to adjust the Keepalive value for your particular situation.)
Note that if you're using MySqlConnector on Linux, due to limitations of .NET Core, this option is only implemented on .NET Core 3.0 (or later).

Mono + MDBTools: Encoding Issue

I'm working on a C# application that has to read an Access database (.mdb), on Linux. I'm using Mono to compile and run the application.
Suppose I have a test database that I create in Access 2013. It has one table: TestTable, with the default ID column and a testField1 column created with the 'Long Text' type. I insert three rows, with these values for the testField1 column: "foo", "bar", "baz". The database is saved as 'Access 2002-2003 Database (*.mdb)'.
The resulting database (named Test.mdb) is transferred to my Linux box. Just as a sanity check, I can run mdb-export on the database:
$ mdb-export Test.mdb TestTable
ID,testField1
1,"foo"
2,"bar"
3,"baz"
So far, so good. Now, suppose we have a C# program that reads the testField1 column of the table:
using System;
using System.Data.Odbc;
class Program {
public static void Main(string[] args){
try {
OdbcConnection conn = new OdbcConnection("ODBC;Driver=MDBTools;DBQ=/path/to/Test.mdb");
conn.Open();
var command = conn.CreateCommand();
command.CommandText = "SELECT testField1 FROM TestTable";
var reader = command.ExecuteReader();
while(reader.Read()){
Console.WriteLine(reader.GetString(0));
}
} catch(Exception e){
Console.WriteLine(e.Message);
Console.WriteLine(e.StackTrace);
}
}
}
I would expect that running this program would print "foo", "bar", and "baz". However, compiling and running the program does not yield this output:
$ mcs mdb_odbc.cs -r:System.data.dll
$ mono mdb_odbc.exe
潦o
$ # this line added to show the empty lines
My guess is that this is an encoding issue, but I have no idea how to resolve it. Is there a way to fix my program or the environment that it runs in so that the contents of the database are printed correctly? I believe that it is an issue with either ODBC or MDBTools, because in a similar program, a string equality check against fields of a database fails.
I'm using Ubuntu 16.10. mono --version outputs Mono JIT compiler version 5.4.0.167 (tarball Wed Sep 27 18:38:59 EDT 2017) (I built it from source with this patch applied to fix another issue with ODBC). MDBTools, installed through Apt, is version 0.7.1-4build1, and the odbc-mdbtools package is the same version.
I know that the combination of tools and software I'm using is unusual, but unfortunately, I have to use C#, I probably have to use Mono, I have to use an Access database, and I have to use ODBC to access the database. If there's no other way around it, I suppose I could convert the database to another format (SQLite comes to mind).

Increase SQL Bulk Copy speed on Azure

I am working on a project where I have to move an on-premises application over to Azure. We have an upload utility that transfers about 150,000 records to the web app (MVC App). Unfortunately, I was getting timeout issues after I migrated to Azure. I made several changes including using SqlBulkCopy and Store Procedures instead of using SqlCommand. Now, the timeout issue has been resolve but the data upload is taking about 5mins to upload the 150,000 records into a table on Azure.
I am using a trial version on Azure, and my Database DTU is 20. Now, I would love to keep it at 20 because of the cost. I have a small budget That I am working with. Note, Database Size isnt a problem. I am well below the quota.
Any Suggestions on how I can decrease the time to insert those 150,000 records?
Code Sample
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(destinationConnection))
{
bulkCopy.BulkCopyTimeout = 0;
bulkCopy.BatchSize = 10000;
bulkCopy.ColumnMappings.Add("Barcode", "Barcode");
bulkCopy.ColumnMappings.Add("SubCategory", "SubCategory");
bulkCopy.ColumnMappings.Add("ItemDescription", "ItemDescription");
bulkCopy.ColumnMappings.Add("CreateDate", "CreateDate");
bulkCopy.ColumnMappings.Add("RevisedDate", "RevisedDate");
bulkCopy.DestinationTableName = "Items";
try
{
bulkCopy.WriteToServer(dtTblData);
destinationConnection.Close();
}
catch (Exception ex)
{
this.Logs.Add(DateTime.Now.ToString() + ": " + ex.Message);
}
}
}
FYI: During the insert operation the DTU for my database reaches 100%.
Using the option SqlBulkCopyOptions.TableLock will increase the performance.
So if you can lock the table, you should without a doubt use it.
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(destinationConnection, SqlBulkCopyOptions.TableLock))
{
// ...code...
}
Outside of this configuration, there is not a lot of stuff you can do since you already use SqlBulkCopy. The bottle neck is your database performance that you cannot upgrade because of the budget.
Besides the Table locking Jonathan mentioned, the only real way to increase performance is to increase the DTUs for the service.
However you don't need to leave the database on the higher setting forever, if this bulk load is a infrequent operation you could temporary raise the DTUs of the database, do your load, then lower the DTUs back down. You would only be billed at the higher rate for the time you where actually uploading.
You can change the database via code using the the Azure SDK and functions in the Microsoft.Azure.Management.Sql.DatabasesOperationsExtensions class and setting the RequestedServiceObjectiveId value with a higher tier objective (The 20 DTUs you are on now is a S1 objective, you could move up to a S2 (50 DTUs) during the bulk load) on the Database object you pass in to the update function.

Slow opening SQLite connection in C# app using System.Data.SQLite

Edit 3:
I guess my issue is resolved for the moment... I changed both my service and test app to run as the SYSTEM account instead of the NetworkService account. It remains to be seen if the benefits of changing the user account will persist, or if it will only be temporary.
Original Question:
I've noticed that my small 224kB SQLite DB is very slow to open in my C# application, taking anywhere from some small number of ms, to 1.5 seconds or more. Below is my code, with all the extra debugging statements I've added this afternoon. I've narrowed it down to the call to cnn.Open(); as shown in the logs here:
2014-03-27 15:05:39,864 DEBUG - Creating SQLiteConnection...
2014-03-27 15:05:39,927 DEBUG - SQLiteConnection Created!
2014-03-27 15:05:39,927 DEBUG - SQLiteConnection Opening...
2014-03-27 15:05:41,627 DEBUG - SQLiteConnection Opened!
2014-03-27 15:05:41,627 DEBUG - SQLiteCommand Creating...
2014-03-27 15:05:41,627 DEBUG - SQLiteCommand Created!
2014-03-27 15:05:41,627 DEBUG - SQLiteCommand executing reader...
2014-03-27 15:05:41,658 DEBUG - SQLiteCommand executed reader!
2014-03-27 15:05:41,658 DEBUG - DataTable Loading...
2014-03-27 15:05:41,767 DEBUG - DataTable Loaded!
As you can see, in this instance it took 1.7 SECONDS to open the connection. I've tried repeating this, and cannot predict whether subsequent connections will open nearly immediately, or be delayed like this.
I've considered using some form of connection pooling, but is it worthwhile to pursue that for a single-instance single-threaded application? Right now, I'm creating an instance of my SQLiteDatabase class, and calling the below function for each of my queries.
public DataTable GetDataTable(string sql)
{
DataTable dt = new DataTable();
try
{
Logging.LogDebug("Creating SQLiteConnection...");
using (SQLiteConnection cnn = new SQLiteConnection(dbConnection))
{
Logging.LogDebug("SQLiteConnection Created!");
Logging.LogDebug("SQLiteConnection Opening...");
cnn.Open();
Logging.LogDebug("SQLiteConnection Opened!");
Logging.LogDebug("SQLiteCommand Creating...");
using (SQLiteCommand mycommand = new SQLiteCommand(cnn))
{
Logging.LogDebug("SQLiteCommand Created!");
mycommand.CommandText = sql;
Logging.LogDebug("SQLiteCommand executing reader...");
using (SQLiteDataReader reader = mycommand.ExecuteReader())
{
Logging.LogDebug("SQLiteCommand executed reader!");
Logging.LogDebug("DataTable Loading...");
dt.Load(reader);
Logging.LogDebug("DataTable Loaded!");
reader.Close();
}
}
cnn.Close();
}
}
catch (Exception e)
{
throw new Exception(e.Message);
}
return dt;
}
Edit:
Sure, dbConnection is the connection string, set by the following function. inputFile is just the string path of the filename to open.
public SqLiteDatabase(String inputFile)
{
dbConnection = String.Format("Data Source={0}", inputFile);
}
And at this point, I think sql is irrelevant, as it's not making it to that point when the cnn.Open() stalls.
Edit 2:
Ok, I've done some more testing. Running the testing locally, it completes a 1000 iteration loop in ~5 seconds, for about 5ms per call to cnn.Open(). Running the test from the same windows installer that I did on my local PC, it completes in ~25 minutes, averaging 1468ms per call to cnn.Open().
I made a small test program that only calls the TestOpenConn() function from the service program (same exact code that is running in the Windows service), running against a copy of the file located in a test directory. Running this on the server or my local PC results in acceptable performance (1.95ms per call on the server, 4ms per call on my local PC):
namespace EGC_Timing_Test
{
class Program
{
static void Main(string[] args)
{
Logging.Init("log4net.xml", "test.log");
var db = new SqLiteDatabase("config.sqlite");
db.TestOpenConn();
}
}
}
Here's the test function:
public void TestOpenConn()
{
// TODO: Remove this after testing loop of opening / closing SQLite DB repeatedly:
const int iterations = 1000;
Logging.LogDebug(String.Format("Running TestOpenConn for {0} opens...", iterations));
var startTime = DateTime.Now;
for (var i = 0; i < iterations; i++)
{
using (SQLiteConnection cnn = new SQLiteConnection(dbConnection))
{
Logging.LogDebug(String.Format("SQLiteConnection Opening, iteration {0} of {1}...", i, iterations));
var startTimeInner = DateTime.Now;
cnn.Open();
var endTimeInner = DateTime.Now;
var diffTimeInner = endTimeInner - startTimeInner;
Logging.LogDebug(String.Format("SQLiteConnection Opened in {0}ms!", diffTimeInner.TotalMilliseconds));
cnn.Close();
}
}
var endTime = DateTime.Now;
var diffTime = endTime - startTime;
Logging.LogDebug(String.Format("Done running TestOpenConn for {0} opens!", iterations));
Logging.LogInfo(String.Format("{0} iterations total:\t{1}", iterations, diffTime));
Logging.LogInfo(String.Format("{0} iterations average:\t{1}ms", iterations, diffTime.TotalMilliseconds/iterations));
}
I guess my issue is resolved for the moment... I changed both my service and test app to run as the SYSTEM account instead of the NetworkService account. It remains to be seen if the benefits of changing the user account will persist, or if it will only be temporary.
I'm assuming you're using the open source System.Data.SQLite library.
If that's the case, it's easy to see through the Visual Studio Performance Profiler that the Open method of the SQLiteConnection class has some serious performance issues.
Also, have a lookthrough the source code for this class here: https://system.data.sqlite.org/index.html/artifact/97648754af51ffd6
There's an awful lot of disk access being made to read XML configuration and Windows environment variable(s).
My suggestion is to try and call Open() as seldom as possible, and try and keep a reference to this open SQLiteConnection object around in memory.
A performance ticket is raised with SQLite Forum
Having had the same problem, I was looking into this and it seems to be related to permissions on the file or it's parent folders, who created it, and/or how it was created. In my case, the SQLite database file was being created by a script run as a regular user, and then an IIS-hosted service would access the file under a different domain service account.
Every time the service opened a connection, it took over 1.5 seconds, but otherwise operated correctly (it could eventually access the file). A stand-alone program running as the regular user could open a connection to the same file in the same place in a few milliseconds.
Analysis of a procmon trace revealed that in the case of the service, we were getting several ACCESS DENIED logs on the file over the course of about 1.5 seconds, that were not present in the trace when running as the regular user.
Not sure what's going on there. The service worked fine and was able to eventually query the data in the file, albeit slowly.
When we made the service account the owner of the parent folder of the file, and gave him write permission, the ACCESS DENIED logs disappeared and the service operated at full speed.
You can add "Modify" permissions of appropriate user to folder with your database.
Right Click on folder > Properties > Security > Edit > Add (I added IIS_Users) > Select "Modify" checkbox > OK

SQL Server CE not picking up updates from another process?

I've got two processes with connections to the same SQL CE .sdf database file. One inserts items into a table and the other reads all the records from the table. After the insert I can confirm the rows are there with the Server Explorer but my query from the second process does not show them:
this.traceMessages.Clear();
SqlCeCommand command = new SqlCeCommand("SELECT AppName, Message, TraceId FROM Messages", this.connection);
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
this.traceMessages.Add(
new TraceMessage
{
AppName = reader.GetString("AppName"),
Message = reader.GetString("Message"),
TraceId = reader.GetString("TraceId")
});
}
}
It can generally load up correctly the first time but doesn't pick up updates, even after restarting the process. The connection string just has a simple Data Source that I've confirmed is pointing to the same file on both processes.
Anyone know why this is happening? Is there some setting I can enable to get updates from separate processes to work?
This is because unlike "traditional" databases, the data that you write is not flushed to disk immediately, it is deferred and happens some time later.
You have two choices in the writing program:
1) Add the Flush Interval parameter to your connection string and set it to 1. This will have a lag of up to a second before the data is flushed to the sdf.
2) When you call Commit, use the parameterized overload that allows you to specify CommitMode.Immediate. This will flush data to disk immediately.

Categories