Increase SQL Bulk Copy speed on Azure - c#

I am working on a project where I have to move an on-premises application over to Azure. We have an upload utility that transfers about 150,000 records to the web app (MVC App). Unfortunately, I was getting timeout issues after I migrated to Azure. I made several changes including using SqlBulkCopy and Store Procedures instead of using SqlCommand. Now, the timeout issue has been resolve but the data upload is taking about 5mins to upload the 150,000 records into a table on Azure.
I am using a trial version on Azure, and my Database DTU is 20. Now, I would love to keep it at 20 because of the cost. I have a small budget That I am working with. Note, Database Size isnt a problem. I am well below the quota.
Any Suggestions on how I can decrease the time to insert those 150,000 records?
Code Sample
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(destinationConnection))
{
bulkCopy.BulkCopyTimeout = 0;
bulkCopy.BatchSize = 10000;
bulkCopy.ColumnMappings.Add("Barcode", "Barcode");
bulkCopy.ColumnMappings.Add("SubCategory", "SubCategory");
bulkCopy.ColumnMappings.Add("ItemDescription", "ItemDescription");
bulkCopy.ColumnMappings.Add("CreateDate", "CreateDate");
bulkCopy.ColumnMappings.Add("RevisedDate", "RevisedDate");
bulkCopy.DestinationTableName = "Items";
try
{
bulkCopy.WriteToServer(dtTblData);
destinationConnection.Close();
}
catch (Exception ex)
{
this.Logs.Add(DateTime.Now.ToString() + ": " + ex.Message);
}
}
}
FYI: During the insert operation the DTU for my database reaches 100%.

Using the option SqlBulkCopyOptions.TableLock will increase the performance.
So if you can lock the table, you should without a doubt use it.
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(destinationConnection, SqlBulkCopyOptions.TableLock))
{
// ...code...
}
Outside of this configuration, there is not a lot of stuff you can do since you already use SqlBulkCopy. The bottle neck is your database performance that you cannot upgrade because of the budget.

Besides the Table locking Jonathan mentioned, the only real way to increase performance is to increase the DTUs for the service.
However you don't need to leave the database on the higher setting forever, if this bulk load is a infrequent operation you could temporary raise the DTUs of the database, do your load, then lower the DTUs back down. You would only be billed at the higher rate for the time you where actually uploading.
You can change the database via code using the the Azure SDK and functions in the Microsoft.Azure.Management.Sql.DatabasesOperationsExtensions class and setting the RequestedServiceObjectiveId value with a higher tier objective (The 20 DTUs you are on now is a S1 objective, you could move up to a S2 (50 DTUs) during the bulk load) on the Database object you pass in to the update function.

Related

SQL Server CE two way sync with remote Access database

I'm working on a pretty special, legacy project where I need to build an app for PDA devices under Windows Mobile 6.5. The devices have a local database (SQL Server CE) which we are supposed to sync with a remote database (Microsoft Access) whenever they are docked and have network access.
So the local database using SQL Server CE works fine, but I can’t figure out a way to sync it to the Access database properly.
I read that ODBC and OLEDB are unsupported under Windows Mobile 6.5, most ressources I find are obsolete or have empty links, and the only way I found was to export the local database relevant tables in XML in the hope to build a VBA component for Access to import them properly. (and figure out backwards sync).
Update on the project and new questions
First of all, thanks to everyone who provided an useful answer, and to #josef who saved me a lot of time with the auto path on this thread.
So a remote SQL Server is a no go for security reasons (client is paranoid about security and won't provide me a server). So I'm tied to SQL Server CE on the PDA and Access on the computer.
As for the sync:
The exportation is fine: I'm using multiple dataAdapters and a WriteXML method to generate XML files transmitted by FTP when the device is plugged back in. Those files are then automatically imported into the Access database. (see code at the end).
My problem is on the importation: I can acquire data through XML readers from an Access-generated file. This data is then inserted in a dataset (In fact, I can even print the data on the PDA screen) but I can't figure out a way to do an "UPSERT" on the PDA's database. So I need a creative way to update/insert the data to the tables if they already contains data with the same id.
I tried two methods, with SQL errors (from what I understood it's SQL Server CE doesn't handle stored procedures or T-SQL). Example with a simple query that is supposed to update the "available" flag of some storage spots:
try
{
SqlCeDataAdapter dataAdapter = new SqlCeDataAdapter();
DataSet xmlDataSet = new DataSet();
xmlDataSet.ReadXml(localPath +#"\import.xml");
dataGrid1.DataSource = xmlDataSet.Tables[1];
_conn.Open();
int i = 0;
for (i = 0; i <= xmlDataSet.Tables[1].Rows.Count - 1; i++)
{
spot = xmlDataSet.Tables[1].Rows[i].ItemArray[0].ToString();
is_available = Convert.ToBoolean(xmlDataSet.Tables[1].Rows[i].ItemArray[1]);
SqlCeCommand importSpotCmd = new SqlCeCommand(#"
IF EXISTS (SELECT spot FROM spots WHERE spot=#spot)
BEGIN
UPDATE spots SET available=#available
END
ELSE
BEGIN
INSERT INTO spots(spot, available)
VALUES(#spot, #available)
END", _conn);
importSpotCmd.Parameters.Add("#spot", spot);
importSpotCmd.Parameters.Add("#available", is_available);
dataAdapter.InsertCommand = importSpotCmd;
dataAdapter.InsertCommand.ExecuteNonQuery();
}
_conn.Close();
}
catch (SqlCeException sql_ex)
{
MessageBox.Show("SQL database error: " + sql_ex.Message);
}
I also tried this query, same problem SQL server ce apparently don't handle ON DUPLICATE KEY (I think it's MySQL specific).
INSERT INTO spots (spot, available)
VALUES(#spot, #available)
ON DUPLICATE KEY UPDATE spots SET available=#available
The code of the export method, fixed so it works fine but still relevant for anybody who wants to know:
private void exportBtn_Click(object sender, EventArgs e)
{
const string sqlQuery = "SELECT * FROM storage";
const string sqlQuery2 = "SELECT * FROM spots";
string autoPath = System.IO.Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().GetName().CodeBase); //get the current execution directory
using (SqlCeConnection _conn = new SqlCeConnection(_connString))
{
try
{
SqlCeDataAdapter dataAdapter1 = new SqlCeDataAdapter(sqlQuery, _conn);
SqlCeDataAdapter dataAdapter2 = new SqlCeDataAdapter(sqlQuery2, _conn);
_conn.Open();
DataSet ds = new DataSet("SQLExport");
dataAdapter1.Fill(ds, "stock");
dataAdapter2.Fill(ds, "spots");
ds.WriteXml(autoPath + #"\export.xml");
}
catch (SqlCeException sql_ex)
{
MessageBox.Show("SQL database error: " + sql_ex.Message);
}
}
}
As Access is more or less a stand-alone DB solution I strongly recommend to go with a full flavored SQL Server plus IIS to setup a Merge Replication synchronisation between the SQL CE data and the SQL Server data.
This is described with full sample code and setup in the book "Programming the .Net Compact Framework" by Paul Yao and David Durant (chapter 8, Synchronizing Mobile Data).
For a working sync, all changes to defined tables and data on the server and the CE device must be tracked (done via GUIDs, unique numbers) with there timestamps and a conflict handling has to be defined.
If the data is never changed by other means on the server, you may simply track Device side changes only and then push them to the Access database. This could be done by another app that does Buld Updates like described here.
If you do not want to go the expensive way to SQL Server, there are cheaper solutions with free SQLite (available for CE and Compact Framework too) and a commercial Sync tool for SQLite to MSAccess like DBSync.
If you are experienced, you may create your own SQLite to MS ACCESS sync tool.

Get the execution time of a ADO.NET SQL Command

I have been searching over to find if there is any easy way to get the Execution time of a ADO.NET command object.
I know i can manually do a StopWatch start and stop. But wanted to if there are any easy way to do it in ADO.NET
There is a way, but using SqlConnection, not command object. Example:
using (var c = new SqlConnection(connectionString)) {
// important
c.StatisticsEnabled = true;
c.Open();
using (var cmd = new SqlCommand("select * from Error", c)) {
cmd.ExecuteReader().Dispose();
}
var stats = c.RetrieveStatistics();
var firstCommandExecutionTimeInMs = (long) stats["ExecutionTime"];
// reset for next command
c.ResetStatistics();
using (var cmd = new SqlCommand("select * from Code", c))
{
cmd.ExecuteReader().Dispose();
}
stats = c.RetrieveStatistics();
var secondCommandExecutionTimeInMs = (long)stats["ExecutionTime"];
}
Here you can find what other values are contained inside dictionary returned by RetrieveStatistics.
Note that those values represent client-side statistics (basically internals of ADO.NET measure them), but seems you asked for analog of Stopwatch - I think that's fine.
The approach from the answer of #Evk is very interesting and smart: it's working client side and one of the main key of such statistics is in fact NetworkServerTime, which
Returns the cumulative amount of time (in milliseconds) that the
provider spent waiting for replies from the server once the
application has started using the provider and has enabled statistics.
so it includes the network time from the DB server to the ADO NET client.
An alternative, more DB server oriented, would be running SET STATISTICS TIME ON and then retrieve the InfoMessage.
A draft of the code of the delegate (where I'm simply writing to the debug console, but you may want to replace it with a StringBuilder Append)
internal static void TrackInfo(object sender, SqlInfoMessageEventArgs e)
{
Debug.WriteLine(e.Message);
foreach (var element in e.Errors) {
Debug.WriteLine(element.ToString());
}
}
and usage
conn.InfoMessage += TrackInfo;
using (var cmd = new SqlCommand(#"SET STATISTICS TIME ON", conn)) {
cmd.ExecuteNonQuery();
}
using (var cmd = new SqlCommand(yourQuery, conn)) {
var RD = cmd.ExecuteReader();
while (RD.Read()) {
// read the columns
}
}
I suggest you move to SQL Server 2016 and use the Query Store feature. This will track execution time and performance changes over time for each query you submit. Requires no changes in your application. Track all queries, including those executed inside stored procedures. Track any application, not only your own. Is available in all editions, including Express, and including the Azure SQL DB Service.
If you track on the client side, you must measure the time yourself, using a wall clock. I would add and expose performance counters and then use the performance counters infrastructure to capture and store the measurements.
As a side not, simply tracking the execution time of a batch sent to SQL Server yields very coarse performance info and is seldom actionable. Read How to analyse SQL Server performance.

Read bytea data is slow in PostgreSQL

I store data in bytea column in PostgreSQL 9.5 database on Windows.
The data transmission speed is lower than I expect : about 1.5mb per second.
The following code
using (var conn = ConnectionProvider.GetOpened())
using (var comm = new NpgsqlCommand("SELECT mycolumn FROM mytable", conn))
using (var dr = comm.ExecuteReader())
{
var clock = Stopwatch.StartNew();
while (dr.Read())
{
var bytes = (byte[])dr[0];
Debug.WriteLine($"bytes={bytes.Length}, time={clock.Elapsed}");
clock.Restart();
}
}
Produces the following output
bytes=3895534, time=00:00:02.4397086
bytes=4085257, time=00:00:02.7220734
bytes=4333460, time=00:00:02.4462513
bytes=4656500, time=00:00:02.7401579
bytes=5191876, time=00:00:02.7959250
bytes=5159785, time=00:00:02.7693224
bytes=5184718, time=00:00:03.0613514
bytes=720401, time=00:00:00.0227767
bytes=5182772, time=00:00:02.7704914
bytes=538456, time=00:00:00.2996142
bytes=246085, time=00:00:00.0003131
Total: 00:00:22.5199268
The strange thing is that reading last 246kb took less than a millisecond, and reading 720kb in the middle took just 22ms.
Is the reading speed 5mb per 3 sec normal? How I can increase the reading speed?
Details.
My application starts PostgreSQL server on startup and shut downs it on exit.
I start server with the following code
public static void StartServer(string dataDirectory, int port)
{
Invoke("pg_ctl", $"start -w -D \"{dataDirectory}\" -m fast -o \"-B 512MB -p {port} -c temp_buffers=32MB -c work_mem=32MB\"");
}
Also, i change the storage type to my column :
ALTER TABLE mytable ALTER COLUMN mycolumn SET STORAGE EXTERNAL;
I use npgsql 3.0.4.0 and PostgreSQL 9.5 on Windows 10
Running here with Npgsql 3.0.8 (should be the same), PostgreSQL 9.5, windows 10 and Npgsql I don't get your results at all:
bytes=3895534, time=00:00:00.0022591
bytes=4085257, time=00:00:00.0208912
bytes=4333460, time=00:00:00.0228702
bytes=4656500, time=00:00:00.0237144
bytes=5191876, time=00:00:00.0317834
bytes=5159785, time=00:00:00.0268229
bytes=5184718, time=00:00:00.0159028
bytes=720401, time=00:00:00.0130150
bytes=5182772, time=00:00:00.0153306
bytes=538456, time=00:00:00.0021693
bytes=246085, time=00:00:00.0005174
First, what server are you running against, is it on localhost or on some remote machine?
The second thing that comes to mind, is you stopping and starting the server as part of the test. The slow performance you're seeing may be part of a warm-up on PostgreSQL's side. Try to remove it and see if the results can be reproduced after several times of running your tests.
Otherwise this looks like some environmental problem (client machine, server machine or network). I'd try to reproduce the problem on a different machine or in a different setting and go from there.

When inserting data using SQLBulkCopy into an Azure SQL Database table I am getting an error message "The wait operation timed out"?

When I insert more than 80000 records into an Azure SQL Database table using the below code:
IEnumerable<SqlBulkCopyColumnMapping> columnMapping;
db.Database.ExecuteSqlCommand("truncate table dbo.Site");
columnMapping = openXmlParse.GetSiteServiceColumnMappings();
bulkCopy.BatchSize = 2000;
bulkCopy.DestinationTableName = "dbo.Site";
bulkCopy.WriteTableToServer(dt, SqlBulkCopyOptions.Default, columnMapping);
db.sp_TrimTableColumns("Site");
In local DB it works fine but an exception is thrown when the code is run against Azure SQL Database.
Explicitly set the command timeout to larger value based on how long it takes. .Net default value is 30 seconds and may not be sufficient for large inserts. The command run time varies on the service objective chosen too.

Sqlite DB querying takes a long time WP8

I am building a Windows Phone 8 app using sqlite.net using this link as a reference:-
http://developer.nokia.com/community/wiki/How_to_use_SQLite_in_Windows_Phone
There is a database in the project which is being seeded in the Isolated Storage. The database contains only one table which has almost 26k entries.
I am trying to connect to that database in my MainPage.xaml.cs as follows:-
protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
base.OnNavigatedTo(e);
using (SQLiteConnection db = new SQLiteConnection(App._dbPath))
{
db.GetTableInfo("IWMCemeteries");
try
{
List<IWMCemeteries> cemeteriesList = db.Table<IWMCemeteries>().ToList<IWMCemeteries>();
MessageBox.Show("Number of elements in table is " + cemeteriesList.Count);
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
}
}
}
The problem is that it takes too long(over 25 seconds) for the message dialog to show up.
I tried an alternate method running a raw query as follows:-
List<IWMCemeteries> cemeteries = db.Query<IWMCemeteries>("select * from IWMCemeteries");
MessageBox.Show("Number of elements in list is " + cemeteries.Count);
But this seems to take even longer!(almost 30s).
Can someone please tell me what I am doing wrong here?
Thanks,
Rajeev
Nothing wrong here for me. As some people noticed, with 26k rows you are starting to work with an interesting bulk of data. So, in mobile devices working with a "lite" database, you must adapt your request depending on what you really need :
You want the number of rows, then use SELECT COUNT(*)
You want to display all rows in a list, then use paging or asynchronous loading (on scroll down) to fetch only 20 elements each times.
In any apps, but mostly in mobile devices, you have to consider the volume of data which moves.
In this way, any request would be instant and your application will perform well.
There's nothing wrong with your query. Just limit the data you fetch from the database. It's a mobile device with limited power, not a full blown pc.

Categories