I have a sql connection that I have to hit the database anywhere from 500 to 10,000 times a second. After about 250 per second things start to slow down and then the app gets so far behind it crashes.
I was thinking about putting the database into a dictionary. I need the fastest performance I can get. Currently the ado.net takes about 1 to 2 milliseconds but something happens that causes a bottleneck.
Is there anything wrong with the below syntax for the 10k queries per second? is a dictionary going to work? we are talking about 12 million records and I need to be able to search it within 1 to 5 milliseconds. I also have another collection in the database that has 50 million records so I'm not sure how to store it. any suggestions will be great.
The SQL db has 128 gb memory and 80 processors and the app is on the same server on the Sql server 2012
using (SqlConnection sqlconn = new SqlConnection(sqlConnection.SqlConnectionString()))
{
using (SqlCommand sqlcmd = new SqlCommand("", sqlconn))
{
sqlcmd.CommandType = System.Data.CommandType.StoredProcedure;
sqlcmd.Parameters.Clear();
sqlcmd.CommandTimeout = 1;
sqlconn.Open();
using (SqlDataReader sqlDR = sqlcmd.ExecuteReader(CommandBehavior.CloseConnection))
public static string SqlConnectionString()
{
return string.Format("Data Source={0},{1};Initial Catalog={2};User ID={3};Password={4};Application Name={5};Asynchronous Processing=true;MultipleActiveResultSets=true;Max Pool Size=524;Pooling=true;",
DataIP, port, Database, username, password, IntanceID);
}
the code below the datareader is
r.CustomerInfo = new CustomerVariable();
r.GatewayRoute = new List<RoutingGateway>();
while (sqlDR.Read() == true)
{
if (sqlDR["RateTableID"] != null)
r.CustomerInfo.RateTable = sqlDR["RateTableID"].ToString();
if (sqlDR["EndUserCost"] != null)
r.CustomerInfo.IngressCost = sqlDR["EndUserCost"].ToString();
if (sqlDR["Jurisdiction"] != null)
r.CustomerInfo.Jurisdiction = sqlDR["Jurisdiction"].ToString();
if (sqlDR["MinTime"] != null)
r.CustomerInfo.MinTime = sqlDR["MinTime"].ToString();
if (sqlDR["interval"] != null)
r.CustomerInfo.interval = sqlDR["interval"].ToString();
if (sqlDR["code"] != null)
r.CustomerInfo.code = sqlDR["code"].ToString();
if (sqlDR["BillBy"] != null)
r.CustomerInfo.BillBy = sqlDR["BillBy"].ToString();
if (sqlDR["RoundBill"] != null)
r.CustomerInfo.RoundBill = sqlDR["RoundBill"].ToString();
}
sqlDR.NextResult();
Don't close and re-open the connection, you can keep it open between requests. Even if you have connection pooling turned on, there is certain overhead, including a brief critical section to prevent concurrency issues when seizing a connection from the pool. May as well avoid that.
Ensure your stored procedure has SET NOCOUNT ON to reduce chattiness.
Ensure you are using the minimum transaction isolation level you can get away with, e.g. dirty reads a.k.a NOLOCK. You can set this at the client end at the connection level or within the stored procedure itself, which ever you're more comfortable with.
Profile these transactions to ensure the bottleneck is on the client. Could be on the DB server or on the network.
If this is a multithreaded application (e.g. on the web), check your connection pool settings and ensure it's large enough. There's a PerfMon counter for this.
Access your fields by ordinal using strongly typed getters, e.g. GetString(0) or GetInt32(3).
Tweak the bejesus out of your stored procedure and indexes. Could write a book on this.
Reindex your tables during down periods, and fill up the index pages if this is a fairly static table.
If the purpose of the stored procedure is to retrieve a single row, try adding TOP 1 to the query so that it will stop loking after the first row is found. Also, consider using output parameters instead of a resultset, which incurs a little less overhead.
A dictionary could potentially work but it depends on the nature of the data, how you are searching it, and how wide the rows are. If you update your question with more information I'll edit my answer.
If you're going to be accessing the DataReader in a loop, then you should find the indexes outside the loop, then use them inside of the loop. You might also do better to use the strongly-typed accesors.
Well, if you have already measured that the ADO command takes only a couple of milliseconds, the other possible cause of delay is the string.Format to build the connectionstring
I would try to remove the string.Format that is called for every
using(SqlConnection cn = new SqlConnection(sqlConnection.SqlConnectionString()))
Instead, supposing the SqlConnectionString is in a separate class you could write
private static string conString = string.Empty;
public static string SqlConnectionString()
{
if(conString == "")
conString = string.Format("............");
return conString;
}
Of course, a benchmark could rule out this, but I am pretty sure that strings operations like that are costly
Seeing your comments below another thing very important to add is the correct declaration of your parameters. Instead of using AddWithValue (convenient, but with tricky side effects) declare your parameters with the correct size
using (SqlCommand sqlcmd = new SqlCommand("", sqlconn))
{
sqlcmd.CommandType = System.Data.CommandType.StoredProcedure;
sqlcmd.CommandText = mySql.GetLCR();
SqlParameter p1 = new SqlParameter("#GatewayID", SqlDbType.NVarChar, 20).Value = GatewayID;
SqlParameter p2 = new SqlParameter("#DialNumber", SqlDbType.NVarChar, 20).Value = dialnumber;
sqlCmd.Parameters.AddRange(new SqlParameter[] {p1, p2});
sqlcmd.CommandTimeout = 1;
sqlconn.Open();
.....
}
The AddWithValue is not recommended when you need to squeeze every milliseconds of performance. This very useful article explain why passing a string with AddWithValue destroy the works made by the optimizer of Sql Server. (In short, the optimizer calculates and stores a query plan for your command and, if it receives another identical command, it reuse the calculated query plan. But if you pass a string with addwithvalue, the size of the parameter is calculated every time based on the actual passed string length. The optimizer cannot reuse the query plan and recalculates and stores a new one)
"I need the fastest performance I can get."
If you haven't done so already, review your business requirements, and how your application interacts with your data warehouse. If you have done this already, then please disregard this posting.
It has been my experience that:
The fact that you are even executing a SQL query against a database means that you have an expense - queries cost time/cpu/memory.
Queries are even more expensive if they include write operations.
The easiest way to save money, is not to spend it! So look for ways to:
avoid querying the database in the first place
ensure that queries execute as quickly as possible
STRATEGIES
Make sure you are using the database indexes properly.
Avoid SQL queries that result in a full table scan.
Use connection pooling.
If you are inserting data into the database, then use bulk uploads.
Use caching where appropriate. Options include:
caching results in memory (i.e. RAM)
caching results to disk
pre-render results ahead of time an read them instead of executing a new query
instead of mining raw data with each query, consider generating summary data that could be queried instead.
Partition your data. This can occur on several levels:
most enterprise databases support partitioning strategies
by reviewing your business model, you can partition your data across several databases (i.e. read/write operations against one DB, write operations against another DB).
Review your application's design and then measure response times to confirm that the bottle neck is in fact where you believe it is.
CACHING TECHNOLOGIES
Asp.net - caching
memcached
redis
etc.
DISCLAIMER: I am not a database administrator (DBA).
I don't think the issue is the string.format
Result is:
108 ms for the format
1416 ms for the open
5176 ms for the execute
and the whole thing is 6891 ms
run this, VERY simple test!
namespace ConsoleApplication1
{
class Program
{
private static string DataIP;
private static string Database;
private static string IntanceID;
static void Main(string[] args)
{
DataIP = #"FREDOU-PC\SQLEXPRESS"; Database = "testing"; IntanceID = "123";
int count = 0;
System.Diagnostics.Stopwatch swWholeThing = System.Diagnostics.Stopwatch.StartNew();
System.Diagnostics.Stopwatch swFormat = new System.Diagnostics.Stopwatch();
System.Diagnostics.Stopwatch swOpen = new System.Diagnostics.Stopwatch();
System.Diagnostics.Stopwatch swExecute = new System.Diagnostics.Stopwatch();
for (int i = 0; i < 100000; ++i)
{
using (System.Data.SqlClient.SqlConnection sqlconn = new System.Data.SqlClient.SqlConnection(SqlConnectionString(ref swFormat)))
{
using (System.Data.SqlClient.SqlCommand sqlcmd = new System.Data.SqlClient.SqlCommand("dbo.counttable1", sqlconn))
{
sqlcmd.CommandType = System.Data.CommandType.StoredProcedure;
sqlcmd.Parameters.Clear();
swOpen.Start();
sqlconn.Open();
swOpen.Stop();
swExecute.Start();
using (System.Data.SqlClient.SqlDataReader sqlDR = sqlcmd.ExecuteReader(System.Data.CommandBehavior.CloseConnection))
{
if (sqlDR.Read())
count += sqlDR.GetInt32(0);
}
swExecute.Stop();
}
}
}
swWholeThing.Stop();
System.Console.WriteLine("swFormat: " + swFormat.ElapsedMilliseconds);
System.Console.WriteLine("swOpen: " + swOpen.ElapsedMilliseconds);
System.Console.WriteLine("swExecute: " + swExecute.ElapsedMilliseconds);
System.Console.WriteLine("swWholeThing: " + swWholeThing.ElapsedMilliseconds + " " + count);
System.Console.ReadKey();
}
public static string SqlConnectionString(ref System.Diagnostics.Stopwatch swFormat)
{
swFormat.Start();
var str = string.Format("Data Source={0};Initial Catalog={1};Integrated Security=True;Application Name={2};Asynchronous Processing=true;MultipleActiveResultSets=true;Max Pool Size=524;Pooling=true;",
DataIP, Database, IntanceID);
swFormat.Stop();
return str;
}
}
}
dbo.counttable1 stored procedure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
create PROCEDURE dbo.counttable1
AS
BEGIN
SET NOCOUNT ON;
SELECT count(*) as cnt from dbo.Table_1
END
GO
dbo.table_1
USE [testing]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Table_1](
[id] [int] NOT NULL,
CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
content:
insert into dbo.Table_1 (id) values (1)
insert into dbo.Table_1 (id) values (2)
insert into dbo.Table_1 (id) values (3)
insert into dbo.Table_1 (id) values (4)
insert into dbo.Table_1 (id) values (5)
insert into dbo.Table_1 (id) values (6)
insert into dbo.Table_1 (id) values (7)
insert into dbo.Table_1 (id) values (8)
insert into dbo.Table_1 (id) values (9)
insert into dbo.Table_1 (id) values (10)
If you are handling millions of records and hitting the database anywhere from 500 to 10,000 times a second. I will recommend to create handler file (API) for data retrieving and you can find Load testing tools to test the API performance.
By using memcache performance can be increase, following are the step to implement the memcache
You have to create a window service that will retrieve data from database and store in memcache in JSON format as (key value pair).
For website create a handler file as an API that will retrieve data from memcache and display the result.
I have implemented this in one of my project it retrieves thousands of data in milliseconds
Related
I am receiving (streamed) data from an external source (over Lightstreamer) into my C# application.
My C# application receives data from the listener. The data from the listener are stored in a queue (ConcurrentQueue).
The queue is getting cleaned every 0.5 seconds with TryDequeue into a DataTable. The DataTable will then be copy into a SQL database using SqlBulkCopy.
The SQL database processes the newly data arrived from the staging table into the final table.
I currently receive around 300'000 rows per day (can increae within the next weeks strongly) and my goal is to stay under 1 second from the time I receive the data until they are available in the final SQL table.
Currently the maximum rows per seconds I have to process is around 50 rows.
Unfortunately, since receiving more and more data, my logic is getting slower in performance (still far under 1 second, but I wanna keep improving). The main bottleneck (so far) is the processing of the staging data (on the SQL database) into the final table.
In order to improve the performance, I would like to switch the staging table into a memory-optimized table. The final table is already a memory-optimized table so they will work together fine for sure.
My questions:
Is there a way to use SqlBulkCopy (out of C#) with memory-optimized tables? (as far as I know there is no way yet)
Any suggestions for the fastest way to write the received data from my C# application into the memory-optimized staging table?
EDIT (with solution):
After the comments/answers and performance evaluations I decided to give up the bulk insert and use SQLCommand to handover a IEnumerable with my data as table-valued parameter into a native compiled stored procedure to store the data directly in my memory-optimized final table (as well as a copy into the "staging" table which now serves as archive). Performance increased significantly (even I did not consider parallelizing the inserts yet (will be at a later stage)).
Here is part of the code:
Memory-optimized user-defined table type (to handover the data from C# into SQL (stored procedure):
CREATE TYPE [Staging].[CityIndexIntradayLivePrices] AS TABLE(
[CityIndexInstrumentID] [int] NOT NULL,
[CityIndexTimeStamp] [bigint] NOT NULL,
[BidPrice] [numeric](18, 8) NOT NULL,
[AskPrice] [numeric](18, 8) NOT NULL,
INDEX [IndexCityIndexIntradayLivePrices] NONCLUSTERED
(
[CityIndexInstrumentID] ASC,
[CityIndexTimeStamp] ASC,
[BidPrice] ASC,
[AskPrice] ASC
)
)
WITH ( MEMORY_OPTIMIZED = ON )
Native compiled stored procedures to insert the data into final table and staging (which serves as archive in this case):
create procedure [Staging].[spProcessCityIndexIntradayLivePricesStaging]
(
#ProcessingID int,
#CityIndexIntradayLivePrices Staging.CityIndexIntradayLivePrices readonly
)
with native_compilation, schemabinding, execute as owner
as
begin atomic
with (transaction isolation level=snapshot, language=N'us_english')
-- store prices
insert into TimeSeries.CityIndexIntradayLivePrices
(
ObjectID,
PerDateTime,
BidPrice,
AskPrice,
ProcessingID
)
select Objects.ObjectID,
CityIndexTimeStamp,
CityIndexIntradayLivePricesStaging.BidPrice,
CityIndexIntradayLivePricesStaging.AskPrice,
#ProcessingID
from #CityIndexIntradayLivePrices CityIndexIntradayLivePricesStaging,
Objects.Objects
where Objects.CityIndexInstrumentID = CityIndexIntradayLivePricesStaging.CityIndexInstrumentID
-- store data in staging table
insert into Staging.CityIndexIntradayLivePricesStaging
(
ImportProcessingID,
CityIndexInstrumentID,
CityIndexTimeStamp,
BidPrice,
AskPrice
)
select #ProcessingID,
CityIndexInstrumentID,
CityIndexTimeStamp,
BidPrice,
AskPrice
from #CityIndexIntradayLivePrices
end
IEnumerable filled with the from the queue:
private static IEnumerable<SqlDataRecord> CreateSqlDataRecords()
{
// set columns (the sequence is important as the sequence will be accordingly to the sequence of columns in the table-value parameter)
SqlMetaData MetaDataCol1;
SqlMetaData MetaDataCol2;
SqlMetaData MetaDataCol3;
SqlMetaData MetaDataCol4;
MetaDataCol1 = new SqlMetaData("CityIndexInstrumentID", SqlDbType.Int);
MetaDataCol2 = new SqlMetaData("CityIndexTimeStamp", SqlDbType.BigInt);
MetaDataCol3 = new SqlMetaData("BidPrice", SqlDbType.Decimal, 18, 8); // precision 18, 8 scale
MetaDataCol4 = new SqlMetaData("AskPrice", SqlDbType.Decimal, 18, 8); // precision 18, 8 scale
// define sql data record with the columns
SqlDataRecord DataRecord = new SqlDataRecord(new SqlMetaData[] { MetaDataCol1, MetaDataCol2, MetaDataCol3, MetaDataCol4 });
// remove each price row from queue and add it to the sql data record
LightstreamerAPI.PriceDTO PriceDTO = new LightstreamerAPI.PriceDTO();
while (IntradayQuotesQueue.TryDequeue(out PriceDTO))
{
DataRecord.SetInt32(0, PriceDTO.MarketID); // city index market id
DataRecord.SetInt64(1, Convert.ToInt64((PriceDTO.TickDate.Replace(#"\/Date(", "")).Replace(#")\/", ""))); // # is used to avoid problem with / as escape sequence)
DataRecord.SetDecimal(2, PriceDTO.Bid); // bid price
DataRecord.SetDecimal(3, PriceDTO.Offer); // ask price
yield return DataRecord;
}
}
Handling the data every 0.5 seconds:
public static void ChildThreadIntradayQuotesHandler(Int32 CityIndexInterfaceProcessingID)
{
try
{
// open new sql connection
using (SqlConnection TimeSeriesDatabaseSQLConnection = new SqlConnection("Data Source=XXX;Initial Catalog=XXX;Integrated Security=SSPI;MultipleActiveResultSets=false"))
{
// open connection
TimeSeriesDatabaseSQLConnection.Open();
// endless loop to keep thread alive
while(true)
{
// ensure queue has rows to process (otherwise no need to continue)
if(IntradayQuotesQueue.Count > 0)
{
// define stored procedure for sql command
SqlCommand InsertCommand = new SqlCommand("Staging.spProcessCityIndexIntradayLivePricesStaging", TimeSeriesDatabaseSQLConnection);
// set command type to stored procedure
InsertCommand.CommandType = CommandType.StoredProcedure;
// define sql parameters (table-value parameter gets data from CreateSqlDataRecords())
SqlParameter ParameterCityIndexIntradayLivePrices = InsertCommand.Parameters.AddWithValue("#CityIndexIntradayLivePrices", CreateSqlDataRecords()); // table-valued parameter
SqlParameter ParameterProcessingID = InsertCommand.Parameters.AddWithValue("#ProcessingID", CityIndexInterfaceProcessingID); // processing id parameter
// set sql db type to structured for table-value paramter (structured = special data type for specifying structured data contained in table-valued parameters)
ParameterCityIndexIntradayLivePrices.SqlDbType = SqlDbType.Structured;
// execute stored procedure
InsertCommand.ExecuteNonQuery();
}
// wait 0.5 seconds
Thread.Sleep(500);
}
}
}
catch (Exception e)
{
// handle error (standard error messages and update processing)
ThreadErrorHandling(CityIndexInterfaceProcessingID, "ChildThreadIntradayQuotesHandler (handler stopped now)", e);
};
}
Use SQL Server 2016 (it's not RTM yet, but it's already much better than 2014 when it comes to memory-optimized tables). Then use either a memory-optimized table variable or just blast a whole lot of native stored procedure calls in a transaction, each doing one insert, depending on what's faster in your scenario (this varies). A few things to watch out for:
Doing multiple inserts in one transaction is vital to save on network roundtrips. While in-memory operations are very fast, SQL Server still needs to confirm every operation.
Depending on how you're producing data, you may find that parallelizing the inserts can speed things up (don't overdo it; you'll quickly hit the saturation point). Don't try to be very clever yourself here; leverage async/await and/or Parallel.ForEach.
If you're passing a table-valued parameter, the easiest way of doing it is to pass a DataTable as the parameter value, but this is not the most efficient way of doing it -- that would be passing an IEnumerable<SqlDataRecord>. You can use an iterator method to generate the values, so only a constant amount of memory is allocated.
You'll have to experiment a bit to find the optimal way of passing through data; this depends a lot on the size of your data and how you're getting it.
Batch the data from the staging table to the final table in row counts less than 5k, I typically use 4k, and do not insert them in a transaction. Instead, implement programmatic transactions if needed. Staying under 5k inserted rows keeps the number of row locks from escalating into a table lock, which has to wait until everyone else gets out of the table.
Are you sure it's your logic slowing down and not the actual transactions to the database? Entity Framework for example is "sensitive", for lack of a better term, when trying to insert a ton of rows and gets pretty slow.
There's a third party library, BulkInsert, on Codeplex which I've used and it's pretty nice to do bulk inserting of data: https://efbulkinsert.codeplex.com/
You could also write your own extension method on DBContext if you EF that does this too that could be based on record count. Anything under 5000 rows use Save(), anything over that you can invoke your own bulk insert logic.
I am doing a loop insert as seen below(Method A), it seems that calling the database with every single loop isn't a good idea. I found an alternative is to loop a comma-delimited string in my SProc instead to do the insert so to have only one entry to the DB. Will be any significant improvement in terms of performance? :
Method A:
foreach (DataRow row in dt.Rows)
{
userBll = new UserBLL();
UserId = (Guid)row["UserId"];
// Call userBll method to insert into SQL Server with UserId as one of the parameter.
}
Method B:
string UserIds = "Tom, Jerry, 007"; // Assuming we already concatenate the strings. So no loops this time here.
userBll = new UserBLL();
// Call userBll method to insert into SQL Server with 'UserIds' as parameter.
Method B SProc / Perform a loop insert in the SProc.
if right(rtrim(#UserIds ), 1) <> ','
SELECT #string = #UserIds + ','
SELECT #pos = patindex('%,%' , #UserIds )
while #pos <> 0
begin
SELECT #piece = left(#v, (#pos-1))
-- Perform the insert here
SELECT #UserIds = stuff(#string, 1, #pos, '')
SELECT #pos = patindex('%,%' , #UserIds )
end
Less queries usually mean faster processing. That said, a co-worker of mine had some success with .NET Framework's wrapper of the TSQL BULK INSERT, which is provided by the Framework as SqlBulkCopy.
This MSDN blog entry shows how to use it.
The main "API" sample is this (taken from the linked article as-is, it writes the contents of a DataTable to SQL):
private void WriteToDatabase()
{
// get your connection string
string connString = "";
// connect to SQL
using (SqlConnection connection =
new SqlConnection(connString))
{
// make sure to enable triggers
// more on triggers in next post
SqlBulkCopy bulkCopy =
new SqlBulkCopy
(
connection,
SqlBulkCopyOptions.TableLock |
SqlBulkCopyOptions.FireTriggers |
SqlBulkCopyOptions.UseInternalTransaction,
null
);
// set the destination table name
bulkCopy.DestinationTableName = this.tableName;
connection.Open();
// write the data in the "dataTable"
bulkCopy.WriteToServer(dataTable);
connection.Close();
}
// reset
this.dataTable.Clear();
this.recordCount = 0;
}
The linked article explains what needs to be done to leverage this mechanism.
In my experience, there are three things you don't want to have to do for each record:
Open/close a sql connection per row. This concern is handled by ADO.NET connection pooling. You shouldn't have to worry about it unless you have disabled the pooling.
Database roundtrip per row. This tends to be less about the network bandwidth or network latency and more about the client side thread sleeping. You want a substantial amount of work on the client side each time it wakes up or you are wasting your time slice.
Open/close the sql transaction log per row. Opening and closing the log is not free, but you don't want to hold it open too long either. Do many inserts in a single transaction, but not too many.
On any of these, you'll probably see a lot of improvement going from 1 row per request to 10 rows per request. You can achieve this by building up 10 insert statements on the client side before transmitting the batch.
Your approach of sending a list into a proc has been written about in extreme depth by Sommarskog.
If you are looking for better insert performance with multiple input values of a given type, I would recommend you look at table valued parameters.
And a sample can be found here, showing some example code that uses them.
You can use bulk insert functionality for this.
See this blog for details: http://blogs.msdn.com/b/nikhilsi/archive/2008/06/11/bulk-insert-into-sql-from-c-app.aspx
We are currently developing an application that generates upwards of 5-10,000 rows of data in a particular table for each user session. Currently we are using sql text commands to insert each row of data at a time so a save operation could take up to a minute. We are playing around with the use of SqlBulkInserts and have seen the time go down to less than 500ms. Does anyone have any objection with the use of SqlBulkInserts in a production application where many users will be using the system?
I have never ran into an issue with SqlBulkCopy with the tableLock option set and another user being blocked due to it. The TableLock option increases the efficiency of the insert from what many people have talked about and just plain using it have shown me.
My typical method:
public void Bulk(String connectionString, DataTable data, String destinationTable)
{
using (SqlConnection connection = new SqlConnection(connectionString))
{
using (SqlBulkCopy bulkCopy =
new SqlBulkCopy
(
connection,
SqlBulkCopyOptions.TableLock |
SqlBulkCopyOptions.FireTriggers |
SqlBulkCopyOptions.UseInternalTransaction,
null
))
{
bulkCopy.BatchSize = data.Rows.Count;
bulkCopy.DestinationTableName = String.Format("[{0}]", destinationTable);
connection.Open();
bulkCopy.WriteToServer(data);
}
}
}
Before implementing using SqlBulkInsert, try creating your INSERT query dynamically to look like this:
insert into MyTable (Column1, Column2)
select 123, 'abc'
union all
select 124, 'def'
union all
select 125, 'yyy'
union all
select 126, 'zzz'
This will be only one database call, which should run much more quickly. For the SQL string concatenation, make sure you use the StringBuilder class.
I think it's the right way to go, if your application really needs to produce that many records per session.
I am inserting 8500 lines at a SQLite database. It takes > 30sec at a Core 2 Duo.
Its using 70% of CPU during this time, then the problem is the CPU usage.
I am using transaction.
I create the database, tables and inserts on the fly at a temp file. Then I donĀ“t need to worry about corruption, etc.
I just tried to use this, but dont help:
PRAGMA journal_mode = OFF;
PRAGMA synchronous = OFF;
What more can I do?
If I run the same script at Firefox SQLite Manager Plugin, it runs instantly.
I run the profiler.
All the time is at
27seg System.Data.SQLite.SQLite3.Prepare(SQLiteConnection, String, SQLiteStatement, UInt32, String&)
This method calls the three methods
12seg System.Data.SQLite.SQLiteConvert.UTF8ToString(IntPtr, Int32)
9seg System.Data.SQLite.SQLiteConvert.ToUTF8(String)
4seg System.Data.SQLite.UnsafeNativeMethods.sqlite3_prepare_interop(IntPtr, IntPtr, Int32, IntPtr&, IntPtr&, Int32&)
You asked to show the insert. Here:
INSERT INTO [Criterio] ([cd1],[cd2],[cd3],[dc4],[dc5],[dt6],[dc7],[dt8],[dt9],[dt10],[dt11])VALUES('FFFFFFFF-FFFF-FFFF-FFFF-B897A4DE6949',10,20,'',NULL,NULL,'',NULL,julianday('2011-11-25 17:00:00'),NULL,NULL);
Table:
CREATE TABLE Criterio (
cd1 integer NOT NULL,
cd2 text NOT NULL,
dc3 text NOT NULL,
cd4 integer NOT NULL,
dt5 DATE NOT NULL DEFAULT CURRENT_TIMESTAMP,
dt6 DATE NULL,
dt7 DATE NULL,
dc8 TEXT NULL,
dt9 datetime NULL,
dc10 TEXT NULL,
dt11 datetime NULL,
PRIMARY KEY (cd2 ASC, cd1 ASC)
);
C# Code:
scriptSql = System.IO.File.ReadAllText(#"C:\Users\Me\Desktop\ScriptToTest.txt");
using (DbCommand comando = Banco.GetSqlStringCommand(scriptSql))
{
try
{
using (TransactionScope transacao = new TransactionScope())
{
Banco.ExecuteNonQuery(comando);
transacao.Complete();
}
}
catch (Exception ex)
{
Logging.ErroBanco(comando, ex);
throw;
}
}
I don't know why pst deleted his answer so I'll re-post the same information from it as this appears to be the correct answer.
According to the SQLite FAQ - INSERT is really slow - I can only do few dozen INSERTs per second
Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second
...
By default, each INSERT statement is its own transaction. But if you surround multiple INSERT statements with BEGIN...COMMIT then all the inserts are grouped into a single transaction.
So basically you need to group your INSERTs into fewer transactions.
Update: So the problem is probably mostly due to the sheer size of the SQL script - SQLite needs to parse the entire script before it can execute, but the parser will be designed to parse small scripts not massive ones! This is why you are seeing so much time spent in the SQLite3.Prepare method.
Instead you should use a parameterised query and insert records in a loop in your C# code, for example if your data was in CSV format in your text file you could use something like this:
using (TransactionScope txn = new TransactionScope())
{
using (DbCommand cmd = Banco.GetSqlStringCommand(sql))
{
string line = null;
while ((line = reader.ReadLine()) != null)
{
// Set the parameters for the command at this point based on the current line
Banco.ExecuteNonQuery(cmd);
txn.Complete();
}
}
}
Have you tried a parameterized insert? In my experience, transactions will improve speed quite a bit, but parameterized queries have a much bigger impact.
I am working on moving a database from MS Access to sql server. To move the data into the new tables I have decided to write a sync routine as the schema has changed quite significantly and it lets me run testing on programs that run off it and resync whenever I need new test data. Then eventually I will do one last sync and start live on the new sql server version.
Unfortunately I have hit a snag, my method is below for copying from Access to SQLServer
public static void BulkCopyAccessToSQLServer
(string sql, CommandType commandType, DBConnection sqlServerConnection,
string destinationTable, DBConnection accessConnection, int timeout)
{
using (DataTable dt = new DataTable())
using (OleDbConnection conn = new OleDbConnection(GetConnection(accessConnection)))
using (OleDbCommand cmd = new OleDbCommand(sql, conn))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
{
cmd.CommandType = commandType;
cmd.Connection.Open();
adapter.SelectCommand.CommandTimeout = timeout;
adapter.Fill(dt);
using (SqlConnection conn2 = new SqlConnection(GetConnection(sqlServerConnection)))
using (SqlBulkCopy copy = new SqlBulkCopy(conn2))
{
conn2.Open();
copy.DestinationTableName = destinationTable;
copy.BatchSize = 1000;
copy.BulkCopyTimeout = timeout;
copy.WriteToServer(dt);
copy.NotifyAfter = 1000;
}
}
}
Basically this queries access for the data using the input sql string this has all the correct field names so I don't need to set columnmappings.
This was working until I reached a table with a calculated field. SQLBulkCopy doesn't seem to know to skip the field and tries to update the column which fails with error "The column 'columnName' cannot be modified because it is either a computed column or is the result of a union operator."
Is there an easy way to make it skip the calculated field?
I am hoping not to have to specify a full column mapping.
There are two ways to dodge this:
use the ColumnMappings to formally define the column relationship (you note you don't want this)
push the data into a staging table - a basic table, not part of your core transactional tables, whose entire purpose is to look exactly like this data import; then use a TSQL command to transfer the data from the staging table to the real table
I always favor the second option, for various reasons:
I never have to mess with mappings - this is actually important to me ;p
the insert to the real table will be fully logged (SqlBulkCopy is not necessarily logged)
I have the fastest possible insert - no constraint checking, no indexing, etc
I don't tie up a transactional table during the import, and there is no risk of non-repeatable queries running against a partially imported table
I have a safe abort option if the import fails half way through, without having to use transactions (nothing has touched the transactional system at this point)
it allows some level of data-processing when pushing it into the real tables, without the need to either buffer everything in a DataTable at the app tier, or implement a custom IDataReader