EF 6.0 SaveAll() very slow - c#

I'm using EF and saving my POCO objects using this function:
public void SaveAll(IList<CoreEntity> entitaCoreList)
{
bool result = false;
using (var context = new NSTEntities())
{
//context.Configuration.AutoDetectChangesEnabled = false;
foreach (var entitaCore in entitaCoreList)
{
TRACCIAVEICOLO_T500 tracciamentoVeicoliEF = new TRACCIAVEICOLO_T500();
tracciamentoVeicoliEF.C_IDTRACCIAMENTOVEICOLO = tracciaVeicolo.Id;
CultureInfo ci = CultureInfo.CreateSpecificCulture("en-EN");
tracciamentoVeicoliEF.Z_COORD = System.Data.Spatial.DbGeography.PointFromText(
"POINT (" + tracciaVeicolo.Longitudine.ToString(ci) + " " + tracciaVeicolo.Latitudine.ToString(ci) + ")", 4326);
tracciamentoVeicoliEF.D_DATARILEVAZIONE = tracciaVeicolo.DataRilevazione;
tracciamentoVeicoliEF.C_CODICEWEBFLEET = tracciaVeicolo.CodiceVeicoloWebfleet;
tracciamentoVeicoliEF.S_POSITIONSTRING = tracciaVeicolo.posString;
tracciamentoVeicoliEF.P_TIPOMESSAGGIO = (int) tracciaVeicolo.TipoMessaggio;
tracciamentoVeicoliEF.V_VELOCITA = tracciaVeicolo.Velocita;
tracciamentoVeicoliEF.V_DIREZIONE = tracciaVeicolo.Direzione;
tracciamentoVeicoliEF.S_GPSSTATUS = tracciaVeicolo.GpsStatus;
tableSet.Add(tracciamentoVeicoliEF);
}
context.SaveChanges();
}
}
}
But it's very slow, it takes nearly 25 seconds for 1000 records.
I tried using a raw query like this:
public void SaveRaw(List<TracciaVeicolo> v)
{
StringBuilder query = new StringBuilder();
query.Append(#"INSERT INTO [dbo].[TRACCIAMENTOVEICOLI_T500]([Z_COORD],[C_CODICEWEBFLEET],[D_DATARILEVAZIONE],[S_POSITIONSTRING],[P_TIPOMESSAGGIO],[V_VELOCITA],[V_DIREZIONE],[S_GPSSTATUS])VALUES ");
bool first = true;
foreach(var o in v)
{
if (!first)
{
query.Append(",");
}
query.AppendFormat("(geography::Point(47.65100, -122.34900, 4326),'{0}','{1}','{2}',{3},{4},{5},'{6}')"
, o.CodiceVeicoloWebfleet
,o.DataRilevazione.ToString("yyyy-dd-MM HH:mm:ss")
,o.posString
, (int)o.TipoMessaggio
, o.Velocita
, o.Direzione
, o.GpsStatus);
first = false;
}
using (var context = new NSTEntities())
{
context.Database.ExecuteSqlCommand(query.ToString());
}
}
And it takes 5 seconds. Am I using EF wrong? I've also tried using context.Configuration.AutoDetectChangesEnabled = false; (as you can see in the first code snippet) but it doesn't change anything
The query EF is running is like this:
declare #p3 sys.geography
set #p3=convert(sys.geography,0xE6100000010CE9297288B82F44404DF4F92823263240)
exec sp_executesql N'insert [dbo].[TRACCIAMENTOVEICOLI_T500]([Z_COORD], [C_CODICEWEBFLEET], [D_DATARILEVAZIONE], [S_POSITIONSTRING], [P_TIPOMESSAGGIO], [V_VELOCITA], [V_DIREZIONE], [S_GPSSTATUS])
values (#0, #1, #2, #3, #4, #5, #6, #7)
select [C_IDTRACCIAMENTOVEICOLO]
from [dbo].[TRACCIAMENTOVEICOLI_T500]
where ##ROWCOUNT > 0 and [C_IDTRACCIAMENTOVEICOLO] = scope_identity()',N'#0 [geography],#1 nvarchar(20),#2 datetime2(7),#3 nvarchar(256),#4 int,#5 float,#6 int,#7 char(1)',#0=#p3,#1=N'1-83645-666EE1173',#2='2016-02-29 15:34:57',#3=N'Vicino a Lecce, 1a Lecce Centro-1B ',#4=0,#5=8,3333333333333339,#6=50,#7='A'

You can try executing it by combining several operations into one transaction. This will save you a lot of time which goes into network latency when you perform single operations.
using (var context = new NSTEntities())
{
using (var dbContextTransaction = context.Database.BeginTransaction())
{
try
{
[... foreach ... tableSet.Add(...) ...]
context.SaveChanges();
dbContextTransaction.Commit();
}
catch (Exception exception)
{
dbContextTransaction.Rollback();
// Log exception (never ignore)
}
}
}
You can also log the SQL-operations to determine what is happening.
For example using something like: context.Database.Log = s => Debug.WriteLine(s);

As you noticed, Entity Framework SaveChanges is very slow since a database round trip is made for every changes.
They have some best practices you can use like using AddRange instead of "Add" but at the end, you will still have some performance issue because 1000 database round trips will be performed.
Disclaimer: I'm the owner of the project Entity Framework Extensions
One way to dramatically improve performance is by using a third party library which allow to use bulk operations.
public void SaveAll(IList<CoreEntity> entitaCoreList)
{
bool result = false;
using (var context = new NSTEntities())
{
// ... code ...
context.BulkSaveChanges();
}
}

Related

Can I run code in parallel that is using Entity Framework to insert data?

I've seen a lot of posts about this, but I'm still not quite sure what the answer is. I inherited some code that is handling items one by one. It does a bunch of stuff (it's somewhat CPU intensive), then needs to save data to the database, and update the status of the item. It uses EF.
Here is the existing method:
private void CheckForThingsToDo(IContainer container)
{
var finished = false;
while (!finished)
{
using (var scope = container.BeginLifetimeScope("Console Request"))
{
var queries = scope.Resolve<IQueries>();
var commands = scope.Resolve<ICommands>();
var context = scope.Resolve<IContext>();
var nextItem = queries.GetNextItem();
finished = (nextItem == null);
if (finished) continue;
using (var transaction = context.BeginTransaction())
{
if (nextItem == null) return;
try
{
commands.ProcessItem(nextItem.Id); // this is somewhat CPU intensive
transaction.Commit();
}
catch(Exception ex)
{
_logger.Error(ex.Message, ex);
}
}
}
}
}
I would like to be able to run these in parallel because it's maxing out one core and the rest of the server is sitting there.
private void CheckForThingsToDoParallel(IContainer container)
{
using (var scope = container.BeginLifetimeScope("Console Request"))
{
var context0 = new EntityFrameworkRecordkeepingContext();
var queries = new AccountTransactionQueries(context0, mapper);
var items = queries.GetItems();
Parallel.ForEach(items,
new ParallelOptions { MaxDegreeOfParallelism = 4 },
() => new EntityFrameworkRecordkeepingContext(),
(item, parallelLoopState, context) =>
{
try
{
var queries = new Queries(context);
var repo = new Repository(context);
var commands = new Commands(repo, queries);
using (var transaction = context.BeginTransaction())
{
commands.ProcessItem(nextBatchItem);
transaction.Commit();
}
}
catch (Exception ex)
{
_logger.Error(ex.Message, ex);
}
return context;
},
(context) =>
{
context.Dispose();
});
}
}
Should something like that work? It works some of the time. But I end up getting this error:
DbUpdateConcurrencyException: Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded
Each item is a separate thing, so any inserts done in relation to one item should not impact another set of inserts from a different item.
I feel like I might be missing something obvious. Some of the parallel code is me trying to figure stuff out. Any thoughts? I need a push in the right direction.

Multi Threaded C# Application - MongoDB misses records during Insert with no error

I'm working on Mongo Db C# Driver version 2.0.1.27 and Mongo Db version 3.0.
Our Aim is to insert huge number of documents into the MongoDb Collection using "Insert" method.
Our Architecture calls this Add method multiple times for each thread.
Below is the Add method:
public bool Add(CallContext context, FileQueueEntity entity)
{
bool bResult = false;
// This logic is to prevent duplicate file.
// Consider new algorithm if supporting other files types
bResult = Delete(context, entity);
if (context.ErrorList.Count == 0)
{
var server = GetMongoServer();
try
{
var database = GetMongoDatabase(server);
var collection = database.GetCollection<FileQueueEntity>("QueueCollection");
entity.BaseMeta = null;
entity.IsNew = false;
collection.Insert(entity);
context.AddToUpdatedList(entity);
}
catch (Exception ex)
{
bResult = false;
context.AddError(ErrorSeverity.System, "DataAccess.AddFileQueue", GetThreadExceptionMessage(ex));
}
finally
{
}
}
return bResult;
}
Below is the Get MongoDatabase Method:
private MongoDatabase GetMongoDatabase(MongoServer mongoServer)
{
return mongoServer.GetDatabase(mConnectionBuilder.InitialCatalog);
}
Below is the one for GetMongoServer
private MongoServer GetMongoServer()
{
System.Threading.Monitor.Enter(_lock);
try
{
if (_mongoServer != null)
{
return _mongoServer;
}
DatabaseProviderFactory factory = new DatabaseProviderFactory();
var aDatabase = factory.Create("ConnectionStringName");
mConnectionBuilder = new SqlConnectionStringBuilder(aDatabase.ConnectionString);
var credential = MongoCredential.CreateCredential(mConnectionBuilder.InitialCatalog, mConnectionBuilder.UserID, mConnectionBuilder.Password);
MongoServerSettings databaseSettings = new MongoServerSettings();
var connectionStrings = mConnectionBuilder.DataSource.Split(',');
if (connectionStrings != null && connectionStrings.Count() > 1)
{
string ipAddress = connectionStrings[0];
int portNumber = Convert.ToInt32(connectionStrings[1], CultureInfo.InvariantCulture);
databaseSettings.Credentials = new[] { credential };
databaseSettings.Server = new MongoServerAddress(ipAddress, portNumber);
}
_mongoServer = new MongoServer(databaseSettings);
return _mongoServer;
}
finally
{
System.Threading.Monitor.Exit(_lock);
}
}
and this is called in this way:
foreach (var n in entities)
{
Add(n);
}
The Foreach loop is called for every instance separately.
The problem is that we are seeing that all the files are not reaching the db as every time there are
random files which are missing from the table.
The entity that we are sending is very light(hardly 400-500bytes).
The Number of files will be 2000-5000 max which will be cleared on daily basis.
So the maximum storage will not be exceeded in this case
For Example:
Thread 1: 50 files - Random 48 files are inserted
Thread 2: 80 files - Random 75 files are inserted
Thread 3: 70 files - Random 60 files are inserted
Thread 4: 60 files - Random 59 files are inserted
Are we missing any Mongo Configuration as it is not throwing any exception and fails silently to insert the records,
Which is a bit strange.
The response we are getting duing insert is
Response: { "ok" : 1, "n" : NumberLong(0) }
It is observed that all the time random files from each of the thread are failing every time.
Can any one help me on this? Are we missing any MongoDB configuration?
couple of points for consideration:
IMongoColleciton can be retrieved once and stored as static/singleton. that is the recommended pattern.
deletion is redundant. use collection.ReplaceOne(e => e.Id == entity.Id, entity);
better yet, use bulk replace with batches of about 50 to 100 entities in each iteration or thread.
try to update to the latest server and driver. So many good changes have occurred since v2.0

Using Entity Framework Transaction Correctly for Isolation

I'm using Entity Framework 6.0 and SQL Server 2016 for my ASP.Net Website. I recently found a problem with concurrency at one of my function. The function is used for processing unpaid order and sometimes this function is executed multiple times for the same key and the same time (because multiple user access it together).
Here is what it looks like.
public void PaidOrder(string paymentCode)
{
using (MyEntities db = new MyEntities())
{
using (DbContextTransaction trans = db.Database.BeginTransaction())
{
try
{
Order_Payment_Code payment = db.Order_Payment_Code.Where(item => item.PaymentCode == paymentCode).FirstOrDefault();
if(payment.Status == PaymentStatus.NotPaid)
{
//This Scope can be executed multiple times
payment.Status = PaymentStatus.Paid;
db.Entry(payment).State = EntityState.Modified;
db.SaveChanges();
//Continue processing Order
trans.Commit();
}
}
catch (Exception ex)
{
trans.Rollback();
}
}
}
}
What I don't understand is why scope inside my if statement can be executed multiple time even it is inside a transaction? Isn't transaction suppose to be isolating the data? Or my understanding of transaction is wrong? If so, then what is the correct way to make the scope inside my if statement only executed once?
A simple and reliable way to serialize an EF SQL Server transaction is to use an Application Lock.
Add this method to your DbContext:
public void GetAppLock(string lockName)
{
var sql = "exec sp_getapplock #lockName, 'exclusive';";
var pLockName = new SqlParameter("#lockName", SqlDbType.NVarChar, 255);
pLockName.Value = lockName;
this.Database.ExecuteSqlCommand(sql, pLockName);
}
And call it just after you start your transaction.
public void PaidOrder(string paymentCode)
{
using (MyEntities db = new MyEntities())
{
using (DbContextTransaction trans = db.Database.BeginTransaction())
{
db.GetAppLock("PaidOrder");
Order_Payment_Code payment = db.Order_Payment_Code.Where(item => item.PaymentCode == paymentCode).FirstOrDefault();
if(payment.Status == PaymentStatus.NotPaid)
{
//This Scope can be executed multiple times
payment.Status = PaymentStatus.Paid;
db.Entry(payment).State = EntityState.Modified;
db.SaveChanges();
//Continue processing Order
}
trans.Commit();
}
}
}
Then only one instance of that transaction can run at a time, even if you have multiple front-end servers. So this is like a Mutex that works across all the clients that access the same database.

CRM Dynamics 2013 How to Update Multiple Records From External Source Using ExecuteMultipleRequest

I am having a scenario in CRM where I need to update multiple accounts values(text fields and option sets) with values from an external sql database table. How can I go about doing this using the execute multiple request. The requirement is to sync all the account data in CRM with our ERP data which comes from a sql table. This process needs to be automated so i opted to use a windows service that runs daily to update accounts that ar flagged for update in the external sql table. I am struggling to find the best approach for this,I tested this idea in a console application on DEV and here is my solution code below. My question is how can I do this better using ExecuteMultipleRequest request.
public static void UpdateAllCRMAccountsWithEmbraceAccountStatus(IOrganizationService service, CRM_Embrace_IntegrationEntities3 db)
{
List<C_Tarsus_Account_Active_Seven__> crmAccountList = new List<C_Tarsus_Account_Active_Seven__>();
//Here I get the list from Staging table
var crmAccounts = db.C_Tarsus_Account_Active_Seven__.Select(x => x).ToList();
foreach (var dbAccount in crmAccounts)
{
CRMDataObjectFour modelObject = new CRMDataObjectFour()
{
ID = dbAccount.ID,
Account_No = dbAccount.Account_No,
Account_Name = dbAccount.Account_Name,
Account_Status = Int32.Parse(dbAccount.Account_Status.ToString()),
Country = dbAccount.Country,
Terms = dbAccount.Terms
};
}
var officialDatabaseList = crmAccounts;
//Here I query CRM to
foreach (var crmAcc in officialDatabaseList)
{
QueryExpression qe = new QueryExpression();
qe.EntityName = "account";
qe.ColumnSet = new ColumnSet("accountnumber", "new_embraceaccountstatus");
qe.Criteria.AddCondition("statecode", ConditionOperator.Equal, 0);
qe.Criteria.AddCondition("accountnumber", ConditionOperator.NotIn, "List of acconts for example"
);
EntityCollection response = service.RetrieveMultiple(qe);
//Here I update the optionset value
foreach (var acc in response.Entities)
{
if (acc.Attributes["accountnumber"].ToString() == crmAcc.Account_No)
{
if (acc.Contains("new_embraceaccountstatus"))
{
continue;
}
else
{
acc.Attributes["new_embraceaccountstatus"] = new OptionSetValue(Int32.Parse(crmAcc.Account_Status.ToString()));
}
service.Update(acc);
}
}
}
}
I know this might not be the right approach, please advise me how to use ExecuteMultipleRequest or perhaps a different solution altogether.
Here is some helper methods I've used before to handle this:
public static ExecuteMultipleRequest MultipleRequest { get; set; }
private const int BatchSize = 250;
public static long LastBatchTime { get; set; }
private static void Batch(IOrganizationService service, OrganizationRequest request)
{
if (MultipleRequest.Requests.Count == BatchSize)
{
ExecuteBatch(service);
}
MultipleRequest.Requests.Add(request);
}
private static void ExecuteBatch(IOrganizationService service)
{
if (!MultipleRequest.Requests.Any())
{
return;
}
Log("Executing Batch size {0}. Last Batch was executed in {1}",MultipleRequest.Requests.Count, LastBatchTime);
var watch = new System.Diagnostics.Stopwatch();
watch.Start();
var response = (ExecuteMultipleResponse)service.Execute(MultipleRequest);
watch.Stop();
LastBatchTime = watch.ElapsedMilliseconds;
Log("Completed Executing Batch in " + watch.ElapsedMilliseconds);
WriteLogsToConsole();
var errors = new List<string>();
// Display the results returned in the responses.
foreach (var responseItem in response.Responses)
{
// A valid response.
if (responseItem.Fault != null)
{
errors.Add(string.Format(
"Error: Execute Multiple Response Fault. Error Code: {0} Message {1} Trace Text: {2} Error Keys: {3} Error Values: {4} ",
responseItem.Fault.ErrorCode,
responseItem.Fault.Message,
responseItem.Fault.TraceText,
responseItem.Fault.ErrorDetails.Keys,
responseItem.Fault.ErrorDetails.Values));
}
}
MultipleRequest.Requests.Clear();
if (errors.Any())
{
throw new Exception(string.Join(Environment.NewLine, errors));
}
}
You can then call this from your normal logic like so:
public static void UpdateAllCRMAccountsWithEmbraceAccountStatus(IOrganizationService service, CRM_Embrace_IntegrationEntities3 db)
{
List<C_Tarsus_Account_Active_Seven__> crmAccountList = new List<C_Tarsus_Account_Active_Seven__>();
//Here I get the list from Staging table
var crmAccounts = db.C_Tarsus_Account_Active_Seven__.Select(x => x).ToList();
foreach (var dbAccount in crmAccounts)
{
CRMDataObjectFour modelObject = new CRMDataObjectFour()
{
ID = dbAccount.ID,
Account_No = dbAccount.Account_No,
Account_Name = dbAccount.Account_Name,
Account_Status = Int32.Parse(dbAccount.Account_Status.ToString()),
Country = dbAccount.Country,
Terms = dbAccount.Terms
};
}
var officialDatabaseList = crmAccounts;
//Here I query CRM to
foreach (var crmAcc in officialDatabaseList)
{
QueryExpression qe = new QueryExpression();
qe.EntityName = "account";
qe.ColumnSet = new ColumnSet("accountnumber", "new_embraceaccountstatus");
qe.Criteria.AddCondition("statecode", ConditionOperator.Equal, 0);
qe.Criteria.AddCondition("accountnumber", ConditionOperator.NotIn, "List of acconts for example");
EntityCollection response = service.RetrieveMultiple(qe);
//Here I update the optionset value
foreach (var acc in response.Entities)
{
if (acc.Attributes["accountnumber"].ToString() == crmAcc.Account_No)
{
if (acc.Contains("new_embraceaccountstatus"))
{
continue;
}
else
{
acc.Attributes["new_embraceaccountstatus"] = new OptionSetValue(Int32.Parse(crmAcc.Account_Status.ToString()));
}
Batch(service, new UpdateRequest { Target = acc });
}
}
}
// Call ExecuteBatch to ensure that any batched requests, get executed.
ExeucteBatch(service)
}
Because it is 2013, and you need to sync records, you'll need to know if some previous records were already in CRM, because depending on that you'll need to send a bunch of Create's or Update's. I would do it in 2 batches of ExecuteMultiple:
1) One batch to execute a query to find which accounts need to be created / updated in CRM, depending on some matching field there.
2) Another batch which will use the previous one to generate all Create / Update operations in one go, depending on the responses you got from 1).
The issue is that they won't run in the same transaction, and that's something which was improved in 2016, as #Daryl said. There is also a new request in 2016 which might improve things even further, because you could merge the 2 batches into one: Upsert, therefore avoiding unnecessary roundtrips to the server.
Maybe this was inspired on Mongo Db's upsert concept which existed long time before? Who knows :)
If you just need to now how to perform an ExecuteMultipleRequest there are samples on the MSDN. Sample: Execute multiple requests.

SqlBulkCopy Multiple Tables Insert under single Transaction OR Bulk Insert Operation between Entity Framework and Classic Ado.net

I have two tables which need to be inserted when my application run.
Let's say that I have tables as followed
tbl_FirstTable and tbl_SecondTable
My problem is data volume.
I need to insert over 10,000 rows to tbl_FirstTable and over 500,000 rows to tbl_SecondTable.
So fristly, I use entity framework as follow.
public bool Save_tbl_FirstTable_Vs_tbl_SecondTable(List<tbl_FirstTable> List_tbl_FirstTable, List<tbl_SecondTable> List_tbl_SecondTable)
{
bool IsSuccessSave = false;
try
{
using (DummyDBClass_ObjectContext _DummyDBClass_ObjectContext = new DummyDBClass_ObjectContext())
{
foreach (tbl_FirstTable _tbl_FirstTable in List_tbl_FirstTable)
{
_DummyDBClass_ObjectContext.tbl_FirstTable.InsertOnSubmit(_tbl_FirstTable);
}
foreach (tbl_SecondTable _tbl_SecondTable in List_tbl_SecondTable)
{
_DummyDBClass_ObjectContext.tbl_SecondTable.InsertOnSubmit(_tbl_SecondTable);
}
_DummyDBClass_ObjectContext.SubmitChanges();
IsSuccessSave = true;
}
}
catch (Exception ex)
{
Log4NetWrapper.WriteError(string.Format("{0} : {1} : Exception={2}",
this.GetType().FullName,
(new StackTrace(new StackFrame(0))).GetFrame(0).GetMethod().Name.ToString(),
ex.Message.ToString()));
if (ex.InnerException != null)
{
Log4NetWrapper.WriteError(string.Format("{0} : {1} : InnerException Exception={2}",
this.GetType().FullName,
(new StackTrace(new StackFrame(0))).GetFrame(0).GetMethod().Name.ToString(),
ex.InnerException.Message.ToString()));
}
}
return IsSuccessSave;
}
That is the place I face error Time out exception.
I think that exception will be solved If I use below code.
DummyDBClass_ObjectContext.CommandTimeout = 1800; // 30 minutes
So I used it. It solved but I face another error OutOfMemory Exception.
So I searched the solutions, fortunately, I found below articles.
Problem with Bulk insert using Entity Framework
Using Transactions with SqlBulkCopy
Performing a Bulk Copy Operation in a Transaction
According to that articles, I change my code from Entity Framework to Classic ADO.net code.
public bool Save_tbl_FirstTable_Vs_tbl_SecondTable(DataTable DT_tbl_FirstTable, DataTable DT_tbl_SecondTable)
{
bool IsSuccessSave = false;
SqlTransaction transaction = null;
try
{
using (DummyDBClass_ObjectContext _DummyDBClass_ObjectContext = new DummyDBClass_ObjectContext())
{
var connectionString = ((EntityConnection)_DummyDBClass_ObjectContext.Connection).StoreConnection.ConnectionString;
using (SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
using (transaction = connection.BeginTransaction())
{
using (SqlBulkCopy bulkCopy_tbl_FirstTable = new SqlBulkCopy(connection, SqlBulkCopyOptions.KeepIdentity, transaction))
{
bulkCopy_tbl_FirstTable.BatchSize = 5000;
bulkCopy_tbl_FirstTable.DestinationTableName = "dbo.tbl_FirstTable";
bulkCopy_tbl_FirstTable.ColumnMappings.Add("ID", "ID");
bulkCopy_tbl_FirstTable.ColumnMappings.Add("UploadFileID", "UploadFileID");
bulkCopy_tbl_FirstTable.ColumnMappings.Add("Active", "Active");
bulkCopy_tbl_FirstTable.ColumnMappings.Add("CreatedUserID", "CreatedUserID");
bulkCopy_tbl_FirstTable.ColumnMappings.Add("CreatedDate", "CreatedDate");
bulkCopy_tbl_FirstTable.ColumnMappings.Add("UpdatedUserID", "UpdatedUserID");
bulkCopy_tbl_FirstTable.ColumnMappings.Add("UpdatedDate", "UpdatedDate");
bulkCopy_tbl_FirstTable.WriteToServer(DT_tbl_FirstTable);
}
using (SqlBulkCopy bulkCopy_tbl_SecondTable = new SqlBulkCopy(connection, SqlBulkCopyOptions.KeepIdentity, transaction))
{
bulkCopy_tbl_SecondTable.BatchSize = 5000;
bulkCopy_tbl_SecondTable.DestinationTableName = "dbo.tbl_SecondTable";
bulkCopy_tbl_SecondTable.ColumnMappings.Add("ID", "ID");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("UploadFileDetailID", "UploadFileDetailID");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("CompaignFieldMasterID", "CompaignFieldMasterID");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("Value", "Value");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("Active", "Active");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("CreatedUserID", "CreatedUserID");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("CreatedDate", "CreatedDate");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("UpdatedUserID", "UpdatedUserID");
bulkCopy_tbl_SecondTable.ColumnMappings.Add("UpdatedDate", "UpdatedDate");
bulkCopy_tbl_SecondTable.WriteToServer(DT_tbl_SecondTable);
}
transaction.Commit();
IsSuccessSave = true;
}
connection.Close();
}
}
}
catch (Exception ex)
{
if (transaction != null)
transaction.Rollback();
Log4NetWrapper.WriteError(string.Format("{0} : {1} : Exception={2}",
this.GetType().FullName,
(new StackTrace(new StackFrame(0))).GetFrame(0).GetMethod().Name.ToString(),
ex.Message.ToString()));
if (ex.InnerException != null)
{
Log4NetWrapper.WriteError(string.Format("{0} : {1} : InnerException Exception={2}",
this.GetType().FullName,
(new StackTrace(new StackFrame(0))).GetFrame(0).GetMethod().Name.ToString(),
ex.InnerException.Message.ToString()));
}
}
return IsSuccessSave;
}
Finally, It perform insert process in less than 15 seconds for over 500,000 rows.
There is two reasons why I post this scenario.
I would like to share what I found out.
As I am not perfect, I still need to get more suggestion from you.
So, every better solution will be appreciated.
1) Use EF6.x, which has much better performance than EF5.x
Here are more suggestions (from Bulk insert with EF)
2) Keep the active Context Graph small by using a new context for each Unit of Work
3) Turn off AutoDetechChangesEnabled - context.Configuration.AutoDetectChangesEnabled = false;
4) Batching, in your loop, Call SaveChanges periodically
I use payed Entity Framework extension from ZZZ Projects which is developer friendly because of fluent API (extenssion methods, functional approach). This is not an andvertisment, I use it in business projects for several years and it is great. If You want to use something for free and You have Oracle database the Oracle Managed Data Access Oracle.ManagedDataAccess.Core has implementation of bulk operations.
Bulk Operations are not really what ORMs are meant for. For bulk insert operations, I send xml to the stored procedure, and I shred it and bulk insert/update or merge from there.
So even when I use an ORM, I create a Domain Library that is not EF (or NHibernate) dependent.so I have a "safety valve" to bypass the ORM in certain situations.
You should look at using the System.Data.SqlClient.SqlBulkCopy for this. Here's the documentation- http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx, and of course there are plenty of tutorials online.
In case we want EF to bulk insert records, would suggest below points to improve performance
Call SaveChanges() after for example 100 records and dispose the context and create a new one.
Disable change detection
Example:
using (TransactionScope scope = new TransactionScope())
{
MyDbContext context = null;
try
{
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;
foreach (var entityToInsert in someCollectionOfEntitiesToInsert)
{
++count;
context = AddToContext(context, entityToInsert, count, 100, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
private MyDbContext AddToContext(MyDbContext context,
Entity entity, int count, int commitCount, bool recreateContext)
{
context.Set<Entity>().Add(entity);
if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
For the performance it is important to call SaveChanges() after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one.
This clears the context from all entites, SaveChanges doesn't do that, the entities are still attached to the context in state Unchanged. It is the growing size of attached entities in the context what slows down the insertion step by step.
So, it is helpful to clear it after some time.
AutoDetectChangesEnabled = false; on the DbContext.
It also has a big additional performance effect: Why is inserting entities in EF 4.1 so slow compared to ObjectContext?.
below combination increase speed well enough in EF.
context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;

Categories