c# transaction scope with yield - c#

my requirement is to process multiple cost files, which has million of records. after processing and validation I have to add those records to database.
For better performance I am using "yield" in foreach loop and return one record at a time, process that record and immediately add that one record to database with file number. During this file reading process if I come across any data validation error, I throw InvalidRecordException.
My requirement is to delete all the records from table related that file. in short, even if one record is invalid I want to mark that file as invalid file and not add even a single record of that file to database.
can anyone help me here, how can i make use of TransactionScope here.
public class CostFiles
{
public IEnumerable<string> FinancialRecords
{
get
{
//logic to get list of DataRecords
foreach (var dataRecord in DataRecords)
{
//some processing... which can throw InvalidRecord exception
yield return dataRecord;
}
yield break;
}
}
}
public void ProcessFileRecords(CostFiles costFile, int ImportFileNumber)
{
Database db = new Database();
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required))
{
try
{
foreach (var record in costFile.FinancialRecords)
{
db.Add(record, ImportFileNumber);
}
}
catch(InvalidRecordException ex)
{
//here i want to delete all the records from the table where import file number is same as input paramter ImportFileNumber
}
}
}

The purpose of a transaction scope is to create an "all or nothing" scenario, so either the whole transaction commits, or nothing at all commits. It looks like you already have the right idea (at least in terms of the TransactionScope. The scope won't actually commit the records to the database until you call TransactionScope.Complete(). If Complete() is not called, then the records are discarded when you leave the transaction scope. You could easily do something like this:
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required))
{
bool errorsEncountered = false;
try
{
foreach (var record in costFile.FinancialRecords)
{
db.Add(record, ImportFileNumber);
}
}
catch(InvalidRecordException ex)
{
//here i want to delete all the records from the table where import file number is same as input paramter ImportFileNumber
errorsEncountered = true;
}
if (!errorsEncountered)
{
scope.Complete();
}
}
Or you can just let the Add throw an exception and handle it outside of the transaction scope instead, as the exception will cause Complete() not to be called, and therefore no records added. This method has the additional advantage of stopping processing of additional records when we already know it will do nothing.
try
{
using (var scope = new TransactionScope(TransactionScopeOptions.Required))
{
foreach(var record in costFile.FinancialRecords)
{
db.Add(record, ImportFileNumber);
}
// if an exception is thrown during db.Add(), then Complete is never called
scope.Complete()
}
catch(Exception ex)
{
// handle your exception here
}
}
EDIT If you don't want your transaction elevated to a distributed transaction (which may have additional security/network requirements), make sure you reuse the same SqlConnection object for every database call within your transaction scope.
using (var conn = new SqlConnection("myConnectionString"))
{
conn.Open();
using (var scope = new TransactionScope(...))
{
foreach(var foo in foos)
{
db.Add(foo, conn);
}
scope.Complete();
}
}

Related

Create EF db instance for every insert?

I have a worker thread, whose job is to insert Objects that are stored in a Queue into the database.
We are currently using Entity framework to do the inserts. Now my question is, do I need to make a new Db Instance for every insert? or can I safely re-use the same db instance over and over?
private static void MainWorker()
{
while (true)
{
try
{
if (IncomingDataQueue.Any())
{
if (IncomingDataQueue.TryDequeue(out var items))
{
//Insert into db
using (var db = GetNewDbInstance())
{
if (db != null)
{
db.DataRaw.AddRange(items);
db.SaveChanges();
//Skip everything and continue to the next loop
continue;
}
}
}
}
}
catch (Exception ex)
{
Debug.WriteException("Failed to insert DB Data", ex);
//Delay here in case we are hitting the db 2 hard.
Thread.Sleep(100);
}
//Wait here as we did not have any items in the queue, so wait before checkign again
Thread.Sleep(20);
}
}
Here is my function which gets a new DB Instance:
private static DbEntities GetNewDbInstance()
{
try
{
var db = new DbEntities();
db.Configuration.ProxyCreationEnabled = false;
db.Configuration.AutoDetectChangesEnabled = false;
return db;
}
catch (Exception ex)
{
Debug.WriteLine("Error in getting db instance" + ex.Message);
}
return null;
}
Now I have not had any issues to date, however, I worry that this solution will not scale well if we are for example doing 1000s of inserts per minute?
I then also worry that with 1 static db instance that we could get memory leaks or that object will keep growing and not manage it's db connections properly?
What is the correct way to use EF with long term db connections?

LINQ to SQL not creating Transactions

I've been searching everywhere, to try and get over this issue but I just can't figure this out.
I'm trying to make many changes to the DB with one single transaction using LINQ to SQL.
I've created a .dbml that represents the SQL Table, then I use basicaly this code:
foreach (var _doc in _r.Docs)
{
try
{
foreach (var _E in _Es)
{
Entity _newEnt = CreateNewEnt(_EListID, _doc, _fileName, _E);
_db.Etable.InsertOnSubmit(_newEnt);
_ECount++;
if (_ECount % 1000 == 0)
{
_db.SubmitChanges();
}
}
}
catch (Exception ex)
{
throw;
}
}
But when I do a SQL Profiler, the commands are all executed individually. It won't even start an SQL Transaction.
I've tried using TransactionScope (using statement and Complete()) and DbTransaction (BeginTransaction() and Commit()), none of them did anything at all, it just keeps on executing all commands individually, inserting everything like it was looping through all the inserts.
TransactionScope:
using(var _tans = new TransactionScope())
{
foreach (var _doc in _r.Docs)
{
try
{
foreach (var _E in _Es)
{
Entity _newEnt = CreateNewEnt(_EListID, _doc, _fileName, _E);
_db.Etable.InsertOnSubmit(_newEnt);
_ECount++;
if (_ECount % 1000 == 0)
{
_db.SubmitChanges();
}
}
}
catch (Exception ex)
{
throw;
}
}
_trans.Complete();
}
DbTransaction:
_db.Transaction = _db.Connection.BeginTransaction();
foreach (var _doc in _r.Docs)
{
try
{
foreach (var _E in _Es)
{
Entity _newEnt = CreateNewEnt(_EListID, _doc, _fileName, _E);
_db.Etable.InsertOnSubmit(_newEnt);
_ECount++;
if (_ECount % 1000 == 0)
{
_db.SubmitChanges();
}
}
}
catch (Exception ex)
{
throw;
}
}
_db.Transaction.Commit();
I also tried commiting transactions everytime I Submit the changes, but still nothing, just keeps on executing everything individually.
Right now I'm at a loss and wasting time :\
GSerg was right and pointed me to the right direction, Transactions do not mean multiple commands in one go, they just allow to "undo" all that was made inside given transaction if need be. Bulk statements do what I want to do.
You can download a Nuget Package directly from Visual Studio called "Z.LinqToSql.Plus" that helps with this. It extends DataContext from LINQ, and allows to do multiple insertions, updates or deletes in bulks, which means, in one single statement, like this:
foreach (var _doc in _r.Docs)
{
try
{
foreach (var _E in _Es)
{
Entity _newEnt = CreateNewEnt(_EListID, _doc, _fileName, _E);
_dictionary.add(_ECount, _newEnt); //or using a list as well
_ECount++;
if (_ECount % 20000 == 0)
{
_db.BulkInsert(_dictionary.Values); //inserts in bulk, there are also BulkUpdate and BulkDelete
_dictionary = new Dictionary<long, Entity>(); //restarts the dictionary to prepare for the next bulk
}
}
}
catch (Exception ex)
{
throw;
}
}
As in the code, I can even insert 20k entries in seconds. It's a very useful tool!
Thank you to everyone who tried helping! :)

Deadlock when previous query threw an exception

Using entity framework, I have a function that basically goes something like this:
using (var ctx = new Dal.MyEntities())
{
try
{
//...
// create a temp entity
Dal.Temp temp = new Dal.Temp();
// populate its children
// note that temp is set to cascade deletes down to it's children
temp.Children = from foo in foos
select new Dal.Children()
{
// set some properties...
Field1 = foo.field1,
Field2 = foo.field2
}
//...
// add temp row to temp table
ctx.Temp.Add(temp);
ctx.SaveChanges();
// some query that joins on the temp table...
var results = from d in ctx.SomeOtherTable
join t in temp.Children
on new { d.Field1, d.Field2 } equals new { t.Field1, d.Field2 }
select d;
if (results.Count() == 0)
{
throw new Exception("no results")
}
// Normal processing and return result
return results;
}
finally
{
if (temp != null && temp.ID != 0)
{
ctx.TempTables.Remove(temp);
ctx.SaveChanges();
}
}
}
The idea is that as part of the processing of a request I need to build a temporary table with some data that then gets used to join the main query and filter the results. Once the query has been processed, the temp table should be deleted. I put the deletion part in the finally clause so that if there is a problem with the query (an exception thrown), the temporary table will always get cleaned up.
This seems to work fine, except intermittently I have a problem were the SaveChanges in the finally block throws a deadlock exception with an error message along the lines of:
Transaction (Process ID 89) was deadlocked on lock resources with another process and
has been chosen as the deadlock victim. Rerun the transaction.
I can't reliably reproduce it, but it seems to happen most often if the previous query threw the "no results" exception. Note that, due to an error that was discovered on the front end, two identically requests were being submitted under certain circumstances, but nevertheless, the code should be able to handle that.
Does anybody have an clues as to what might be happening here? Is throwing an exception inside the using block a problem? Should I handle that differently?
Update, so the exception might be a red herring. I removed it altogether (instead returning an empty result) and I still have the problem. I've tried a bunch of variations on:
using (new TransactionScope(TransactionScopeOption.Required, new TransactionOptions { IsolationLevel = IsolationLevel.ReadUncommitted })
using (var ctx = new Dal.MyEntities())
{
}
But despite what I've read, it doesn't seem to make any difference. I still get intermittent deadlocks on the second SaveChanges to remove the temp table.
how about adding a
using (var ctx = new Dal.MyEntities())
{
try
{
//...
Dal.TempTable temp = new Dal.TempTable();
//...
ctx.TempTables.Add(temp);
// some query that joins on the temp table...
if (no
results are
returned)
{
throw new Exception("no results")
}
// Normal processing and return result
}
catch
{
ctx.TempTables.Remove(temp);
ctx.SaveChanges();
}
finally
{
if (temp != null && temp.ID != 0)
{
ctx.TempTables.Remove(temp);
ctx.SaveChanges();
}
}
}

Dispose not working, many dead connections

I'm getting strange things since updated to EF6,no sure this is related or not, but used to be good
I'm doing a set of work, then save it to DB , then do another , save another.
after a while,i check SQL server by sp_who2 , i found many dead connections from my computer.
Job is huge then there goes to 700 connections,
I have to kill them all manually in cycle.
program like:
while (jobDone == false)
{
var returnData=doOneSetJob();
myEntity dbconn= new myEntity;
foreach( var one in retrunData)
{
dbconn.targetTable.add(one );
try
{
dbconn.savechange();
/// even i put a dispose() here , still lots of dead connections
}
catch
{
console.writeline("DB Insertion Fail.");
dbconn.dispose();
dbconn= new myEntity();
}
}
dbconn.dispose()
}
You should consider refactoring your code so that your connection is cleaned up after your job is complete. For example:
using (var context = new DbContext())
{
while (!jobDone)
{
// Execute job and get data
var returnData = doOneSetJob();
// Process job results
foreach (var one in returnData)
{
try
{
context.TargetTable.Add(one);
context.SaveChanges();
}
catch (Exception ex)
{
// Log the error
}
}
}
}
The using statement will guarantee that your context is cleaned up properly, even if an error occurs while you are looping through the results.
In this case you should use a using statement. Taken from MSDN:
The using statement ensures that Dispose is called even if an exception occurs while you are calling methods on the object. You can achieve the same result by putting the object inside a try block and then calling Dispose in a finally block; in fact, this is how the using statement is translated by the compiler.
So, your code would look better like this:
using(var dbconn = new DbContext())
{
while (!jobDone)
{
foreach(var one in retrunData)
{
try
{
targetTable row = new TargetTable();
dbconn.TargetTable.add(row);
dbconn.SaveChanges();
}
catch (Exception ex)
{
Console.WriteLine("DB Insertion Fail.");
}
}
}
}
This way, even if your code fails at some point, the Context, resources and connections will be properly disposed.

In my typed dataset, will the Update method run as a transaction?

I have a typed dataset for a table called People. When you call the update method of a table adapter and pass in the table, is it run as a transaction?
I'm concerned that at some point the constraints set in the xsd will pass but the database will reject this item for one reason or another. I want to make sure that the entire update is rejected and I'm not sure that it just accepts what it can until that error occurs.
If it runs as a transaction I have this
Auth_TestDataSetTableAdapters.PeopleTableAdapter tableAdapter = new Auth_TestDataSetTableAdapters.PeopleTableAdapter();
Auth_TestDataSet.PeopleDataTable table = tableAdapter.GetDataByID(1);
table.AddPeopleRow("Test Item", 5.015);
tableAdapter.Update(table);
But if I have to manually trap this in a transaction I wind up with this
Auth_TestDataSetTableAdapters.PeopleTableAdapter tableAdapter = new Auth_TestDataSetTableAdapters.PeopleTableAdapter();
Auth_TestDataSet.PeopleDataTable table = tableAdapter.GetDataByID(1);
tableAdapter.Connection.Open();
tableAdapter.Transaction = tableAdapter.Connection.BeginTransaction();
table.AddPeopleRow("Test Item", 5.015);
try
{
tableAdapter.Update(table);
tableAdapter.Transaction.Commit();
}
catch
{
tableAdapter.Transaction.Rollback();
}
finally
{
tableAdapter.Connection.Close();
}
Either way works but I am interested in the inner workings. Any other issues with the way I've decided to handle this type of row addition?
-- EDIT --
Determined that it does not work as a transaction and will commit however many records are successful until the error occurs. Thanks to the helpful post below a bit of that transactional code has been condensed to make controlling the transaction easier on the eyes:
Auth_TestDataSetTableAdapters.PeopleTableAdapter tableAdapter = new Auth_TestDataSetTableAdapters.PeopleTableAdapter();
Auth_TestDataSet.PeopleDataTable table = tableAdapter.GetDataByID(1);
try
{
using (TransactionScope ts = new TransactionScope())
{
table.AddPeopleRow("Test Item", (decimal)5.015);
table.AddPeopleRow("Test Item", (decimal)50.015);
tableAdapter.Update(table);
ts.Complete();
}
}
catch (SqlException ex)
{ /* ... */ }
Your approach should work.
You can simplify it a little though:
using (TransactionScope ts = new TransactionScope())
{
// your old code here
ts.Complete();
}

Categories