Multi-threading with Linq to SQL - c#

I am building an application which requires me to make use of DataContext's inside threads. My application keeps throwing InvalidOperationException's similar to:
There is already an open DataReader associated with this Command which must be closed first
ExecuteReader requires an open and available Connection. The connection's current state is connecting
These exceptions are intermittant.
Here is a snippet of my code:
var repo = new Repository();
var entities = repo.GetAllEntities();
foreach (var entity in entities)
{
ThreadPool.QueueUserWorkItem(
delegate
{
try
{
ProcessEntity(entity);
}
catch (Exception)
{
throw;
}
});
}
I think it may have something to with passing an entity to a thread from the main thread as the error seems to throw as soon as I try and access a property of entity.
Anyone have any idea's why this is happening and how I can resolve it?
Update
Here is what I decided to go with:
var events = new Dictionary<int, AutoResetEvent>();
var repo = new Repository();
var entities = repo.GetAllEntities();
foreach (var entity in entities)
{
events.Add(entity.ID, new AutoResetEvent(false));
ThreadPool.QueueUserWorkItem(
delegate
{
var repo = new Repository();
try
{
ProcessHierarchy(repo.GetEntity(entity.ID), ReportRange.Weekly);
}
catch (Exception)
{
throw;
}
finally
{
events[entity.ID].Set();
}
});
}
WaitHandle.WaitAll(events.Values.ToArray());
Improvements/Suggestions welcome, but this seems to have done the trick.

The exception is thrown since some properties of entity executes new query while a previous reader has not been closed yet. You cannot execute more than one query on the data context at the same time.
As a workaround you can "visit" the properties you access in ProcessEntity() and make the SQL run prior to the thread.
For example:
var repo = new Repository();
var entities = repo.GetAllEntities();
foreach (var entity in entities)
{
var localEntity = entity; // Verify the callback uses the correct instance
var entityCustomers = localEntity.Customers;
var entityOrders = localEntity.Orders;
var invoices = entityOrders.Invoices;
ThreadPool.QueueUserWorkItem(
delegate
{
try
{
ProcessEntity(localEntity);
}
catch (Exception)
{
throw;
}
});
}
This workaround will execute the SQL only on the main thread and the processing will be done in other threads. You loose here some of the efficiency since all the queries are done in the single thread. This solution is good if you have a lot of logic in ProcessEntity() and the queries are not very heavy.

Try creating the Repository inside the new thread instead of passing it in.

Be aware that a SqlConnection instance is NOT thread safe. Whether you still have an open reader or not. In general, the access of a SqlConnection instance from multiple threads will cause you unpredictable intermittent problems.
See: http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.aspx

The solution for me was LightSpeed Persistence framework, which is free until 8 entities. Per thread create the unitwork.
http://www.mindscapehq.com/products/LightSpeed/default.aspx

Related

Entity Framework, Multi Threading, and Transactions

I have an application that reads data from one database, and transforms that data into a new form and writes it into a new database. Some of the tables in the new database are made from multiple tables in the old database so there is a large amount of reading and writing going on. Here is the basic concept of the system:
public void TransferData()
{
OldEntities oldContext = new OldEntities()
NewEntities newContext = new NewEntities()
using(var transaction = newContext.Database.BeginTransaction())
{
try{
TransferTable(oldContext, newContext);
} catch (Exception e) {
transaction.Rollback();
}
}
}
public void TransferTable(OldEntities oldContext, NewEntities newContext)
{
List<Entity1> mainTable = oldContext.Where();
Parallel.ForEach(mainTable, (row) =>
{
using(NewEntities anotherNewContext = new NewContext())
{
anotherNewContext.Database.UseTransaction(newContext.Database.CurrentTransaction.UnderlyingTransaction);
// Do Work
}
});
}
This causes the following exception:
The transaction passed in is not associated with the current connection. Only transactions associated with the current connection may be used.
How can I get around this. The transaction will always be coming from a different EF context but I need them all to share the same transaction. I couldn't find a way to create the new context as a "child" of the original and I am trying to avoid creating a transaction entirely separate from the EF context. Any suggestions?
There is an excellent overview of transactions here which explains how to use transactions in a variety of contexts some of which are similar to yours. Rather than trying to fix your code as is it may be that a modified approach will help.
I assume you are using EF6

Bulk insert with EF

I need to insert some objects (about 4 million) in the database using C# and EF (using .NET 3.5). My method that adds the objects is in a for:
private DBModelContainer AddToContext(DBModelContainer db, tblMyTable item, int count)
{
db.AddTottblMyTable (item);
if ((count % 10000== 0) || (count == this.toGenerate))
{
try
{
db.SaveChanges();
}
catch (Exception e)
{
Console.WriteLine(e.StackTrace);
}
}
return db;
}
How to detach the added objects (of type tblMyTable) from the context object? I don't need them for a later use and when more than 300000 objects are added, the execution time between db saving ( db.SaveChanges()) increases considerably.
Regards
Entity Framework may not be the best tool for this type of operation. You may be better off with plain ADO.Net, some stored procedures... But if you had to use it, here are a number of suggestions:
Keep the active Context Graph small by using a new context for each
Unit of Work
Turn off AutoDetechChangesEnabled - context.Configuration.AutoDetectChangesEnabled = false;
Batching, in your loop, Call SaveChanges periodically
EDIT
using(var db = new DBModelContainer())
{
db.tblMyTable.MergeOption = MergeOption.NoTracking;
// Narrow the scope of your db context
db.AddTottblMyTable (item);
db.SaveChanges();
}
Keeping a long running db context is not advisable, so consider refactoring your Add method to not keep attempting to reuse the same context.
See Rick Strahl's post on bulk inserts for more details
AFAK EF does not support directly the BulkInsert so it will be tedious to do such thing manually.
try to consider EntityFramework.BulkInsert
using (var ctx = GetContext())
{
using (var transactionScope = new TransactionScope())
{
// some stuff in dbcontext
ctx.BulkInsert(entities);
ctx.SaveChanges();
transactionScope.Complete();
}
}
You may try Unit Of Work and dont save context (SaveChanges) on every record insert but save it at end

DbContextTransaction and Multi-thread: The connection is already in a transaction and cannot participate in another transaction

I got this error when I was trying to call same method with multiple threads: The connection is already in a transaction and cannot participate in another transaction. EntityClient does not support parallel transactions.
And I found that my issue somehow is similar to this: SqlException from Entity Framework - New transaction is not allowed because there are other threads running in the session
My scenario:
I have a class that is instantiated by multiple theads, each thread - new instance:
public MarketLogic()
{
var dbContext = new FinancialContext();
AccountBalanceRepository = new AccountBalanceRepository(dbContext);
CompositeTradeRepository = new CompositeTradeRepository(
new OrderRepository(dbContext)
, new PositionRepository(dbContext)
, new TradeRepository(dbContext));
CompositeRepository = new CompositeRepository(
new LookupValueRepository(dbContext)
, new SecurityRepository(dbContext)
, new TransactionRepository(dbContext)
, new FinancialMarketRepository(dbContext)
, new FinancialMarketSessionRepository(dbContext)
);
}
In MarketLogic class, SavePosition() is used to save information into database using Entity Framework DbContext. (SaveChanges()) method.
private void SavePosition()
{
using (DbContextTransaction transaction = CompositeTradeRepository.OrderRepository.DbContext.Database.BeginTransaction())
{
try
{
// business logic code, **this take some times to complete**.
position = EntityExistsSpecification.Not().IsSatisfiedBy(position)
? CompositeTradeRepository.PositionRepository.Add(position)
: CompositeTradeRepository.PositionRepository.Update(position);
transaction.Commit();
}
catch (Exception exception)
{
// some code
transaction.Rollback();
}
}
}
public Position Add(Position position)
{
// some code
// context is a instance of FinancialContext, this class is generated by Entity Framework 6
context.SaveChanges();
}
In my scenario, the issue happened when there are 2 threads and more try to call new MarketLogic().SavePosition().
I can see that while the first transaction is not completed yet, the second thread come in and start a new transaction.
But I dont understand why 2 threads are in different DbContext object BUT the error still happens
So what is wrong? Or did I miss something?
My fault, I left the repositories as static, so all thread shared same repositories, which means they shared same DbContext, which caused the issue when the EF didn't finished permitting changes yet and other call to SaveChanges() is made. So EF throwed exception.

How to put an retrieve action in NHibernate in an transaction?

It appears that is a best practice to put all the database calls into a transaction. So, I wanted to put a select action in a transaction, but I can't find how to do this.
I have tried this code, but I get an error:
using (var session = GetSession().SessionFactory.OpenSession())
using (var transaction = session.BeginTransaction())
{
// var session = GetSession();
var result = session.Query<I>().Where(condition);
transaction.Commit();
return result;
}
Error:
Session is closed!
Object name: 'ISession'.
It's not a matter of Transaction itself, although I only use transactions for save/update calls, not selects, but that might be a matter of preference (or I simply don't know something important).
The thing is, you're not 'materializing' the collection before closing the session.
This should work:
var result = session.Query<I>.Where(condition).List();
return result;
The Where does not do anything by itself. Which means you're just deferring execution of the filter until you do something with it - e.g. iterate over it. If you're out of Session scope by then (and it seems you are), you'll get the exception, since you can't call the database when the session is closed.
Although you probably won't be able to access lazily loaded child items without eagerly Fetching them first - you can't call database through proxy when you're not inside an open session. :)
Disclaimer
By the way, same thing would happen in EF with LINQ:
IEnumerable<I> myObjects;
using(var context = new MyDbContext())
{
myObjects = context.Set<I>.Where(x => x.Name == "Test");
}
foreach(obj in myObjects)
{
var name = obj.Name; //BOOM! Context is disposed.
}

Method that adds elements to a DbSet from multiple threads

static Object LockEx=new Object();
public void SaveMyData(IEnumerable<MyData> list)
{
lock (LockEx)
{
using (PersistencyContext db = new PersistencyContext())
{
foreach (var el in list)
{
try
{
db.MyData.Add(el);
db.SaveChanges();
}
catch (DbUpdateException)
{
db.Entry(el).State = EntityState.Modified;
db.SaveChanges();
}
}
}
}
}
This methods is called from multiple threads. Right now I use a static lock to avoid 2 threads to save data at the same time. Though this is wrong because I only want to save data. The catch is used to create an update query in case the insert (Add) fails because the entry already exists.
What happens if I remove the lock. How will the SaveChanges work? How should my code look like? Thanks
I would remove the lock because the database already handles concurrency anyway by design, then I will also verify if the record exists before trying to add it, then I would do the add or update depending on this result. Just to avoid exceptions because they are performance killers.
Building on Davide's answer, you could also call SaveChanges once after you added all the new entities. That should be faster.

Categories