How do I handle a Linq-to_SQL DataContext across multiple threads?
Should I be creating a global static DataContext that all the threads use and commit changes at the very end or should I create a Context per thread and use that instance for everything inside that thread?
DataContext is not thread safe; using it directly from multiple threads would cause #fail; having a global static data-context would cause #fail and would cause uncontrolled memory growth (the data-context includes an identity manager and change tracker for every object fetched; this only grows over time, as more objects are touched)
Data context should ideally be used for a unit of work; spin one up; do something (that is bound in scope - i.e. not the entire app lifetime), and dispose it. So IMO the real answer here is "tie it to that unit of work". Only you can know what that is in your application; it could be a single method, it could a page request on a web page, it could be a timer "tick" in a service. Who knows...
Related
I am a newbie in .NET and have many confusions regarding the same. If for every request in a dotnet MVC web application a thread is created and inside that thread if we access static variables, then, will the static variables inside all the threads have common memory or will every single thread contain separate static memory variables?
I don't have any code currently.
Following is a bit of a simplification, but should provide enough context to answer your question.
The memory model for .NET is such that memory is generally shared across threads and can be done so without synchronizing access. Thus, if you have some class A, both its instance and static members can be accessed across threads. That is, of course, problematic because multiple threads accessing either could result in concurrency issues or corrupt state.
That said, it is possible to prevent this corruption in a few ways
use lock or some other form of mutual exclusion on both static and instance members
use statics with ThreadLocal or [ThreadStatic]
and, if state must be shared across threads, copy it so threads are accessing memory that (hopefully) the original thread won't touch
There are pros and cons to these. It is generally better to have no cross-thread dependencies. In the context of ASP.NET this is especially true. Ideally you want your APIs and/or web page renders to be completely independent with other calls and across calls.
If you find yourself needing static members (without ThreadLocal<T> or [ThreadStatic]) stick to making only read accesses (no writes) against these fields. But even then, the lifecycle of the static field matters; for example, if the static is initialized after requests start accessing the fields, you'll have trouble.
A friend asked me which would be better ThreadStatic or ThreadLocal. Checking the doc I told him ThreadLocal looks more convenient, is available since .NET 4.0, but I don't understand why use any of them over creating object instance for a thread. Their purpose is to store "thread-local-data", so you can call methods less clumsily and avoid locking in some instances. When I wanted such thread-local-data I always was creating something like:
class ThreadHandler
{
SomeClass A;
public ThreadHandler(SomeClass A)
{
this.A = A;
}
public void Worker()
{
}
}
If I want just fire and forget thread it would be new Thread(new ThreadHandler(new SomeClass()).TheWorkerMethod).Start(), if I want to track threads it can be added to collection, if I want to track data ThreadHandler can be added to collection, if I want to handle both I can make Thread property for ThreadHandler and put ThreadHandler to collection, I want threadpool it's QueueUserWorkItem instead of new Thread(). It's short and simple if scope is simple, but easily extensible if scope gets wider.
When I'm trying to google why use ThreadLocal over an object instance all my searches end up with explanation how ThreadLocal is much greater than ThreadStatic, which in my eyes look like people explaining that they had this clumsy screwdriver, but now toolbox has heavy monkey-wrench which is much more convenient for hammering nails. Whilst toolbox had a hammer to begin with.
I understand I'm missing something, because if ThreadStatic/ThreadLocal had no advantage they just wouldn't exist. Can somebody please point out at least one significant advantage of ThreadLocal over creating an object instance for a thread?
UPD: Looks like a double of this, I think when I was googling "java" keyword was throwing me off. So there's at least one advantage - ThreadLocal is more natural to use with Task Parallel Library.
I don't get advantage of ThreadLocal over creating an instance of object for a thread.
You're right, when you have control over the threads being created, and how they're used, it's very handy to just wrap the whole thread in a helper class, and have it get 'thread local' data from there.
The problem is that, especially in institutionally large projects, you don't always have this kind of control. You may start up a thread, and call some code, and that one thread may wind its way through calls in millions of lines of code scattered between 10 projects owned by 3 internal teams and one external contractor team. Good luck plumbing some of those parameters everywhere.
Thread-local storage lets those guys interact without requiring that they have explicit references to the object that represents that thread's context.
A related problem I had was associating data to some thread and every child thread created by that thread (since my large projects create their own threads, and so thread-local doesn't work anymore), see this question I had: Is there any programmable data that is automatically inherited by children Thread objects?
At the end of the day, it's often lazy programming, but sometimes you find situations where you just need it.
ThreadLocal<T> works like a Dictionary<Thread, T>. The problem with a dictionary is that instances belonging to killed or dead threads stay around forever - they don't get garbage collected, because they are referenced by the dictionary. Using ThreadLocal will ensure that, when a thread dies, the instances referenced by that thread are eligible for GC.
Plus, it's a much nicer interface than having to manually deal with a Dictionary<Thread, T>. It Just Works.
ThreadLocal has 2 benefits over ThreadStatic attribute approach, you can avoid to define class-field and it has built in lazy loading feature. your manual collection approach requires locking mechanism, if you look ThreadLocal's source code, you see its optimized to this specific case.
ThreadLocal can get benfits when T type object new and gc frequenctly. And it's thread safe.
I'm using Spring .Net and Fluent NHibernate in my window application, and I'm working with multiple threads.
I read in some blogs and questions that there can only be one session per thread, and I'm using the HibernateDaoSupport and CurrentSession to do it:
public class DaoBase<T> : HibernateDaoSupport, IDaoBase<T>
{
protected ISession CurrentSession
{
get { return SessionFactoryUtils.GetSession(HibernateTemplate.SessionFactory, true); }
}
}
However, I am testing this feature and must show that the sessions of each thread are different sessions.
How can I do it?
Obs:
After some research I found that objects obtained through a nhibernate session, can not be changed in another session, for example, can not find an object in the "Session 1" and give an update on the same object in "Session 2".
But, in my tests I'm getting an object with the first thread and updating the same in the second thread, this is working. Whats are wrong?
You've got it backwards - a thread can have how many NHibernate sessions it likes. The important thing is that the session is not designed to be threadsafe, so only one thread at a time can operate on a particular session.
Until a session has been disposed, operating on an object loaded from that session also counts as "working with the session", since it may trigger lazy loads etc. So objects loaded from a still-active session should normally only be accessed from a single thread at a time.
As with any violation of thread-safety rules, there is no guarantee that it will break. But there is no promise that it will work either.
Your Test
You can have each thread access CurrentSession, and put the instance in some shared collection, where the test runner thread can then access the collection of sessions and verify that all elements in the collection are distinct instances.
I'm using a singleton pattern for the datacontext in my web application so that I dont have to instantiate it every time, however I'm not sure how web applications work, does IIS open a thread for every user connected? if so, what would happend if my singleton is not thread safe? Also, is it OK to use a singleton pattern for the datacontext? Thanks.
I'm using a singleton pattern for the datacontext in my web application
"Singleton" can mean many different things in this context. Is it single-instance per request? Per session? Per thread? Per AppDomain (static instance)? The implications of all of these are drastically different.
A "singleton" per request (stored in the HttpContext) is fine. A singleton per session is discouraged, but can be made to work. A singleton per thread may appear to work but is likely to result in unexpected and difficult-to-debug behaviour. A singleton per Application or AppDomain is a disaster waiting to happen.
so that I dont have to instantiate it every time
Creating a DataContext is very, very cheap. The metadata is globally cached, and connections aren't created until you actually execute a query. There is no reason to try to optimize away the construction of a DataContext instance.
however I'm not sure how web applications work, does IIS open a thread for every user connected?
IIS uses a different thread for every request, but a single request may use multiple threads, and the threads are taken from the Thread Pool, which means that ultimately the same user will have requests on many different threads, and conversely, different users will share the same thread over multiple requests and an extended period of time. That is why I mention above that you cannot rely on a Thread-Local Singleton.
if so, what would happend if my singleton is not thread safe?
Very bad things. Anything that you cache globally in an ASP.NET application either needs to be made thread safe or needs to be locked while it is in use.
Also, is it OK to use a singleton pattern for the datacontext? Thanks.
A DataContext is not thread-safe, and in this case, even if you lock the DataContext while it is in use (which is already a poor idea), you can still run into cross-thread/cross-request race conditions. Don't do this.
DataContext instances should be confined to the scope of a single method when possible, using the using clause. The next best thing is to store them in the HttpContext. If you must, you can store one in the Session, but there are many things you need to be aware of (see this question I answered recently on the ObjectContext - almost all of the same principles apply to a DataContext).
But above all, do not create "global" singleton instances of a DataContext in an ASP.NET application. You will deeply regret it later.
Many people keep the DataContext around for the duration of the request by keeping it in the HttpContext.Current.Items Thereby it is also private to the request.
Have a look at this blogpost by Steve Sanderson, and the UnitOfWork pattern.
Static variables are visible to all users on the per app domain, not per session. Once created, the variable will sit in memory for the lifetime of the app domain, even if there are no active references to the object.
So if you have some sort of stateful information in a web app that shouldn't be visible to other users, it should absolutely not be static. Store that sort of information in the users session instead, or convert your static var to something like this:
public static Data SomeData
{
get
{
if (HttpContext.Session["SomeData"] == null)
HttpContext.Session["SomeData"] = new Data();
return (Data)HttpContext.Session["SomeData"];
}
}
It looks like a static variable, but its session specific, so the data gets garbage collected when the session dies and its totally invisible to other users. There safety is not guaranteed.
Additionally, if you have stateful information in a static variable, you need some sort of syncronization to modify it, otherwise you'll have a nightmare of race conditions to untangle.
#ryudice the web server creates a new thread for each request. I think the best approach is to have a datacontext bound to each request, meaning that you should create a new datacontext every time you serve a request. A good way of achieving this is by using a DI tool, such as StructureMap. These kind of tools allow you to setup the lifecycle of the instances you configure, so for example in your case you would configure your XDataContext class to be HttpContext scoped.
Regards.
here are Microsoft's examples on how to do multi-tier with LINQ-To-SQL.
http://code.msdn.microsoft.com/multitierlinqtosql
I have a web app that currently uses the current HttpContext to store a LINQ Data Context. The context is persisted for the current request, on a per user basis, per Rick Strahl's blog:
string ocKey = "ocm_" + HttpContext.Current.GetHashCode().ToString("x")
Thread.CurrentContext.ContextID.ToString();
if (!HttpContext.Current.Items.Contains(ocKey))
{
// Get new Data Context and store it in the HTTP Context
}
However, I have some scripts that execute from the global.asax file, that don't have an HttpContext. The HttpContext.Current is NULL, because the server is the one making the "request".
Is there an equivalent object that I can use to store the Data Context? So I don't have to worry about re-creating it, and attaching/detaching objects? I only want to persist the context for the lifetime of my processes.
UPDATED:
I am currently trying to use a static variable in my DAL helper class. on the first call to one of the methods in the class the DataContext is instantiated, and stored in the static variable. At the end of my process, I call another method that calls Dispose on the DataContext, and sets the static variable to NULL.
Can you not just use a static variable specifically for those scripts? That will have the same life-time as the AppDomain. You should probably think carefully about any concurrency concerns, but it sounds like the simplest way to keep a value around.
(I've just checked, and although one instance of HttpApplication can be used to service multiple requests, each one only serves one request at a time - which suggests that multiple instances are created for concurrent request processing. I haven't validated this, but it does sound like it wouldn't be safe to keep it in an instance variable.)
EDIT: Josh's answer suggests that you want this to be per-thread. That sounds slightly odd to me, as unless you've got a lot of these events occurring, you're quite likely to only ever see them execute on different threads, making the whole sharing business pointless. If you really do want that sort of thing, I'd suggest just using an instance variable in the HttpApplication-derived class - for exactly the reason described in the paragraph above :)
Why not use the current HttpContext? The scripts in your global.asax file are all the result of a request coming into the server, so there should be a context associated with that request which you can grab.
I don't understand the need for generating the key based on the hashcode or the thread. There is going to be a separate instance of HttpContext for each request that comes in, and that instance is going to be specific to the thread that is processing the request. Because of that, the key is pretty much worthless when it's based on the instance of HttpContext and the thread.
Also, how do you dispose of the DataContext when you are done? It implements IDisposable for a reason, so I would recommend against a shared instance like this.
UPDATE
In the comments, it indicates that there is a timer that is running that is executing the scripts. Instead of the timer, I would recommend setting up a Scheduled Task which will call a webservice or predetermined page on the site which will perform the task. Then you will always have an HttpContext to work with.
HttpContext.Current is a static method and should be available from anywhere as long as the code is executing within the context of a request.
In your case your not executing within the context of a request, You could look at using Application.Cache but I would caution against holding a DataContext open. I am not very famillar with linq to entities, so I could be wrong, but generally caching data base related items such as connections is bad.
I would also recommend that you consider moving the logic out of your global.asax and to a windows service. This would let you have more control over these tasks, for example you can shut them down seperatley of the web site.
Edit
As JS points out you could use a static variable. You could also define an instance variable marked with ThreadLocal attribute. This will give each thread its own copy of the variable, and can eliminate contention. Since you want each thread to have its own copy anyways.
Is there a reason why these need to be handled the same way as the other DataContexts? It seems to me that if the context is only needed inside the event handling routine, you shouldn't need to keep it around. Especially if it is in Application_Start (as per your comment), I wouldn't bother caching it anywhere -- just use it locally and pass it to the other methods as needed.
Set the DataContext as the state parameter when creating the timer. Based on the info you posted on the comments, it seems to me that your DataContext is more related to the timers than anything else.
Also avoid using the same DataContext for different timers, because you would end up with mixed modifications from the different timers. Also make sure your same timer logic isn't run twice, since it would cause the same i.e. too short period with no control.