Deterministic dispose of ThreadStatic objects

Deterministic dispose of ThreadStatic objects - c#

The ThreadStatic attribute declares a static variable as unique-per-thread.
Do you know an easy pattern to correctly dispose such variables?
What we used before ThreadStatic is a ThreadContextManager. Every thread was allocated a ThreadContext which retained all thread-specific information. We spawned some threads and let them work. Then, when they all finished, we disposed of the ThreadContentManager, which in turn disposed all the contexts if they were IDisposable.
I don't see an immediate way to translate this pattern to ThreadStatic objects. The objects will be disposed of eventualy, because the threads die, and so nothing reference them. However, we prefer deterministic dispose whenever possible.
Update
I do not really control the threads directly - I'm using Microsoft CCR, which has a ThreadPool that does tasks. When all the tasks are done, I'm disposing the Dispatcher (which holds the threadpool). The thing is - I do not get a chance to do anything "at the end of a thread's main function" - so I can't dispose things manually at the end of a thread's run. Can I access the thread's static objects from outside the thread somehow?

You can still use the equivalent of your ThreadContextManager class to handle the dispose. The spawned threads dispose of this 'manager' object which in turn takes out all the other thread static objects it knows about.
I prefer to have relatively few thread static objects and use a context object instead. This keeps the thread specific state in only a few places, and makes patterns like this easier.
Update: to handle the threadpool case you could create a base 'task' object that is the one that you pass to the thread pool. It can perform any generic initialization your code needs, invoke the 'real' task and then performs any cleanup needed.

Related

Prevent multiple identical processes from spawning [duplicate]

I have the following code:
using (Mutex mut = new Mutex(false, MUTEX_NAME))
{
if (mut.WaitOne(new TimeSpan(0, 0, 30)))
{
// Some code that deals with a specific TCP port
// Don't want this to run at the same time in another process
}
}
I've set a breakpoint within the if block, and ran the same code within another instance of Visual Studio. As expected, the .WaitOne call blocks. However, to my surprise, as soon as I continue in the first instance and the using block terminates, I get an exception in the second process about an abandoned Mutex.
The fix is to call ReleaseMutex:
using (Mutex mut = new Mutex(false, MUTEX_NAME))
{
if (mut.WaitOne(new TimeSpan(0, 0, 30)))
{
// Some code that deals with a specific TCP port
// Don't want this to run twice in multiple processes
}
mut.ReleaseMutex();
}
Now, things work as expected.
My Question: Usually the point of an IDisposable is it cleans up whatever state you put things in. I could see perhaps having multiple waits and releases within a using block, but when the handle to the Mutex is disposed, shouldn't it get released automatically? In other words, why do I need to call ReleaseMutex if I'm in a using block?
I'm also now concerned that if the code within the if block crashes, I'll have abandoned mutexes lying around.
Is there any benefit to putting Mutex in a using block? Or, should I just new up a Mutex instance, wrap it in a try/catch, and call ReleaseMutex() within the finally block (Basically implementing exactly what I thought Dispose() would do)

The documentation explains (in the "Remarks" section) that there is a conceptual difference between instantiating a Mutex object (which does not, in fact, do anything special as far as synchronization goes) and acquiring a Mutex (using WaitOne). Note that:
WaitOne returns a boolean, meaning that acquiring a Mutex can fail (timeout) and both cases must be handled
When WaitOne returns true, then the calling thread has acquired the Mutex and must call ReleaseMutex, or else the Mutex will become abandoned
When it returns false, then the calling thread must not call ReleaseMutex
So, there's more to Mutexes than instantiation. As for whether you should use using anyway, let's take a look at what Dispose does (as inherited from WaitHandle):
protected virtual void Dispose(bool explicitDisposing)
{
if (this.safeWaitHandle != null)
{
this.safeWaitHandle.Close();
}
}
As we can see, the Mutex is not released, but there is some cleanup involved, so sticking with using would be a good approach.
As to how you should proceed, you can of course use a try/finally block to make sure that, if the Mutex is acquired, that it gets properly released. This is likely the most straightforward approach.
If you really don't care about the case where the Mutex fails to be acquired (which you haven't indicated, since you pass a TimeSpan to WaitOne), you could wrap Mutex in your own class that implements IDisposable, acquire the Mutex in the constructor (using WaitOne() with no arguments), and release it inside Dispose. Although, I probably wouldn't recommend this, as this would cause your threads to wait indefinitely if something goes wrong, and regardless there are good reasons for explicitly handling both cases when attempting an acquire, as mentioned by #HansPassant.

This design decision was made a long, long time ago. Over 21 years ago, well before .NET was ever envisioned or the semantics of IDisposable were ever considered. The .NET Mutex class is a wrapper class for the underlying operating system support for mutexes. The constructor pinvokes CreateMutex, the WaitOne() method pinvokes WaitForSingleObject().
Note the WAIT_ABANDONED return value of WaitForSingleObject(), that's the one that generates the exception.
The Windows designers put the rock-hard rule in place that a thread that owns the mutex must call ReleaseMutex() before it exits. And if it doesn't that this is a very strong indication that the thread terminated in an unexpected way, typically through an exception. Which implies that synchronization is lost, a very serious threading bug. Compare to Thread.Abort(), a very dangerous way to terminate a thread in .NET for the same reason.
The .NET designers did not in any way alter this behavior. Not in the least because there isn't any way to test the state of the mutex other than by performing a wait. You must call ReleaseMutex(). And do note that your second snippet is not correct either; you cannot call it on a mutex that you didn't acquire. It must be moved inside of the if() statement body.

Ok, posting an answer to my own question. From what I can tell, this is the ideal way to implement a Mutex that:
Always gets Disposed
Gets Released iff WaitOne was successful.
Will not get abandoned if any code throws an exception.
Hopefully this helps someone out!
using (Mutex mut = new Mutex(false, MUTEX_NAME))
{
if (mut.WaitOne(new TimeSpan(0, 0, 30)))
{
try
{
// Some code that deals with a specific TCP port
// Don't want this to run twice in multiple processes
}
catch(Exception)
{
// Handle exceptions and clean up state
}
finally
{
mut.ReleaseMutex();
}
}
}
Update: Some may argue that if the code within the try block puts your resource in an unstable state, you should not release the Mutex and instead let it get abandoned. In other words, just call mut.ReleaseMutex(); when the code finishes successfully, and not put it within the finally block. The code acquiring the Mutex could then catch this exception and do the right thing.
In my situation, I'm not really changing any state. I'm temporarily using a TCP port and can't have another instance of the program run at the same time. For this reason, I think my solution above is fine but yours may be different.

One of the primary uses of a mutex is to ensure that the only code which will ever see a shared object in a state which doesn't satisfy its invariants is the code which (hopefully temporarily) put the object into that state. A normal pattern for code which needs to modify an object is:
Acquire mutex
Make changes to object which cause its state to become invalid
Make changes to object which cause its state to become valid again
Release mutex
If something goes wrong in after #2 has begun and before #3 has finished, the object may be left in a state which does not satisfy its invariants. Since the proper pattern is to release a mutex before disposing it, the fact that code disposes a mutex without releasing it implies that something went wrong somewhere. As such, it may not be safe for code to enter the mutex (since it hasn't been released), but there's no reason to wait for the mutex to be released (since--having been disposed--it never will be). Thus, the proper course of action is to throw an exception.
A pattern which is somewhat nicer than the one implemented by the .NET mutex object is to have the "acquire" method return an IDisposable object which encapsulates not the mutex, but rather a particular acquisition thereof. Disposing that object will then release the mutex. Code can then look something like:
using(acq = myMutex.Acquire())
{
... stuff that examines but doesn't modify the guarded resource
acq.EnterDanger();
... actions which might invalidate the guarded resource
... actions which make it valid again
acq.LeaveDanger();
... possibly more stuff that examines but doesn't modify the resource
}
If the inner code fails between EnterDanger and LeaveDanger, then the acquisition object should invalidate the mutex by calling Dispose on it, since the guarded resource may be in a corrupted state. If the inner code fails elsewhere, the mutex should be released since the guarded resource is in a valid state, and the code within the using block won't need to access it anymore. I don't have any particular recommendations of libraries implementing that pattern, but it isn't particularly difficult to implement as a wrapper around other kinds of mutex.

We need to understand more then .net to know what is going on the start of the MSDN page gives the first hint that someone “odd” is going on:
A synchronization primitive that can also be used for interprocess
synchronization.
A Mutex is a Win32 “Named Object”, each process locks it by name, the .net object is just a wrapper round the Win32 calls. The Muxtex itself lives within the Windows Kernal address space, not your application address space.
In most cases you are better off using a Monitor, if you are only trying to synchronizes access to objects within a single process.

If you need to garantee that the mutex is released switch to a try catch finally block and put the mutex release in the finally block. It is assumed that you own and have a handle for the mutex. That logic needs to be included before release is invoked.

Reading the documentation for ReleaseMutex, it seems the design decision was that a Mutex should be released consciously. if ReleaseMutex isn't called, it signifies an abnormal exit of the protected section. putting the release in a finally or dispose, circumvents this mechanism. you are still free to ignore the AbandonedMutexException, of course.

Be aware: The Mutex.Dispose() executed by the Garbage collector fails because the garbage collection process does not own the handle according Windows.

Dispose depends on WaitHandle to be released. So, even though using calls Dispose, it won't go into affect until the the conditions of stable state are met. When you call ReleaseMutex you're telling the system that you're releasing the resource, and thus, it is free to dispose of it.

For the last question.
Is there any benefit to putting Mutex in a using block? Or, should I just new up a Mutex instance, wrap it in a try/catch, and call ReleaseMutex() within the finally block (Basically implementing exactly what I thought Dispose() would do)
If you don't dispose of the mutex object, creating too many mutex objects may encounter the following issue.
---> (Inner Exception #4) System.IO.IOException: Not enough storage is available to process this command. : 'ABCDEFGHIJK'
at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew)
at NormalizationService.Controllers.PhysicalChunkingController.Store(Chunk chunk, Stream bytes) in /usr/local/...
The program uses the named mutex and runs 200,000 times in the parallel for loop. Adding using statement resolves the issue.

C# Threading - Using a class in a thread-safe way vs. implementing it as thread-safe

Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe). Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
Is this means that I use the class in a thread-safe way, and the fact that the documentation state that it's not thread-safe
is no longer relevant to my situation?
If the answer is No: Can I do everything related to a specific object in the same thread - i.e, creating it and calling its members always in the same thread (but not the GUI thread)? If so, how do I do that? (If relevant, it's a WPF app).

No, it is not thread safe. As a general rule, you should never write multi threaded code without some kind of synchronization. In your first example, even if you somehow manage to ensure that modifying/reading is never done at the same time, still there is a problem of caching values and instructions reordering.
Just for example, CPU caches values into a register, you update it on one thread, read it from another. If the second one has it cached, it doesn't go to RAM to fetch it and doesn't see the updated value.
Take a look at this great post for more info and problems with writing lock free multi threaded code link. It has a great explanation how CPU, compiler and CLI byte code compiler can reorder instructions.

Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe).
"Thread-safe" has a number of different meanings. Most objects fall into one of three categories:
Thread-affine. These objects can only be accessed from a single thread, never from another thread. Most UI components fall into this category.
Thread-safe. These objects can be accessed from any thread at any time. Most synchronization objects (including concurrent collections) fall into this category.
One-at-a-time. These objects can be accessed from one thread at a time. This is the "default" category, with most .NET types falling into this category.
Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
As another answerer noted, you have to take into consideration instruction reordering and cached reads. In other words, it's not sufficient to just do these at different times; you'll need to implement proper barriers to ensure it is guaranteed to work correctly.
The easiest way to do this is to protect all access of the object with a lock statement. If all reads, writes, and method calls are all within the same lock, then this would work (assuming the object does have a one-at-a-time kind of threading model and not thread-affine).

Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe). Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
All Classes are by default non thread safe, except few Collections like Concurrent Collections designed specifically for the thread safety. So for any other class that you may choose and if you access it via multiple threads or in a Non atomic manner, whether read / write then it's imperative to introduce thread safety while changing the state of an object. This only applies to the objects whose state can be modified in a multi-threaded environment but Methods as such are just functional implementation, they are themselves not a state, which can be modified, they just introduce thread safety for maintaining the object state.
Is this means that I use the class in a thread-safe way, and the fact that the documentation state that it's not thread-safe is no longer relevant to my situation? If the answer is No: Can I do everything related to a class in the same thread (but not the GUI thread)? If so, how do I do that? (If relevant, it's a WPF app).
For a Ui application, consider introducing Async-Await for IO based operations, like file read, database read and use TPL for compute bound operations. Benefit of Async-Await is that:
It doesn't block the Ui thread at all, and keeps Ui completely responsive, in fact post await Ui controls can be directly updated with no Cross thread concern, since only one thread is involved
The TPL concurrency too makes compute operations blocking, they summon the threads from the thread Pool and can't be used for the Ui update due to Cross thread concern
And last: there are classes in which one method starts an operation, and another one ends it. For example, using the SpeechRecognitionEngine class you can start a speech recognition session with RecognizeAsync (this method was before the TPL library so it does not return a Task), and then cancel the recognition session with RecognizeAsyncCancel. What if I call RecognizeAsync from one thread and RecognizeAsyncCancel from another one? (It works, but is it "safe"? Will it fail on some conditions which I'm not aware of?)
As you have mentioned the Async method, this might be an older implementation, based on APM, which needs AsyncCallBack to coordinate, something on the lines of BeginXX, EndXX, if that's the case, then nothing much would be required to co-ordinate, as they use AsyncCallBack to execute a callback delegate. In fact as mentioned earlier, there's no extra thread involved here, whether its old version or new Async-Await. Regarding task cancellation, CancellationTokenSource can be used for the Async-Await, a separate cancellation task is not required. Between multiple threads coordination can be done via Auto / Manual ResetEvent.
If the calls mentioned above are synchronous, then use the Task wrapper to return the Task can call them via Async method as follows:
await Task.Run(() => RecognizeAsync())
Though its a sort of Anti-Pattern, but can be useful in making whole call chain Async
Edits (to answer OP questions)
Thanks for your detailed answer, but I didn't understand some of it. At the first point you are saying that "it's imperative to introduce thread safety", but how?
Thread safety is introduced using synchronization constructs like lock, mutex, semaphore, monitor, Interlocked, all of them serve the purpose of saving an object from getting corrupt / race condition. I don't see any steps.
Does the steps I have taken, as described in my post, are enough?
I don't see any thread safety steps in your post, please highlight which steps you are talking about
At the second point I'm asking how to use an object in the same thread all the time (whenever I use it). Async-Await has nothing to do with this, AFAIK.
Async-Await is the only mechanism in concurrency, which since doesn't involved any extra thread beside calling thread, can ensure everything always runs on same thread, since it use the IO completion ports (hardware based concurrency), otherwise if you use Task Parallel library, then there's no way for you to ensure that same / given thread is always use, as that's a very high level abstraction
Check one of my recent detailed answer on threading here, it may help in providing some more detailed aspects

It is not thread-safe, as the technical risk exists, but your policy is designed to cope with the problem and work around the risk. So, if things stand as you described, then you are not having a thread-safe environment, however, you are safe. For now.

What is advantage of ThreadStatic, ThreadLocal, GetData over creating object instance for a thread?

A friend asked me which would be better ThreadStatic or ThreadLocal. Checking the doc I told him ThreadLocal looks more convenient, is available since .NET 4.0, but I don't understand why use any of them over creating object instance for a thread. Their purpose is to store "thread-local-data", so you can call methods less clumsily and avoid locking in some instances. When I wanted such thread-local-data I always was creating something like:
class ThreadHandler
{
SomeClass A;
public ThreadHandler(SomeClass A)
{
this.A = A;
}
public void Worker()
{
}
}
If I want just fire and forget thread it would be new Thread(new ThreadHandler(new SomeClass()).TheWorkerMethod).Start(), if I want to track threads it can be added to collection, if I want to track data ThreadHandler can be added to collection, if I want to handle both I can make Thread property for ThreadHandler and put ThreadHandler to collection, I want threadpool it's QueueUserWorkItem instead of new Thread(). It's short and simple if scope is simple, but easily extensible if scope gets wider.
When I'm trying to google why use ThreadLocal over an object instance all my searches end up with explanation how ThreadLocal is much greater than ThreadStatic, which in my eyes look like people explaining that they had this clumsy screwdriver, but now toolbox has heavy monkey-wrench which is much more convenient for hammering nails. Whilst toolbox had a hammer to begin with.
I understand I'm missing something, because if ThreadStatic/ThreadLocal had no advantage they just wouldn't exist. Can somebody please point out at least one significant advantage of ThreadLocal over creating an object instance for a thread?
UPD: Looks like a double of this, I think when I was googling "java" keyword was throwing me off. So there's at least one advantage - ThreadLocal is more natural to use with Task Parallel Library.

I don't get advantage of ThreadLocal over creating an instance of object for a thread.
You're right, when you have control over the threads being created, and how they're used, it's very handy to just wrap the whole thread in a helper class, and have it get 'thread local' data from there.
The problem is that, especially in institutionally large projects, you don't always have this kind of control. You may start up a thread, and call some code, and that one thread may wind its way through calls in millions of lines of code scattered between 10 projects owned by 3 internal teams and one external contractor team. Good luck plumbing some of those parameters everywhere.
Thread-local storage lets those guys interact without requiring that they have explicit references to the object that represents that thread's context.
A related problem I had was associating data to some thread and every child thread created by that thread (since my large projects create their own threads, and so thread-local doesn't work anymore), see this question I had: Is there any programmable data that is automatically inherited by children Thread objects?
At the end of the day, it's often lazy programming, but sometimes you find situations where you just need it.

ThreadLocal<T> works like a Dictionary<Thread, T>. The problem with a dictionary is that instances belonging to killed or dead threads stay around forever - they don't get garbage collected, because they are referenced by the dictionary. Using ThreadLocal will ensure that, when a thread dies, the instances referenced by that thread are eligible for GC.
Plus, it's a much nicer interface than having to manually deal with a Dictionary<Thread, T>. It Just Works.

ThreadLocal has 2 benefits over ThreadStatic attribute approach, you can avoid to define class-field and it has built in lazy loading feature. your manual collection approach requires locking mechanism, if you look ThreadLocal's source code, you see its optimized to this specific case.

ThreadLocal can get benfits when T type object new and gc frequenctly. And it's thread safe.

Thread safe disposable

I need to ensure thread safety when disposing an IDisposable resource. The resource is cached multiple times in memory. When entries are evicted from the cache we have a callback that calls Dispose().
Therefore if the resource is cached three times, we'll call Dispose() three times.
Is it the responsibility of the IDisposable resource to ensure thread safety or the caller?

It is the responsibility of the class implementing the IDisposable interface to ensure that the method can be called multiple times without throwing exception.
To help ensure that resources are always cleaned up appropriately, a Dispose method should be callable multiple times without throwing an exception.
http://msdn.microsoft.com/en-us/library/fs2xkftw(v=vs.110).aspx
After an object is disposed, calls that fail because the object is disposed can and likely should throw an ObjectDisposedException (to aid in future debugging). If it is important that external objects know whether or not an object is disposed before making calls (because the object is shared), then it is customary to add a public boolean property (IsDisposed/Disposed) that indicates the state of the object.
EDIT: To more clearly cast this answer to the phrasing of the question, the class implementing IDisposable should implementing thread-safety if it is expected the class will be used in a cross-threaded environment. The link I posted shows an example of this at the bottom of the page.

Either
The system which is evicting the values and calling Dispose must use synchronization to ensure the calls don't overlap
The object itself must implement Dispose in such a way that it can be safely called from multiple threads
Both of these are completely valid solutions to the problem. Which one is better will depend a lot on your system.
All other things being equal I would opt for #2. I prefer to have my objects be self sufficient and require as little help as possible to successfully execute in the environment they are designed for. Making it thread safe reduces the knowledge the rest of the system needs to use it correctly

The answer depends on whether the objects you want to dispose adhere to the guidelines or not.
If they do some things get simpler, from IDisposable.Dispose:
If an object's Dispose method is called more than once, the object must ignore all calls after the first one. The object must not throw an exception if its Dispose method is called multiple times. Instance methods other than Dispose can throw an ObjectDisposedException when resources are already disposed.
So if calling Dispose multiple times has no effect other than the first time it's called you only have to ensure that you don't call them at the same time. And that should absolutely be the responsibility of the caller.
Stephen Cleary wrote an article on CodeProject about how to implement IDisposable and the possible problems which I find extremely helpful.

Is it the responsibility of the IDisposable resource to ensure thread safety or the caller?
It is up to the caller to guarantee thread safety through lock primitive or Monitor class IDisposable has nothing to do with thread safety. It's just a signal that the object's instance is ready to be garbage collected.

C# Managed Thread Cleanup

After my application creates a thread using a ParameterizedThreadStart delegate, that thread performs some initialization and runs to completion. Later on, I can observe that this thread is no longer active because its IsAlive property is false and ThreadState property is ThreadState.Stopped.
Once a thread reaches this state they remain in my application, still existing as thread objects until my application shuts down. Are there any steps I can take to dispose of them once they're no longer active? I would like to delete the object and deallocate any resources so that any given moment the only thread objects I have are active threads. Thread doesn't implement IDisposable, though, so I'm not sure how I should do this.

You're holding onto the reference to the thread in your code.
If you have written code that will check the state of the thread, then that code inherently will keep the thread object alive until the GC collects it.
Once you are finished with a thread, or ideally if you don't need to access it, make sure you null all references to it. Thread doesn't implement IDisposable because as you've made clear this wouldn't make sense for a thread.
Threads are native in .Net so you don't have to worry about leaks. If you're certain they will stop then just delete them from your list once you are sure it has finished.

It sounds like you need to let go of your reference to the Thread object, so the garbage collector can discard it. Just set the reference you have to null, and let the GC do its job when it's ready.
Depending on your situation, you may wish to use a WeakReference (or my friend Cyrus' WeakReference<T>).

Is the unmanaged thread still there, did the thread actually return from its ParameterizedThreadStart method? Also try making IsBackground = false

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.