Lock-free, awaitable, exclusive access methods

Lock-free, awaitable, exclusive access methods - c#

I have a thread safe class which uses a particular resource that needs to be accessed exclusively. In my assessment it does not make sense to have the callers of various methods block on a Monitor.Enter or await a SemaphoreSlim in order to access this resource.
For instance I have some "expensive" asynchronous initialization. Since it does not make sense to initialize more than once, whether it be from multiple threads or a single one, multiple calls should return immediately (or even throw an exception). Instead one should create, init and then distribute the instance to multiple threads.
UPDATE 1:
MyClass uses two NamedPipes in either direction. The InitBeforeDistribute method is not really initialization, but rather properly setting up a connection in both directions. It does not make sense to make the pipe available to N threads before you have set up the connection. Once it is setup multiple threads can post work, but only one can actually read/write to the stream. My apologies for obfuscating this with poor naming of the examples.
UPDATE 2:
If InitBeforeDistribute implemented a SemaphoreSlim(1, 1) with proper await logic (instead of the interlocked operation throwing an exception), is the Add/Do Square method OK practice? It does not throw a redundant exception (such as in InitBeforeDistribute), while being lock-free?
The following would be a good bad example:
class MyClass
{
private int m_isIniting = 0; // exclusive access "lock"
private volatile bool vm_isInited = false; // vol. because other methods will read it
public async Task InitBeforeDistribute()
{
if (Interlocked.Exchange(ref this.m_isIniting, -1) != 0)
throw new InvalidOperationException(
"Cannot init concurrently! Did you distribute before init was finished?");
try
{
if (this.vm_isInited)
return;
await Task.Delay(5000) // init asynchronously
.ConfigureAwait(false);
this.vm_isInited = true;
}
finally
{
Interlocked.Exchange(ref this.m_isConnecting, 0);
}
}
}
Some points:
If there is a case where blocking/awaiting access to a lock makes
perfect sense, then this example does not (make sense, that is).
Since I need to await in the method, I must use something like a
SemaphoreSlim if I where to use a "proper" lock. Forgoing the
Semaphore for the example above allows me to not worry about
disposing the class once I'm done with it. (I always disliked the
idea of disposing an item used by multiple threads. This is a minor
positive, for sure.)
If the method is called often there might be some performance
benefits, which of course should be measured.
The above example does not make sense in ref. to (3.) so here is another example:
class MyClass
{
private volatile bool vm_isInited = false; // see above example
private int m_isWorking = 0; // exclusive access "lock"
private readonly ConcurrentQueue<Tuple<int, TaskCompletionSource<int>> m_squareWork =
new ConcurrentQueue<Tuple<int, TaskCompletionSource<int>>();
public Task<int> AddSquare(int number)
{
if (!this.vm_isInited) // see above example
throw new InvalidOperationException(
"You forgot to init! Did you already distribute?");
var work = new Tuple<int, TaskCompletionSource<int>(number, new TaskCompletionSource<int>()
this.m_squareWork.Enqueue(work);
Task do = DoSquare();
return work.Item2.Task;
}
private async Task DoSquare()
{
if (Interlocked.Exchange(ref this.m_isWorking, -1) != 0)
return; // let someone else do the work for you
do
{
try
{
Tuple<int, TaskCompletionSource<int> work;
while (this.m_squareWork.TryDequeue(out work))
{
await Task.Delay(5000) // Limiting resource that can only be
.ConfigureAwait(false); // used by one thread at a time.
work.Item2.TrySetResult(work.Item1 * work.Item1);
}
}
finally
{
Interlocked.Exchange(ref this.m_isWorking, 0);
}
} while (this.m_squareWork.Count != 0 &&
Interlocked.Exchange(ref this.m_isWorking, -1) == 0)
}
}
Are there some of the specific negative aspects of this "lock-free" example that I should pay attention to?
Most questions relating to "lock-free" code on SO generally advise against it, stating that it is for the "experts". Rarely (I could be wrong on this one) do I see suggestions for books/blogs/etc that one can delve into, should one be so inclined. If there any such resources I should look into, please share. Any suggestions will be highly appreciated!

Update: great article related
.: Creating High-Performance Locks and Lock-free Code (for .NET) :.
The main point about lock-free algorythms is not that they are for experts.
The main point is Do you really need lock-free algorythm here? I can't understand your logic here:
Since it does not make sense to initialize more than once, whether it be from multiple threads or a single one, multiple calls should return immediately (or even throw an exception).
Why can't your users simply wait for a result of initialization, and use your resource after that? If your can, simply use the Lazy<T> class or even Asynchronous Lazy Initialization.
You really should read about consensus number and CAS-operations and why does it matters while implementing your own synchronization primitive.
In your code your are using the Interlocked.Exchange method, which isn't CAS in real, as it always exchanges the value, and it has a consensus number equal to 2. This means that the primitive using such construction will work correctly only for 2 threads (not in your situation, but still 2).
I've tried to define is your code works correctly for 3 threads, or there can be some circumstances which lead your application to damaged state, but after 30 minutes I stopped. And any your team member will stop like me after some time trying to understand your code. This is a waste of time, not only yours, but your team. Don't reinvent the wheel until you really have to.
My favorite book in related area is Writing High-Performance .NET Code by Ben Watson, and my favorite blog is Stephen Cleary's. If you can be more specific about what kind of book are you interested in, I can add some more references.
No locks in program doesn't make your application lock-free. In .NET application you really should not use the Exceptions for your internal program flow. Consider that the initializing thread isn't scheduled for a while by the OS (on various reasons, no matter what they are exactly).
In this case all other threads in your app will die step by step trying to access your shared resource. I can't say that this is a lock-free code. Yes, there are no locks in it, but it doesn't guarantee the correctness of the program and thus it isn't a lock-free by definition.

The Art of Multiprocessor Programming by Maurice Herlihy and Nir Shavit, is a great resource for lock-free and wait-free programming. lock-free is a progress guarantee other than a mode of programming, so to argue that an algorithm is lock-free, one has to validate or show proofs of the progress guarantee. lock-free in simple terms implies that blocking or halting of one thread doesn't not block progress of other threads or that if a threads is blocked infinitely often, then there is another thread that makes progress infinitely often.

Related

How can I make a interrupt in C#? [duplicate]

I understand Thread.Abort() is evil from the multitude of articles I've read on the topic, so I'm currently in the process of ripping out all of my abort's in order to replace it for a cleaner way; and after comparing user strategies from people here on stackoverflow and then after reading "How to: Create and Terminate Threads (C# Programming Guide)" from MSDN both which state an approach very much the same -- which is to use a volatile bool approach checking strategy, which is nice, but I still have a few questions....
Immediately what stands out to me here, is what if you do not have a simple worker process which is just running a loop of crunching code? For instance for me, my process is a background file uploader process, I do in fact loop through each file, so that's something, and sure I could add my while (!_shouldStop) at the top which covers me every loop iteration, but I have many more business processes which occur before it hits it's next loop iteration, I want this cancel procedure to be snappy; don't tell me I need to sprinkle these while loops every 4-5 lines down throughout my entire worker function?!
I really hope there is a better way, could somebody please advise me on if this is in fact, the correct [and only?] approach to do this, or strategies they have used in the past to achieve what I am after.
Thanks gang.
Further reading: All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me. What if it is a linear, but timely background operation?

Unfortunately there may not be a better option. It really depends on your specific scenario. The idea is to stop the thread gracefully at safe points. That is the crux of the reason why Thread.Abort is not good; because it is not guaranteed to occur at safe points. By sprinkling the code with a stopping mechanism you are effectively manually defining the safe points. This is called cooperative cancellation. There are basically 4 broad mechanisms for doing this. You can choose the one that best fits your situation.
Poll a stopping flag
You have already mentioned this method. This a pretty common one. Make periodic checks of the flag at safe points in your algorithm and bail out when it gets signalled. The standard approach is to mark the variable volatile. If that is not possible or inconvenient then you can use a lock. Remember, you cannot mark a local variable as volatile so if a lambda expression captures it through a closure, for example, then you would have to resort to a different method for creating the memory barrier that is required. There is not a whole lot else that needs to be said for this method.
Use the new cancellation mechanisms in the TPL
This is similar to polling a stopping flag except that it uses the new cancellation data structures in the TPL. It is still based on cooperative cancellation patterns. You need to get a CancellationToken and the periodically check IsCancellationRequested. To request cancellation you would call Cancel on the CancellationTokenSource that originally provided the token. There is a lot you can do with the new cancellation mechanisms. You can read more about here.
Use wait handles
This method can be useful if your worker thread requires waiting on an specific interval or for a signal during its normal operation. You can Set a ManualResetEvent, for example, to let the thread know it is time to stop. You can test the event using the WaitOne function which returns a bool indicating whether the event was signalled. The WaitOne takes a parameter that specifies how much time to wait for the call to return if the event was not signaled in that amount of time. You can use this technique in place of Thread.Sleep and get the stopping indication at the same time. It is also useful if there are other WaitHandle instances that the thread may have to wait on. You can call WaitHandle.WaitAny to wait on any event (including the stop event) all in one call. Using an event can be better than calling Thread.Interrupt since you have more control over of the flow of the program (Thread.Interrupt throws an exception so you would have to strategically place the try-catch blocks to perform any necessary cleanup).
Specialized scenarios
There are several one-off scenarios that have very specialized stopping mechanisms. It is definitely outside the scope of this answer to enumerate them all (never mind that it would be nearly impossible). A good example of what I mean here is the Socket class. If the thread is blocked on a call to Send or Receive then calling Close will interrupt the socket on whatever blocking call it was in effectively unblocking it. I am sure there are several other areas in the BCL where similiar techniques can be used to unblock a thread.
Interrupt the thread via Thread.Interrupt
The advantage here is that it is simple and you do not have to focus on sprinkling your code with anything really. The disadvantage is that you have little control over where the safe points are in your algorithm. The reason is because Thread.Interrupt works by injecting an exception inside one of the canned BCL blocking calls. These include Thread.Sleep, WaitHandle.WaitOne, Thread.Join, etc. So you have to be wise about where you place them. However, most the time the algorithm dictates where they go and that is usually fine anyway especially if your algorithm spends most of its time in one of these blocking calls. If you algorithm does not use one of the blocking calls in the BCL then this method will not work for you. The theory here is that the ThreadInterruptException is only generated from .NET waiting call so it is likely at a safe point. At the very least you know that the thread cannot be in unmanaged code or bail out of a critical section leaving a dangling lock in an acquired state. Despite this being less invasive than Thread.Abort I still discourage its use because it is not obvious which calls respond to it and many developers will be unfamiliar with its nuances.

Well, unfortunately in multithreading you often have to compromise "snappiness" for cleanliness... you can exit a thread immediately if you Interrupt it, but it won't be very clean. So no, you don't have to sprinkle the _shouldStop checks every 4-5 lines, but if you do interrupt your thread then you should handle the exception and exit out of the loop in a clean manner.
Update
Even if it's not a looping thread (i.e. perhaps it's a thread that performs some long-running asynchronous operation or some type of block for input operation), you can Interrupt it, but you should still catch the ThreadInterruptedException and exit the thread cleanly. I think that the examples you've been reading are very appropriate.
Update 2.0
Yes I have an example... I'll just show you an example based on the link you referenced:
public class InterruptExample
{
private Thread t;
private volatile boolean alive;
public InterruptExample()
{
alive = false;
t = new Thread(()=>
{
try
{
while (alive)
{
/* Do work. */
}
}
catch (ThreadInterruptedException exception)
{
/* Clean up. */
}
});
t.IsBackground = true;
}
public void Start()
{
alive = true;
t.Start();
}
public void Kill(int timeout = 0)
{
// somebody tells you to stop the thread
t.Interrupt();
// Optionally you can block the caller
// by making them wait until the thread exits.
// If they leave the default timeout,
// then they will not wait at all
t.Join(timeout);
}
}

If cancellation is a requirement of the thing you're building, then it should be treated with as much respect as the rest of your code--it may be something you have to design for.
Lets assume that your thread is doing one of two things at all times.
Something CPU bound
Waiting for the kernel
If you're CPU bound in the thread in question, you probably have a good spot to insert the bail-out check. If you're calling into someone else's code to do some long-running CPU-bound task, then you might need to fix the external code, move it out of process (aborting threads is evil, but aborting processes is well-defined and safe), etc.
If you're waiting for the kernel, then there's probably a handle (or fd, or mach port, ...) involved in the wait. Usually if you destroy the relevant handle, the kernel will return with some failure code immediately. If you're in .net/java/etc. you'll likely end up with an exception. In C, whatever code you already have in place to handle system call failures will propagate the error up to a meaningful part of your app. Either way, you break out of the low-level place fairly cleanly and in a very timely manner without needing new code sprinkled everywhere.
A tactic I often use with this kind of code is to keep track of a list of handles that need to be closed and then have my abort function set a "cancelled" flag and then close them. When the function fails it can check the flag and report failure due to cancellation rather than due to whatever the specific exception/errno was.
You seem to be implying that an acceptable granularity for cancellation is at the level of a service call. This is probably not good thinking--you are much better off cancelling the background work synchronously and joining the old background thread from the foreground thread. It's way cleaner becasue:
It avoids a class of race conditions when old bgwork threads come back to life after unexpected delays.
It avoids potential hidden thread/memory leaks caused by hanging background processes by making it possible for the effects of a hanging background thread to hide.
There are two reasons to be scared of this approach:
You don't think you can abort your own code in a timely fashion. If cancellation is a requirement of your app, the decision you really need to make is a resource/business decision: do a hack, or fix your problem cleanly.
You don't trust some code you're calling because it's out of your control. If you really don't trust it, consider moving it out-of-process. You get much better isolation from many kinds of risks, including this one, that way.

The best answer largely depends on what you're doing in the thread.
Like you said, most answers revolve around polling a shared boolean every couple lines. Even though you may not like it, this is often the simplest scheme. If you want to make your life easier, you can write a method like ThrowIfCancelled(), which throws some kind of exception if you're done. The purists will say this is (gasp) using exceptions for control flow, but then again cacelling is exceptional imo.
If you're doing IO operations (like network stuff), you may want to consider doing everything using async operations.
If you're doing a sequence of steps, you could use the IEnumerable trick to make a state machine. Example:
<
abstract class StateMachine : IDisposable
{
public abstract IEnumerable<object> Main();
public virtual void Dispose()
{
/// ... override with free-ing code ...
}
bool wasCancelled;
public bool Cancel()
{
// ... set wasCancelled using locking scheme of choice ...
}
public Thread Run()
{
var thread = new Thread(() =>
{
try
{
if(wasCancelled) return;
foreach(var x in Main())
{
if(wasCancelled) return;
}
}
finally { Dispose(); }
});
thread.Start()
}
}
class MyStateMachine : StateMachine
{
public override IEnumerabl<object> Main()
{
DoSomething();
yield return null;
DoSomethingElse();
yield return null;
}
}
// then call new MyStateMachine().Run() to run.
>
Overengineering? It depends how many state machines you use. If you just have 1, yes. If you have 100, then maybe not. Too tricky? Well, it depends. Another bonus of this approach is that it lets you (with minor modifications) move your operation into a Timer.tick callback and void threading altogether if it makes sense.
and do everything that blucz says too.

Perhaps the a piece of the problem is that you have such a long method / while loop. Whether or not you are having threading issues, you should break it down into smaller processing steps. Let's suppose those steps are Alpha(), Bravo(), Charlie() and Delta().
You could then do something like this:
public void MyBigBackgroundTask()
{
Action[] tasks = new Action[] { Alpha, Bravo, Charlie, Delta };
int workStepSize = 0;
while (!_shouldStop)
{
tasks[workStepSize++]();
workStepSize %= tasks.Length;
};
}
So yes it loops endlessly, but checks if it is time to stop between each business step.

You don't have to sprinkle while loops everywhere. The outer while loop just checks if it's been told to stop and if so doesn't make another iteration...
If you have a straight "go do something and close out" thread (no loops in it) then you just check the _shouldStop boolean either before or after each major spot inside the thread. That way you know whether it should continue on or bail out.
for example:
public void DoWork() {
RunSomeBigMethod();
if (_shouldStop){ return; }
RunSomeOtherBigMethod();
if (_shouldStop){ return; }
//....
}

Instead of adding a while loop where a loop doesn't otherwise belong, add something like if (_shouldStop) CleanupAndExit(); wherever it makes sense to do so. There's no need to check after every single operation or sprinkle the code all over with them. Instead, think of each check as a chance to exit the thread at that point and add them strategically with this in mind.

All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me
There are not a lot of ways to make code take a long time. Looping is a pretty essential programming construct. Making code take a long time without looping takes a huge amount of statements. Hundreds of thousands.
Or calling some other code that is doing the looping for you. Yes, hard to make that code stop on demand. That just doesn't work.

Threads operating on the one instance of a method

I'm creating an app, where I have 50x50 map. On this map I can add dots, which are new instances of the class "dot". Every dot has it's own thread, and every thread connected with a specific dot operates on the method "explore" of the class, and in this method there is another method "check_place(x,y)" which is responsible for checking if some place on the map was already discovered. If not, the static variable of the class "num_discovered" should be incremented. This single instance of the method "check_place(x,y)" should be accessed in the real-time by every thread started in the app.
Constructor:
public dot(Form1 F)
{
/...
thread = new System.Threading.Thread(new System.Threading.ThreadStart(explore)); //wątek wykonujący metodę explore klasy robot
thread.Start();
}
check_place(x,y) method:
static void check_place(int x, int y)
{
lock (ob)
{
if (discovered[x, y] == false)
{
discovered[x, y] = true;
num_discovered += 1;
}
}
}
In the explore method I'm invoking method "check_place(x,y)" like this:
dot.check_place(x, y);
Is it enough to achieve a situation where in the single time only one dot can check if place was already discovered?

Is it enough to achieve a situation where in the single time only one dot can check if place was already discovered?
Yes. But what's the point?
If threads are spending all of their time waiting on other threads, what have you gained from being multi-threaded?
There are three (sometimes overlapping) reasons to spawn more threads:
To make use of more than one core at the same time: overall throughput increases.
To have work done while another thread is waiting on something else (typically I/O from file, DB or network): overall throughput increases.
To respond to user interaction while work is being done: overall throughput decreases, but it feels faster to the user as they are separately being reacted to.
Here the last doesn't apply.
If your "checking" involved I/O then the second might apply, and this strategy might make sense.
The first could well apply, but because all the threads are spending most of their time waiting on other threads, you don't gain an improvement in throughput.
Indeed, because there is overhead involved in setting up threads and switching between them, this code will be slower than just having one thread do everything: If only one thread can work at a time, then only have one thread!
So your use of a lock here is correct in that it prevents corruption and errors, but pointless in that it makes everything too slow.
What to do about this:
If your real case involves I/O or other reasons why the threads in fact spend most of their time out of each others' way, then what you have is fine.
Otherwise you've got two options.
Easy: Just use one thread.
Hard: Have finer locking.
One way to have finer locking would be to do double-checking:
static void check_place(int x, int y)
{
if (!discovered[x, y])
lock (ob)
if (!discovered[x, y])
{
discovered[x, y] = true;
num_discovered += 1;
}
}
Now at the very least some threads will skip past some cases where discovered[x, y] is true without holding up the other threads.
This is useful when a thread is going to get a result at the end of the locked period. Its still not good enough here though, because it's just going to move on quickly to a case were it fights for the lock again.
If our lookup of discovered were itself thread-safe and that thread-safety was finely grained, then we could make some progress:
static void check_place(int x, int y)
{
if (discovered.SetIfFalse(x, y))
Interlocked.Increment(ref num_discovered)
}
So far though we've just moved the problem around; how do we make SetIfFalse thread-safe without using a single lock and causing the same problem?
There are a few approaches. We could use striped locks, or low-locking concurrent collections.
It seem that you have a fixed-size structure of 50×50, in which case this isn't too hard:
private class DotMap
{
//ints because we can't use interlocked with bools
private int[][] _map = new int[50][];
public DotMap()
{
for(var i = 0; i != 50; ++i)
_map[i] = new int[50];
}
public bool SetIfFalse(int x, int y)
{
return Interlocked.CompareExchange(ref _map[x][y], 1, 0) == 0;
}
}
Now our advantages are:
All of our locking is much lower-level (but note that Interlocked operations will still slow down in the face of contention, albeit not as much as lock).
Much of our locking is out of the way of other locking. Specifically, that in SetIfFalse can allow for separate areas to be checked without being in each others way at all.
This is neither a panacea though (such approaches still suffer in the face of contention, and also bring their own costs) nor easy to generalise to other cases (changing SetIfFalse to something that does anything more than check and change that single value is not easy). It's still quite likely that even on a machine with a lot of cores this would be slower than the single-threaded approach.
Another possibility is to not have SetIfFalse thread-safe at all, but to ensure that the threads where each partitioned from each other so that they were never going to hit the same values and that the structure is safe in the case of such multi-threaded access (fixed arrays of elements above machine word-size are thread-safe when threads only ever hit different indices, must mutable structures where one can Add and/or Remove are not).
In all, you've got the right idea about how to use lock to keep threads from causing errors, and that is the approach to use 98% of the time when something lends itself well to multithreading because it involves threads waiting on something else. Your example though hits that lock too much to benefit from multiple cores, and creating code that does is not trivial.

Your performance on this could potentially be pretty bad - I recommend using Task.Run here to increase efficiency when you need to run your explore method on multiple threads in parallel.
As far as locking and thread safety, if the lock in check_place is the only place you're setting bools in the discovered variable and setting the num_discovered variable, the existing code will work. If you start setting them from somewhere else in the code, you will need to use locks there as well.
Also, when reading from these variables, you should read these values into local variables inside other locks using the same lock object to maintain thread safety here as well.
I have other suggestions but those are the two most basic things you need here.

Can the same object lock be checked at the same time by concurrent threads?

How come if I have a statement like this:
private int sharedValue = 0;
public void SomeMethodOne()
{
lock(this){sharedValue++;}
}
public void SomeMethodTwo()
{
lock(this){sharedValue--;}
}
So for a thread to get into a lock it must first check if another thread is operating on it. If it isn't, it can enter and has to write something to memory, this surely cannot be atomic as it needs to read and write.
So how come it's impossible for one thread to be reading the lock, while the other is writing its ownership to it?
To simplify Why cannot two threads both get into a lock at the same time?

It looks like you are basically asking how the lock works. How can the lock maintain internal state in an atomic manner without already having the lock built? It seems like a chicken and egg problem at first does it not?
The magic all happens because of a compare-and-swap (CAS) operation. The CAS operation is a hardware level instruction that does 2 important things.
It generates a memory barrier so that instruction reordering is constrained.
It compares the contents of a memory address with another value and if they are equal then the original value is replaced with a new value. It does all of this in an atomic manner.
At the most fundamental level this is how the trick is accomplished. It is not that all other threads are blocked from reading while another is writing. That is totally the wrong way to think about it. What actually happens is that all threads are acting as writers simultaneously. The strategy is more optimistic than it is pessimistic. Every thread is trying to acquire the lock by performing this special kind of write called a CAS. You actually have access to a CAS operation in .NET via the Interlocked.CompareExchange (ICX) method. Every synchronization primitive can be built from this single operation.
If I were going to write a Monitor-like class (that is what the lock keyword uses behind the scenes) from scratch entirely in C# I could do it using the Interlocked.CompareExchange method. Here is an overly simplified implementation. Please keep in mind that this is most certainly not how the .NET Framework does it.1 The reason I present the code below is to show you how it could be done in pure C# code without the need for CLR magic behind the scenes and because it might get you thinking about how Microsoft could have implemented it.
public class SimpleMonitor
{
private int m_LockState = 0;
public void Enter()
{
int iterations = 0;
while (!TryEnter())
{
if (iterations < 10) Thread.SpinWait(4 << iterations);
else if (iterations % 20 == 0) Thread.Sleep(1);
else if (iterations % 5 == 0) Thread.Sleep(0);
else Thread.Yield();
iterations++;
}
}
public void Exit()
{
if (!TryExit())
{
throw new SynchronizationLockException();
}
}
public bool TryEnter()
{
return Interlocked.CompareExchange(ref m_LockState, 1, 0) == 0;
}
public bool TryExit()
{
return Interlocked.CompareExchange(ref m_LockState, 0, 1) == 1;
}
}
This implementation demonstrates a couple of important things.
It shows how the ICX operation is used to atomically read and write the lock state.
It shows how the waiting might occur.
Notice how I used Thread.SpinWait, Thread.Sleep(0), Thread.Sleep(1) and Thread.Yield while the lock is waiting to be acquired. The waiting strategy is overly simplified, but it does approximate a real life algorithm implemented in the BCL already. I intentionally kept the code simple in the Enter method above to make it easier to spot the crucial bits. This is not how I would have normally implemented this, but I am hoping it does drive home the salient points.
Also note that my SimpleMonitor above has a lot of problems. Here are but only a few.
It does not handle nested locking.
It does not provide Wait or Pulse methods like the real Monitor class. They are really hard to do right.
1The CLR will actually use a special block of memory that exists on each reference type. This block of memory is referred to as the "sync block". The Monitor will manipulate bits in this block of memory to acquire and release the lock. This action may require a kernel event object. You can read more about it on Joe Duffy's blog.

lock in C# is used to create a Monitor object that is actually used for locking.
You can read more about Monitor in here: http://msdn.microsoft.com/en-us/library/system.threading.monitor.aspx. The Enter method of the Monitor ensures that only one thread can enter the critical section at the time:
Acquires a lock for an object. This action also marks the beginning of a critical section. No other thread can enter the critical section unless it is executing the instructions in the critical section using a different locked object.
BTW, you should avoid locking on this (lock(this)). You should use a private variable on a class (static or non-static) to protect the critical section. You can read more in the same link provided above but the reason is:
When selecting an object on which to synchronize, you should lock only on private or internal objects. Locking on external objects might result in deadlocks, because unrelated code could choose the same objects to lock on for different purposes.

Lock-free Reference Counting

I'm working on a system that requires extensive C API interop. Part of the interop requires initialization and shutdown of the system in question before and after any operations. Failure to do either will result in instability in the system. I've accomplished this by simply implementing reference counting in a core disposable environment class like this:
public FooEnvironment()
{
lock(EnvironmentLock)
{
if(_initCount == 0)
{
Init(); // global startup
}
_initCount++;
}
}
private void Dispose(bool disposing)
{
if(_disposed)
return;
if(disposing)
{
lock(EnvironmentLock)
{
_initCount--;
if(_initCount == 0)
{
Term(); // global termination
}
}
}
}
This works fine and accomplished the goal. However, since any interop operation must be nested in a FooEnvironment using block, we are locking all the time and profiling suggests that this locking accounts for close to 50% of the work done during run-time. It seems to me that this is a fundamental enough concept that something in .NET or the CLR must address it. Is there a better way to do reference counting?

This is a trickier task than you might expect at first blush. I don't believe that Interlocked.Increment will be sufficient to your task. Rather, I expect you to need to perform some wizardry with CAS (Compare-And-Swap).
Note also that it's very easy to get this mostly-right, but mostly-right is still completely wrong when your program crashes with heisenbugs.
I strongly suggest some genuine research before going down this path. A couple good jumping off points pop to the top if you do a search for "Lock free reference counting." This Dr. Dobbs article is useful, and this SO Question might be relevant.
Above all, remember that lock free programming is hard. If this is not your specialty, consider stepping back and adjusting your expectations around the granularity of your reference counts. It may be much, much less expensive to rethink your fundamental refcount policy than to create a reliable lock-free mechanism if you're not an expert. Especially when you don't yet know that a lock-free technique will actually be any faster.

As harold's comment notes the answer is Interlocked:
public FooEnvironment() {
if (Interlocked.Increment(ref _initCount) == 1) {
Init(); // global startup
}
}
private void Dispose(bool disposing) {
if(_disposed)
return;
if (disposing) {
if (0 == Interlocked.Decrement(ref _initCount)) {
Term(); // global termination
}
}
}
Both Increment and Decrement return the new count (just for this kind of usage), hence different checks.
But note: this will not work if anything else needs concurrency protection. Interlocked operations are themselves safe, but nothing else is (including different threads relative ordering of Interlocked calls). In the above code Init() can still be running after another thread has completed the constructor.

Probably use a general static variable in a class. Static is only one thing and is not specific to any object.

I believe this will give you a safe way using Interlocked.Increment/Decrement.
Note: This is oversimplified, the code below can lead to deadlock if Init() throws an exception. There is also a race condition in the Dispose when the count goes to zero, the init is reset and the constructor is called again. I don't know your program flow, so you may be better off using a cheaper lock like a SpinLock as opposed to the InterlockedIncrement if you have potential of initing again after several dispose calls.
static ManualResetEvent _inited = new ManualResetEvent(false);
public FooEnvironment()
{
if(Interlocked.Increment(ref _initCount) == 1)
{
Init(); // global startup
_inited.Set();
}
_inited.WaitOne();
}
private void Dispose(bool disposing)
{
if(_disposed)
return;
if(disposing)
{
if(Interlocked.Decrement(ref _initCount) == 0)
{
_inited.Reset();
Term(); // global termination
}
}
}
Edit:
In thinking about this further, you may want to consider some application redesign and instead of this class to manage Init and Term, just have a single call to Init at application startup and a call to Term when the app comes down, then you remove the need for locking altogether, and if the lock is showing up as 50% of your execution time, then it seems like you are always going to want to call Init, so just call it and away you go.

You can make it nearly lock-free by using the following code. It would definitely lower contention and if this is your main problem it would be the solution you need.
Also I would suggest to call Dispose from destructor/finalizer (just in case). I have changed your Dispose method - unmanaged resources should be freed regardless of disposing argument. Check this for details on how to properly dispose an object.
Hope this helps you.
public class FooEnvironment
{
private static int _initCount;
private static bool _initialized;
private static object _environmentLock = new object();
private bool _disposed;
public FooEnvironment()
{
Interlocked.Increment(ref _initCount);
if (_initCount > 0 && !_initialized)
{
lock (_environmentLock)
{
if (_initCount > 0 && !_initialized)
{
Init(); // global startup
_initialized = true;
}
}
}
}
private void Dispose(bool disposing)
{
if (_disposed)
return;
if (disposing)
{
// Dispose managed resources here
}
Interlocked.Decrement(ref _initCount);
if (_initCount <= 0 && _initialized)
{
lock (_environmentLock)
{
if (_initCount <= 0 && _initialized)
{
Term(); // global termination
_initialized = false;
}
}
}
_disposed = true;
}
~FooEnvironment()
{
Dispose(false);
}
}

Using Threading.Interlocked.Increment will be a little faster than acquiring a lock, doing an increment, and releasing the lock, but not enormously so. The expensive part of either operation on a multi-core system is enforcing the synchronization of memory caches between cores. The primary advantage of Interlocked.Increment is not speed, but rather the fact that it will complete in a bounded amount of time. By contrast, if one seeks to acquire a lock, perform an increment, and release the lock, even if the lock is used for no purpose other than guarding the counter, there is a risk that one might have to wait forever if some other thread acquires the lock and then gets waylaid.
You don't mention which version of .net you're using, but there are some Concurrent classes that might be of use. Depending upon your patterns of allocating and freeing things, a class that might seem a little tricky but could work well is the ConcurrentBag class. It's somewhat like a queue or stack, except that there's no guarantee that things will come out any particular order. Include in your resource wrapper a flag indicating whether it's still good, and include with the resource itself a reference to a wrapper. When an resource user is created, throw a wrapper object in the bag. When the resource user is no longer needed, set the "invalid" flag. The resource should remain alive as long as either there's at least one wrapper object in the bag whose "valid" flag is set, or the resource itself holds a reference to a valid wrapper. If when an item is deleted the resource doesn't seem to hold a valid wrapper, acquire a lock and, if the resource still doesn't hold a valid wrapper, pull wrappers out of the bag until a valid one is found, and then store that one with the resource (or, if none was found, destroy the resource). If when an item is deleted the resource holds a valid wrapper but the bag seems like it might hold an excessive number of invalid items, acquire the lock, copy the bag's contents to an array, and throw valid items back into the bag. Keep a count of how many items are thrown back, so one can judge when to do the next purge.
This approach may seem more complicated than using locks or Threading.Interlocked.Increment, and there are a lot of corner cases to worry about, but it may offer better performance because ConcurrentBag is designed to reduce resource contention. If processor 1 performs Interlocked.Increment on some location, and then processor 2 does so, processor 2 will have to instruct processor 1 to flush that location from its cache, wait until processor 1 has done so, inform all the other processors that it needs control of that location, load that location into its cache, and finally get around to incrementing it. After all that has happened, if processor 1 needs to increment the location again, the same general sequence of steps will be required. All of this is very slow. The ConcurrentBag class, by contrast, is designed so that multiple processors can add things to a list without cache collisions. Sometime between when things are added and when they're removed, they'll have to be copied to a coherent data structure, but such operations can be performed in batches in such a way as to yield good cache performance.
I haven't tried an approach like the above using ConcurrentBag, so I don't know what sort of performance it would actually yield, but depending upon the usage patterns it may be possible to give better performance than would be obtained via reference counting.

Interlocked class approach work a little faster than the lock statment, but on a multi-core machine the speed advantage may not be very much, because Interlocked instructions must bypass the memory cache layers.
How important is it to call the Term() function when the code is not in use and/or when the program exits?
Frequently, you can just put the call to Init() once in a static constructor for the class that wraps the other APIs, and not really worry about calling Term(). E.g:
static FooEnvironment() {
Init(); // global startup
}
The CLR will ensure that the static constructor will get called once, before any other member functions in the enclosing class.
It’s also possible to hook notification of some (but not all) application shutdown scenarios, making it possible to call Term() on clean shutdowns. See this article. http://www.codeproject.com/Articles/16164/Managed-Application-Shutdown

Question about terminating a thread cleanly in .NET

I understand Thread.Abort() is evil from the multitude of articles I've read on the topic, so I'm currently in the process of ripping out all of my abort's in order to replace it for a cleaner way; and after comparing user strategies from people here on stackoverflow and then after reading "How to: Create and Terminate Threads (C# Programming Guide)" from MSDN both which state an approach very much the same -- which is to use a volatile bool approach checking strategy, which is nice, but I still have a few questions....
Immediately what stands out to me here, is what if you do not have a simple worker process which is just running a loop of crunching code? For instance for me, my process is a background file uploader process, I do in fact loop through each file, so that's something, and sure I could add my while (!_shouldStop) at the top which covers me every loop iteration, but I have many more business processes which occur before it hits it's next loop iteration, I want this cancel procedure to be snappy; don't tell me I need to sprinkle these while loops every 4-5 lines down throughout my entire worker function?!
I really hope there is a better way, could somebody please advise me on if this is in fact, the correct [and only?] approach to do this, or strategies they have used in the past to achieve what I am after.
Thanks gang.
Further reading: All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me. What if it is a linear, but timely background operation?

Well, unfortunately in multithreading you often have to compromise "snappiness" for cleanliness... you can exit a thread immediately if you Interrupt it, but it won't be very clean. So no, you don't have to sprinkle the _shouldStop checks every 4-5 lines, but if you do interrupt your thread then you should handle the exception and exit out of the loop in a clean manner.
Update
Even if it's not a looping thread (i.e. perhaps it's a thread that performs some long-running asynchronous operation or some type of block for input operation), you can Interrupt it, but you should still catch the ThreadInterruptedException and exit the thread cleanly. I think that the examples you've been reading are very appropriate.
Update 2.0
Yes I have an example... I'll just show you an example based on the link you referenced:
public class InterruptExample
{
private Thread t;
private volatile boolean alive;
public InterruptExample()
{
alive = false;
t = new Thread(()=>
{
try
{
while (alive)
{
/* Do work. */
}
}
catch (ThreadInterruptedException exception)
{
/* Clean up. */
}
});
t.IsBackground = true;
}
public void Start()
{
alive = true;
t.Start();
}
public void Kill(int timeout = 0)
{
// somebody tells you to stop the thread
t.Interrupt();
// Optionally you can block the caller
// by making them wait until the thread exits.
// If they leave the default timeout,
// then they will not wait at all
t.Join(timeout);
}
}

If cancellation is a requirement of the thing you're building, then it should be treated with as much respect as the rest of your code--it may be something you have to design for.
Lets assume that your thread is doing one of two things at all times.
Something CPU bound
Waiting for the kernel
If you're CPU bound in the thread in question, you probably have a good spot to insert the bail-out check. If you're calling into someone else's code to do some long-running CPU-bound task, then you might need to fix the external code, move it out of process (aborting threads is evil, but aborting processes is well-defined and safe), etc.
If you're waiting for the kernel, then there's probably a handle (or fd, or mach port, ...) involved in the wait. Usually if you destroy the relevant handle, the kernel will return with some failure code immediately. If you're in .net/java/etc. you'll likely end up with an exception. In C, whatever code you already have in place to handle system call failures will propagate the error up to a meaningful part of your app. Either way, you break out of the low-level place fairly cleanly and in a very timely manner without needing new code sprinkled everywhere.
A tactic I often use with this kind of code is to keep track of a list of handles that need to be closed and then have my abort function set a "cancelled" flag and then close them. When the function fails it can check the flag and report failure due to cancellation rather than due to whatever the specific exception/errno was.
You seem to be implying that an acceptable granularity for cancellation is at the level of a service call. This is probably not good thinking--you are much better off cancelling the background work synchronously and joining the old background thread from the foreground thread. It's way cleaner becasue:
It avoids a class of race conditions when old bgwork threads come back to life after unexpected delays.
It avoids potential hidden thread/memory leaks caused by hanging background processes by making it possible for the effects of a hanging background thread to hide.
There are two reasons to be scared of this approach:
You don't think you can abort your own code in a timely fashion. If cancellation is a requirement of your app, the decision you really need to make is a resource/business decision: do a hack, or fix your problem cleanly.
You don't trust some code you're calling because it's out of your control. If you really don't trust it, consider moving it out-of-process. You get much better isolation from many kinds of risks, including this one, that way.

The best answer largely depends on what you're doing in the thread.
Like you said, most answers revolve around polling a shared boolean every couple lines. Even though you may not like it, this is often the simplest scheme. If you want to make your life easier, you can write a method like ThrowIfCancelled(), which throws some kind of exception if you're done. The purists will say this is (gasp) using exceptions for control flow, but then again cacelling is exceptional imo.
If you're doing IO operations (like network stuff), you may want to consider doing everything using async operations.
If you're doing a sequence of steps, you could use the IEnumerable trick to make a state machine. Example:
<
abstract class StateMachine : IDisposable
{
public abstract IEnumerable<object> Main();
public virtual void Dispose()
{
/// ... override with free-ing code ...
}
bool wasCancelled;
public bool Cancel()
{
// ... set wasCancelled using locking scheme of choice ...
}
public Thread Run()
{
var thread = new Thread(() =>
{
try
{
if(wasCancelled) return;
foreach(var x in Main())
{
if(wasCancelled) return;
}
}
finally { Dispose(); }
});
thread.Start()
}
}
class MyStateMachine : StateMachine
{
public override IEnumerabl<object> Main()
{
DoSomething();
yield return null;
DoSomethingElse();
yield return null;
}
}
// then call new MyStateMachine().Run() to run.
>
Overengineering? It depends how many state machines you use. If you just have 1, yes. If you have 100, then maybe not. Too tricky? Well, it depends. Another bonus of this approach is that it lets you (with minor modifications) move your operation into a Timer.tick callback and void threading altogether if it makes sense.
and do everything that blucz says too.

Perhaps the a piece of the problem is that you have such a long method / while loop. Whether or not you are having threading issues, you should break it down into smaller processing steps. Let's suppose those steps are Alpha(), Bravo(), Charlie() and Delta().
You could then do something like this:
public void MyBigBackgroundTask()
{
Action[] tasks = new Action[] { Alpha, Bravo, Charlie, Delta };
int workStepSize = 0;
while (!_shouldStop)
{
tasks[workStepSize++]();
workStepSize %= tasks.Length;
};
}
So yes it loops endlessly, but checks if it is time to stop between each business step.

You don't have to sprinkle while loops everywhere. The outer while loop just checks if it's been told to stop and if so doesn't make another iteration...
If you have a straight "go do something and close out" thread (no loops in it) then you just check the _shouldStop boolean either before or after each major spot inside the thread. That way you know whether it should continue on or bail out.
for example:
public void DoWork() {
RunSomeBigMethod();
if (_shouldStop){ return; }
RunSomeOtherBigMethod();
if (_shouldStop){ return; }
//....
}

Instead of adding a while loop where a loop doesn't otherwise belong, add something like if (_shouldStop) CleanupAndExit(); wherever it makes sense to do so. There's no need to check after every single operation or sprinkle the code all over with them. Instead, think of each check as a chance to exit the thread at that point and add them strategically with this in mind.

All these SO responses assume the worker thread will loop. That doesn't sit comfortably with me
There are not a lot of ways to make code take a long time. Looping is a pretty essential programming construct. Making code take a long time without looping takes a huge amount of statements. Hundreds of thousands.
Or calling some other code that is doing the looping for you. Yes, hard to make that code stop on demand. That just doesn't work.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.