Lock-free Reference Counting - c#

I'm working on a system that requires extensive C API interop. Part of the interop requires initialization and shutdown of the system in question before and after any operations. Failure to do either will result in instability in the system. I've accomplished this by simply implementing reference counting in a core disposable environment class like this:
public FooEnvironment()
{
lock(EnvironmentLock)
{
if(_initCount == 0)
{
Init(); // global startup
}
_initCount++;
}
}
private void Dispose(bool disposing)
{
if(_disposed)
return;
if(disposing)
{
lock(EnvironmentLock)
{
_initCount--;
if(_initCount == 0)
{
Term(); // global termination
}
}
}
}
This works fine and accomplished the goal. However, since any interop operation must be nested in a FooEnvironment using block, we are locking all the time and profiling suggests that this locking accounts for close to 50% of the work done during run-time. It seems to me that this is a fundamental enough concept that something in .NET or the CLR must address it. Is there a better way to do reference counting?

This is a trickier task than you might expect at first blush. I don't believe that Interlocked.Increment will be sufficient to your task. Rather, I expect you to need to perform some wizardry with CAS (Compare-And-Swap).
Note also that it's very easy to get this mostly-right, but mostly-right is still completely wrong when your program crashes with heisenbugs.
I strongly suggest some genuine research before going down this path. A couple good jumping off points pop to the top if you do a search for "Lock free reference counting." This Dr. Dobbs article is useful, and this SO Question might be relevant.
Above all, remember that lock free programming is hard. If this is not your specialty, consider stepping back and adjusting your expectations around the granularity of your reference counts. It may be much, much less expensive to rethink your fundamental refcount policy than to create a reliable lock-free mechanism if you're not an expert. Especially when you don't yet know that a lock-free technique will actually be any faster.

As harold's comment notes the answer is Interlocked:
public FooEnvironment() {
if (Interlocked.Increment(ref _initCount) == 1) {
Init(); // global startup
}
}
private void Dispose(bool disposing) {
if(_disposed)
return;
if (disposing) {
if (0 == Interlocked.Decrement(ref _initCount)) {
Term(); // global termination
}
}
}
Both Increment and Decrement return the new count (just for this kind of usage), hence different checks.
But note: this will not work if anything else needs concurrency protection. Interlocked operations are themselves safe, but nothing else is (including different threads relative ordering of Interlocked calls). In the above code Init() can still be running after another thread has completed the constructor.

Probably use a general static variable in a class. Static is only one thing and is not specific to any object.

I believe this will give you a safe way using Interlocked.Increment/Decrement.
Note: This is oversimplified, the code below can lead to deadlock if Init() throws an exception. There is also a race condition in the Dispose when the count goes to zero, the init is reset and the constructor is called again. I don't know your program flow, so you may be better off using a cheaper lock like a SpinLock as opposed to the InterlockedIncrement if you have potential of initing again after several dispose calls.
static ManualResetEvent _inited = new ManualResetEvent(false);
public FooEnvironment()
{
if(Interlocked.Increment(ref _initCount) == 1)
{
Init(); // global startup
_inited.Set();
}
_inited.WaitOne();
}
private void Dispose(bool disposing)
{
if(_disposed)
return;
if(disposing)
{
if(Interlocked.Decrement(ref _initCount) == 0)
{
_inited.Reset();
Term(); // global termination
}
}
}
Edit:
In thinking about this further, you may want to consider some application redesign and instead of this class to manage Init and Term, just have a single call to Init at application startup and a call to Term when the app comes down, then you remove the need for locking altogether, and if the lock is showing up as 50% of your execution time, then it seems like you are always going to want to call Init, so just call it and away you go.

You can make it nearly lock-free by using the following code. It would definitely lower contention and if this is your main problem it would be the solution you need.
Also I would suggest to call Dispose from destructor/finalizer (just in case). I have changed your Dispose method - unmanaged resources should be freed regardless of disposing argument. Check this for details on how to properly dispose an object.
Hope this helps you.
public class FooEnvironment
{
private static int _initCount;
private static bool _initialized;
private static object _environmentLock = new object();
private bool _disposed;
public FooEnvironment()
{
Interlocked.Increment(ref _initCount);
if (_initCount > 0 && !_initialized)
{
lock (_environmentLock)
{
if (_initCount > 0 && !_initialized)
{
Init(); // global startup
_initialized = true;
}
}
}
}
private void Dispose(bool disposing)
{
if (_disposed)
return;
if (disposing)
{
// Dispose managed resources here
}
Interlocked.Decrement(ref _initCount);
if (_initCount <= 0 && _initialized)
{
lock (_environmentLock)
{
if (_initCount <= 0 && _initialized)
{
Term(); // global termination
_initialized = false;
}
}
}
_disposed = true;
}
~FooEnvironment()
{
Dispose(false);
}
}

Using Threading.Interlocked.Increment will be a little faster than acquiring a lock, doing an increment, and releasing the lock, but not enormously so. The expensive part of either operation on a multi-core system is enforcing the synchronization of memory caches between cores. The primary advantage of Interlocked.Increment is not speed, but rather the fact that it will complete in a bounded amount of time. By contrast, if one seeks to acquire a lock, perform an increment, and release the lock, even if the lock is used for no purpose other than guarding the counter, there is a risk that one might have to wait forever if some other thread acquires the lock and then gets waylaid.
You don't mention which version of .net you're using, but there are some Concurrent classes that might be of use. Depending upon your patterns of allocating and freeing things, a class that might seem a little tricky but could work well is the ConcurrentBag class. It's somewhat like a queue or stack, except that there's no guarantee that things will come out any particular order. Include in your resource wrapper a flag indicating whether it's still good, and include with the resource itself a reference to a wrapper. When an resource user is created, throw a wrapper object in the bag. When the resource user is no longer needed, set the "invalid" flag. The resource should remain alive as long as either there's at least one wrapper object in the bag whose "valid" flag is set, or the resource itself holds a reference to a valid wrapper. If when an item is deleted the resource doesn't seem to hold a valid wrapper, acquire a lock and, if the resource still doesn't hold a valid wrapper, pull wrappers out of the bag until a valid one is found, and then store that one with the resource (or, if none was found, destroy the resource). If when an item is deleted the resource holds a valid wrapper but the bag seems like it might hold an excessive number of invalid items, acquire the lock, copy the bag's contents to an array, and throw valid items back into the bag. Keep a count of how many items are thrown back, so one can judge when to do the next purge.
This approach may seem more complicated than using locks or Threading.Interlocked.Increment, and there are a lot of corner cases to worry about, but it may offer better performance because ConcurrentBag is designed to reduce resource contention. If processor 1 performs Interlocked.Increment on some location, and then processor 2 does so, processor 2 will have to instruct processor 1 to flush that location from its cache, wait until processor 1 has done so, inform all the other processors that it needs control of that location, load that location into its cache, and finally get around to incrementing it. After all that has happened, if processor 1 needs to increment the location again, the same general sequence of steps will be required. All of this is very slow. The ConcurrentBag class, by contrast, is designed so that multiple processors can add things to a list without cache collisions. Sometime between when things are added and when they're removed, they'll have to be copied to a coherent data structure, but such operations can be performed in batches in such a way as to yield good cache performance.
I haven't tried an approach like the above using ConcurrentBag, so I don't know what sort of performance it would actually yield, but depending upon the usage patterns it may be possible to give better performance than would be obtained via reference counting.

Interlocked class approach work a little faster than the lock statment, but on a multi-core machine the speed advantage may not be very much, because Interlocked instructions must bypass the memory cache layers.
How important is it to call the Term() function when the code is not in use and/or when the program exits?
Frequently, you can just put the call to Init() once in a static constructor for the class that wraps the other APIs, and not really worry about calling Term(). E.g:
static FooEnvironment() {
Init(); // global startup
}
The CLR will ensure that the static constructor will get called once, before any other member functions in the enclosing class.
It’s also possible to hook notification of some (but not all) application shutdown scenarios, making it possible to call Term() on clean shutdowns. See this article. http://www.codeproject.com/Articles/16164/Managed-Application-Shutdown

Related

Should thread-safe class have a memory barrier at the end of its constructor?

When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads?
Simplified question:
Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier between the initialization and the access of the thread-safe class? Or should the thread-safe class itself protect against this?
ConcurrentQueue<int> queue = null;
Parallel.Invoke(
() => queue = new ConcurrentQueue<int>(),
() => queue?.Enqueue(5));
Note that it is acceptable for the program to enqueue nothing, as would happen if the second delegate executes before the first. (The null-conditional operator ?. protects against a NullReferenceException here.) However, it should not be acceptable for the program to throw an IndexOutOfRangeException, NullReferenceException, enqueue 5 multiple times, get stuck in an infinite loop, or do any of the other weird things caused by race hazards on internal structures.
Elaborated question:
Concretely, imagine that I were implementing a simple thread-safe wrapper for a queue. (I'm aware that .NET already provides ConcurrentQueue<T>; this is just an example.) I could write:
public class ThreadSafeQueue<T>
{
private readonly Queue<T> _queue;
public ThreadSafeQueue()
{
_queue = new Queue<T>();
// Thread.MemoryBarrier(); // Is this line required?
}
public void Enqueue(T item)
{
lock (_queue)
{
_queue.Enqueue(item);
}
}
public bool TryDequeue(out T item)
{
lock (_queue)
{
if (_queue.Count == 0)
{
item = default(T);
return false;
}
item = _queue.Dequeue();
return true;
}
}
}
This implementation is thread-safe, once initialized. However, if the initialization itself is raced by another consumer thread, then race hazards could arise, whereby the latter thread would access the instance before the internal Queue<T> has been initialized. As a contrived example:
ThreadSafeQueue<int> queue = null;
Parallel.For(0, 10000, i =>
{
if (i == 0)
queue = new ThreadSafeQueue<int>();
else if (i % 2 == 0)
queue?.Enqueue(i);
else
{
int item = -1;
if (queue?.TryDequeue(out item) == true)
Console.WriteLine(item);
}
});
It is acceptable for the code above to miss some numbers; however, without the memory barrier, it could also be getting a NullReferenceException (or some other weird result) due to the internal Queue<T> not having been initialized by the time that Enqueue or TryDequeue are called.
Is it the responsibility of the thread-safe class to include a memory barrier at the end of its constructor, or is it the consumer who should include a memory barrier between the class's instantiation and its visibility to other threads? What is the convention in the .NET Framework for classes marked as thread-safe?
Edit: This is an advanced threading topic, so I understand the confusion in some of the comments. An instance can appear as half-baked if accessed from other threads without proper synchronization. This topic is discussed extensively within the context of double-checked locking, which is broken under the ECMA CLI specification without the use of memory barriers (such as through volatile). Per Jon Skeet:
The Java memory model doesn't ensure that the constructor completes before the reference to the new object is assigned to instance. The Java memory model underwent a reworking for version 1.5, but double-check locking is still broken after this without a volatile variable (as in C#).
Without any memory barriers, it's broken in the ECMA CLI specification too. It's possible that under the .NET 2.0 memory model (which is stronger than the ECMA spec) it's safe, but I'd rather not rely on those stronger semantics, especially if there's any doubt as to the safety.
Lazy<T> is a very good choice for Thread-Safe Initialization. I think it should be left to the consumer to provide that:
var queue = new Lazy<ThreadSafeQueue<int>>(() => new ThreadSafeQueue<int>());
Parallel.For(0, 10000, i =>
{
else if (i % 2 == 0)
queue.Value.Enqueue(i);
else
{
int item = -1;
if (queue.Value.TryDequeue(out item) == true)
Console.WriteLine(item);
}
});
Should thread-safe class have a memory barrier at the end of its
constructor?
I do not see a reason for this. The queue is local variable that is assigned from one thread and accessed from another. Such concurrent access should be synchronized and it is responsibility of the accessing code to do so. It has nothing to do with constructor or type of the variable, such access should always be explicitly synchronized or you are entering a dangerous area even for primitive types (even if the assignment is atomic, you may get caught is some cache trap). If the access to the variable is properly synchronized, it does not need any support in the constructor.
I'll attempt to answer this interesting and well-presented question, based on the comments by Servy and Douglas, and on information coming from other related questions. What follows is just my assumptions, and not solid information from a reputable source.
Thread-safe classes have properties and methods that can be safely invoked by multiple threads concurrently, but their constructors are not thread-safe. This means that it is entirely possible for a thread to "see" an instance of a thread-safe class having an invalid state, provided that the instance is constructed concurrently by another thread.
Adding the line Thread.MemoryBarrier(); at the end of the constructor is not enough to make the constructor thread-safe, because this statement only affects the thread that runs the constructor¹. The other threads that may access concurrently the under-construction instance are not affected. Memory-visibility is cooperative, and one thread cannot change what another thread "sees" by altering the other thread's execution flow (or invalidating the local cache of the CPU-core that the other thread is running on) in a non-cooperative manner.
The correct and robust way to ensure that all threads are seeing the instance having a valid state, is to include proper memory barriers in all threads. This can be achieved by either declaring the instance as volatile, in case it is a field of a class, or otherwise using the methods of the static Volatile class:
ThreadSafeQueue<int> queue = null;
Parallel.For(0, 10000, i =>
{
if (i == 0)
Volatile.Write(ref queue, new ThreadSafeQueue<int>());
else if (i % 2 == 0)
Volatile.Read(ref queue)?.Enqueue(i);
else
{
int item = -1;
if (Volatile.Read(ref queue)?.TryDequeue(out item) == true)
Console.WriteLine(item);
}
});
In this particular example it would be simpler and more efficient to instantiate the queue variable before invoking the Parallel.For method. Doing so would render unnecessary the explicit Volatile invocations. The Parallel.For method is using Tasks internally, and TPL includes the appropriate memory barriers at the beginning/end of each task. Memory barriers are generated implicitly and automatically by the .NET infrastructure, by any built-in mechanism that starts a thread or causes a delegate to execute on another thread. (citation)
I'll repeat that I'm not 100% confident about the correctness of the information presented above.
¹ Quoting from the documentation of the Thread.MemoryBarrier method: Synchronizes memory access as follows: The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to MemoryBarrier() execute after memory accesses that follow the call to MemoryBarrier().
No, you don't need memory barrier in the constructor. Your assumption, even though demonstrating some creative thought - is wrong. No thread can get a half backed instance of queue. The new reference is "visible" to the other threads only when the initialization is done. Suppose thread_1 is the first thread to initialize queue - it goes through the ctor code, but queue's reference in the main stack is still null! only when thread_1 exists the constructor code it assigns the reference.
See comments below and OP elaborated question.

Lock-free, awaitable, exclusive access methods

I have a thread safe class which uses a particular resource that needs to be accessed exclusively. In my assessment it does not make sense to have the callers of various methods block on a Monitor.Enter or await a SemaphoreSlim in order to access this resource.
For instance I have some "expensive" asynchronous initialization. Since it does not make sense to initialize more than once, whether it be from multiple threads or a single one, multiple calls should return immediately (or even throw an exception). Instead one should create, init and then distribute the instance to multiple threads.
UPDATE 1:
MyClass uses two NamedPipes in either direction. The InitBeforeDistribute method is not really initialization, but rather properly setting up a connection in both directions. It does not make sense to make the pipe available to N threads before you have set up the connection. Once it is setup multiple threads can post work, but only one can actually read/write to the stream. My apologies for obfuscating this with poor naming of the examples.
UPDATE 2:
If InitBeforeDistribute implemented a SemaphoreSlim(1, 1) with proper await logic (instead of the interlocked operation throwing an exception), is the Add/Do Square method OK practice? It does not throw a redundant exception (such as in InitBeforeDistribute), while being lock-free?
The following would be a good bad example:
class MyClass
{
private int m_isIniting = 0; // exclusive access "lock"
private volatile bool vm_isInited = false; // vol. because other methods will read it
public async Task InitBeforeDistribute()
{
if (Interlocked.Exchange(ref this.m_isIniting, -1) != 0)
throw new InvalidOperationException(
"Cannot init concurrently! Did you distribute before init was finished?");
try
{
if (this.vm_isInited)
return;
await Task.Delay(5000) // init asynchronously
.ConfigureAwait(false);
this.vm_isInited = true;
}
finally
{
Interlocked.Exchange(ref this.m_isConnecting, 0);
}
}
}
Some points:
If there is a case where blocking/awaiting access to a lock makes
perfect sense, then this example does not (make sense, that is).
Since I need to await in the method, I must use something like a
SemaphoreSlim if I where to use a "proper" lock. Forgoing the
Semaphore for the example above allows me to not worry about
disposing the class once I'm done with it. (I always disliked the
idea of disposing an item used by multiple threads. This is a minor
positive, for sure.)
If the method is called often there might be some performance
benefits, which of course should be measured.
The above example does not make sense in ref. to (3.) so here is another example:
class MyClass
{
private volatile bool vm_isInited = false; // see above example
private int m_isWorking = 0; // exclusive access "lock"
private readonly ConcurrentQueue<Tuple<int, TaskCompletionSource<int>> m_squareWork =
new ConcurrentQueue<Tuple<int, TaskCompletionSource<int>>();
public Task<int> AddSquare(int number)
{
if (!this.vm_isInited) // see above example
throw new InvalidOperationException(
"You forgot to init! Did you already distribute?");
var work = new Tuple<int, TaskCompletionSource<int>(number, new TaskCompletionSource<int>()
this.m_squareWork.Enqueue(work);
Task do = DoSquare();
return work.Item2.Task;
}
private async Task DoSquare()
{
if (Interlocked.Exchange(ref this.m_isWorking, -1) != 0)
return; // let someone else do the work for you
do
{
try
{
Tuple<int, TaskCompletionSource<int> work;
while (this.m_squareWork.TryDequeue(out work))
{
await Task.Delay(5000) // Limiting resource that can only be
.ConfigureAwait(false); // used by one thread at a time.
work.Item2.TrySetResult(work.Item1 * work.Item1);
}
}
finally
{
Interlocked.Exchange(ref this.m_isWorking, 0);
}
} while (this.m_squareWork.Count != 0 &&
Interlocked.Exchange(ref this.m_isWorking, -1) == 0)
}
}
Are there some of the specific negative aspects of this "lock-free" example that I should pay attention to?
Most questions relating to "lock-free" code on SO generally advise against it, stating that it is for the "experts". Rarely (I could be wrong on this one) do I see suggestions for books/blogs/etc that one can delve into, should one be so inclined. If there any such resources I should look into, please share. Any suggestions will be highly appreciated!
Update: great article related
.: Creating High-Performance Locks and Lock-free Code (for .NET) :.
The main point about lock-free algorythms is not that they are for experts.
The main point is Do you really need lock-free algorythm here? I can't understand your logic here:
Since it does not make sense to initialize more than once, whether it be from multiple threads or a single one, multiple calls should return immediately (or even throw an exception).
Why can't your users simply wait for a result of initialization, and use your resource after that? If your can, simply use the Lazy<T> class or even Asynchronous Lazy Initialization.
You really should read about consensus number and CAS-operations and why does it matters while implementing your own synchronization primitive.
In your code your are using the Interlocked.Exchange method, which isn't CAS in real, as it always exchanges the value, and it has a consensus number equal to 2. This means that the primitive using such construction will work correctly only for 2 threads (not in your situation, but still 2).
I've tried to define is your code works correctly for 3 threads, or there can be some circumstances which lead your application to damaged state, but after 30 minutes I stopped. And any your team member will stop like me after some time trying to understand your code. This is a waste of time, not only yours, but your team. Don't reinvent the wheel until you really have to.
My favorite book in related area is Writing High-Performance .NET Code by Ben Watson, and my favorite blog is Stephen Cleary's. If you can be more specific about what kind of book are you interested in, I can add some more references.
No locks in program doesn't make your application lock-free. In .NET application you really should not use the Exceptions for your internal program flow. Consider that the initializing thread isn't scheduled for a while by the OS (on various reasons, no matter what they are exactly).
In this case all other threads in your app will die step by step trying to access your shared resource. I can't say that this is a lock-free code. Yes, there are no locks in it, but it doesn't guarantee the correctness of the program and thus it isn't a lock-free by definition.
The Art of Multiprocessor Programming by Maurice Herlihy and Nir Shavit, is a great resource for lock-free and wait-free programming. lock-free is a progress guarantee other than a mode of programming, so to argue that an algorithm is lock-free, one has to validate or show proofs of the progress guarantee. lock-free in simple terms implies that blocking or halting of one thread doesn't not block progress of other threads or that if a threads is blocked infinitely often, then there is another thread that makes progress infinitely often.

Can the same object lock be checked at the same time by concurrent threads?

How come if I have a statement like this:
private int sharedValue = 0;
public void SomeMethodOne()
{
lock(this){sharedValue++;}
}
public void SomeMethodTwo()
{
lock(this){sharedValue--;}
}
So for a thread to get into a lock it must first check if another thread is operating on it. If it isn't, it can enter and has to write something to memory, this surely cannot be atomic as it needs to read and write.
So how come it's impossible for one thread to be reading the lock, while the other is writing its ownership to it?
To simplify Why cannot two threads both get into a lock at the same time?
It looks like you are basically asking how the lock works. How can the lock maintain internal state in an atomic manner without already having the lock built? It seems like a chicken and egg problem at first does it not?
The magic all happens because of a compare-and-swap (CAS) operation. The CAS operation is a hardware level instruction that does 2 important things.
It generates a memory barrier so that instruction reordering is constrained.
It compares the contents of a memory address with another value and if they are equal then the original value is replaced with a new value. It does all of this in an atomic manner.
At the most fundamental level this is how the trick is accomplished. It is not that all other threads are blocked from reading while another is writing. That is totally the wrong way to think about it. What actually happens is that all threads are acting as writers simultaneously. The strategy is more optimistic than it is pessimistic. Every thread is trying to acquire the lock by performing this special kind of write called a CAS. You actually have access to a CAS operation in .NET via the Interlocked.CompareExchange (ICX) method. Every synchronization primitive can be built from this single operation.
If I were going to write a Monitor-like class (that is what the lock keyword uses behind the scenes) from scratch entirely in C# I could do it using the Interlocked.CompareExchange method. Here is an overly simplified implementation. Please keep in mind that this is most certainly not how the .NET Framework does it.1 The reason I present the code below is to show you how it could be done in pure C# code without the need for CLR magic behind the scenes and because it might get you thinking about how Microsoft could have implemented it.
public class SimpleMonitor
{
private int m_LockState = 0;
public void Enter()
{
int iterations = 0;
while (!TryEnter())
{
if (iterations < 10) Thread.SpinWait(4 << iterations);
else if (iterations % 20 == 0) Thread.Sleep(1);
else if (iterations % 5 == 0) Thread.Sleep(0);
else Thread.Yield();
iterations++;
}
}
public void Exit()
{
if (!TryExit())
{
throw new SynchronizationLockException();
}
}
public bool TryEnter()
{
return Interlocked.CompareExchange(ref m_LockState, 1, 0) == 0;
}
public bool TryExit()
{
return Interlocked.CompareExchange(ref m_LockState, 0, 1) == 1;
}
}
This implementation demonstrates a couple of important things.
It shows how the ICX operation is used to atomically read and write the lock state.
It shows how the waiting might occur.
Notice how I used Thread.SpinWait, Thread.Sleep(0), Thread.Sleep(1) and Thread.Yield while the lock is waiting to be acquired. The waiting strategy is overly simplified, but it does approximate a real life algorithm implemented in the BCL already. I intentionally kept the code simple in the Enter method above to make it easier to spot the crucial bits. This is not how I would have normally implemented this, but I am hoping it does drive home the salient points.
Also note that my SimpleMonitor above has a lot of problems. Here are but only a few.
It does not handle nested locking.
It does not provide Wait or Pulse methods like the real Monitor class. They are really hard to do right.
1The CLR will actually use a special block of memory that exists on each reference type. This block of memory is referred to as the "sync block". The Monitor will manipulate bits in this block of memory to acquire and release the lock. This action may require a kernel event object. You can read more about it on Joe Duffy's blog.
lock in C# is used to create a Monitor object that is actually used for locking.
You can read more about Monitor in here: http://msdn.microsoft.com/en-us/library/system.threading.monitor.aspx. The Enter method of the Monitor ensures that only one thread can enter the critical section at the time:
Acquires a lock for an object. This action also marks the beginning of a critical section. No other thread can enter the critical section unless it is executing the instructions in the critical section using a different locked object.
BTW, you should avoid locking on this (lock(this)). You should use a private variable on a class (static or non-static) to protect the critical section. You can read more in the same link provided above but the reason is:
When selecting an object on which to synchronize, you should lock only on private or internal objects. Locking on external objects might result in deadlocks, because unrelated code could choose the same objects to lock on for different purposes.

Creating a mutiple syncLock variable for an instance

I have two internal properties that use lazy-loading of backing fields, and are used in a multi-threaded application, so I have implemented a double-checking lock scheme as per this MSDN article
Now, firstly assuming that this is an appropriate pattern, all the examples show creating a single lock object for an instance. If my two properties are independent of each other, would it not be more efficient to create a lock instance for each property?
It occurs to me that maybe there is only one in order to avoid deadlocks or race-conditions. A obvious situation doesn't come to mind, but I'm sure someone can show me one... (I'm not very experienced with multi-threaded code, obviously)
private List<SomeObject1> _someProperty1;
private List<SomeObject2> _someProperty2;
private readonly _syncLockSomeProperty1 = new Object();
private readonly _syncLockSomeProperty2 = new Object();
internal List<SomeObject1> SomeProperty1
{
get
{
if (_someProperty1== null)
{
lock (_syncLockSomeProperty1)
{
if (_someProperty1 == null)
{
_someProperty1 = new List<SomeObject1>();
}
}
}
return _someProperty1;
}
set
{
_someProperty1 = value;
}
}
internal List<SomeObject2> SomeProperty2
{
get
{
if (_someProperty2 == null)
{
lock (_syncLockSomeProperty2)
{
if (_someProperty2 == null)
{
_someProperty2 = new List<SomeObject2>();
}
}
}
return _someProperty2;
}
set
{
_someProperty2 = value;
}
}
If your properties are truly independent, then there's no harm in using independent locks for each of them.
In case the two properties (or their initializers more specifically) are independent of each other, as in the sample code you provided, it makes sense to have two different lock objects. However, when the initialization occurs rarely, the effect will be negligible.
Note that you should protect the setter's code as well. The lock statement imposes a so called memory barrier, which is indispensable especially on multi-CPU and/or multi-core systems to prevent race conditions.
Yes, if they are independent of each other, this would indeed be more efficient, as access to one wont' block access to the other. You're also on the money about the risk of a deadlock if that independence turned out to be false.
The question is, presuming that _someProperty1 = new List<SomeObject1>(); isn't the real code for assigning to _someProperty1 (hardly worth the lazy-load, is it?), then the question is: Can the code that fills SomeProperty1 ever call that which fills SomeProperty2, or vice-versa, through any code-path, no matter how bizarre?
Even if one can call the other, there can't be a deadlock, but if they both can call each other (or 1 call 2, 2 call 3 and 3 call 1, and so on), then a deadlock can definitely happen.
As a rule, I'd start with broad locks (one lock for all locked tasks) and then make the locks narrower as an optimisation as needed. In cases where you have, say, 20 methods which need locking, then judging the safety can be harder (also, you begin to fill memory just with lock objects).
Note that there are two issues with your code also:
First, you don't lock in your setter. Possibly this is fine (you just want your lock to prevent multiple heavy calls to the loading method, and don't actually care if there are over-writes between the set, and the get), possibly this is a disaster.
Second, depending on the CPU running it, double-check as you write it can have issues with read/write reordering, so you should either have a volatile field, or call a memory barrier. See http://blogs.msdn.com/b/brada/archive/2004/05/12/130935.aspx
Edit:
It's also worth considering whether it's really needed at all.
Consider that the operation itself should be thread-safe:
Do a bunch of stuff is done.
Have an object created based on that bunch of stuff.
Assign that object to the local variable.
1 and 2 will only happen on one thread, and 3 is atomic. Therefore, the advantage of locking is:
If performing step 1 and/or 2 above have their own threading issues, and aren't protected from them by their own locks, then locking is 100% necessary.
If it would be disastrous for something to have acted upon a value obtained in step 1 and 2, and then later to do so with step 1 and 2 being repeated, locking is 100% necessary.
Locking will prevent the waste of 1 and 2 being done multiple times.
So, if we can rule out case 1 and 2 as an issue (takes a bit of analysis, but it's often possible), then we've only preventing the waste in case 3 to worry about. Now, maybe this is a big worry. However, if it would rarely come up, and also not be that much of a waste when it did, then the gains of not locking would outweigh the gains of locking.
If in doubt, locking is probably the safer approach, but its possible that just living with the occasional wasted operation is better.

locking only when modifying vs entire method

When should locks be used? Only when modifying data or when accessing it as well?
public class Test {
static Dictionary<string, object> someList = new Dictionary<string, object>();
static object syncLock = new object();
public static object GetValue(string name) {
if (someList.ContainsKey(name)) {
return someList[name];
} else {
lock(syncLock) {
object someValue = GetValueFromSomeWhere(name);
someList.Add(name, someValue);
}
}
}
}
Should there be a lock around the the entire block or is it ok to just add it to the actual modification? My understanding is that there still could be some race condition where one call might not have found it and started to add it while another call right after might have also run into the same situation - but I'm not sure. Locking is still so confusing. I haven't run into any issues with the above similar code but I could just be lucky so far. Any help above would be appriciated as well as any good resources for how/when to lock objects.
You have to lock when reading too, or you can get unreliable data, or even an exception if a concurrent modification physically changes the target data structure.
In the case above, you need to make sure that multiple threads don't try to add the value at the same time, so you need at least a read lock while checking whether it is already present. Otherwise multiple threads could decide to add, find the value is not present (since this check is not locked), and then all try to add in turn (after getting the lock)
You could use a ReaderWriterLockSlim if you have many reads and only a few writes. In the code above you would acquire the read lock to do the check and upgrade to a write lock once you decide you need to add it. In most cases, only a read lock (which allows your reader threads to still run in parallel) would be needed.
There is a summary of the available .Net 4 locking primitives here. Definitely you should understand this before you get too deep into multithreaded code. Picking the correct locking mechanism can make a huge performance difference.
You are correct that you have been lucky so far - that's a frequent feature of concurrency bugs. They are often hard to reproduce without targeted load testing, meaning correct design (and exhaustive testing, of course) is vital to avoid embarrassing and confusing production bugs.
Lock the whole block before you check for the existence of name. Otherwise, in theory, another thread could add it between the check, and your code that adds it.
Actually locking just when you perform the Add really doesn't do anything at all. All that would do is prevent another thread from adding something simultaneously. But since that other thread would have already decided it was going to do the add, it would just try to do it anyway as soon as the lock was released.
If a resource can only be accessed by multiple threads, you do not need any locks.
If a resource can be accessed by multiple threads and can be modified, then all accesses/modifications need to be synchronized. In your example, if GetValueFromSomeWhere takes a long time to return, it is possible for a second call to be made with the same value in name, but the value has not been stored in the Dictionary.
ReaderWriterLock or the slim version if you under 4.0.
You will aquire the reader lock for the reads (will allow for concurrent reads) and upgrade the lock to the writer lock when something is to write (will allow only one write at the time and will block all the reads until is done, as well as the concurrent write-threads).
Make sure to release your locks with the pattern to avoid deadlocking:
void Write(object[] args)
{
this.ReaderWriterLock.AquireWriteLock(TimeOut.Infinite);
try
{
this.myData.Write(args);
}
catch(Exception ex)
{
}
finally
{
this.ReaderWriterLock.RelaseWriterLock();
}
}

Categories