I was reading in Albahari's excellent eBook on threading and came across the following scenario he mentions that "a thread can repeatedly lock the same object in a nested (reentrant) fashion"
lock (locker)
lock (locker)
lock (locker)
{
// Do something...
}
as well as
static readonly object _locker = new object();
static void Main()
{
lock (_locker)
{
AnotherMethod();
// We still have the lock - because locks are reentrant.
}
}
static void AnotherMethod()
{
lock (_locker) { Console.WriteLine ("Another method"); }
}
From the explanation, any threads will block on the first (outermost) lock and that it is unlocked only after the outer lock has exited.
He states "nested locking is useful when one method calls another within a lock"
Why is this useful? When would you NEED to do this and what problem does it solve?
Lets say you have two public methods, A() and B(), which both need the same lock.
Furthermore, let's say that A() calls B()
Since the client can also call B() directly, you need to lock in both methods.
Therefore, when A() is called, B() will take the lock a second time.
It's not so much that it's useful to do so, as it's useful to be allowed to. Consider how you may often have public methods that call other public methods. If the public method called into locks, and the public method calling into it needs to lock on the wider scope of what it does, then being able to use recursive locks means you can do so.
There are some cases where you might feel like using two lock objects, but you're going to be using them together and hence if you make a mistake, there's a big risk of deadlock. If you can deal with the wider scope being given to the lock, then using the same object for both cases - and recursing in those cases where you'd be using both objects - will remove those particular deadlocks.
However...
This usefulness is debatable.
On the first case, I'll quote from Joe Duffy:
Recursion typically indicates an over-simplification in your synchronization design that often leads to less reliable code. Some designs use lock recursion as a way to avoid splitting functions into those that take locks and those that assume locks are already taken. This can admittedly lead to a reduction in code size and therefore a shorter time-to-write, but results in a more brittle design in the end.
It is always a better idea to factor code into public entry-points that take non-recursive locks, and internal worker functions that assert a lock is held. Recursive lock calls are redundant work that contributes to raw performance overhead. But worse, depending on recursion can make it more difficult to understand the synchronization behavior of your program, in particular at what boundaries invariants are supposed to hold. Usually we’d like to say that the first line after a lock acquisition represents an invariant “safe point” for an object, but as soon as recursion is introduced this statement can no longer be made confidently. This in turn makes it more difficult to ensure correct and reliable behavior when dynamically composed.
(Joe has more to say on the topic elsewhere in his blog, and in his book on concurrent programming).
The second case is balanced by the cases where recursive lock entry just makes different types of deadlock happen, or push up the rate of contention so high that there might as well be deadlocks (This guy says he'd prefer it just to hit a deadlock the first time you recursed, I disagree - I'd much prefer it just to throw a big exception that brought my app down with a nice stack-trace).
One of the worse things, is it simplifies at the wrong time: When you're writing code it can be simpler to use lock recursion than to split things out more and think more deeply about just what should be locking when. However, when you're debugging code, the fact that leaving a lock does not mean leaving that lock complicates things. What a bad way around - it's when we think we know what we're doing that complicated code is a temptation to be enjoyed in your off-time so you don't indulge while on the clock, and when we realised we messed up that we most want things to be nice and simple.
You really don't want to mix them with condition variables.
Hey, POSIX-threads only has them because of a dare!
At least the lock keyword means we avoid the possibility of not having matching Monitor.Exit()s for every Monitor.Enter()s which makes some of the risks less likely. Up until the time you need to do something outside of that model.
With more recent locking classes, .NET does it's bit to help people avoid using lock-recursion, without blocking those who use older coding patterns. ReaderWriterLockSlim has a constructor overload that lets you use it recursion, but the default is LockRecursionPolicy.NoRecursion.
Often in dealing with issues of concurrency we have to make a decision between a more fraught technique that could potentially give us better concurrency but which requires much more care to be sure of correctness vs a simpler technique that could potentially give worse concurrency but where it is easier to be sure of the correctness. Using locks recursively gives us a technique where we will hold locks longer and have less good concurrency, and also be less sure of correctness and have harder debugging.
If you have a resource that you want exclusive control over, but many methods act upon this resource. A method might not be able to assume that it is locked so it will lock it within it's method. If it's locked in the outer method AND inner method, then it provides a situation similar to the example in the book. I cannot see a time where I would want to lock twice in the same code block.
Related
This is more a conceptual question. I was wondering if I used a lock inside of Parallel.ForEach<> loop if that would take away the benefits of Paralleling a foreachloop.
Here is some sample code where I have seen it done.
Parallel.ForEach<KeyValuePair<string, XElement>>(binReferences.KeyValuePairs, reference =>
{
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
fileLocks.Add(reference.Key, new object());
}
}
RecursiveBinUpdate(reference.Value, testPath, reference.Key, maxRecursionCount, ref recursionCount);
lock (fileLocks[reference.Key])
{
reference.Value.Document.Save(reference.Key);
}
});
Where fileLockObject and fileLocks are as follows.
private static object fileLockObject = new object();
private static Dictionary<string, object> fileLocks = new Dictionary<string, object>();
Does this technique completely make the loop not parallel?
I would like to see your thoughts on this.
It means all of the work inside of the lock can't be done in parallel. This greatly harms the performance here, yes. Since the entire body is not all locked (and locked on the same object) there is still some parallelization here though. Whether the parallelization that you do get adds enough benefit to surpass the overhead that comes with managing the threads and synchronizing around the locks is something you really just need to test yourself with your specific data.
That said, it looks like what you're doing (at least in the first locked block, which is the one I'd be more concerned with at every thread is locking on the same object) is locking access to a Dictionary. You can instead use a ConcurrentDictionary, which is specifically designed to be utilized from multiple threads, and will minimize the amount of synchronization that needs to be done.
if I used a lock ... if that would take away the benefits of Paralleling a foreachloop.
Proportionally. When RecursiveBinUpdate() is a big chunk of work (and independent) then it will still pay off. The locking part could be a less than 1%, or 99%. Look up Amdahls law, that applies here.
But worse, your code is not thread-safe. From your 2 operations on fileLocks, only the first is actually inside a lock.
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
...
}
}
and
lock (fileLocks[reference.Key]) // this access to fileLocks[] is not protected
change the 2nd part to:
lock (fileLockObject)
{
reference.Value.Document.Save(reference.Key);
}
and the use of ref recursionCount as a parameter looks suspicious too. It might work with Interlocked.Increment though.
The "locked" portion of the loop will end up running serially. If the RecursiveBinUpdate function is the bulk of the work, there may be some gain, but it would be better if you could figure out how to handle the lock generation in advance.
When it comes to locks, there's no difference in the way PLINQ/TPL threads have to wait to gain access. So, in your case, it only makes the loop not parallel in those areas that you're locking and any work outside those locks is still going to execute in parallel (i.e. all the work in RecursiveBinUpdate).
Bottom line, I see nothing substantially wrong with what you're doing here.
I came across a ConcurrentDictionary implementation for .NET 3.5 (I'm so sorry I could find the link right now) that uses this approach for locking:
var current = Thread.CurrentThread.ManagedThreadId;
while (Interlocked.CompareExchange(ref owner, current, 0) != current) { }
// PROCESS SOMETHING HERE
if (current != Interlocked.Exchange(ref owner, 0))
throw new UnauthorizedAccessException("Thread had access to cache even though it shouldn't have.");
Instead of the traditional lock:
lock(lockObject)
{
// PROCESS SOMETHING HERE
}
The question is: Is there any real reason for doing this? Is it faster or have some hidden benefit?
PS: I know there's a ConcurrentDictionary in some latest version of .NET but I can't use for a legacy project.
Edit:
In my specific case, what I'm doing is just manipulating an internal Dictionary class in such a way that it's thread safe.
Example:
public bool RemoveItem(TKey key)
{
// open lock
var current = Thread.CurrentThread.ManagedThreadId;
while (Interlocked.CompareExchange(ref owner, current, 0) != current) { }
// real processing starts here (entries is a regular `Dictionary` class.
var found = entries.Remove(key);
// verify lock
if (current != Interlocked.Exchange(ref owner, 0))
throw new UnauthorizedAccessException("Thread had access to cache even though it shouldn't have.");
return found;
}
As #doctorlove suggested, this is the code: https://github.com/miensol/SimpleConfigSections/blob/master/SimpleConfigSections/Cache.cs
There is no definitive answer to your question. I would answer: it depends.
What the code you've provided is doing is:
wait for an object to be in a known state (threadId == 0 == no current work)
do work
set back the known state to the object
another thread now can do work too, because it can go from step 1 to step 2
As you've noted, you have a loop in the code that actually does the "wait" step. You don't block the thread until you can access to your critical section, you just burn CPU instead. Try to replace your processing (in your case, a call to Remove) by Thread.Sleep(2000), you'll see the other "waiting" thread consuming all of one of your CPUs for 2s in the loop.
Which means, which one is better depends on several factors. For example: how many concurrent accesses are there? How long the operation takes to complete? How many CPUs do you have?
I would use lock instead of Interlocked because it's way easier to read and maintain. The exception would be the case you've got a piece of code called millions of times, and a particular use case you're sure Interlocked is faster.
So you'll have to measure by yourself both approaches. If you don't have time for this, then you probably don't need to worry about performances, and you should use lock.
Your CompareExchange sample code doesn't release the lock if an exception is thrown by "PROCESS SOMETHING HERE".
For this reason as well as the simpler, more readable code, I would prefer the lock statement.
You could rectify the problem with a try/finally, but this makes the code even uglier.
The linked ConcurrentDictionary implementation has a bug: it will fail to release the lock if the caller passes a null key, potentially leaving other threads spinning indefinitely.
As for efficiency, your CompareExchange version is essentially a Spinlock, which can be efficient if threads are only likely to be blocked for short periods. But inserting into a managed dictionary can take a relatively long time, since it may be necessary to resize the dictionary. Therefore, IMHO, this isn't a good candidate for a spinlock - which can be wasteful, especially on single-processor system.
A little bit late... I have read your sample but in short:
Fastest to slowest MT sync:
Interlocked.* => This is a CPU atomic instruction. Can't be beat if it is sufficient for your need.
SpinLock => Uses Interlocked behind and is really fast. Uses CPU when wait. Do not use for code that wait long time (it is usually used to prevent thread switching for lock that do quick action). If you often have to wait more than one thread cycle, I would suggest to go with "Lock"
Lock => The slowest but easier to use and read than SpinLock. The instruction itself is very fast but if it can't acquire the lock it will relinquish the cpu. Behind the scene, it will do a WaitForSingleObject on a kernel objet (CriticalSection) and then Window will give cpu time to the thread only when the lock will be freed by the thread that acquired it.
Have fun with MT!
The docs for the Interlocked class tell us it
"Provides atomic operations for variables that are shared by multiple threads. "
The theory is an atomic operation can be faster than locks. Albahari gives some further details on interlocked operations stating they are faster.
Note that Interlocked provides a "smaller" interface than Lock - see previous question here
Yes.
The Interlocked class offer atomic operations which means they do not block other code like a lock because they don't really need to.
When you lock a block of code you want to make sure no 2 threads are in it at the same time, that means that when a thread is inside all other threads wait to get in, which uses resources (cpu time and idle threads).
The atomic operations on the other hand do not need to block other atomic operations because they are atomic. It's conceptually a one CPU operation, the next ones just go in after the previous, and you're not wasting threads on just waiting. (By the way, that's why it's limited to very basic operations like Increment, Exchange etc.)
I think a lock (which is a Monitor underneath) uses interlocked to know if the lock is already taken, but it can't know that the actions inside it can be atomic.
In most cases, though, the difference is not critical. But you need to verify that for your specific case.
Interlocked is faster - already explained in other comments and you can also define the logic of how the wait is implemented e.g. spinWait.spin(), spinUntil, Thread.sleep etc once the lock fails the first time.. Also, if your code within the lock is expected to run without possibility of crash (custom code/delegates/resource resolution or allocation/events/unexpected code executed during the lock) unless you are going to be catching the exception to allow your software to continue execution, "try" "finally" is also skipped so extra speed there. lock(something) makes sure if you catch the exception from outside to unlock that something, just like "using" makes sure (C#) when the execution exits the execution block for whatever reason to dispose the "used" disposable object.
One important difference between lock and interlock.CompareExhange is how it can be used in async environments.
async operations cannot be awaited inside a lock, as they can easily occur in deadlocks if the thread that continues execution after the await is not the same one that originally acquired the lock.
This is not a problem with interlocked however, because nothing is "acquired" by a thread.
Another solution for asynchronous code that may provide better readability than interlocked may be semaphore as described in this blog post:
https://blog.cdemi.io/async-waiting-inside-c-sharp-locks/
I struggle to choose from this 2 approaches, many of the answers here favor one or the other.
I need a guidance to choose the best for my situation.
lock (lockObject)
lock (lockObject2) {
// two critical portions of code
}
versus:
lock (lockObject)
{
//first critical portion for lockObject
lock (lockObject2) {
//critical portion for lockObject2 and lockObject
}
}
The second example is marked as a bad practice by Coverity and I want to switch to the first if it is OK to do that.
Which of the 2 is the best(and by best I mean code quality and fewer problems on the long run)? And why?
Edit 1: The first lock is only used for this case in particular.
"Best" is subjective and depends on context. One problem with either is that you can risk deadlocks if some code uses the same locks in a different order. If you have nested locks, then personally at a minimum I'd be using a lock-with-timeout (and raising an exception) - an exception is better than a deadlock. The advantage of taking out both locks immediately is that you know you can get them, before you start doing the work. The advantage of not doing that is that you reduce the time the lock on lockObject2 is held.
Personally, I would be looking for ways to make it:
lock (lockObject) {
//critical portion for lockObject
}
lock (lockObject2) {
//critical portion for lockObject2
}
This has the advantages of both, without the disadvantages of either - if you can restructure the code to do it.
Depends on the case you are facing:
If it is possible the first lock is release without touching the second lock it a possible speedup to your application because the second lock won't be locked without being used.
If always both locks are used I would prefere the first case for readability
Edit: I think it's not possible to tell this without more information.
To keep the locks as short as possible you need to lock as late as possible. Anyway there may be cases you want to get both locks at the same time.
Edit: I said the wrong thing.
The first will lock the first lock, then lock the second lock for the portion of the code and then unlock it automatically.
The second example will lock the first, then lock the second and then both portions of code has ended, both have unlocked.
Personally I would prefer the second option since it will automatically unlock both locks. But as someone else commented, beware of deadlocks. This can happen if another thread locks in the reverse order.
I have two internal properties that use lazy-loading of backing fields, and are used in a multi-threaded application, so I have implemented a double-checking lock scheme as per this MSDN article
Now, firstly assuming that this is an appropriate pattern, all the examples show creating a single lock object for an instance. If my two properties are independent of each other, would it not be more efficient to create a lock instance for each property?
It occurs to me that maybe there is only one in order to avoid deadlocks or race-conditions. A obvious situation doesn't come to mind, but I'm sure someone can show me one... (I'm not very experienced with multi-threaded code, obviously)
private List<SomeObject1> _someProperty1;
private List<SomeObject2> _someProperty2;
private readonly _syncLockSomeProperty1 = new Object();
private readonly _syncLockSomeProperty2 = new Object();
internal List<SomeObject1> SomeProperty1
{
get
{
if (_someProperty1== null)
{
lock (_syncLockSomeProperty1)
{
if (_someProperty1 == null)
{
_someProperty1 = new List<SomeObject1>();
}
}
}
return _someProperty1;
}
set
{
_someProperty1 = value;
}
}
internal List<SomeObject2> SomeProperty2
{
get
{
if (_someProperty2 == null)
{
lock (_syncLockSomeProperty2)
{
if (_someProperty2 == null)
{
_someProperty2 = new List<SomeObject2>();
}
}
}
return _someProperty2;
}
set
{
_someProperty2 = value;
}
}
If your properties are truly independent, then there's no harm in using independent locks for each of them.
In case the two properties (or their initializers more specifically) are independent of each other, as in the sample code you provided, it makes sense to have two different lock objects. However, when the initialization occurs rarely, the effect will be negligible.
Note that you should protect the setter's code as well. The lock statement imposes a so called memory barrier, which is indispensable especially on multi-CPU and/or multi-core systems to prevent race conditions.
Yes, if they are independent of each other, this would indeed be more efficient, as access to one wont' block access to the other. You're also on the money about the risk of a deadlock if that independence turned out to be false.
The question is, presuming that _someProperty1 = new List<SomeObject1>(); isn't the real code for assigning to _someProperty1 (hardly worth the lazy-load, is it?), then the question is: Can the code that fills SomeProperty1 ever call that which fills SomeProperty2, or vice-versa, through any code-path, no matter how bizarre?
Even if one can call the other, there can't be a deadlock, but if they both can call each other (or 1 call 2, 2 call 3 and 3 call 1, and so on), then a deadlock can definitely happen.
As a rule, I'd start with broad locks (one lock for all locked tasks) and then make the locks narrower as an optimisation as needed. In cases where you have, say, 20 methods which need locking, then judging the safety can be harder (also, you begin to fill memory just with lock objects).
Note that there are two issues with your code also:
First, you don't lock in your setter. Possibly this is fine (you just want your lock to prevent multiple heavy calls to the loading method, and don't actually care if there are over-writes between the set, and the get), possibly this is a disaster.
Second, depending on the CPU running it, double-check as you write it can have issues with read/write reordering, so you should either have a volatile field, or call a memory barrier. See http://blogs.msdn.com/b/brada/archive/2004/05/12/130935.aspx
Edit:
It's also worth considering whether it's really needed at all.
Consider that the operation itself should be thread-safe:
Do a bunch of stuff is done.
Have an object created based on that bunch of stuff.
Assign that object to the local variable.
1 and 2 will only happen on one thread, and 3 is atomic. Therefore, the advantage of locking is:
If performing step 1 and/or 2 above have their own threading issues, and aren't protected from them by their own locks, then locking is 100% necessary.
If it would be disastrous for something to have acted upon a value obtained in step 1 and 2, and then later to do so with step 1 and 2 being repeated, locking is 100% necessary.
Locking will prevent the waste of 1 and 2 being done multiple times.
So, if we can rule out case 1 and 2 as an issue (takes a bit of analysis, but it's often possible), then we've only preventing the waste in case 3 to worry about. Now, maybe this is a big worry. However, if it would rarely come up, and also not be that much of a waste when it did, then the gains of not locking would outweigh the gains of locking.
If in doubt, locking is probably the safer approach, but its possible that just living with the occasional wasted operation is better.
In a multi-threaded program running on a multi-cpu machine do I need to access shared state ( _data in the example code below) using volatile read/writes to ensure correctness.
In other words, can heap objects be cached on the cpu?
Using the example below and assuming multi-threads will access the GetValue and Add methods, I need ThreadA to be able to add data (using the Add Method) and ThreadB to be able to see/get that added data immediately (using the GetValue method). So do I need to add volatile reads/writes to _data to ensure this? Basically I don’t want to added data to be cached on ThreadA’s cpu.
/ I am not Locking (enforcing exclusive thread access) as the code needs to be ultra-fast and I am not removing any data from _data so I don’t need to lock _data.
Thanks.
**** Update ****************************
Obviously you guys think going lock-free using this example is bad idea. But what side effects or exceptions could I face here?
Could the Dictionary type throw an exception if 1 thread is iterating the values for read and another thread is iterating the values for update? Or would I just experience “dirty reads” (which would be fine in my case)?
**** End Update ****************************
public sealed class Data
{
private volatile readonly Dictionary<string, double> _data = new Dictionary<string, double>();
public double GetVaule(string key)
{
double value;
if (!_data.TryGetValue(key, out value))
{
throw new ArgumentException(string.Format("Key {0} does not exist.", key));
}
return value;
}
public void Add(string key, double value)
{
_data.Add(key, value);
}
public void Clear()
{
_data.Clear();
}
}
Thanks for the replies. Regarding the locks, the methods are pretty much constantly called by mulitple threads so my problem is with contested locks not the actual lock operation.
So my question is about cpu caching, can heap objects (the _data instance field) be cached on a cpu? Do i need the access the _data field using volatile reads/writes?
/Also, I am stuck with .Net 2.0.
Thanks for your help.
The MSDN docs for Dictionary<TKey, TValue> say that it's safe for multiple readers but they don't give the "one writer, multiple readers" guarantee that some other classes do. In short, I wouldn't do this.
You say you're avoiding locking because you need the code to be "ultra-fast" - have you tried locking to see what the overhead is? Uncontested locks are very cheap, and when the lock is contested that's when you're benefiting from the added safety. I'd certainly profile this extensively before deciding to worry about the concurrency issues of a lock-free solution. ReaderWriterLockSlim may be useful if you've actually got multiple readers, but it sounds like you've got a single reader and a single writer, at least at the moment - simple locking will be easier in this case.
I think you may be misunderstanding the use of the volatile keyword (either that or I am, and someone please feel free to correct me). The volatile keyword guarantees that get and set operations on the value of the variable itself from multiple threads will always deal with the same copy. For instance, if I have a bool that indicates a state then setting it in one thread will make the new value immediately available to the other.
However, you never change the value of your variable (in this case, a reference). All that you do is manipulate the area of memory that the reference points to. Declaring it as volatile readonly (which, if my understanding is sound, defeats the purpose of volatile by never allowing it to be set) won't have any effect on the actual data that's being manipulated (the back-end store for the Dictionary<>).
All that being said, you really need to use a lock in this case. Your danger extends beyond the prospect of "dirty reads" (meaning that what you read would have been, at some point, valid) into truly unknown territory. As Jon said, you really need proof that locking produces unacceptable performance before you try to go down the road of lockless coding. Otherwise that's the epitome of premature optimization.
The problem is that your add method:
public void Add(string key, double value)
{
_data.Add(key, value);
}
Could cause _data to decide to completely re-organise the data it's holding - at that point a GetVaule request could fail in any possible way.
You need a lock or a different data structure / data structure implementation.
I don't think volatile can be a replacement of locking if you start calling methods on it. You are guaranteeing that the thread A and thread B sees the same copy of the dictionary, but you can still access the dictionary simultaneously. You can use multi-moded locks to increase concurrency. See ReaderWriterLockSlim for example.
Represents a lock that is used to
manage access to a resource, allowing
multiple threads for reading or
exclusive access for writing.
The volatile keyword is not about locking, it is used to indicate that the value of the specified field might be changed or read by different thread or other thing that can run concurrently with your code. This is crucial for the compiler to know, because many optimization processes involve caching the variable value and rearranging the instructions. The volatile keyword will tell the compiler to be "cautious" when optimizing those instructions that reference to volatile variable.
For multi-thread usage of dictionary, there are many ways to do. The simplest way is using lock keyword, which has adequate performance. If you need higher performance, you might need to implement your own dictionary for your specific task.
Volatile is not locking, it has nothing to do with synchronization. It's generally safe to do lock-free reads on read-only data. Note that just because you don't remove anything from _data, you seem to call _data.Add(). That is NOT read-only. So yes, this code will blow up in your face in a variety of exciting and difficult to predict ways.
Use locks, it's simple, it's safer. If you're a lock-free guru (you're not!), AND profiling shows a bottleneck related to contention for the lock, AND you cannot solve the contention issues via partitioning or switching to spin-locks THEN AND ONLY THEN can you investigate a solution to get lock-free reads, which WILL involve writing your own Dictionary from scratch and MAY be faster than the locking solution.
Are you starting to see how far off base you are in your thinking here? Just use a damn lock!