In my application I manage a collection of locks that I need to serialize access to some of objects (each object is assigned a lock). This collection of locks (lock manager) also needs to be maintained in a thread-safe fashion (new locks are added/old locks are removed as new objects which require serialization are added/removed).
The algorithm works something like this:
LockManager.Lock();
var myLock = LockManager.FindLock(myObject);
LockManager.Unlock(); // atomic
myLock.Lock(); // atomic
Swapping two lines is not a good solution. If locking of myLock would block then this would also block unlocking of LockManager making any requests for other locks to block.
What I would need is that the two marked lines are executed atomically. Is there a way to achieve this?
So you want to:
guarantee that the individual lock (via myLock) was entered
then unlock the LockManager
make the above two operations atomic
and not allow this new atomic operation to block if the individual lock cannot be entered immediately
Similar to how you cannot circumvent the laws of physics to create a perpetual motion machine you also cannot circumvent the laws of computation by executing a sequence of operations atomically in such manner that it does not block even if one of its constituents can, in fact, be expected to block. In other words, there is no way to make the operation complete until the individual parts also complete.
However, what we can do is attempt this atomic operation in an all-or-none manner that never blocks as long as we are okay with the "none" outcome. You see this a lot with the TryXXX methods that exist on a lot of concurrent data structures. All you would need to do is define a TryLock on your myLock type. Then, the LockManager could look like the following.
public class LockManager
{
public bool TryEnterIndividualLock(object value)
{
Lock();
try
{
var myLock = FindLock(value);
if (myLock != null)
{
return myLock.TryLock();
}
return false;
}
finally
{
Unlock();
}
}
}
Then the calling code would look like this:
while (!LockManager.TryEnterIndividualLock(myObject))
{
// Do something else until the lock can be entered.
}
This would give you the atomicity you were looking for, but at the cost of the operation not succeeding. If you are relying on this operation succeeding immediately then you are going to have to rethink your overall design.
Related
This is more a conceptual question. I was wondering if I used a lock inside of Parallel.ForEach<> loop if that would take away the benefits of Paralleling a foreachloop.
Here is some sample code where I have seen it done.
Parallel.ForEach<KeyValuePair<string, XElement>>(binReferences.KeyValuePairs, reference =>
{
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
fileLocks.Add(reference.Key, new object());
}
}
RecursiveBinUpdate(reference.Value, testPath, reference.Key, maxRecursionCount, ref recursionCount);
lock (fileLocks[reference.Key])
{
reference.Value.Document.Save(reference.Key);
}
});
Where fileLockObject and fileLocks are as follows.
private static object fileLockObject = new object();
private static Dictionary<string, object> fileLocks = new Dictionary<string, object>();
Does this technique completely make the loop not parallel?
I would like to see your thoughts on this.
It means all of the work inside of the lock can't be done in parallel. This greatly harms the performance here, yes. Since the entire body is not all locked (and locked on the same object) there is still some parallelization here though. Whether the parallelization that you do get adds enough benefit to surpass the overhead that comes with managing the threads and synchronizing around the locks is something you really just need to test yourself with your specific data.
That said, it looks like what you're doing (at least in the first locked block, which is the one I'd be more concerned with at every thread is locking on the same object) is locking access to a Dictionary. You can instead use a ConcurrentDictionary, which is specifically designed to be utilized from multiple threads, and will minimize the amount of synchronization that needs to be done.
if I used a lock ... if that would take away the benefits of Paralleling a foreachloop.
Proportionally. When RecursiveBinUpdate() is a big chunk of work (and independent) then it will still pay off. The locking part could be a less than 1%, or 99%. Look up Amdahls law, that applies here.
But worse, your code is not thread-safe. From your 2 operations on fileLocks, only the first is actually inside a lock.
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
...
}
}
and
lock (fileLocks[reference.Key]) // this access to fileLocks[] is not protected
change the 2nd part to:
lock (fileLockObject)
{
reference.Value.Document.Save(reference.Key);
}
and the use of ref recursionCount as a parameter looks suspicious too. It might work with Interlocked.Increment though.
The "locked" portion of the loop will end up running serially. If the RecursiveBinUpdate function is the bulk of the work, there may be some gain, but it would be better if you could figure out how to handle the lock generation in advance.
When it comes to locks, there's no difference in the way PLINQ/TPL threads have to wait to gain access. So, in your case, it only makes the loop not parallel in those areas that you're locking and any work outside those locks is still going to execute in parallel (i.e. all the work in RecursiveBinUpdate).
Bottom line, I see nothing substantially wrong with what you're doing here.
How come if I have a statement like this:
private int sharedValue = 0;
public void SomeMethodOne()
{
lock(this){sharedValue++;}
}
public void SomeMethodTwo()
{
lock(this){sharedValue--;}
}
So for a thread to get into a lock it must first check if another thread is operating on it. If it isn't, it can enter and has to write something to memory, this surely cannot be atomic as it needs to read and write.
So how come it's impossible for one thread to be reading the lock, while the other is writing its ownership to it?
To simplify Why cannot two threads both get into a lock at the same time?
It looks like you are basically asking how the lock works. How can the lock maintain internal state in an atomic manner without already having the lock built? It seems like a chicken and egg problem at first does it not?
The magic all happens because of a compare-and-swap (CAS) operation. The CAS operation is a hardware level instruction that does 2 important things.
It generates a memory barrier so that instruction reordering is constrained.
It compares the contents of a memory address with another value and if they are equal then the original value is replaced with a new value. It does all of this in an atomic manner.
At the most fundamental level this is how the trick is accomplished. It is not that all other threads are blocked from reading while another is writing. That is totally the wrong way to think about it. What actually happens is that all threads are acting as writers simultaneously. The strategy is more optimistic than it is pessimistic. Every thread is trying to acquire the lock by performing this special kind of write called a CAS. You actually have access to a CAS operation in .NET via the Interlocked.CompareExchange (ICX) method. Every synchronization primitive can be built from this single operation.
If I were going to write a Monitor-like class (that is what the lock keyword uses behind the scenes) from scratch entirely in C# I could do it using the Interlocked.CompareExchange method. Here is an overly simplified implementation. Please keep in mind that this is most certainly not how the .NET Framework does it.1 The reason I present the code below is to show you how it could be done in pure C# code without the need for CLR magic behind the scenes and because it might get you thinking about how Microsoft could have implemented it.
public class SimpleMonitor
{
private int m_LockState = 0;
public void Enter()
{
int iterations = 0;
while (!TryEnter())
{
if (iterations < 10) Thread.SpinWait(4 << iterations);
else if (iterations % 20 == 0) Thread.Sleep(1);
else if (iterations % 5 == 0) Thread.Sleep(0);
else Thread.Yield();
iterations++;
}
}
public void Exit()
{
if (!TryExit())
{
throw new SynchronizationLockException();
}
}
public bool TryEnter()
{
return Interlocked.CompareExchange(ref m_LockState, 1, 0) == 0;
}
public bool TryExit()
{
return Interlocked.CompareExchange(ref m_LockState, 0, 1) == 1;
}
}
This implementation demonstrates a couple of important things.
It shows how the ICX operation is used to atomically read and write the lock state.
It shows how the waiting might occur.
Notice how I used Thread.SpinWait, Thread.Sleep(0), Thread.Sleep(1) and Thread.Yield while the lock is waiting to be acquired. The waiting strategy is overly simplified, but it does approximate a real life algorithm implemented in the BCL already. I intentionally kept the code simple in the Enter method above to make it easier to spot the crucial bits. This is not how I would have normally implemented this, but I am hoping it does drive home the salient points.
Also note that my SimpleMonitor above has a lot of problems. Here are but only a few.
It does not handle nested locking.
It does not provide Wait or Pulse methods like the real Monitor class. They are really hard to do right.
1The CLR will actually use a special block of memory that exists on each reference type. This block of memory is referred to as the "sync block". The Monitor will manipulate bits in this block of memory to acquire and release the lock. This action may require a kernel event object. You can read more about it on Joe Duffy's blog.
lock in C# is used to create a Monitor object that is actually used for locking.
You can read more about Monitor in here: http://msdn.microsoft.com/en-us/library/system.threading.monitor.aspx. The Enter method of the Monitor ensures that only one thread can enter the critical section at the time:
Acquires a lock for an object. This action also marks the beginning of a critical section. No other thread can enter the critical section unless it is executing the instructions in the critical section using a different locked object.
BTW, you should avoid locking on this (lock(this)). You should use a private variable on a class (static or non-static) to protect the critical section. You can read more in the same link provided above but the reason is:
When selecting an object on which to synchronize, you should lock only on private or internal objects. Locking on external objects might result in deadlocks, because unrelated code could choose the same objects to lock on for different purposes.
In my app I have a List of objects. I'm going to have a process (thread) running every few minutes that will update the values in this list. I'll have other processes (other threads) that will just read this data, and they may attempt to do so at the same time.
When the list is being updated, I don't want any other process to be able to read the data. However, I don't want the read-only processes to block each other when no updating is occurring. Finally, if a process is reading the data, the process that updates the data must wait until the process reading the data is finished.
What sort of locking should I implement to achieve this?
This is what you are looking for.
ReaderWriterLockSlim is a class that will handle scenario that you have asked for.
You have 2 pair of functions at your disposal:
EnterWriteLock and ExitWriteLock
EnterReadLock and ExitReadLock
The first one will wait, till all other locks are off, both read and write, so it will give you access like lock() would do.
The second one is compatible with each other, you can have multiple read locks at any given time.
Because there's no syntactic sugar like with lock() statement, make sure you will never forget to Exit lock, because of Exception or anything else. So use it in form like this:
try
{
lock.EnterWriteLock(); //ReadLock
//Your code here, which can possibly throw an exception.
}
finally
{
lock.ExitWriteLock(); //ReadLock
}
You don't make it clear whether the updates to the list will involve modification of existing objects, or adding/removing new ones - the answers in each case are different.
To handling modification of existing items in the list, each object should handle it's own locking.
To allow modification of the list while others are iterating it, don't allow people direct access to the list - force them to work with a read/only copy of the list, like this:
public class Example()
{
public IEnumerable<X> GetReadOnlySnapshot()
{
lock (padLock)
{
return new ReadOnlyCollection<X>( MasterList );
}
}
private object padLock = new object();
}
Using a ReadOnlyCollection<X> to wrap the master list ensures that readers can iterate through a list of fixed content, without blocking modifications made by writers.
You could use ReaderWriterLockSlim. It would satisfy your requirements precisely. However, it is likely to be slower than just using a plain old lock. The reason is because RWLS is ~2x slower than lock and accessing a List would be so fast that it would not be enough to overcome the additional overhead of the RWLS. Test both ways, but it is likely ReaderWriterLockSlim will be slower in your case. Reader writer locks do better in scenarios were the number readers significantly outnumbers the writers and when the guarded operations are long and drawn out.
However, let me present another options for you. One common pattern for dealing with this type of problem is to use two separate lists. One will serve as the official copy which can accept updates and the other will serve as the read-only copy. After you update the official copy you must clone it and swap out the reference for the read-only copy. This is elegant in that the readers require no blocking whatsoever. The reason why readers do not require any blocking type of synchronization is because we are treating the read-only copy as if it were immutable. Here is how it can be done.
public class Example
{
private readonly List<object> m_Official;
private volatile List<object> m_Readonly;
public Example()
{
m_Official = new List<object>();
m_Readonly = m_Official;
}
public void Update()
{
lock (m_Official)
{
// Modify the official copy here.
m_Official.Add(...);
m_Official.Remove(...);
// Now clone the official copy.
var clone = new List<object>(m_Official);
// And finally swap out the read-only copy reference.
m_Readonly = clone;
}
}
public object Read(int index)
{
// It is safe to access the read-only copy here because it is immutable.
// m_Readonly must be marked as volatile for this to work correctly.
return m_Readonly[index];
}
}
The code above would not satisfy your requirements precisely because readers never block...ever. Which means they will still be taking place while writers are updating the official list. But, in a lot of scenarios this winds up being acceptable.
When should locks be used? Only when modifying data or when accessing it as well?
public class Test {
static Dictionary<string, object> someList = new Dictionary<string, object>();
static object syncLock = new object();
public static object GetValue(string name) {
if (someList.ContainsKey(name)) {
return someList[name];
} else {
lock(syncLock) {
object someValue = GetValueFromSomeWhere(name);
someList.Add(name, someValue);
}
}
}
}
Should there be a lock around the the entire block or is it ok to just add it to the actual modification? My understanding is that there still could be some race condition where one call might not have found it and started to add it while another call right after might have also run into the same situation - but I'm not sure. Locking is still so confusing. I haven't run into any issues with the above similar code but I could just be lucky so far. Any help above would be appriciated as well as any good resources for how/when to lock objects.
You have to lock when reading too, or you can get unreliable data, or even an exception if a concurrent modification physically changes the target data structure.
In the case above, you need to make sure that multiple threads don't try to add the value at the same time, so you need at least a read lock while checking whether it is already present. Otherwise multiple threads could decide to add, find the value is not present (since this check is not locked), and then all try to add in turn (after getting the lock)
You could use a ReaderWriterLockSlim if you have many reads and only a few writes. In the code above you would acquire the read lock to do the check and upgrade to a write lock once you decide you need to add it. In most cases, only a read lock (which allows your reader threads to still run in parallel) would be needed.
There is a summary of the available .Net 4 locking primitives here. Definitely you should understand this before you get too deep into multithreaded code. Picking the correct locking mechanism can make a huge performance difference.
You are correct that you have been lucky so far - that's a frequent feature of concurrency bugs. They are often hard to reproduce without targeted load testing, meaning correct design (and exhaustive testing, of course) is vital to avoid embarrassing and confusing production bugs.
Lock the whole block before you check for the existence of name. Otherwise, in theory, another thread could add it between the check, and your code that adds it.
Actually locking just when you perform the Add really doesn't do anything at all. All that would do is prevent another thread from adding something simultaneously. But since that other thread would have already decided it was going to do the add, it would just try to do it anyway as soon as the lock was released.
If a resource can only be accessed by multiple threads, you do not need any locks.
If a resource can be accessed by multiple threads and can be modified, then all accesses/modifications need to be synchronized. In your example, if GetValueFromSomeWhere takes a long time to return, it is possible for a second call to be made with the same value in name, but the value has not been stored in the Dictionary.
ReaderWriterLock or the slim version if you under 4.0.
You will aquire the reader lock for the reads (will allow for concurrent reads) and upgrade the lock to the writer lock when something is to write (will allow only one write at the time and will block all the reads until is done, as well as the concurrent write-threads).
Make sure to release your locks with the pattern to avoid deadlocking:
void Write(object[] args)
{
this.ReaderWriterLock.AquireWriteLock(TimeOut.Infinite);
try
{
this.myData.Write(args);
}
catch(Exception ex)
{
}
finally
{
this.ReaderWriterLock.RelaseWriterLock();
}
}
I'm still confused... When we write some thing like this:
Object o = new Object();
var resource = new Dictionary<int , SomeclassReference>();
...and have two blocks of code that lock o while accessing resource...
//Code one
lock(o)
{
// read from resource
}
//Code two
lock(o)
{
// write to resource
}
Now, if i have two threads, with one thread executing code which reads from resource and another writing to it, i would want to lock resource such that when it is being read, the writer would have to wait (and vice versa - if it is being written to, readers would have to wait). Will the lock construct help me? ...or should i use something else?
(I'm using Dictionary for the purposes of this example, but could be anything)
There are two cases I'm specifically concerned about:
two threads trying to execute same line of code
two threads trying to work on the same resource
Will lock help in both conditions?
Most of the other answers address your code example, so I'll try to answer you question in the title.
A lock is really just a token. Whoever has the token may take the stage so to speak. Thus the object you're locking on doesn't have an explicit connection to the resource you're trying to synchronize around. As long as all readers/writers agree on the same token it can be anything.
When trying to lock on an object (i.e. by calling Monitor.Enter on an object) the runtime checks if the lock is already held by a thread. If this is the case the thread trying to lock is suspended, otherwise it acquires the lock and proceeds to execute.
When a thread holding a lock exits the lock scope (i.e. calls Monitor.Exit), the lock is released and any waiting threads may now acquire the lock.
Finally a couple of things to keep in mind regarding locks:
Lock as long as you need to, but no longer.
If you use Monitor.Enter/Exit instead of the lock keyword, be sure to place the call to Exit in a finally block so the lock is released even in the case of an exception.
Exposing the object to lock on makes it harder to get an overview of who is locking and when. Ideally synchronized operations should be encapsulated.
Yes, using a lock is the right way to go. You can lock on any object, but as mentioned in other answers, locking on your resource itself is probably the easiest and safest.
However, you may want use a read/write lock pair instead of just a single lock, to decrease concurrency overhead.
The rationale for that is that if you have only one thread writing, but several threads reading, you do not want a read operation to block an other read operation, but only a read block a write or vice-versa.
Now, I am more a java guy, so you will have to change the syntax and dig up some doc to apply that in C#, but rw-locks are part of the standard concurrency package in Java, so you could write something like:
public class ThreadSafeResource<T> implements Resource<T> {
private final Lock rlock;
private final Lock wlock;
private final Resource res;
public ThreadSafeResource(Resource<T> res) {
this.res = res;
ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
this.rlock = rwl.readLock();
this.wlock = rwl.writeLock();
}
public T read() {
rlock.lock();
try { return res.read(); }
finally { rlock.unlock(); }
}
public T write(T t) {
wlock.lock();
try { return res.write(t); }
finally { wlock.unlock(); }
}
}
If someone can come up with a C# code sample...
Both blocks of code are locked here. If thread one locks the first block, and thread two tries to get into the second block, it will have to wait.
The lock (o) { ... } statement is compiled to this:
Monitor.Enter(o)
try { ... }
finally { Monitor.Exit(o) }
The call to Monitor.Enter() will block the thread if another thread has already called it. It will only be unblocked after that other thread has called Monitor.Exit() on the object.
Will lock help in both conditions?
Yes.
Does lock(){} lock a resource, or does
it lock a piece of code?
lock(o)
{
// read from resource
}
is syntactic sugar for
Monitor.Enter(o);
try
{
// read from resource
}
finally
{
Monitor.Exit(o);
}
The Monitor class holds the collection of objects that you are using to synchronize access to blocks of code.
For each synchronizing object, Monitor keeps:
A reference to the thread that currently holds the lock on the synchronizing object; i.e. it is this thread's turn to execute.
A "ready" queue - the list of threads that are blocking until they are given the lock for this synchronizing object.
A "wait" queue - the list of threads that block until they are moved to the "ready" queue by Monitor.Pulse() or Monitor.PulseAll().
So, when a thread calls lock(o), it is placed in o's ready queue, until it is given the lock on o, at which time it continues executing its code.
And that should work assuming that you only have one process involved. You will want to use a "Mutex" if you want that to work across more then one process.
Oh, and the "o" object, should be a singleton or scoped across everywhere that lock is needed, as what is REALLY being locked is that object and if you create a new one, then that new one will not be locked yet.
The way you have it implemented is an acceptable way to do what you need to do. One way to improve your way of doing this would be to use lock() on the dictionary itself, rather than a second object used to synchronize the dictionary. That way, rather than passing around an extra object, the resource itself keeps track of whether there's a lock on it's own monitor.
Using a separate object can be useful in some cases, such as synchronizing access to outside resources, but in cases like this it's overhead.