Advantage of upgradable readlock? - c#

I was wondering what are the advantages of using a upgradable read lock as opposed performing these steps:
Take read lock
Check condition to see if we need to take write lock
Release Read Lock
Take Write Lock
Perform update
Release Write Lock
One apparent disadvantage of the performing the above steps as opposed taking an upgradable read lock is, that there is an window of time between the steps 3 and 4, where another thread can take up a write lock.
Apart from this advantage what other advantages do you find for taking upgradable read lock over the steps I have mentioned above?

Let's consider the different ways in which one can use a reader-writer lock that doesn't have a separate "upgradable reader".
With your pattern, there is a race between step 3 and 4 as you point out, where another thread can take the writer lock. More to the point, there is a step between 3 and 4 where a thread can take the writer lock and change the state we observed in step 2.
Therefore, we've got four choices depending on how likey this is to happen:
We stay with your approach because this is actually impossible (e.g. a given state transition is one-way in our application, so once observed it is permanent). In this case though we could quite possibly have remodelled so as to not need a lock at all. (One-way transitions lend themselves to lock-free techniques).
We just take the writer lock in the first place, because the state we observe in step 2 is very likely to change and it's a waste of time checking it with a reader lock.
We change your steps to:
Take read lock
Check condition to see if we need to take write lock
Release Read Lock
Take Write Lock
Re-check the condition in case it changed.
Perform update
Release Write Lock
We change to:
Take a read lock on a recursion-supporting lock.
Check to see if we need to take write lock.
Take write lock (no release of read).
Perform update.
Release write lock.
Release read lock.
It's not hard to see why 4 was more tempting to some, though it is only slightly harder to see how it makes deadlocks easy to create. Sadly, that slightly harder is just enough for a lot of people to see the advantages without seeing the disadvantages.
For anyone who doesn't spot it, if two threads have a read lock and one of them upgrades to a write lock it must wait for the other to release the read-lock. However if that second thread upgrades to a write lock without releasing the read-lock then it will wait forever on the first thread, which will wait forever on it.
As said above, just which approach is best depends on how likely it is for the state to change in the meantime (or how promptly we want to react to it, I suppose). Even the last approach with the non-releasing upgrade can have its place in viable code, as long as there can only ever be one thread that ever tries to upgrade its lock without releasing.
Aside from the special case where the last option works, the difference between the other options are all about performance, and which is most performant mostly depends on the cost of re-checking the state and the likelihood of the write being aborted due to a change in the meantime.
However, note that all of them involve taking a writer lock, so all of them have the effect of blocking all reading threads, even when the write is indeed aborted.
Upgradable read-locks give us a middle-ground because while they block write-locks and other upgradable read-locks, they don't block read locks. They are perhaps better though of not as read-locks that can upgrade as a write-locks that have yet to commit to writing.* In the cases where it was decided not to upgrade, the effect on the reading threads is nil.
This means that if it's even slightly possible that the thread will decide not to change the state, the reading threads are not affected, and the performance improvement can certainly justify it's use.
*For that matter, "reader-writer" is a bit of a misnomer, we could e.g. protect an array of ints or objects with a ReaderWriterLockSlim, use the read-locks for both reading and writing individual items atomically and use the write-locks for operations that need to read the entire array without parts of it changing as it reads. In such a case it's a reading operation than needs the exclusive lock, while writing operations are fine with the shared lock.

It also prevents deadlocks that may happen because different threads operate at the same time and they wait for each other to release the locks.

Related

How can I share resources with threads in C#?

I have a thread reading from a specific plc's memory and it works perfectly. Now what I want is to start another thread to test the behavior of the system (simulate the first thread) in case of a conectivity issue, and when everything is Ok, continue the first thread. But I think I'll have problems with that because these two threads will need to use the same port.
My first idea was to abort the first thread, start the second one and when the everything's OK again, abort this thread and 'restart' the first one.
I've read some other forums and people say that aborting or suspending a thread is the worst solution, and I've read about syncronization of threads but I dont really know if this is useful in this case because I've never used it.
My question is, what is the correct way to solve this kind of situations?
You have a shared resource that you need to coordinate thread access to. There are a number of mechanisms in .NET available for that coordination.
There is a wonderful resource that provides both an introduction to thread concepts in .NET, and discusses advanced concepts in an approachable manner
http://www.albahari.com/threading/
In your case, have a look at the section on locking
Exclusive locking is used to ensure that only one thread can enter particular sections of code at a time. The two main exclusive locking constructs are lock and Mutex. Of the two, the lock construct is faster and more convenient. Mutex, though, has a niche in that its lock can span applications in different processes on the computer.
http://www.albahari.com/threading/part2.aspx#_Locking
You can structure your two threads so that they must acquire a specific lock to work with the port. Have your first thread release that lock before you start the second thread, then have the first thread wait to acquire that lock again (which the second thread will hold until done).

Forcing the modification of a particular object and any extensions to be atomic

I have a bit of a reader/writer problem.
I have a static collection of particular objects which can be accessed by many readers and writers at the same time.
The writers will obtain an item from the collection handler, and then modify it in some manner.
The readers may obtain that particular item from the collection handler and then display it in some manner.
Now I realise the standard solution to this problem is to create a lock (or a ReaderWriter one) and force each writer and reader to enter the lock. The problem is that I'm not the only one adding readers/writers, and so there is the chance that someone may forget to use the lock in their implementation and break everything.
So is there a way, from an object's side, to force all changes to be made in an atomic manner? Bear in mind that this object WILL be extended in multiple forms, and they must also be restrained in this manner.
If the consumers of your objects cannot be trusted to obey safety rules then you have a big problem. I can see two solutions:
1) Do nothing. If your consumers write broken code then the pain they find themselves in will be a good incentive towards not doing that again.
1a) You can make their pain more excruciating by, say, having your object somehow detect when it is being read without a read lock and throwing an exception. This is what we did when designing the threading rules for the JScript and VBScript engines; if the user ever calls a method on a thread that does not support that method then the engines return the "catestrophic failure" error code. Users of the engine do not make that mistake twice; they very quickly learn which is the right thread and which is the wrong thread, and stop calling on the wrong thread.
This is a much better solution than simply allowing the call to succeed most of the time and then die with some crazy race condition problem one time in a million.
2) Move the responsibility for obtaining the lock into the object. Let the consumers use the object normally; the object takes out its own reader-writer lock whenever it is read or written.

ConcurrentDictionary Object - Reading and writing via different threads

I want to use a ConcurrentDictionary in my app, but first I need to make sure I understand correctly how it works. In my app, I'll have one or more threads that write to, or delete from, the dictionary. And, I'll have one or more threads that read from the dictionary. Potentially, all at the same time.
Am I correct that the implementation of ConcurrentDictionary takes care of all the required locking for this to happen, and I don't need to provide my own locking? In other words, if one thread is writing to, or deleting from, the dictionary, a reading thread (or another write thread) will be blocked until the update or delete is finished?
Thanks very much.
The current implementation uses a mixture of striped locks (the technique I suggested in an answer to someone yesterday at https://stackoverflow.com/a/11950835/400547) and thinking very very hard about the situations in which an operation cannot possibly cause problems for or have problems cause by, a concurrent operation (there's quite a lot of these, but you have to be very sure if you make use of them).
As such if you have several operations happening on the concurrent dictionary at once, each of the following is possible:
No threads even lock, but everything happens correctly.
Some threads lock, but they lock on separate things, and there is no lock contention.
One or two threads have lock contention with each other, and are slowed down, but the effect upon performance is less than if there were a single lock.
One or two threads need to lock the entire thing for a while (generally for internal resizing) which blocks all the threads that could possibly be blocked in case 3 above, though some can keep going (those that read).
None of this involves dirty reads, which is a matter only vaguely related to locking (my own form of concurrent dictionary uses no locks at all, and it doesn't have dirty reads either).
This thread-safety doesn't apply to batches done by your code (if you read a value and then write a value, the value read may have changed before you finished the write), but note that some common cases which would require a couple of calls on Dictionary are catered for by single methods on ConcurrentDictionary (GetOrAdd and AddOrUpdate do things that would be two calls with a Dictionary so they can be done atomically - though note that the Func involved in some overloads may be called more than once).
Due to this, there's no added danger with ConcurrentDictionary, so you should pick as follows:
If you're going to have to lock over some batches of operations that don't match what ConcurrentDictionary offers like e.g.:
lock(lockObj)
{
var test = dict[key1];
var test2 = dict[key2];
if(test < test2 && test2 < dict[key3] && SomeOtherBooleanProducer())
dict[key4] = SomeFactoryCall(key4);
}
Then you would have to lock on ConcurrentDictionary, and while there may be a way to combine that with what it offers in the way of support for concurrency, there probably won't, so just use Dictionary with a lock.
Otherwise it comes down to how much concurrent hits there will probably be. If you're mostly only going to have one thread hitting the dictionary, but you need to guard against the possibility of concurrent access, then you should definitely go for Dictionary with a lock. If you're going to have periods where half a dozen or more threads are hitting the dictionary, then you should definitely go for ConcurrentDictionary (if they're likely to be hitting the same small number of keys then take a look at my version because that's the one situation where I have better performance).
Just where the middle point between "few" and "many" threads lies, is hard to say. I'd say that if there are more than two threads on a regular basis then go with ConcurrentDictionary. If nothing else, demands from concurrency tend to increase throughout the lifetime of a project more often than they decrease.
Edit: To answer about the particular case you give, of one writer and one reader, there won't be any blocking at all, as that is safe for roughly the same reason why multiple readers and one writer is safe on Hashtable, though ConcurrentDictionary goes beyond that in several ways.
In other words, if one thread is writing to, or deleting from, the dictionary, a reading thread (or another write thread) will be blocked until the update or delete is finished?
I don't believe it will block - it will just be safe. There won't be any corruption - you'll just have a race in terms of whether the read sees the write.
From a FAQ about the lock-free-ness of the concurrent collections:
ConcurrentDictionary<TKey,TValue> uses fine-grained locking when adding to or updating data in the dictionary, but it is entirely lock-free for read operations. In this way, it’s optimized for scenarios where reading from the dictionary is the most frequent operation.

Multi-threading concept and lock in c#

I read about lock, though not understood nothing at all.
My question is why do we use a un-used object and lock that and how this makes something thread-safe or how this helps in multi-threading ? Isn't there other way to make thread-safe code.
public class test {
private object Lock { get; set; }
...
lock (this.Lock) { ... }
...
}
Sorry is my question is very stupid, but i don't understand, although i've used it many times.
Accessing a piece of data from one thread while other thread is modifying it is called "data race condition" (or just "data race") and can lead to corruption of data. (*)
Locks are simply a mechanism for avoiding data races. If two (or more) concurrent threads lock the same lock object, then they are no longer concurrent and can no longer cause data races, for the duration of the lock. Essentially, we are serializing the access to shared data.
The trick is to keep your locks as "wide" as you must to avoid data races, yet as "narrow" as you can to gain performance through concurrent execution. This is a fine balance that can easily go out of whack in either direction, which is why multi-threaded programming is hard.
Some guidelines:
As long all threads are just reading the data and none will ever modify it, lock is unnecessary.
Conversely, if at least one thread might at some point modify the data, then all concurrent code paths accessing that same data must be properly serialized through locks, even those that only read the data.
Using a lock in one code path but not the other will leave the data wide open to race conditions.
Also, using one lock object in one code path, but a different lock object in another (concurrent) code path does not serialize these code paths and leaves you wide open to data races.
On the other hand, if two concurrent code paths access different data, they can use different lock objects. But, whenever there is more than one lock object, watch out for deadlocks. A deadlock is often also a "code race condition" (and a heisenbug, see below).
The lock object does not need to be (and usually isn't) the same thing as the data you are trying to protect. Unfortunately, there is no language facility that lets you "declare" which data is protected by which lock object, so you'll have to very carefully document your "locking convention" both for other people that might maintain your code, and for yourself (since even after a short time you will forget some of the nooks and crannies of your locking convention).
It's usually a good idea to protect the lock object from the outside world as much as you can. After all, you are using it for the very sensitive task of locking and you don't want it locked by external actors in unforeseen ways. That's why using this or a public field as a lock object is usually a bad idea.
The lock keyword is simply a more convenient syntax for Monitor.Enter and Monitor.Exit.
The lock object can be any object in .NET, but value objects will be boxed in the call to Monitor.Enter, which means threads will not share the same lock object, leaving the data unprotected. Therefore, only use reference types as lock objects.
For inter-process communication you can use a global mutex, which can be created by passing a non-empty name to Mutex Constructor. Global mutexes provide essentially the same functionality as regular "local" locking, except they can be shared between separate processes.
There are synchronization mechanisms other than locks, such as semaphores, condition variables, message queues or atomic operations. Be careful when mixing different synchronization mechanisms.
Locks also behave as memory barriers, which is increasingly important on modern multi-core, multi-cache CPUs. This is part of the reason why you need locks on reading the data and not just writing.
(*) It is called "race" because concurrent threads are "racing" towards performing an operation on the shared data and whoever wins that race determines the outcome of the operation. So the outcome depends on timing of the execution, which is essentially random on modern preemptive multitasking OSes. Worse yet, timing is easily modified by a simple act of observing the program execution through tools such as debugger, which makes them "heisenbugs" (i.e. the phenomenon being observed is changed by the mere act of observation).
Lock object is like a door into the single room where only one guest per time can enter.
The room can be your data, the guest can be your function.
define data (room)
add door (lock object)
invite guests (functions)
using lock insctruction close/open door to allow only one guest per time enter into the room.
Why we need this? If you simulatniously write a data in a file (just an example, can be 1000s others) you will need to sync an access of your funcitons (close/open door for guests) to the write file, so any function will append to the end of the file (assuming that is requierement of this example)
This is naturally not only way sync the threads, there are more out there:
Monitors
Wait hadlers
...
Check out the link for complete information and description of each of them
Thread Synchronization
Yes, there is indeed another way:
using System.Runtime.CompilerServices;
class Test
{
private object Lock { get; set; }
[MethodImpl(MethodImplOptions.Synchronized)]
public void Foo()
{
// Now this instance is locked
}
}
While it looks more "natural", it's not used often, because of the fact that the object is locking on itself this way, so other code could not risk locking on this object -- it could cause a deadlock.
Because of this, you usually create a (lazy-initialized) private field referring to an object, and use that object as a lock instead. This will guarantee that no one else can lock against the same object as you.
A little more detail on what's happening beneath the hood:
When you "lock on an object", you're not locking on the object itself. Rather, you're using the object as a guaranteed-to-be-unique-address-in-memory throughout your program. When you "lock", the runtime takes the object's address, uses it to look up the actual lock inside another table (which is hidden from you), and uses that object as the ""lock" (also known as a "critical section").
So really, for you, an object is just a proxy/symbol -- it isn't doing anything by itself; it's just acting as a unique indicator that will never clash with another valid object in the same program.
When you have different threads accessing same variable/resource at the same time they may over write on this variable/resource and you can have unexpected results. Lock will make sure only one thread can assess variable at on time and remain thread will queue to get access to this variable/resource till lock is released
suppose we have balance variable of an account.
Two different thread read its value which was 100
Suppose first thread adds 50 to it like 100 + 50 and saves it and balance will have 150
As second thread already read 100 and mean while. suppose it subtract 50 like 100-50 but point to note here is that first thread has made the balance 150 so second thread should to 150-50 this could cause serious problems.
So lock makes sure that when on thread wants to change some resource states it locks it and leaves after committing change
The lock statement introduces the concept of mutual exclusion. Only one thread can acquire a lock on a given object at any one time. This prevents threads from accessing shared data structures concurrently, thus corrupting them.
If other threads already hold a lock, the lock statement will block until it is able to acquire an exclusive lock on its argument before allowing its block to execute.
Note that the only thing lock does is control entry to the block of code. Access to members of the class is completely unrelated to the lock. It is up to the class itself to ensure that accesses that must be synchronized are coordinated by the use of lock or other synchronization primitives. Also note that access to some or all members may not have to be synchronized. For instance, if you want to maintain a counter, you could use the Interlocked class without locking.
An alternative to locking is lock-free data structures, which behave correctly in the presence of multiple threads. Operations on lock-free data structures must be designed very carefully, usually with the assistance of lock-free primitives such as compare-and-swap (CAS).
The general theme of such techniques is to try to perform operations on data structures atomically and detect when operations fail due to concurrent actions by other threads, followed by retries. This works well on a lightly loaded system where failures are unlikely, but can produce runaway behaviour as the failure rate climbs and retries become a dominant load. This problem can be ameliorated by backing off the retry rate, effectively throttling the load.
A more sophisticated alternative is software transactional memory. Unlike CAS, STM generalizes the concept of fail-and-retry to arbitrarily complex memory operations. In simple terms, you start a transaction, perform all your operations, and finally commit. The system detects if the operations cannot succeed due to conflicting operations performed by other threads that beat the current thread to the punch. In such cases, STM can either fail outright, requiring the application to take corrective action, or, in more sophisticated implementations, it can automatically go back to the start of the transaction and try again.
Your confusion is pretty typical for those just getting familiar with the lock keyword in C#. You are right, the object used in the lock statement is really nothing more than a token that defines a critical section. That object, in no way, has any protection from multithreaded access itself.
The way this works is that the CLR reserves a 4 byte (32-bit systems) section in the object header (type handle) called the sync block. The sync block is nothing more than an index into an array that stores the actual critical section information. When you use the lock keyword the CLR will modify this sync block value accordingly.
There are advantages and disadvantages to this scheme. The advantage is that it made for a fairly elegant solution to defining critical sections. One obvious disadvantage is that each object instance contains the sync block and most instances never use it so it would seem to be a waste of space in most cases. Another disadvantage is that boxed value types can be used which is almost always wrong and certainly leads to confusion.
I remember way back when .NET was first released that there was a lot of chatter over whether the lock keyword was good or bad for the language. The general consensus (at least as I remember it) was that it was bad because the using keyword could have been easily used instead. In fact, a solution that used the using keyword actually would have made more sense because it could have been done without the need for the sync block. The c# design team even went on record to say that had they been given a second chance the lock keyword never would have made it into the language.1
1The only reference I could find for this is on Jon Skeet's website here.

Is ReaderWriterLockSlim.EnterUpgradeableReadLock() essentially the same as Monitor.Enter()?

So I have a situation where I may have many, many reads and only the occasional write to a resource shared between multiple threads.
A long time ago I read about ReaderWriterLock, and have read about ReaderWriterGate which attempts to mitigate the issue where many writes coming in trump reads and hurt performance. However, now I've become aware of ReaderWriterLockSlim...
From the docs, I believe that there can only be one thread in "upgradeable mode" at any one time. In a situation where the only access I'm using is EnterUpgradeableReadLock() (which is appropriate for my scenario) then is there much difference to just sticking with lock(){}?
Here's the excerpt:
A thread that tries to enter
upgradeable mode blocks if there is
already a thread in upgradeable mode,
if there are threads waiting to enter
write mode, or if there is a single
thread in write mode.
Or, does the recursion policy make any difference to this?
Agreed. If all of your threads need to acquire an upgradable read lock and you cannot afford to release a read lock and acquire a write lock then ReaderWriterLockSlim is no improvement over a simple exclusive lock. Recursion does not change that. RWLS and the need to avoid the ever present danger of deadlock heavily favors a pattern where a single thread does the writing.
I don't have all your answers, but I'll give it a shot:
The lock statement in c# is syntactic sugar for calling Monitor.Enter and Monitor.Exit. The effect is that only one thread can access the code within the lock at a time.
lock()
{
//only one thread can access this code at a time
}
The issue with this is that multiple reads are harmless, but lock() blocks anyway. ReaderWriterLockSlim allows multiple reads, only one write. It's an attempt to improve efficiency.
The recursion policy is something you must specify - by default it is off. Don't know too much more beyond that, but hope that helps a little.

Categories