Which is preferred in a multi-threaded application:
Dictionary with lock object
or
Concurrency Dictionary
Which is efficient and why should I use one or the other?
edit 1: Storing Guid as key and bool as value.
edit 2: more than 2 worker threads and one UI thread.
I would say you've got the following options.
Some new Framework 4.0 classes:
ConcurrentDictionary. Work fast and reliable.
ConcurrentBag. It's unordered collection of objects, so it works faster, but suits if you don't need sorting only.
ConcurrentStack. It's an implementation of the classic LIFO (Last-In First-Out) data structure that provides thread-safe access without the need for external synchronization
ConcurrentQueue. It's a thread-safe FIFO (first in-first out) collection.
All new 4.0 classes work faster but have some features mentioned by levanovd. Performance comparison of these classes you can find here.
Some classic solutions from earlier versions:
Dictionary + Monitor. Simple wrap to lock
Dictionary + ReaderWriterLock. Better than the previous one, because has got read and write locks. So several threads can read, and just one - write.
Dictionary + ReaderWriterLockSlim. It's just optimisation of the previous one.
Hashtable. From my experience, it's the slowest method. Check Hashtable.Synchronized() method, it's a ready to go solution from Microsoft.
If I had a restriction of using Framework v3.5, I would use Dictionary + ReaderWriterLock or ReaderWriterLockSlim.
Read carefully about ConcurrentDictionary. It has some unobvious features.
Here are some of them:
If two threads call AddOrUpdate there's no guarantees about which of factory delegates will be called and even no guarantee that if a factory delegate will produce some item that this item will be stored in dictionary.
Enumerator obtained by GetEnumerator call is not a snapshot and may be modified during enumeration (that doesn't cause any exceptions).
Keys and Values properties are snapshots of corresponding collections and may not correspond to actual dictionary state.
etc.
So please read about ConcurrentDictionary again and decide if this behavior is what you need.
Hope this helps!
When you are implementing a dictionary with a lock object, your main concern seems like thread safety. So it seems, a concurrentDictionary already manages this concern. I think there is no point in re-inventing the wheel.
I think both will provide thread-safety but using a Dictionary with lock object will limit the number of thread that can access the Dictionary concurrently to 1. While using Concurrent Dictionary, you can specify concurrent level (i.e. number of threads that can access the Dictionary concurrently). If performance does matter, I believe Concurrent Dictionary should be your choice.
Related
I have a dictionary only supports add and modify operations and can be concurrently operated, but always for different keys. Keys are int and values are a reference type. Also modify means change some properties of a value.
My questions are:
Do I need to use ConcurrentDictionary in this scenario? If needed, how does it help?
If concurrent modification can happen on the same key, will ConcurrentDictionary help to ensure thread safty? My understanding is no, is that correct?
Thanks!
Do I need to use ConcurrentDictionary in this scenario? If needed, how
does it help?
Yes, the standard dictionary will not behave correctly if more than one thread adds or removes entries at the same time. (although it is safe for multiple threads to read from it at the same time if no others are modifying it).
If concurrent modification can happen on the same key, will
ConcurrentDictionary help to ensure thread safety? My understanding is
no, is that correct?
If you are asking "Will the concurrent dictionary prevent multiple threads from accessing the values inside the dictionary at the same time?", then no, it will not.
If you want to prevent multiple threads from accessing the same value at the same time you will need to use some sort of concurrency control, such as locking.
Best way to find this out is check MSDN documentation.
For ConcurrentDictionary the page is http://msdn.microsoft.com/en-us/library/dd287191.aspx
Under thread safety section, it is stated "All public and protected members of ConcurrentDictionary<TKey, TValue> are thread-safe and may be used concurrently from multiple threads."
I want to use a ConcurrentDictionary in my app, but first I need to make sure I understand correctly how it works. In my app, I'll have one or more threads that write to, or delete from, the dictionary. And, I'll have one or more threads that read from the dictionary. Potentially, all at the same time.
Am I correct that the implementation of ConcurrentDictionary takes care of all the required locking for this to happen, and I don't need to provide my own locking? In other words, if one thread is writing to, or deleting from, the dictionary, a reading thread (or another write thread) will be blocked until the update or delete is finished?
Thanks very much.
The current implementation uses a mixture of striped locks (the technique I suggested in an answer to someone yesterday at https://stackoverflow.com/a/11950835/400547) and thinking very very hard about the situations in which an operation cannot possibly cause problems for or have problems cause by, a concurrent operation (there's quite a lot of these, but you have to be very sure if you make use of them).
As such if you have several operations happening on the concurrent dictionary at once, each of the following is possible:
No threads even lock, but everything happens correctly.
Some threads lock, but they lock on separate things, and there is no lock contention.
One or two threads have lock contention with each other, and are slowed down, but the effect upon performance is less than if there were a single lock.
One or two threads need to lock the entire thing for a while (generally for internal resizing) which blocks all the threads that could possibly be blocked in case 3 above, though some can keep going (those that read).
None of this involves dirty reads, which is a matter only vaguely related to locking (my own form of concurrent dictionary uses no locks at all, and it doesn't have dirty reads either).
This thread-safety doesn't apply to batches done by your code (if you read a value and then write a value, the value read may have changed before you finished the write), but note that some common cases which would require a couple of calls on Dictionary are catered for by single methods on ConcurrentDictionary (GetOrAdd and AddOrUpdate do things that would be two calls with a Dictionary so they can be done atomically - though note that the Func involved in some overloads may be called more than once).
Due to this, there's no added danger with ConcurrentDictionary, so you should pick as follows:
If you're going to have to lock over some batches of operations that don't match what ConcurrentDictionary offers like e.g.:
lock(lockObj)
{
var test = dict[key1];
var test2 = dict[key2];
if(test < test2 && test2 < dict[key3] && SomeOtherBooleanProducer())
dict[key4] = SomeFactoryCall(key4);
}
Then you would have to lock on ConcurrentDictionary, and while there may be a way to combine that with what it offers in the way of support for concurrency, there probably won't, so just use Dictionary with a lock.
Otherwise it comes down to how much concurrent hits there will probably be. If you're mostly only going to have one thread hitting the dictionary, but you need to guard against the possibility of concurrent access, then you should definitely go for Dictionary with a lock. If you're going to have periods where half a dozen or more threads are hitting the dictionary, then you should definitely go for ConcurrentDictionary (if they're likely to be hitting the same small number of keys then take a look at my version because that's the one situation where I have better performance).
Just where the middle point between "few" and "many" threads lies, is hard to say. I'd say that if there are more than two threads on a regular basis then go with ConcurrentDictionary. If nothing else, demands from concurrency tend to increase throughout the lifetime of a project more often than they decrease.
Edit: To answer about the particular case you give, of one writer and one reader, there won't be any blocking at all, as that is safe for roughly the same reason why multiple readers and one writer is safe on Hashtable, though ConcurrentDictionary goes beyond that in several ways.
In other words, if one thread is writing to, or deleting from, the dictionary, a reading thread (or another write thread) will be blocked until the update or delete is finished?
I don't believe it will block - it will just be safe. There won't be any corruption - you'll just have a race in terms of whether the read sees the write.
From a FAQ about the lock-free-ness of the concurrent collections:
ConcurrentDictionary<TKey,TValue> uses fine-grained locking when adding to or updating data in the dictionary, but it is entirely lock-free for read operations. In this way, it’s optimized for scenarios where reading from the dictionary is the most frequent operation.
Many threads have access to summary. Each thread will have an unique key for accessing the dictionary;
Dictionary<string, List<Result>> summary;
Do I need locking for following operations?
summary[key] = new List<Result>()
summary[key].Add(new Result());
It seems that I don't need locking because each thread will access dictionary with different key, but won't the (1) be problematic because of adding concurrently new record to dictionary with other treads?
Yes, you need to use locking.
Dictionary is not thread safe for add operations.
If you are on .NET 4 you may consider switching to ConcurrentDictionary. Otherwise you should create your own thread safe collection (such as this).
Consider using a ReaderWriterLockSlim for synchronizing access to your collection (in case you won't use ConcurrentDictionary).
All write accesses to your dictionnary must be locked. There is no guarantee that accessing different keys is thread safe, and, in fact, it isn't.
From MSDN:
A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
By default Dictionary is not thread safe. It does not matter what are you going to add. And the most important: you cannot control concurrent additions from different threads. So you need locks for sure. Or switch to a thread safe collection (i.e. CocnurrentDictionary for .NET 4+)
While reading Joe Albahari's excellent book "Threading in C#" I came across the following ambiguous sentence:
A thread-safe type does not necessarily make the program using it thread-safe, and often the work involved in the latter makes the former redundant.
(You can find the sentence on this page; just search for "indeterminacy" to quickly jump to the appropriate section.)
I am looking to use a ConcurrentDictionary to implement certain thread-safe data structures. Is the paragraph telling me that ConcurrentDictionary does not guarantee thread-safe writes to my data structure? Can someone please provide a counter-example that shows a thread-safe type actually failing to provide thread safety?
Thanks in advance for your help.
At the simplest, a thread safe list or dictionary is a good example; having each individual operation thread safe isn't always enough - for example, "check if the list is empty; if it is, add an item" - even if all thread-safe, you can't do:
if(list.Count == 0) list.Add(foo);
as it could change between the two. You need to synchronize the test and the change.
My understanding of the warning is that just because you are using thread safe variables does not mean that your program is thread safe.
As an example, consider a class that has two variables that can be modified from two threads. Just because these variables are individually thread safe doesn't guarantee atomicity of modifications to the class. If there are two threads modifying these variables, it is possible that one variable will end up with the value set by one thread, while the other gets set by another thread. This can easily break the internal consistency of the class.
Was doing some searching a while back to fix a problem I had with some threading and came across this page:
http://www.albahari.com/threading/part2.aspx#_Thread_Safety
Particularly the section on "Locking around thread-safe objects"
From the page:
Sometimes you also need to lock around accessing thread-safe objects. To illustrate, imagine that the Framework’s List class was, indeed, thread-safe, and we want to add an item to a list:
if (!_list.Contains (newItem)) _list.Add (newItem);
Whether or not the list was thread-safe, this statement is certainly not!
I think what he means is that just using ConcurrentDictionary instead of Dictionary everywhere isn't going to make the program thread-safe. So, if you have a non-thread-safe program, a search and replace isn't going to help; likewise, adding SynchronizedAttribute everywhere isn't going to work like a magic fairy dust. This is particularly true regarding collections, where iteration is always a problem[1].
On the other hand, if you restructure the non-thread-safe program into a more thread-safe design, then you often don't need thread-safe data structures. One popular approach is to redefine the program in terms of "actors" that send "messages" to each other - aside from a single producer/consumer-style message queue, each actor can stand alone and does not need to use thread-safe data structures internally.
[1] The first release of BCL collections included some "thread-safe" collections that just plain were not thread-safe during iterations. The Concurrent collections are thread-safe during iteration, but iterate concurrently with other threads' modifications. Other collection libraries allow "snapshots" which can then be iterated, ignoring modifications from other threads.
It's a bit of a vague statement, but consider for example, a class has two members, each of which is thread-safe, but that must both be updated in an atomic manner.
In dealing with that situation, you're likely to make that entire operation atomic, and thus thread-safe, rendering the thread-safe access to the individual members irrelevant.
If doesn't mean that your ConcurrentDictionary is going to behave in an unsafe way.
My concise explanation is this. There are many forms of thread safety and code that satisfies one form does not automatically satisfy all the others.
Roy,
I guess you're "over-reading" a too-concise sentence... I interpret that sentence as meaning two things:
"Just using threadsafe data-structures doesn't mean your program handles multithreading properly... any more than than the presence of threadsafe data-structures inherently makes your program multithreaded"; and he then goes on to say
"Unless you're prepared to put in "the hard yards" involved (it often requires a very precise understanding of quite complex scenarios) to make your WHOLE program handle threading properly, using a threadsafe data-structure is basically a waste of clock-ticks.
Ergo: Multi-threading is pretty hard, using appropriate out-of-the-box datastructures is an important part of any solution, but it's certainly NOT the whole solution... and unless you're prepared to think-it-through (i.e. do your syncronization home-work) you're just kidding yourself that a data-structure will somehow magically "fix" your program.
I know that sounds "a bit harsh" but my perception is that a lot of noobs are really disappointed when they discover that programming (still, in this enlightened age of animiated icons and GUI painters) requires Deep Thought. Who'd've thunk it?!?!
Cheers. Keith.
Is the paragraph telling me that
ConcurrentDictionary does not
guarantee thread-safe writes to my
data structure?
No, that is not what Joe Albahari means. ConcurrentDictionary will always maintain a consistent state through simultaneous writes from multiple threads. Another thread will never see the data structure in an inconsistent state.
Can someone please provide a
counter-example that shows a
thread-safe type actually failing to
provide thread safety?
However, a series of reads and writes from a thread-safe type may still fail in a multithreaded environment.
void ExecutedByMultipleThreads(ConcurrentQueue<object> queue)
{
object value;
if (!queue.IsEmpty)
{
queue.TryDequeue(out value);
Console.WriteLine(value.GetHashCode());
}
}
So clearly ConcurrentQueue is a thread-safe type, but this program can still fail with a NullReferenceException if another thread dequeued the last item between the IsEmpty and TryDequeue methods. The data structure itself still provides its thread-safety guarentee by remaining in a consistent state, but the program is not thread-safe by assumptions it make about thread-safety in a general are not correct. In this case its the program that is incorrect; not the data structure.
lock(dictionaryX)
{
dictionaryX.TryGetValue(key, out value);
}
is locking necessary while doing lookups to a Dictionary ?
THe program is multithreaded, and while adding key/value to dict. dict is being locked.
As mentioned here:
Using TryGetValue() without locking is not safe. The dictionary is temporarily in a state that makes it unsuitable for reading while another thread is writing the dictionary. A dictionary will reorganize itself from time to time as the number of entries it contains grows. When you read at the exact time this re-organization takes place, you'll run the risk of finding the wrong value for the key when the buckets got updated but not yet the value entries.
UPDATE:
take a look at "Thread Safety" part of this page too.
As with many subtle questions in programming, the answer is: Not necessarily.
If you only add values as an initialization, then the subsequent reading does not need to be synchronized. But, on the other hand, if you're going to be reading and writing at all times, then absolutely you need to protect that resource.
However, a full-blown lock may not be the best way, depending on the amount of traffic your Dictionary gets. Try a ReaderWriterLockSlim if you are using .NET 3.5 or greater.
If you have multiple threads accessing the Dictionary, then you do need to lock on updates and on lookup. The reason you need to lock on lookup is that there could be an update taking place at the same time you're doing the lookup, and the Dictionary could be in an inconsistent state during the update. For example, imagine that your have one thread doing this:
if (myDictionary.TryGetValue(key, out value))
{
}
and a separate thread is doing this:
myDictionary.Remove(key);
What could happen is that the thread doing the TryGetValue determines that the item is in the dictionary, but before it can retrieve the item, the other thread removes it. The result would be that the thread doing the lookup would either throw an exception or TryGetValue would return true but the value would be null or possibly an object that does not match the key.
That's only one thing that can happen. Something similarly disastrous can happen if you're doing a lookup on one thread and another thread does an add of the value that you're trying to look up.
Locking is only needed when you are synchronizing access to a resource between threads. As long as there are not mulitple threads involved then locking is not needed here.
In the context of updating and reading the value from multiple threads, yes a lock is absolutely necessary. In fact if you're using 4.0 you should consider switching to one of the collections specifically designed for concurrent access.
Use the new ConcurrentDictionary<TKey, TValue> object and you can forget about having to do any locks.
Yes you need to lock the dictionary for access in a multithreaded environment. The writing to a dictionary is not atomic, so it could add the key, but not the value. In that case when you access it, you could get an exception.
If you are on .Net 4, you can replace with ConcurrentDictionary to do this safely. There are other similar collections, preferred when you need multithreaded access, in the System.Collection.Concurrent namespace.
Don't use roll-your-own locking if this is an option for you.
Yes, you should lock if this dictionary is a shared resource between more than one thread. This ensures that you get the correct value and other thread doesn't happen to change value mid-way during your Lookup call.
Yes, you have to lock if you have multithreaded updates on that dictionary. Check this great post for details: “Thread safe” Dictionary(TKey,TValue)
But since ConcurrentDictionary<> introduced you can use it either via .NET 4 or by using Rx in 3.5 (it contains System.Threading.dll with implementation for new thread-safe collections)