Thread safety with Dictionary<int,int> in .Net - c#

I have this function:
static Dictionary<int, int> KeyValueDictionary = new Dictionary<int, int>();
static void IncreaseValue(int keyId, int adjustment)
{
if (!KeyValueDictionary.ContainsKey(keyId))
{
KeyValueDictionary.Add(keyId, 0);
}
KeyValueDictionary[keyId] += adjustment;
}
Which I would have thought would not be thread safe. However, so far in testing it I have not seen any exceptions when calling it from multiple threads at the same time.
My questions: Is it thread safe or have I just been lucky so far? If it is thread safe then why?

However, so far in testing it I have not seen any exceptions when calling it from multiple threads at the same time.
Is it thread safe or have I just been lucky so far? If it is thread safe then why?
You're getting lucky. These types of bugs with threads are so easy to make because testing can you give you a false sense of security that you did things correctly.
It turns out that Dictionary<TKey, TValue> is not thread-safe when you have multiple writers. The documentation explicitly states:
A Dictionary<TKey, TValue> can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
Alternatively, use ConcurrentDictionary. However, you still must write correct code (see note below).
In addition to the lack of thread-safety with Dictionary<TKey, TValue> which you've been lucky to avoid, your code is dangerously flawed. Here's how you can get a bug with your code:
static void IncreaseValue(int keyId, int adjustment) {
if (!KeyValueDictionary.ContainsKey(keyId)) {
// A
KeyValueDictionary.Add(keyId, 0);
}
KeyValueDictionary[keyId] += adjustment;
}
Dictionary is empty.
Thread 1 enters the method with keyId = 17. As the Dictionary is empty, the conditional in the if returns true and thread 1 reaches the line of code marked A.
Thread 1 is paused and thread 2 enters the method with keyId = 17. As the Dictionary is empty, the conditional in the if returns true and thread 2 reaches the line of code marked A.
Thread 2 is paused and thread 1 resumes. Now thread 1 adds (17, 0) to the dictionary.
Thread 1 is paused and now thread 2 resumes. Now thread 2 tries to add (17, 0) to the dictionary. An exception is thrown because of a key violation.
There are other scenarios in which an exception can occur. For example, thread 1 could be paused while loading the value of KeyValueDictionary[keyId] (say it loads keyId = 17, and obtains the value 42), thread 2 could come in and modify the value (say it loads keyId = 17, adds the adjustment 27), and now thread 1 resumes and adds its adjustment to the value it loaded (in particular, it doesn't see the modification that thread 2 made to the value associated with keyId = 17!).
Note that even using a ConcurrentDictionary<TKey, TValue> could lead to the above bugs! Your code is NOT safe for reasons not related to the thread-safety or lack thereof for Dictionary<TKey, TValue>.
To get your code to be thread-safe with a concurrent dictionary, you'll have to say:
KeyValueDictionary.AddOrUpdate(keyId, adjustment, (key, value) => value + adjustment);
Here we are using ConcurrentDictionary.AddOrUpdate.

It's not thread safe, but does not check and so probably doesn't notice silent corruption.
It will appear to be thread safe for a long time because only when it needs to rehash() does it have even a chance of exception. Otherwise, it just corrupts data.

The .NET library has a thread safe dictionary, the ConcurrentDictionary<TKey, TValue> http://msdn.microsoft.com/en-us/library/dd287191.aspx
Updated: I didn't exactly answer the question, so here's updated with more answery to exact question posed.
As per the MSDN:http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
A Dictionary can support multiple readers concurrently,
as long as the collection is not modified. Even so, enumerating
through a collection is intrinsically not a thread-safe procedure. In
the rare case where an enumeration contends with write accesses, the
collection must be locked during the entire enumeration. To allow the
collection to be accessed by multiple threads for reading and writing,
you must implement your own synchronization.
For a thread-safe alternative, see ConcurrentDictionary.
Public static (Shared in Visual Basic) members of this type are thread
safe.

You've just been lucky so far. It's not thread-safe.
From the Dictionary<K,V> documentation...
A Dictionary<TKey, TValue> can support multiple readers
concurrently, as long as the collection is not modified. Even so,
enumerating through a collection is intrinsically not a thread-safe
procedure. In the rare case where an enumeration contends with write
accesses, the collection must be locked during the entire enumeration.
To allow the collection to be accessed by multiple threads for reading
and writing, you must implement your own synchronization.

Related

When to lock a thread-safe collection in .net ? ( & when not to lock ? )

Ok, I have read Thread safe collections in .NET and Why lock Thread safe collections?.
The former question being java centered, doesn't answer my question and the answer to later question tells that I don't need to lock the collection because they are supposed to thread-safe. (which is what I thought)
Now coming to my question,
I lot of developers I see, (on github and in my organisation) have started using the new thread-safe collection. However, often they don'tremove the lock around read & write operations.
I don't understand this. Isn't a thread-safe collection ... well, thread-safe completely ?
What could be the implications involved in not locking a thread-safe collection ?
EDIT: PS: here's my case,
I have a lot of classes, and some of them have an attribute on them. Very often I need to check if a given type has that attribute or not (using reflection of course). This could be expensive on performance. So decided to create a cache using a ConcurrentDictionary<string,bool>. string being the typeName and bool specifying if it has the attribute. At First, the cache is empty, the plan was to keep on adding to it as and when required. I came across GetOrAdd() method of ConcurrentDictionary. And my question is about the same, if I should call this method without locking ?
The remarks on MSDN says:
If you call GetOrAdd simultaneously on different threads,
addValueFactory may be called multiple times, but its key/value pair
might not be added to the dictionary for every call.
You should not lock a thread safe collection, it exposes methods to update the collection that are already locked, use them as intended.
The thread safe collection may not match your needs for instance if you want to prevent modification while an enumerator is opened on the collection (the provided thread safe collections allow modifications). If that's the case you'd better use a regular collection and lock it everywhere. The internal locks of the thread safe collections aren't publicly available.
It's hard to answer about implication in not locking a thread-safe collection. You don't need to lock a thread-safe collection but you may have to lock your code that does multiple things. Hard to tell without seeing the code.
Yes the method is thread safe but it might call the AddValueFactory multiple times if you hit an Add for the same key at the same time. In the end only one of the values will be added, the others will be discarded. It might not be an issue... you'll have to check how often you may reach this situation but I think it's not common and you can live with the performance penalty in an edge case that may never occur.
You could also build your dictionnary in a static ctor or before you need it. This way, the dictionnary is filled once and you don't ever write to it. The dictionary is then read only and you don't need any lock neither a thread safe collection.
A method of a class typically changes the object from state A to state B. However, another thread may also change the state of the object during the execution of that method, potentially leaving the object in an instable state.
For instance, a list may want to check if its underlying data buffer is large enough before adding a new item:
void Add(object item)
{
int requiredSpace = Count + 1;
if (buffer.Length < requiredSpace)
{
// increase underlying buffer
}
buffer[Count] = item;
}
Now if a list has buffer space for only one more item, and two threads attempt to add an item at the same time, they may both decide that no additional buffer space is required, potentially causing an IndexOutOfRangeException on one of these threads.
Thread-safe classes ensure that this does not happen.
This does not mean that using a thread-safe class makes your code thread-safe:
int count = myConcurrentCollection.Count;
myCurrentCollection.Add(item);
count++;
if (myConcurrentCollection.Count != count)
{
// some other thread has added or removed an item
}
So although the collection is thread safe, you still need to consider thread-safety for your own code. The enumerator example Guillaume mentioned is a perfect example of where threading issues might occur.
In regards to your comment, the documentation for ConcurrentDictionary mentions:
All these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary class. The only exceptions are the methods that accept a delegate, that is, AddOrUpdate and GetOrAdd. For modifications and write operations to the dictionary, ConcurrentDictionary uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) However, delegates for these methods are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, the code executed by these delegates is not subject to the atomicity of the operation.
So yes these overloads (that take a delegate) are exceptions.

List thread safe?

Can the following be considered thread safe due to the atomic operation appearance of the code.
My main concern is if the lists needs to be re-sized it becomes non-thread safe during the re-sizing.
List<int> list = new List<int>(10);
public List<int> GetList()
{
var temp = list;
list = new List<int>(10);
return temp;
}
TimerElapsed(int number)
{
list.Add(number);
}
No. List<T> is explicitly documented not to be thread-safe:
It is safe to perform multiple read operations on a List, but issues can occur if the collection is modified while it’s being read. To ensure thread safety, lock the collection during a read or write operation. To enable a collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization. For collections with built-in synchronization, see the classes in the System.Collections.Concurrent namespace. For an inherently thread–safe alternative, see the ImmutableList class.
Neither your code nor the List<T> are thread-safe.
The list isn't thread-safe according to its documentation. Your code is not thread safe because it lacks synchronization.
Consider two threads calling GetList concurrently. Let's say the first thread gets pre-empted right after setting up the temp. Now the second thread sets the temp of its own, replaces the list, and lets the GetList function run to completion. When the first thread gets to continue, it would return the same list that the second thread has just returned.
But that's not all! If a third thread has called TimerElapsed after the second thread has completed but before the first thread has completed, it would place a value in a list that is about to be overwritten without a trace. So not only would multiple threads return the same data, but also some of your data will disappear.
No. It is not ThreadSafe.
Try using members of the System.Collections.Concurrent namespace
As already mentioned, a List<T> is not thread safe. You can look at alternatives in the Concurrent namespace, possibly using the ConcurrentBag, or there is an article here by Dean Chalk Fast Parallel ConcurrentList<T> Implementation.
It is not thread safe since there can be a context switch between the first line of the GetList method which transfers to TimerElapsed method. This will create inconsistent result on different scenarions. Also as other users already mentioned the List class is not thread safe and you should use the System.Collections.Concurrent equivalent.
It is thread safe for reading only, not for writing.

Is accessing the Dictionary<TKey, TValue> Keys property thread safe?

For example can I go:
string[] keys = new string[items.Count];
items.Keys.CopyTo(keys);
Where items is a:
Dictionary<string, MyObject>
instance that could be being modified by another thread? (Its not clear to me if when accessing the Keys property the inner workings of Dictionary iterates the collection - in which case I know it would not be thread safe).
Update: I realise my above example is not thread safe becuase the size of the Dictonary could change between the lines. however what about:
Dictionary<string, MyObject> keys = items.Keys;
string[] copyOfKeys = new string[keys.Count];
keys.Keys.CopyTo(copyOfKeys );
No, Dictionary is not thread-safe. If you need thread safety and you're using .NET 4 or higher, you should use a ConcurrentDictionary.
From MSDN:
A Dictionary can support multiple readers concurrently,
as long as the collection is not modified. Even so, enumerating
through a collection is intrinsically not a thread-safe procedure. In
the rare case where an enumeration contends with write accesses, the
collection must be locked during the entire enumeration. To allow the
collection to be accessed by multiple threads for reading and writing,
you must implement your own synchronization.
If the dictionary is being mutated on another thread it is not safe to do this. The first thing that comes to mind is that the dictionary might resize its internal buffer which surely does not play well with concurrent iteration.
If you aren't changing the keys and are only writing values this should be safe in practice.
Anyway, I wouldn't put such code into production because the risk is too high. What if some .NET patch changes the internals of Dictionary and suddenly you have a bug. There are other concerns as well.
I recommend you use ConcurrentDictionary or some other safe strategy. Why gamble with the correctness of your code?

is Queue.Count thread safe?

I need one thread to modify Queue (both adding and removing elements) and another thread only to call Queue.Count. Would it be safe or I need to use locks or ConcurrentQueue?
The Queue property is not thread-safe, as per the docs.
But it is an atomic int, the worst that could happen is that you read the wrong (outdated) value. Which may or may not be a problem.
But since you'll have to do something to prevent your reading thread from caching the value you might as well lock().
Queue does not provide thread safety guarantees, so yes you do need one of the two alternatives you mention.
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
A Queue(Of T) can support multiple readers concurrently, as long as the
collection is not modified. Even so, enumerating through a collection
is intrinsically not a thread-safe procedure. To guarantee thread
safety during enumeration, you can lock the collection during the
entire enumeration. To allow the collection to be accessed by multiple
threads for reading and writing, you must implement your own
synchronization.
It's not guaranteed to be threadsafe.
The current implementation of Count is threadsafe. It's not likely to change, but there's no promise.
Most of the time, this isn't very useful though. If you were doing something like outputting a current estimate of the size to UI, then that's perfectly safe. If you make any decision on the basis of it, that is not safe:
if(queue.Count != 0)
return queue.Dequeue; //not thread-safe as Dequeue isn't threadsafe.
if(queue.Count != 0)
{
lock(queue)
return queue.Dequeue; //not thread-safe, won't corrput
//queue but may error as Count could now be zero.
}
lock(queue)
if(queue.Count != 0)
return queue.Dequeue; //thread-safe
ConcurrentQueue<int> cQueue = new ConcurrentQueue<int>();
/*...*/
int val;
if(cQueue.TryDequeue(out val))
return val; //perfectly thread-safe and lock-free,
//but more expensive than single-threaded use of Queue<int>
From the Queue msdn documentation under the Thread Safety heading:
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
To guarantee the thread safety of the Queue, all operations must be
done through the wrapper returned by the Synchronized method.
Enumerating through a collection is intrinsically not a thread-safe
procedure. Even when a collection is synchronized, other threads can
still modify the collection, which causes the enumerator to throw an
exception. To guarantee thread safety during enumeration, you can
either lock the collection during the entire enumeration or catch the
exceptions resulting from changes made by other threads.
msdn has pretty good documentation. I advise you to look there the next time.

Is it safe to iterate over an unchanging dictionary from multiple threads?

I've got code like this that's executed from many threads simultaneously (over shared a and b objects of type Dictionary<int, double>):
foreach (var key in a.Keys.Union(b.Keys)) {
dist += Math.Pow(b[key] - a[key], 2);
}
The dictionaries don't change during the lifetime of the threads. Is this safe? So far, it seems OK, but I wanted to be sure.
From the dictionary documentation:
A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
As long as you're never writing, it should be safe.
Only if You Guarantee No Writes Occur
A Dictionary(Of TKey, TValue) can support multiple readers
concurrently, as long as the collection is not modified. Even so,
enumerating through a collection is intrinsically not a thread-safe
procedure. In the rare case where an enumeration contends with write
accesses, the collection must be locked during the entire enumeration.
To allow the collection to be accessed by multiple threads for reading
and writing, you must implement your own synchronization.
For a thread-safe alternative, see ConcurrentDictionary(Of TKey,
TValue).
Public static (Shared in Visual Basic) members of this type are thread
safe.
Sources
http://msdn.microsoft.com/en-us/library/xfhwa508.aspx
http://msdn.microsoft.com/en-us/library/dd287191.aspx

Categories