I remember I saw a class with 2 exciting properties. I couldn't remember the class name but I think it was a collection, I'm not sure. The first property was named Read*word* and the second property was called Write*word*. Sadly I don't remember. Let me try to explain why these 2 properties exist. The idea was when a write happens we allow the already running "Reads" to finish but new ones will need to wait. See below how one would use this class.
var list = new ThreadSafeList();
// This is how we should read the collection.
using(var read = list.ReadLock)
{
// If a write is happening we will wait until it finishes.
await read.WaitAsync();
// Do something with the list. If we are here in the execution no writing can happen, only reads.
}
// This is how we should change the collection.
using(var write = list.WriteLock)
{
await write.WaitAsync();
// Modify the collection, nobody reads it, so we are safe.
}
Do I remember right? Is there a collection that has similar properties? If not can somebody tell me how I should make a similar class is this an AutoResetEvent or a ManualResetEvent maybe a Sempahore? I don't know.
The idea was when a write happens we allow the already running "Reads" to finish but new ones will need to wait. See below how one would use this class.
Sounds like a reader/writer lock to me. Reader/writer locks can be used with collections, although the lock itself does not contain items.
There is a synchronous reader/writer lock in the BCL. There is no asynchronous version, though.
Usually, when people ask for a reader/writer lock, I ask them to carefully reflect whether they really need it. Just the fact that you have some code acting as a reader and other code acting as a writer does not imply you should have a reader/writer lock. RWLs are possibly one of the most misused synchronization primitives; a simple lock (or SemaphoreSlim for the asynchronous case) usually suffices just fine.
Is there a collection that has similar properties?
There are certainly reader/writer collections available. These work by containing a number of items and permit different parts of code to produce or consume items. Hence the common name "producer/consumer" collections.
The one I would recommend these days is System.Threading.Channels. TPL Dataflow is another option.
Note that you don't have to take any locks with Channels/Dataflow. Your producers just produce data items, and your consumers just consume them. All the locking is internal to these collections.
If not can somebody tell me how I should make a similar class
If, after careful reflection, you really do want an asynchronous reader/writer lock, you can make one like this.
Related
Ok, I have read Thread safe collections in .NET and Why lock Thread safe collections?.
The former question being java centered, doesn't answer my question and the answer to later question tells that I don't need to lock the collection because they are supposed to thread-safe. (which is what I thought)
Now coming to my question,
I lot of developers I see, (on github and in my organisation) have started using the new thread-safe collection. However, often they don'tremove the lock around read & write operations.
I don't understand this. Isn't a thread-safe collection ... well, thread-safe completely ?
What could be the implications involved in not locking a thread-safe collection ?
EDIT: PS: here's my case,
I have a lot of classes, and some of them have an attribute on them. Very often I need to check if a given type has that attribute or not (using reflection of course). This could be expensive on performance. So decided to create a cache using a ConcurrentDictionary<string,bool>. string being the typeName and bool specifying if it has the attribute. At First, the cache is empty, the plan was to keep on adding to it as and when required. I came across GetOrAdd() method of ConcurrentDictionary. And my question is about the same, if I should call this method without locking ?
The remarks on MSDN says:
If you call GetOrAdd simultaneously on different threads,
addValueFactory may be called multiple times, but its key/value pair
might not be added to the dictionary for every call.
You should not lock a thread safe collection, it exposes methods to update the collection that are already locked, use them as intended.
The thread safe collection may not match your needs for instance if you want to prevent modification while an enumerator is opened on the collection (the provided thread safe collections allow modifications). If that's the case you'd better use a regular collection and lock it everywhere. The internal locks of the thread safe collections aren't publicly available.
It's hard to answer about implication in not locking a thread-safe collection. You don't need to lock a thread-safe collection but you may have to lock your code that does multiple things. Hard to tell without seeing the code.
Yes the method is thread safe but it might call the AddValueFactory multiple times if you hit an Add for the same key at the same time. In the end only one of the values will be added, the others will be discarded. It might not be an issue... you'll have to check how often you may reach this situation but I think it's not common and you can live with the performance penalty in an edge case that may never occur.
You could also build your dictionnary in a static ctor or before you need it. This way, the dictionnary is filled once and you don't ever write to it. The dictionary is then read only and you don't need any lock neither a thread safe collection.
A method of a class typically changes the object from state A to state B. However, another thread may also change the state of the object during the execution of that method, potentially leaving the object in an instable state.
For instance, a list may want to check if its underlying data buffer is large enough before adding a new item:
void Add(object item)
{
int requiredSpace = Count + 1;
if (buffer.Length < requiredSpace)
{
// increase underlying buffer
}
buffer[Count] = item;
}
Now if a list has buffer space for only one more item, and two threads attempt to add an item at the same time, they may both decide that no additional buffer space is required, potentially causing an IndexOutOfRangeException on one of these threads.
Thread-safe classes ensure that this does not happen.
This does not mean that using a thread-safe class makes your code thread-safe:
int count = myConcurrentCollection.Count;
myCurrentCollection.Add(item);
count++;
if (myConcurrentCollection.Count != count)
{
// some other thread has added or removed an item
}
So although the collection is thread safe, you still need to consider thread-safety for your own code. The enumerator example Guillaume mentioned is a perfect example of where threading issues might occur.
In regards to your comment, the documentation for ConcurrentDictionary mentions:
All these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary class. The only exceptions are the methods that accept a delegate, that is, AddOrUpdate and GetOrAdd. For modifications and write operations to the dictionary, ConcurrentDictionary uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) However, delegates for these methods are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, the code executed by these delegates is not subject to the atomicity of the operation.
So yes these overloads (that take a delegate) are exceptions.
I'm working on making my SortedDictionary thread safe and the thing I'm not sure about is: is it safe to have a call to add to SortedDictionary in one thread, like this:
dictionary.Add(key, value);
and simply get an item from this dictionary in another thread, like this:
variable = dictionary[key];
There is no explicit enumeration in either of those places, so it looks safe, but it would be great to be sure about it.
No, it is not safe to read and write SortedDictionary<K,V> concurrently: adding an element to a sorted dictionary may involve re-balancing of the tree, which may cause the concurrent read operation to take a wrong turn while navigating to the element of interest.
In order to fix this problem you would need to either wrap an instance of SortedDictionary<K,V> in a class that performs explicit locking, or roll your own collection compatible with the interfaces implemented by SortedDictionary<K,V>.
No. Anything that modifies the tree is not thread safe at all. The trick is to fill up the SortedDictionary in one thread, then treat it as immutable and let multiple threads read from it. (You can do this with a SortedDictionary, as stated here. I mention this because there may be a collection/dictionary/map out there somewhere that is changed when it is read, so you should always check.)
If you need to modify it once it's released into the wild, you have a problem. You need to lock it to write to it, and all the readers need to respect that lock, which means they need to lock it too, which means the readers can no longer read it simultaneously. The best way around this is usually to create a whole new SortedDictionary, then, once the new one is immutable, replace the reference to the original with a reference to the new one. (You need a volatile reference to do this right.) The readers will switch dictionaries cleanly without a problem. And the old dictionary won't go away until the last reader has finished reading and released its reference.
(There are n-readers and 1-writer locks, but you want to avoid any locking at all.)
(And keep in mind the reference to the dictionary can change suddenly if you're enumerating. Use a local variable for this rather than refering to the (volatile) reference.)
Java has a ConcurrentSkipListMap, which allows any number of simultaneous reads and writes, but I don't think there's anything like it in .NET yet. And if there is, it's going to be slower for reads than an immutable SortedDictionary anyway.
No, because it is not documented to be safe. That is the real reason. Reasoning with implementation details is not as good because they are details that you cannot rely on.
No, it is not safe to do so. If you want to implement in multithreading than you should do this
private readonly object lockObject = new object();
lock (lockObject )
{
//your dictionary operation here.
}
When looking into IsEmpty, I noticed this on MSDN:
However, as this collection is intended to be accessed concurrently, it may be the case that another thread will modify the collection after IsEmpty returns, thus invalidating the result.
Sure this is true, but does this also imply that ConcurrentQueue doesn't use a read barrier when checking if the queue is empty?
I want to have a piece of code that checks in another thread if the concurrent queue is empty. Something like this:
while (!queue.IsEmpty)
{
}
However.. if ConcurrentQueue doesn't use a read barrier, I'd say we need to add our own memory barrier to ensure we read the right data, like this:
Thread.MemoryBarrier();
while (!queue.IsEmpty)
{
Thread.MemoryBarrier();
}
(BTW: These are just a minimal example to illustrate the case, there's more code in reality).
Is my observation correct? Or does the ConcurrentQueue handle this and does the first implementation work? (e.g. what I would expect from 'Concurrent')?
And what about 'Count'? I can't find the answer on MSDN... Same story?
No, it's nothing to do with memory barriers. It's simply got to be the case that something else could nip in and add or remove something to or from the queue just after your test.
You shouldn't really use IsEmpty. Use TryDequeue instead. Or use a BlockingCollection.
You really don't want to start writing code that "locks" the queue while you mess around with IsEmpty or Count.
(I almost never use ConcurrentQueue nowadays, since BlockingCollection is so much better, although that does of course depend on what you're trying to do.)
If I have an array that can/will be accessed by multiple threads at any given point in time, what exactly causes it to be non-thread safe, and what would be the steps taken to ensure that the array would be thread safe in most situations?
I have looked extensively around on the internet and have found little to no information on this subject, everything seems to be specific scenarios (e.g. is this array, that is being accessed like this by these two threads thread-safe, and on, and on). I would really like of someone could either answer the questions I laid out at the top, or if someone could point towards a good document explaining said items.
EDIT:
After looking around on MSDN, I found the ArrayList class. When you use the synchronize method, it returns a thread-safe wrapper for a given list. When setting data in the list (i.e. list1[someNumber] = anotherNumber;) does the wrapper automatically take care of locking the list, or do you still need to lock it?
When two threads are accessing the exact same resource (e.g., not local copies, but actually the same copy of the same resource), a number of things can happen. In the most obvious scenario, if Thread #1 is accessing a resource and Thread #2 changes it mid-read, some unpredictable behavior can happen. Even with something as simple as an integer, you could have logic errors arise, so try to imagine the horrors that can result from improperly using something more complicated, like a database access class that's declared as static.
The classical way of handling this problem is to put a lock on the sensitive resources so only one thread can use it at a time. So in the above example, Thread #1 would request a lock to a resource and be granted it, then go in to read what it needs to read. Thread #2 would come along mid-read and request a lock to the resource, but be denied and told to wait because Thread #1 is using it. When Thread #1 finishes, it releases the lock and it's OK for Thread #2 to proceed.
There are other situations, but this illustrates one of the most basic problems and solutions. In C#, you may:
1) Use specific .NET objects that are managed as lockable by the framework (like Scorpion-Prince's link to SynchronizedCollection)
2) Use [MethodImpl(MethodImplOptions.Synchronized)] to dictate that a specific method that does something dangerous should only be used by one thread at a time
3) Use the lock statement to isolate specific lines of code that are doing something potentially dangerous
What approach is best is really up to your situation.
If I have an array that can/will be accessed by multiple threads at
any given point in time, what exactly causes it to be non-thread safe,
and what would be the steps taken to ensure that the array would be
thread safe in most situations?
In general terms, the fact that the array is not thread-safe is the notion that two or more threads could be modifying the contents of the array if you do not synchronize access to it.
Speaking generally, for example, let's suppose you have thread 1 doing this work:
for (int i = 0; i < array.Length; i++)
{
array[i] = "Hello";
}
And thread 2 doing this work (on the same shared array)
for (int i = 0; i < array.Length; i++)
{
array[i] = "Goodbye";
}
There isn't anything synchronizing the threads so your results will depend on which thread wins the race first. It could be "Hello" or "Goodbye", in some random order, but will always be at least 'Hello' or 'Goodbye'.
The actual write of the string 'Hello' or 'Goodbye' is guaranteed by the CLR to be atomic. That is to say, the writing of the value 'Hello' cannot be interrupted by a thread trying to write 'Goodbye'. One must occur before or after the other, never in between.
So you need to create some kind of synchronization mechanism to prevent the arrays from stepping on each other. You can accomplish this by using a lock statement in C#.
C# 3.0 and above provide a generic collection class called SynchronizedCollection which "provides a thread-safe collection that contains objects of a type specified by the generic parameter as elements."
Array is thread safe if it is named public and static keywords - instant is not guaranteed - as the System.Array implements the ICollection interface which define some synchronize method to support synchronizing mechanism.
However, coding to enumerate through the array's item is not safe, developer should implement lock statement to make sure there is no change to the array during the array enumeration.
EX:
Array arrThreadSafe = new string[] {"We", "are", "safe"};
lock(arrThreadSafe.SyncRoot)
{
foreach (string item in arrThreadSafe)
{
Console.WriteLine(item);
}
}
In my app I have a List of objects. I'm going to have a process (thread) running every few minutes that will update the values in this list. I'll have other processes (other threads) that will just read this data, and they may attempt to do so at the same time.
When the list is being updated, I don't want any other process to be able to read the data. However, I don't want the read-only processes to block each other when no updating is occurring. Finally, if a process is reading the data, the process that updates the data must wait until the process reading the data is finished.
What sort of locking should I implement to achieve this?
This is what you are looking for.
ReaderWriterLockSlim is a class that will handle scenario that you have asked for.
You have 2 pair of functions at your disposal:
EnterWriteLock and ExitWriteLock
EnterReadLock and ExitReadLock
The first one will wait, till all other locks are off, both read and write, so it will give you access like lock() would do.
The second one is compatible with each other, you can have multiple read locks at any given time.
Because there's no syntactic sugar like with lock() statement, make sure you will never forget to Exit lock, because of Exception or anything else. So use it in form like this:
try
{
lock.EnterWriteLock(); //ReadLock
//Your code here, which can possibly throw an exception.
}
finally
{
lock.ExitWriteLock(); //ReadLock
}
You don't make it clear whether the updates to the list will involve modification of existing objects, or adding/removing new ones - the answers in each case are different.
To handling modification of existing items in the list, each object should handle it's own locking.
To allow modification of the list while others are iterating it, don't allow people direct access to the list - force them to work with a read/only copy of the list, like this:
public class Example()
{
public IEnumerable<X> GetReadOnlySnapshot()
{
lock (padLock)
{
return new ReadOnlyCollection<X>( MasterList );
}
}
private object padLock = new object();
}
Using a ReadOnlyCollection<X> to wrap the master list ensures that readers can iterate through a list of fixed content, without blocking modifications made by writers.
You could use ReaderWriterLockSlim. It would satisfy your requirements precisely. However, it is likely to be slower than just using a plain old lock. The reason is because RWLS is ~2x slower than lock and accessing a List would be so fast that it would not be enough to overcome the additional overhead of the RWLS. Test both ways, but it is likely ReaderWriterLockSlim will be slower in your case. Reader writer locks do better in scenarios were the number readers significantly outnumbers the writers and when the guarded operations are long and drawn out.
However, let me present another options for you. One common pattern for dealing with this type of problem is to use two separate lists. One will serve as the official copy which can accept updates and the other will serve as the read-only copy. After you update the official copy you must clone it and swap out the reference for the read-only copy. This is elegant in that the readers require no blocking whatsoever. The reason why readers do not require any blocking type of synchronization is because we are treating the read-only copy as if it were immutable. Here is how it can be done.
public class Example
{
private readonly List<object> m_Official;
private volatile List<object> m_Readonly;
public Example()
{
m_Official = new List<object>();
m_Readonly = m_Official;
}
public void Update()
{
lock (m_Official)
{
// Modify the official copy here.
m_Official.Add(...);
m_Official.Remove(...);
// Now clone the official copy.
var clone = new List<object>(m_Official);
// And finally swap out the read-only copy reference.
m_Readonly = clone;
}
}
public object Read(int index)
{
// It is safe to access the read-only copy here because it is immutable.
// m_Readonly must be marked as volatile for this to work correctly.
return m_Readonly[index];
}
}
The code above would not satisfy your requirements precisely because readers never block...ever. Which means they will still be taking place while writers are updating the official list. But, in a lot of scenarios this winds up being acceptable.