Can the following be considered thread safe due to the atomic operation appearance of the code.
My main concern is if the lists needs to be re-sized it becomes non-thread safe during the re-sizing.
List<int> list = new List<int>(10);
public List<int> GetList()
{
var temp = list;
list = new List<int>(10);
return temp;
}
TimerElapsed(int number)
{
list.Add(number);
}
No. List<T> is explicitly documented not to be thread-safe:
It is safe to perform multiple read operations on a List, but issues can occur if the collection is modified while it’s being read. To ensure thread safety, lock the collection during a read or write operation. To enable a collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization. For collections with built-in synchronization, see the classes in the System.Collections.Concurrent namespace. For an inherently thread–safe alternative, see the ImmutableList class.
Neither your code nor the List<T> are thread-safe.
The list isn't thread-safe according to its documentation. Your code is not thread safe because it lacks synchronization.
Consider two threads calling GetList concurrently. Let's say the first thread gets pre-empted right after setting up the temp. Now the second thread sets the temp of its own, replaces the list, and lets the GetList function run to completion. When the first thread gets to continue, it would return the same list that the second thread has just returned.
But that's not all! If a third thread has called TimerElapsed after the second thread has completed but before the first thread has completed, it would place a value in a list that is about to be overwritten without a trace. So not only would multiple threads return the same data, but also some of your data will disappear.
No. It is not ThreadSafe.
Try using members of the System.Collections.Concurrent namespace
As already mentioned, a List<T> is not thread safe. You can look at alternatives in the Concurrent namespace, possibly using the ConcurrentBag, or there is an article here by Dean Chalk Fast Parallel ConcurrentList<T> Implementation.
It is not thread safe since there can be a context switch between the first line of the GetList method which transfers to TimerElapsed method. This will create inconsistent result on different scenarions. Also as other users already mentioned the List class is not thread safe and you should use the System.Collections.Concurrent equivalent.
It is thread safe for reading only, not for writing.
Related
I have a generic List that's getting added and removed by two different threads.
Thread 1 :
batchToBeProcessed = this.list.Count;
// Do some processing
this.list.RemoveRange(0, batchToBeProcessed.Count);
Thread 2 :
lock()
{
this.list.Add(item);
}
Is RemoveRange, thread safe in the following scenario?
Say the list has 10 items thats being processed in thread1, and while removerange is getting executed, thread2 adds 1 item, what would happen?
Edit : Do not want to use a concurrentbag since I need the ordering.
The answer to your question is No, it is not safe to have multiple threads modifying the list concurrently. You must synchronize access to the list using a lock or similar synchronization primitive. List is safe for multiple concurrent readers. Or you can have a single writer. But you can't modify the list in one thread while some other thread is accessing it at all.
As to what will happen if you try to Add while another thread is executing a RemoveRange, it's likely that the code will throw an exception. But it's possible no exception will be thrown and your list will become corrupted. Or it could work just fine. Which will happen depends on timing. But there's no guarantee that it will work.
To my knowledge, the .NET Framework does not have a concurrent data structure that gives you all the functionality of List.
Is RemoveRange, thread safe in the following scenario?
No, all the methods in List<T> class are not thread safe.
Say the list has 10 items thats being processed in thread1, and while removerange is getting executed, thread2 adds 1 item, what would happen?
It's undefined, new item may be added to the end of the list, old items may not be removed, ...
I know static variable or collection is shared, across the threads, at most single memory address is created for variable, and it's state will be persistent, across the threads.
static int count =0
thread 1 --> count++
thread 2 --> diplay count -->1
thread 3 --> count--
thread 1 --> display count -->0
my question, locking mechanism is required for static collections? below is the static collection and locking mechanism.
public static List<ConnectionManager> ServerConnections = new List<ConnectionManager>();
lock (Global.ServerConnections)
{
//do something
}
Sure. If you need just thread safety you can use C# concurrent collections, but if you want some synchronization (like several actions upon a collection inside one thread to be executed without any impact of other threads) you need locking.
Actually you also need to take care of your variables, count++ and count-- are not thread safe. Use Interlocked or any other mechanism to ensure thread safety.
The answer is: Yes. You need a lock because Static != Thread safe. This applies to your count variable too.
Just because a variable is static that doesn't make it thread safe. Multiple threads can still access it at the exact same time which causes concurrency issues.
There is no thread safety to anything by default, it has to be designed to be thread safe.
Also take a look at the ConcurrentBag<T>.
It belongs to how you use the collection and how you instatiate it. If you instantiate it from different threads you should gurantee that only one thread instatiate it. Because with badluck more threads try instatiation at the same time. The Lazy class implemented in .Net is for this purpose and makes it easy to instantiate thread safe and lazy. Futher you need to lock your collection in any case of operation you wanna do. Insert, Remove, Iterate etc. are all not thread safe. Read about concurrentDictionary etc. for more information about thread safe collections.
Ok, I have read Thread safe collections in .NET and Why lock Thread safe collections?.
The former question being java centered, doesn't answer my question and the answer to later question tells that I don't need to lock the collection because they are supposed to thread-safe. (which is what I thought)
Now coming to my question,
I lot of developers I see, (on github and in my organisation) have started using the new thread-safe collection. However, often they don'tremove the lock around read & write operations.
I don't understand this. Isn't a thread-safe collection ... well, thread-safe completely ?
What could be the implications involved in not locking a thread-safe collection ?
EDIT: PS: here's my case,
I have a lot of classes, and some of them have an attribute on them. Very often I need to check if a given type has that attribute or not (using reflection of course). This could be expensive on performance. So decided to create a cache using a ConcurrentDictionary<string,bool>. string being the typeName and bool specifying if it has the attribute. At First, the cache is empty, the plan was to keep on adding to it as and when required. I came across GetOrAdd() method of ConcurrentDictionary. And my question is about the same, if I should call this method without locking ?
The remarks on MSDN says:
If you call GetOrAdd simultaneously on different threads,
addValueFactory may be called multiple times, but its key/value pair
might not be added to the dictionary for every call.
You should not lock a thread safe collection, it exposes methods to update the collection that are already locked, use them as intended.
The thread safe collection may not match your needs for instance if you want to prevent modification while an enumerator is opened on the collection (the provided thread safe collections allow modifications). If that's the case you'd better use a regular collection and lock it everywhere. The internal locks of the thread safe collections aren't publicly available.
It's hard to answer about implication in not locking a thread-safe collection. You don't need to lock a thread-safe collection but you may have to lock your code that does multiple things. Hard to tell without seeing the code.
Yes the method is thread safe but it might call the AddValueFactory multiple times if you hit an Add for the same key at the same time. In the end only one of the values will be added, the others will be discarded. It might not be an issue... you'll have to check how often you may reach this situation but I think it's not common and you can live with the performance penalty in an edge case that may never occur.
You could also build your dictionnary in a static ctor or before you need it. This way, the dictionnary is filled once and you don't ever write to it. The dictionary is then read only and you don't need any lock neither a thread safe collection.
A method of a class typically changes the object from state A to state B. However, another thread may also change the state of the object during the execution of that method, potentially leaving the object in an instable state.
For instance, a list may want to check if its underlying data buffer is large enough before adding a new item:
void Add(object item)
{
int requiredSpace = Count + 1;
if (buffer.Length < requiredSpace)
{
// increase underlying buffer
}
buffer[Count] = item;
}
Now if a list has buffer space for only one more item, and two threads attempt to add an item at the same time, they may both decide that no additional buffer space is required, potentially causing an IndexOutOfRangeException on one of these threads.
Thread-safe classes ensure that this does not happen.
This does not mean that using a thread-safe class makes your code thread-safe:
int count = myConcurrentCollection.Count;
myCurrentCollection.Add(item);
count++;
if (myConcurrentCollection.Count != count)
{
// some other thread has added or removed an item
}
So although the collection is thread safe, you still need to consider thread-safety for your own code. The enumerator example Guillaume mentioned is a perfect example of where threading issues might occur.
In regards to your comment, the documentation for ConcurrentDictionary mentions:
All these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary class. The only exceptions are the methods that accept a delegate, that is, AddOrUpdate and GetOrAdd. For modifications and write operations to the dictionary, ConcurrentDictionary uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) However, delegates for these methods are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, the code executed by these delegates is not subject to the atomicity of the operation.
So yes these overloads (that take a delegate) are exceptions.
I need one thread to modify Queue (both adding and removing elements) and another thread only to call Queue.Count. Would it be safe or I need to use locks or ConcurrentQueue?
The Queue property is not thread-safe, as per the docs.
But it is an atomic int, the worst that could happen is that you read the wrong (outdated) value. Which may or may not be a problem.
But since you'll have to do something to prevent your reading thread from caching the value you might as well lock().
Queue does not provide thread safety guarantees, so yes you do need one of the two alternatives you mention.
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
A Queue(Of T) can support multiple readers concurrently, as long as the
collection is not modified. Even so, enumerating through a collection
is intrinsically not a thread-safe procedure. To guarantee thread
safety during enumeration, you can lock the collection during the
entire enumeration. To allow the collection to be accessed by multiple
threads for reading and writing, you must implement your own
synchronization.
It's not guaranteed to be threadsafe.
The current implementation of Count is threadsafe. It's not likely to change, but there's no promise.
Most of the time, this isn't very useful though. If you were doing something like outputting a current estimate of the size to UI, then that's perfectly safe. If you make any decision on the basis of it, that is not safe:
if(queue.Count != 0)
return queue.Dequeue; //not thread-safe as Dequeue isn't threadsafe.
if(queue.Count != 0)
{
lock(queue)
return queue.Dequeue; //not thread-safe, won't corrput
//queue but may error as Count could now be zero.
}
lock(queue)
if(queue.Count != 0)
return queue.Dequeue; //thread-safe
ConcurrentQueue<int> cQueue = new ConcurrentQueue<int>();
/*...*/
int val;
if(cQueue.TryDequeue(out val))
return val; //perfectly thread-safe and lock-free,
//but more expensive than single-threaded use of Queue<int>
From the Queue msdn documentation under the Thread Safety heading:
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
To guarantee the thread safety of the Queue, all operations must be
done through the wrapper returned by the Synchronized method.
Enumerating through a collection is intrinsically not a thread-safe
procedure. Even when a collection is synchronized, other threads can
still modify the collection, which causes the enumerator to throw an
exception. To guarantee thread safety during enumeration, you can
either lock the collection during the entire enumeration or catch the
exceptions resulting from changes made by other threads.
msdn has pretty good documentation. I advise you to look there the next time.
Essentially, I am working with this:
var data = input.AsParallel();
List<String> output = new List<String>();
Parallel.ForEach<String>(data, line => {
String outputLine = "";
// ** Do something with "line" and store result in "outputLine" **
// Additionally, there are some this.Invoke statements for updating UI
output.Add(outputLine);
});
Input is a List<String> object. The ForEach() statement does some processing on each value, updates the UI, and adds the result to the output List. Is there anything inherently wrong with this?
Notes:
Output order is unimportant
Update:
Based on feedback I've gotten, I've added a manual lock to the output.Add statement, as well as to the UI updating code.
Yes; List<T> is not thread safe, so adding to it ad-hoc from arbitrary threads (quite possibly at the same time) is doomed. You should use a thread-safe list instead, or add locking manually. Or maybe there is a Parallel.ToList.
Also, if it matters: insertion order will not be guaranteed.
This version is safe, though:
var output = new string[data.Count];
Parallel.ForEach<String>(data, (line,state,index) =>
{
String outputLine = index.ToString();
// ** Do something with "line" and store result in "outputLine" **
// Additionally, there are some this.Invoke statements for updating UI
output[index] = outputLine;
});
here we are using index to update a different array index per parallel call.
Is there anything inherently wrong with this?
Yes, everything. None of this is safe. Lists are not safe for updating on multiple threads concurrently, and you can't update the UI from any thread other than the UI thread.
The documentation says the following about the thread safety of List<T>:
Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
A List(Of T) can support multiple readers concurrently, as long as the collection is not modified. Enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with one or more write accesses, the only way to ensure thread safety is to lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
Thus, output.Add(outputLine) is not thread-safe and you need to ensure thread safety yourself, for example, by wrapping the add operation in a lock statement.
When you want the results of a parallel operation, the PLINQ is more convenient than the Parallel class. You started well by converting your input to a ParallelQuery<T>:
ParallelQuery<string> data = input.AsParallel();
...but then you fed the data to the Parallel.ForEach, which treats it as a standard IEnumerable<T>. So the AsParallel() was wasted. It didn't provide any parallelization, only overhead. Here is the correct way to use PLINQ:
List<string> output = input
.AsParallel()
.Select(line =>
{
string outputLine = "";
// ** Do something with "line" and store result in "outputLine" **
return outputLine;
})
.ToList();
A few differences that you should have in mind:
The Parallel runs the code on the ThreadPool by default, but it's configurable. The PLINQ uses exclusively the ThreadPool.
The Parallel by default has unlimited parallelism (it uses all the available threads of the ThreadPool). The PLINQ uses by default at most Environment.ProcessorCount threads.
Regarding the order of the results, PLINQ doesn't preserve the order by default. In case you want to preserve the order, you can attach the AsOrdered operator.