List RemoveRange Thread safety - c#

I have a generic List that's getting added and removed by two different threads.
Thread 1 :
batchToBeProcessed = this.list.Count;
// Do some processing
this.list.RemoveRange(0, batchToBeProcessed.Count);
Thread 2 :
lock()
{
this.list.Add(item);
}
Is RemoveRange, thread safe in the following scenario?
Say the list has 10 items thats being processed in thread1, and while removerange is getting executed, thread2 adds 1 item, what would happen?
Edit : Do not want to use a concurrentbag since I need the ordering.

The answer to your question is No, it is not safe to have multiple threads modifying the list concurrently. You must synchronize access to the list using a lock or similar synchronization primitive. List is safe for multiple concurrent readers. Or you can have a single writer. But you can't modify the list in one thread while some other thread is accessing it at all.
As to what will happen if you try to Add while another thread is executing a RemoveRange, it's likely that the code will throw an exception. But it's possible no exception will be thrown and your list will become corrupted. Or it could work just fine. Which will happen depends on timing. But there's no guarantee that it will work.
To my knowledge, the .NET Framework does not have a concurrent data structure that gives you all the functionality of List.

Is RemoveRange, thread safe in the following scenario?
No, all the methods in List<T> class are not thread safe.
Say the list has 10 items thats being processed in thread1, and while removerange is getting executed, thread2 adds 1 item, what would happen?
It's undefined, new item may be added to the end of the list, old items may not be removed, ...

Related

Locking mechanism is required for static list or variables?

I know static variable or collection is shared, across the threads, at most single memory address is created for variable, and it's state will be persistent, across the threads.
static int count =0
thread 1 --> count++
thread 2 --> diplay count -->1
thread 3 --> count--
thread 1 --> display count -->0
my question, locking mechanism is required for static collections? below is the static collection and locking mechanism.
public static List<ConnectionManager> ServerConnections = new List<ConnectionManager>();
lock (Global.ServerConnections)
{
//do something
}
Sure. If you need just thread safety you can use C# concurrent collections, but if you want some synchronization (like several actions upon a collection inside one thread to be executed without any impact of other threads) you need locking.
Actually you also need to take care of your variables, count++ and count-- are not thread safe. Use Interlocked or any other mechanism to ensure thread safety.
The answer is: Yes. You need a lock because Static != Thread safe. This applies to your count variable too.
Just because a variable is static that doesn't make it thread safe. Multiple threads can still access it at the exact same time which causes concurrency issues.
There is no thread safety to anything by default, it has to be designed to be thread safe.
Also take a look at the ConcurrentBag<T>.
It belongs to how you use the collection and how you instatiate it. If you instantiate it from different threads you should gurantee that only one thread instatiate it. Because with badluck more threads try instatiation at the same time. The Lazy class implemented in .Net is for this purpose and makes it easy to instantiate thread safe and lazy. Futher you need to lock your collection in any case of operation you wanna do. Insert, Remove, Iterate etc. are all not thread safe. Read about concurrentDictionary etc. for more information about thread safe collections.

Thread Safe Generic List block Enumeration

First I would like to state that I have looked up probably a hundred google, and stackoverflow questions related to my question. I cannot find anything that answers my specific question.
I converted a DataTable into a List. I have multiple threads that enumerate the List with foreach. However, once every 5 minutes a master thread needs to refresh that List with the latest data. When this occurs, I need to block other threads from reading the thread until the master thread has fully updated the List.
All the articles and questions I have found blocks access on the single add. I know I can write a blocking for the update, but I need that lock to be also syncronized with all the other threads that enumerate. I do not want to update the List while other threads are in middle of their own enumeration.
How can I write a lock that will be utilized foreach statements and also for my update function?
Thanks
EDIT: I want to block "Consumers / Observers" when the producer thread is producing. I do not want Consumers / Observers blocking each other.
Put all code that populates and enumerates the list in a lock block:
lock (theList)
{
Update(theList);
}
lock (theList)
{
foreach (var thingy in theList)
{
DoStuff(thingy);
}
}
At first I though you could try to use a BlockingCollection. It implements the producer/consumer problem.
What you need actually is a way to set a rendez-vous for your reading threads and your master thread.
I think this can be done with monitors (especially by using Wait and Pulse methods). Try it and tell us what you found.

List thread safe?

Can the following be considered thread safe due to the atomic operation appearance of the code.
My main concern is if the lists needs to be re-sized it becomes non-thread safe during the re-sizing.
List<int> list = new List<int>(10);
public List<int> GetList()
{
var temp = list;
list = new List<int>(10);
return temp;
}
TimerElapsed(int number)
{
list.Add(number);
}
No. List<T> is explicitly documented not to be thread-safe:
It is safe to perform multiple read operations on a List, but issues can occur if the collection is modified while it’s being read. To ensure thread safety, lock the collection during a read or write operation. To enable a collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization. For collections with built-in synchronization, see the classes in the System.Collections.Concurrent namespace. For an inherently thread–safe alternative, see the ImmutableList class.
Neither your code nor the List<T> are thread-safe.
The list isn't thread-safe according to its documentation. Your code is not thread safe because it lacks synchronization.
Consider two threads calling GetList concurrently. Let's say the first thread gets pre-empted right after setting up the temp. Now the second thread sets the temp of its own, replaces the list, and lets the GetList function run to completion. When the first thread gets to continue, it would return the same list that the second thread has just returned.
But that's not all! If a third thread has called TimerElapsed after the second thread has completed but before the first thread has completed, it would place a value in a list that is about to be overwritten without a trace. So not only would multiple threads return the same data, but also some of your data will disappear.
No. It is not ThreadSafe.
Try using members of the System.Collections.Concurrent namespace
As already mentioned, a List<T> is not thread safe. You can look at alternatives in the Concurrent namespace, possibly using the ConcurrentBag, or there is an article here by Dean Chalk Fast Parallel ConcurrentList<T> Implementation.
It is not thread safe since there can be a context switch between the first line of the GetList method which transfers to TimerElapsed method. This will create inconsistent result on different scenarions. Also as other users already mentioned the List class is not thread safe and you should use the System.Collections.Concurrent equivalent.
It is thread safe for reading only, not for writing.

.net 4.0 Concurrent Collection's performance

if we have a ConcurrentBag<object> safeBag` filled with 100 objects.
then one thread works as:
foreach(object o in safeBag)
{
Thread.Sleep(1000);
}
the other thread starts right after the 1st thread starts:
{
safeBag.AddOrTake(something);
}
Will the 2nd thread wait for 100Sec to enter the resource?
Another question, if the 1st thread run with Parallel.ForEach(),how will the threads work?
EDIT:The MSDN said:"A List can support multiple readers concurrently, as long as the collection is not modified. Enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with one or more write accesses, the only way to ensure thread safety is to lock the collection during the entire enumeration." Does the Enumerating through the ConcurrentBag cause the 2nd thread waiting at writing access to the ConcurrentBag?
With most Concurrent* collections most operations are atomic but don't hold any long term locks. The first thread doesn't block the second thread after GetEnumerator() returns.
ConcurrentBag<T>.GetEnumerator Method
The enumeration represents a moment-in-time snapshot of the contents of the bag. It does not reflect any updates to the collection after GetEnumerator was called. The enumerator is safe to use concurrently with reads from and writes to the bag.
The second thread, assuming you're spawning two threads right in a row - the first using a ThreadStart pointing to the block containing the iteration and the second pointing to that other code block, will not wait for 1000ms. The foreach block will simply wait 1 second between moving to the next object in the set, the second block is unaffected by that.
If it were a parallel foreach, you'd have several threads waiting for a second (concurrently) before moving to the next element. The second block would still not be waiting on the ConcurrentBag to become free.

List with non-null elements ends up containing null. A synchronization issue?

First of all, sorry about the title -- I couldn't figure out one that was short and clear enough.
Here's the issue: I have a list List<MyClass> list to which I always add newly-created instances of MyClass, like this: list.Add(new MyClass()). I don't add elements any other way.
However, then I iterate over the list with foreach and find that there are some null entries. That is, the following code:
foreach (MyClass entry in list)
if (entry == null)
throw new Exception("null entry!");
will sometimes throw an exception.
I should point out that the list.Add(new MyClass()) are performed from different threads running concurrently. The only thing I can think of to account for the null entries is the concurrent accesses. List<> isn't thread-safe, after all. Though I still find it strange that it ends up containing null entries, instead of just not offering any guarantees on ordering.
Can you think of any other reason?
Also, I don't care in which order the items are added, and I don't want the calling threads to block waiting to add their items. If synchronization is truly the issue, can you recommend a simple way to call the Add method asynchronously, i.e., create a delegate that takes care of that while my thread keeps running its code? I know I can create a delegate for Add and call BeginInvoke on it. Does that seem appropriate?
Thanks.
EDIT: A simple solution based on Kevin's suggestion:
public class AsynchronousList<T> : List<T> {
private AddDelegate addDelegate;
public delegate void AddDelegate(T item);
public AsynchronousList() {
addDelegate = new AddDelegate(this.AddBlocking);
}
public void AddAsynchronous(T item) {
addDelegate.BeginInvoke(item, null, null);
}
private void AddBlocking(T item) {
lock (this) {
Add(item);
}
}
}
I only need to control Add operations and I just need this for debugging (it won't be in the final product), so I just wanted a quick fix.
Thanks everyone for your answers.
List<T> can only support multiple readers concurrently. If you are going to use multiple threads to add to the list, you'll need to lock the object first. There is really no way around this, because without a lock you can still have someone reading from the list while another thread updates it (or multiple objects trying to update it concurrently also).
http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx
Your best bet probably is to encapsulate the list in another object, and have that object handle the locking and unlocking actions on the internal list. That way you could make your new object's "Add" method asynchronous and let the calling objects go on their merry way. Any time you read from it though you'll most likely still have to wait on any other objects finishing their updates though.
The only thing I can think of to account for the null entries is the concurrent accesses. List<> isn't thread-safe, after all.
That's basically it. We are specifically told it's not thread-safe, so we shouldn't be surprised that concurrent access results in contract-breaking behaviour.
As to why this specific problem occurs, we can but speculate, since List<>'s private implementation is, well, private (I know we have Reflector and Shared Source - but in principle it is private). Suppose the implementation involves an array and a 'last populated index'. Suppose also that 'Add an item' looks like this:
Ensure the array is big enough for another item
last populated index <- last populated index + 1
array[last populated index] = incoming item
Now suppose there are two threads calling Add. If the interleaved sequence of operations ends up like this:
Thread A : last populated index <- last populated index + 1
Thread B : last populated index <- last populated index + 1
Thread A : array[last populated index] = incoming item
Thread B : array[last populated index] = incoming item
then not only will there be a null in the array, but also the item that thread A was trying to add won't be in the array at all!
Now, I don't know for sure how List<> does its stuff internally. I have half a memory that it is with an ArrayList, which internally uses this scheme; but in fact it doesn't matter. I suspect that any list mechanism that expects to be run non-concurrently can be made to break with concurrent access and a sufficiently 'unlucky' interleaving of operations. If we want thread-safety from an API that doesn't provide it, we have to do some work ourselves - or at least, we shouldn't be surprised if the API sometimes breaks its when we don't.
For your requirement of
I don't want the calling threads to block waiting to add their item
my first thought is a Multiple-Producer-Single-Consumer queue, wherein the threads wanting to add items are the producers, which dispatch items to the queue async, and there is a single consumer which takes items off the queue and adds them to the list with appropriate locking. My second thought is that this feels as if it would be heavier than this situation warrants, so I'll let it mull for a bit.
If you're using .NET Framework 4, you might check out the new Concurrent Collections. When it comes to threading, it's better not to try to be clever, as it's extremely easy to get it wrong. Synchronization can impact performance, but the effects of getting threading wrong can also result in strange, infrequent errors that are a royal pain to track down.
If you're still using Framework 2 or 3.5 for this project, I recommend simply wrapping your calls to the list in a lock statement. If you're concerned about performance of Add (are you performing some long-running operation using the list somewhere else?) then you can always make a copy of the list within a lock and use that copy for your long-running operation outside the lock. Simply blocking on the Adds themselves shouldn't be a performance issue, unless you have a very large number of threads. If that's the case, you can try the Multiple-Producer-Single-Consumer queue that AakashM recommended.

Categories