According to the discussion here, somewhere, someplace on the internet it is verified that replacing some types of collections while enumerating them is possible/thread safe.
My tests below seem to confirm that.
// This test confirmed insufficient by comments
var a = new List<int> { 1, 2, 3 };
Parallel.For(1, 10000, i => {
foreach (var x in a)
Console.WriteLine(i + x);
});
Parallel.For(1, 10000, i => a = new List<int> { 1, 2, 3, 4 });
I would however very much like to read some official documentation or some concrete reference pertaining to this fact before i start implementing it in my code.
Can someone verify this/post a link?
As has already been mentioned, you are not in fact mutating a while you're iterating it. You're iterating it a bunch, and then after you're done iterating it a bunch, you're mutating a a bunch, because Parallel.For will block until it has finished executing all of the iterations.
But, even if you were mutating a in parallel with the iterations here, it would in fact be perfectly safe. The foreach is going to read the value of a once at the very start, get a reference to a list, and then from that point forward, it's never going to look at a again. It's going to be working off of local copies to the reference to the list that it got from a, so it won't know or care what changes are made to the variable a after that point. So if you're mutating what list a points to and also iterating a at the same time, then you don't know whether the list being iterated is what was in a before or after the change in another thread, but you know that the list being iterated must be one list or the other, and not some error or mix of the two.
Now if you were mutating the list that a references rather than mutating the variable a to point to a new reference then that would be entirely different. List is not designed to be accessed from multiple threads at the same time, and so all sorts of bad things would happen. If you used a collection specifically designed to be accessed from multiple threads, and you used it in a way it was designed to be used, then it could function properly.
Just to add to Servy's answer and what has been said in the comments, what you have isn't really an illustration of modifying the variable in parallel while iterating over it. Your Parallel.For loops run sequentially - i.e. first you iterate over the list 10000 times (possibly in parallel), then you replace it with a new list 10000 times (again, possibly in parallel).
// This doesn't modify or replace the collection at all, it just iterates over it a bunch of times
Parallel.For(1, 10000, i => {
foreach (var x in a)
Console.WriteLine(i + x);
});
// This happens AFTER the previous Parallel.For loop completes
// Thus, you're not actually iterating over the loop at this point, just replacing it a bunch of times
Parallel.For(1, 10000, i => a = new List<int> { 1, 2, 3, 4 });
Note that I said possibly in parallel - simply putting something in a Parallel.For loop doesn't guarantee that the framework will actually use multiple threads to accomplish the task, and you can't predict "in advance" how many threads it'll use if it does. Point being that this code doesn't even necessarily prove that these tasks are running on multiple threads (or how many they're running on if they are).
One other flaw in this test: you're replacing the class with the same exact collection every time, so you can't really tell which thread did the final update after the loop is done. Let's say that it uses 3 different threads to execute this - A, B, and C. How do you know which one made the last update to the collection? Recall that a Parallel.For loop is not guaranteed to execute sequentially, so it could have been updated by any of the three. From the documentation (emphasis mine):
The syntax of a parallel loop is very similar to the for and foreach
loops you already know, but the parallel loop runs faster on a
computer that has available cores. Another difference is that, unlike
a sequential loop, the order of execution isn't defined for a parallel
loop. Steps often take place at the same time, in parallel. Sometimes,
two steps take place in the opposite order than they would if the loop
were sequential. The only guarantee is that all of the loop's
iterations will have run by the time the loop finishes.
Basically, then, with a Parallel.For loop you have no idea "in advance" the degree of parallelism, whether it uses parallelism at all, or even which order the steps will execute in (so using this construct necessarily entails giving up considerable control of how the code is actually executed).
Related
So I just cant grasp the concept here.
I have a Method that uses the Parallel class with the Foreach method.
But the thing I dont understand is, does it create new threads so it can run the function faster?
Let's take this as an example.
I do a normal foreach loop.
private static void DoSimpleWork()
{
foreach (var item in collection)
{
//DoWork();
}
}
What that will do is, it will take the first item in the list, assign the method DoWork(); to it and wait until it finishes. Simple, plain and works.
Now.. There are three cases I am curious about
If I do this.
Parallel.ForEach(stringList, simpleString =>
{
DoMagic(simpleString);
});
Will that split up the Foreach into let's say 4 chunks?
So what I think is happening is that it takes the first 4 lines in the list, assigns each string to each "thread" (assuming parallel creates 4 virtual threads) does the work and then starts with the next 4 in that list?
If that is wrong please correct me I really want to understand how this works.
And then we have this.
Which essentially is the same but with a new parameter
Parallel.ForEach(stringList, new ParallelOptions() { MaxDegreeOfParallelism = 32 }, simpleString =>
{
DoMagic(simpleString);
});
What I am curious about is this
new ParallelOptions() { MaxDegreeOfParallelism = 32 }
Does that mean it will take the first 32 strings from that list (if there even is that many in the list) and then do the same thing as I was talking about above?
And for the last one.
Task.Factory.StartNew(() =>
{
Parallel.ForEach(stringList, simpleString =>
{
DoMagic(simpleString);
});
});
Would that create a new task, assigning each "chunk" to it's own task?
Do not mix async code with parallel. Task is for async operations - querying a DB, reading file, awaiting some comparatively-computation-cheap operation such that your UI won't be blocked and unresponsive.
Parallel is different. That's designed for 1) multi-core systems and 2) computational-intensive operations. I won't go in details how it works, that kind of info could be found in an MS documentation. Long story short, Parallel.For most probably will make it's own decision on what exactly when and how to run. It might disobey you parameters, i.e. MaxDegreeOfParallelism or somewhat else. The whole idea is to provide the best possible parallezation, thus complete your operation as fast as possible.
Parallel.ForEach perform the equivalent of a C# foreach loop, but with each iteration executing in parallel instead of sequentially. There is no sequencing, it depends on whether the OS can find an available thread, if there is it will execute
MaxDegreeOfParallelism
By default, For and ForEach will utilize as many threads as the OS provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used by the application.
You do not need to modify this parameter in general but may choose to change it in advanced scenarios:
When you know that a particular algorithm you're using won't scale
beyond a certain number of cores. You can set the property to avoid
wasting cycles on additional cores.
When you're running multiple algorithms concurrently and want to
manually define how much of the system each algorithm can utilize.
When the thread pool's heuristics is unable to determine the right
number of threads to use and could end up injecting too many
threads. e.g. in long-running loop body iterations, the
thread pool might not be able to tell the difference between
reasonable progress or livelock or deadlock, and might not be able
to reclaim threads that were added to improve performance. You can set the property to ensure that you don't use more than a reasonable number of threads.
Task.StartNew is usually used when you require fine-grained control for a long-running, compute-bound task, and like what #Сергей Боголюбов mentioned, do not mix them up
It creates a new task, and that task will create threads asynchronously to run the for loop
You may find this ebook useful: http://www.albahari.com/threading/#_Introduction
does the work and then starts with the next 4 in that list?
This depends on your machine's hardware and how busy the machine's cores are with other processes/apps your CPU is working on
Does that mean it will take the first 32 strings from that list (if there even if that many in the list) and then do the same thing as I was talking about above?
No, there's is no guarantee that it will take first 32, could be less. It will vary each time you execute the same code
Task.Factory.StartNew creates a new tasks but it will not create a new one for each chunk as you expect.
Putting a Parallel.ForEach inside a new Task will not help you further reduce the time taken for the parallel tasks themselves.
I have a method which reads a text file which contains an int value per line, for making reading faster, i used Parallel.ForEach, but the behaviour what i am seeing is unexpected, i have 800 lines in the file but when i run this method, every time it returns different count of HashSet, what i have read after searching is Parallel.ForEach spawns multiple threads and it returns the result when all threads have completed their work, but my code execute contradicts, or i am missing something improtant here?
Here is my method:
private HashSet<int> GetKeyItemsProcessed()
{
HashSet<int> keyItems = new HashSet<int>();
if (!File.Exists(TrackingFilePath))
return keyItems;
// normal foreach works fine
//foreach(var keyItem in File.ReadAllLines(TrackingFilePath))
//{
// keyItems.Add(int.Parse(keyItem));
//}
// this does not return right number of hashset rows
Parallel.ForEach(File.ReadAllLines(TrackingFilePath).AsParallel(), keyItem =>
{
keyItems.Add(int.Parse(keyItem));
});
return keyItems;
}
HashSet.Add is NOT thread safe.
From MSDN:
Any public static (Shared in Visual Basic) members of this type are
thread safe. Any instance members are not guaranteed to be thread
safe.
The unpredictability of multithread timing could, and seems to be, causing issues.
You could wrap the access in a synchronization construct, which is sometimes faster than a concurrent collection, but may not speed anything up in some cases. As others have mentioned, another option is to use a thread safe collection like ConcurrenDictionary or ConcurrentQueue, though those may have additional memory overhead.
Be sure to benchmark any results you get with regards to timing. The raw power of singlethreaded access can sometimes be faster than dealing with the overhead of threading. It may not be worth it at all to thread this code.
The final word though, is that HashSet alone, without synchronization, is simply unacceptable for multi threaded operations.
I try to optimize code with parallel execution, but sometimes only one thread gets all the heavy load. The following example shows how 40 tasks should be performed in at most 4 threads, and the ten first are more time consuming than the others.
Parallel.ForEach seem to split the array in 4 parts, and lets one thread handle each part. So the entire execution takes about 10 seconds. It should be able to complete within at most 3.3 seconds!
Is there a way to use all threads all the way, since it in my real problem isn't known which tasks that are time consuming?
var array = System.Linq.Enumerable.Range(0, 40).ToArray();
System.Threading.Tasks.Parallel.ForEach(array, new System.Threading.Tasks.ParallelOptions() { MaxDegreeOfParallelism = 4, },
i =>
{
Console.WriteLine("Running index {0,3} : {1}", i, DateTime.Now.ToString("HH:mm:ss.fff"));
System.Threading.Thread.Sleep(i < 10 ? 1000 : 10);
});
It would be possible with Parallel.ForEach, but you'd need to use a custom partitioner (or find a 3rd party partitioner) that would be able to partition the elements more sensibly based on your particular items. (Or just use much smaller batches.)
This is also assuming that you don't strictly know in advance which items are going to be fast and which are slow; if you did, you could re-order the items yourself before calling ForEach so that the expensive items are more spread out. That may or may not be sufficient, depending on the circumstances.
In general I prefer to solve these problems by simply having one producer and multiple consumers, each of which handle one item at a time, rather than batches. The BlockingCollection class makes these situations rather straightforward. Just add all of the items to the collection, create N tasks/threads/etc., each of which grab an item and process it until there are no more items. It doesn't give you the dynamic adding/removing of threads that Parallel.ForEach gives you, but that doesn't seem to be an issue in your case.
Using a custom partitioner is the right solution to modify the behavior of Parallel.ForEach(). If you're on .Net 4.5, there is an overload of Partitioner.Create() that you can use. With it, your code would look like this:
var partitioner = Partitioner.Create(
array, EnumerablePartitionerOptions.NoBuffering);
Parallel.ForEach(
partitioner, new ParallelOptions { MaxDegreeOfParallelism = 4, }, i => …);
This is not the default, because turning off buffering increases the overhead of Parallel.ForEach(). But if your iterations are really that long (seconds), that additional overhead shouldn't be noticeable.
This is due to a feature called the partitioner. By default your loop is divided among your available threads equally. It sounds like you want to change this behavior. The reasoning behind the current behavior is that it takes a certain about of overhead time to set up a thread, so you want to do as much work as is reasonable on it. Therefore the collection is partitioned in to blocks and sent to each thread. The system has no way to know that parts of the collection take longer than others (unless you explicitly tell it) and assumes that an equal division leads to a roughly equal complete time. In your case you may want to split out the tasks that take longer and run time in a different way. Or you may wish to provide a custom partitioner which transverses the collection in a non sequential manner.
You might want to use the Microsoft TPL Dataflow library, which helps in designing highlight concurrent systems.
Your code is roughly equivalent to the following one using this library:
var options = new ExecutionDataflowBlockOptions {
MaxDegreeOfParallelism = 4,
SingleProducerConstrained = true
};
var actionBlock = new ActionBlock<int>(i => {
Console.WriteLine("Running index {0,3} : {1}", i, DateTime.Now.ToString("HH:mm:ss.fff"));
System.Threading.Thread.Sleep(i < 10 ? 1000 : 10);
}, options);
Task.WhenAll(Enumerable.Range(0, 40).Select(actionBlock.SendAsync)).Wait();
actionBlock.Complete();
actionBlock.Completion.Wait();
TPL dataflow will use 4 consumers in this scenario, processing a new value as soon as one of the consumer is available, thus maximizing throughput.
Once you're used to the library, you might want to add more asynchrony to your system by using the various blocks provided by the library, and removing all those awful Wait calls.
I've already read previous questions here about ConcurrentBag but did not find an actual sample of implementation in multi-threading.
ConcurrentBag is a thread-safe bag implementation, optimized for scenarios where the same thread will be both producing and consuming data stored in the bag."
Currently this is the current usage in my code (this is simplified not actual codes):
private void MyMethod()
{
List<Product> products = GetAllProducts(); // Get list of products
ConcurrentBag<Product> myBag = new ConcurrentBag<Product>();
//products were simply added here in the ConcurrentBag to simplify the code
//actual code process each product before adding in the bag
Parallel.ForEach(
products,
new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount },
product => myBag.Add(product));
ProcessBag(myBag); // method to process each items in the concurrentbag
}
My questions:
Is this the right usage of ConcurrentBag? Is it ok to use ConcurrentBag in this kind of scenario?
For me I think a simple List<Product> and a manual lock will do better. The reason for this is that the scenario above already breaks the "same thread will be both producing and consuming data stored in the bag" rule.
Also I also found out that the ThreadLocal storage created in each thread in the parallel will still exist after the operation (even if the thread is reused is this right?) which may cause an undesired memory leak.
Am I right in this one guys? Or a simple clear or empty method to remove the items in the ConcurrentBag is enough?
This looks like an ok use of ConcurrentBag. The thread local variables are members of the bag, and will become eligible for garbage collection at the same time the bag is (clearing the contents won't release them). You are right that a simple List with a lock would suffice for your case. If the work you are doing in the loop is at all significant, the type of thread synchronization won't matter much to the overall performance. In that case, you might be more comfortable using what you are familiar with.
Another option would be to use ParallelEnumerable.Select, which matches what you are trying to do more closely. Again, any performance difference you are going to see is likely going to be negligible and there's nothing wrong with sticking with what you know.
As always, if the performance of this is critical there's no substitute for trying it and measuring.
It seems to me that bmm6o's is not correct. The ConcurrentBag instance internally contains mini-bags for each thread that adds items to it, so item insertion does not involve any thread locks, and thus all Environment.ProcessorCount threads may get into full swing without being stuck waiting and without any thread context switches. A thread sinchronization may require when iterating over the collected items, but again in the original example the iteration is done by a single thread after all insertions are done. Moreover, if the ConcurrentBag uses Interlocked techniques as the first layer of the thread synchronization, then it is possible not to involve Monitor operations at all.
On the other hand, using a usual List<T> instance and wrapping each its Add() method call with a lock keyword will hurt the performance a lot. First, due to the constant Monitor.Enter() and Monitor.Exit() calls that each require to step deep into the kernel mode and to work with Windows synchronization primitives. Secondly, sometimes occasionally one thread may be blocked by the second thread because the second thread has not finished its addition yet.
As for me, the code above is a really good example of the right usage of ConcurrentBag class.
Is this the right usage of ConcurrentBag? Is it ok to use ConcurrentBag in this kind of scenario?
No, for multiple reasons:
This is not the intended usage scenario for this collection. The ConcurrentBag<T> is intended for mixed producer-consumer scenarios, meaning that each thread is expected to add and take items from the bag. Your scenario is nothing like this. You have many threads that add items, and zero threads that take items. The main application for the ConcurrentBag<T> is for making object-pools (pools of reusable objects that are expensive to create or destroy). And given the availability of the ObjectPool<T> class in the Microsoft.Extensions.ObjectPool package, even this niche application for this collection is contested.
It doesn't preserve the insertion order. Even if preserving the insertion order is not important, getting a shuffled output makes the debugging more difficult.
It creates garbage that have to be collected by the GC. It creates one WorkStealingQueue (internal class) per thread, each containing an expandable array, so the more threads you have the more objects you allocate. Also each time it is enumerated it copies all the items in an array, and exposes an IEnumerator<T> GetEnumerator() property that is boxed on each foreach.
There are better options available, offering both better performance and better ordering behavior.
In your scenario you can store the results of the parallel execution in a simple array. Just create an array with length equal to the products.Count, switch from the Parallel.ForEach to the Parallel.For, and assign the result directly to the corresponding slot of the results array without doing any synchronization at all:
List<Product> products = GetAllProducts(); // Get list of products
Product[] results = Product[products.Count];
Parallel.For(0, products.Count,
new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount },
i => results[i] = products[i]);
ProcessResults(results);
This way you'll get the results with perfect ordering, stored in a container that has the most compact size and the fastest enumeration of all .NET collections, doing only a single object allocation.
In case you are concerned about the thread-safety of the above operation, there is nothing to worry about. Each thread writes on different slots in the results array. After the completion of the parallel execution the current thread has full visibility of all the values that are stored in the array, because the TPL includes the appropriate barriers when tasks are queued, and at the beginning/end of task execution (citation).
(I have posted more thoughts about the ConcurrentBag<T> in this answer.)
If List<T> is used with a lock around Add() method it will make threads wait and will reduce the performance gain of using Parallel.ForEach()
First of all, sorry about the title -- I couldn't figure out one that was short and clear enough.
Here's the issue: I have a list List<MyClass> list to which I always add newly-created instances of MyClass, like this: list.Add(new MyClass()). I don't add elements any other way.
However, then I iterate over the list with foreach and find that there are some null entries. That is, the following code:
foreach (MyClass entry in list)
if (entry == null)
throw new Exception("null entry!");
will sometimes throw an exception.
I should point out that the list.Add(new MyClass()) are performed from different threads running concurrently. The only thing I can think of to account for the null entries is the concurrent accesses. List<> isn't thread-safe, after all. Though I still find it strange that it ends up containing null entries, instead of just not offering any guarantees on ordering.
Can you think of any other reason?
Also, I don't care in which order the items are added, and I don't want the calling threads to block waiting to add their items. If synchronization is truly the issue, can you recommend a simple way to call the Add method asynchronously, i.e., create a delegate that takes care of that while my thread keeps running its code? I know I can create a delegate for Add and call BeginInvoke on it. Does that seem appropriate?
Thanks.
EDIT: A simple solution based on Kevin's suggestion:
public class AsynchronousList<T> : List<T> {
private AddDelegate addDelegate;
public delegate void AddDelegate(T item);
public AsynchronousList() {
addDelegate = new AddDelegate(this.AddBlocking);
}
public void AddAsynchronous(T item) {
addDelegate.BeginInvoke(item, null, null);
}
private void AddBlocking(T item) {
lock (this) {
Add(item);
}
}
}
I only need to control Add operations and I just need this for debugging (it won't be in the final product), so I just wanted a quick fix.
Thanks everyone for your answers.
List<T> can only support multiple readers concurrently. If you are going to use multiple threads to add to the list, you'll need to lock the object first. There is really no way around this, because without a lock you can still have someone reading from the list while another thread updates it (or multiple objects trying to update it concurrently also).
http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx
Your best bet probably is to encapsulate the list in another object, and have that object handle the locking and unlocking actions on the internal list. That way you could make your new object's "Add" method asynchronous and let the calling objects go on their merry way. Any time you read from it though you'll most likely still have to wait on any other objects finishing their updates though.
The only thing I can think of to account for the null entries is the concurrent accesses. List<> isn't thread-safe, after all.
That's basically it. We are specifically told it's not thread-safe, so we shouldn't be surprised that concurrent access results in contract-breaking behaviour.
As to why this specific problem occurs, we can but speculate, since List<>'s private implementation is, well, private (I know we have Reflector and Shared Source - but in principle it is private). Suppose the implementation involves an array and a 'last populated index'. Suppose also that 'Add an item' looks like this:
Ensure the array is big enough for another item
last populated index <- last populated index + 1
array[last populated index] = incoming item
Now suppose there are two threads calling Add. If the interleaved sequence of operations ends up like this:
Thread A : last populated index <- last populated index + 1
Thread B : last populated index <- last populated index + 1
Thread A : array[last populated index] = incoming item
Thread B : array[last populated index] = incoming item
then not only will there be a null in the array, but also the item that thread A was trying to add won't be in the array at all!
Now, I don't know for sure how List<> does its stuff internally. I have half a memory that it is with an ArrayList, which internally uses this scheme; but in fact it doesn't matter. I suspect that any list mechanism that expects to be run non-concurrently can be made to break with concurrent access and a sufficiently 'unlucky' interleaving of operations. If we want thread-safety from an API that doesn't provide it, we have to do some work ourselves - or at least, we shouldn't be surprised if the API sometimes breaks its when we don't.
For your requirement of
I don't want the calling threads to block waiting to add their item
my first thought is a Multiple-Producer-Single-Consumer queue, wherein the threads wanting to add items are the producers, which dispatch items to the queue async, and there is a single consumer which takes items off the queue and adds them to the list with appropriate locking. My second thought is that this feels as if it would be heavier than this situation warrants, so I'll let it mull for a bit.
If you're using .NET Framework 4, you might check out the new Concurrent Collections. When it comes to threading, it's better not to try to be clever, as it's extremely easy to get it wrong. Synchronization can impact performance, but the effects of getting threading wrong can also result in strange, infrequent errors that are a royal pain to track down.
If you're still using Framework 2 or 3.5 for this project, I recommend simply wrapping your calls to the list in a lock statement. If you're concerned about performance of Add (are you performing some long-running operation using the list somewhere else?) then you can always make a copy of the list within a lock and use that copy for your long-running operation outside the lock. Simply blocking on the Adds themselves shouldn't be a performance issue, unless you have a very large number of threads. If that's the case, you can try the Multiple-Producer-Single-Consumer queue that AakashM recommended.