I want to know, how many threads will be used when I run Parallel.For/ForEach loop.
I found, that it can be changed by MaxDegreeOfParallelism option.
MaxDegreeOfParallelism help on MSDN says (link):
By default, For and ForEach will utilize however many threads the
underlying scheduler provides, so changing MaxDegreeOfParallelism from
the default only limits how many concurrent tasks will be used.
But I don't know how many threads underlying scheduler provides.
How can I find out that?
I could test it with loop with 9999999 runs, however this test will show me number, but not the rule that determine that number.
Edit/added later:
I googled for "sheduler max concurrency", and I found (at MSDN - link), that TashSheduler class has MaximumConcurrencyLevel property, and:
Returns an integer that represents the maximum concurrency level. The
default scheduler returns Int32.MaxValue.
That TaskSheduler class is used as "underlying scheduler" for these parallel loops?
According to MSDN:
The default scheduler for Task Parallel Library and PLINQ uses the .NET Framework ThreadPool to queue and execute work. In the .NET Framework 4, the ThreadPool uses the information that is provided by the System.Threading.Tasks.Task type to efficiently support the fine-grained parallelism (short-lived units of work) that parallel tasks and queries often represent.
Looking at the documentation of ThreadPool, it says:
There is one thread pool per process. Beginning with the .NET Framework 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads. The number of threads in the thread pool can be changed by using the SetMaxThreads method.
Related
I'm doing heavy mathematical computations using Math.Net Numerics parallely inside Parallel.For block.
When I run code in my local system with 4 cores(2*2), it's using all 4 cores.
But when I run same code in our dev server with 8 cores(4*2), it's using only 4 cores.
I've tried setting MaxDegreeOfParallism,but couldn't help.
Any idea why all cores are not being utilised.
Below is sample code.
Parallel.For(0,10000,(i)=>
{
// heavy math computations using matrices
});
From MSDN
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
The way I read the documentation: if the underlying scheduler only offers a single thread, then setting MaxDegreeOfParallelism > 1 will still result in a single thread.
Parallelization is done runtime, based on the current conditions and a lots of other circumstances. You cannot force .NET to use all the cores (in managed code at least).
From MSDN:
Conversely, by default, the Parallel.ForEach and Parallel.For methods can use a variable number of tasks. That's why, for example, the ParallelOptions class has a MaxDegreeOfParallelism property instead of a "MinDegreeOfParallelism" property. The idea is that the system can use fewer threads than requested to process a loop.
The .NET thread pool adapts dynamically to changing workloads by allowing the number of worker threads for parallel tasks to change over time. At run time, the system observes whether increasing the number of threads improves or degrades overall throughput and adjusts the number of worker threads accordingly.
Be careful if you use parallel loops with individual steps that take several seconds or more. This can occur with I/O-bound workloads as well as lengthy calculations. If the loops take a long time, you may experience an unbounded growth of worker threads due to a heuristic for preventing thread starvation that's used by the .NET ThreadPool class's thread injection logic.
I'm doing heavy mathematical computations using Math.Net Numerics parallely inside Parallel.For block.
When I run code in my local system with 4 cores(2*2), it's using all 4 cores.
But when I run same code in our dev server with 8 cores(4*2), it's using only 4 cores.
I've tried setting MaxDegreeOfParallism,but couldn't help.
Any idea why all cores are not being utilised.
Below is sample code.
Parallel.For(0,10000,(i)=>
{
// heavy math computations using matrices
});
From MSDN
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
The way I read the documentation: if the underlying scheduler only offers a single thread, then setting MaxDegreeOfParallelism > 1 will still result in a single thread.
Parallelization is done runtime, based on the current conditions and a lots of other circumstances. You cannot force .NET to use all the cores (in managed code at least).
From MSDN:
Conversely, by default, the Parallel.ForEach and Parallel.For methods can use a variable number of tasks. That's why, for example, the ParallelOptions class has a MaxDegreeOfParallelism property instead of a "MinDegreeOfParallelism" property. The idea is that the system can use fewer threads than requested to process a loop.
The .NET thread pool adapts dynamically to changing workloads by allowing the number of worker threads for parallel tasks to change over time. At run time, the system observes whether increasing the number of threads improves or degrades overall throughput and adjusts the number of worker threads accordingly.
Be careful if you use parallel loops with individual steps that take several seconds or more. This can occur with I/O-bound workloads as well as lengthy calculations. If the loops take a long time, you may experience an unbounded growth of worker threads due to a heuristic for preventing thread starvation that's used by the .NET ThreadPool class's thread injection logic.
I've helped a client with an application which stopped doing it's work after a while (it eventually started doing work again).
The problem was that when a Task failed it used Thread.Sleep for 5 seconds (in the task). As there could be up to 800 tasks queued every two second you can imagine the problem if many of those jobs fails and invoked Thread.Sleep. None of those jobs were marked with TaskCreationOptions.LongRunning.
I rewrote the tasks so that Thread.Sleep (or Task.Delay for that matter) wasn't nessacary.
However, I'm interested in what the TaskScheduler (the default) did in that scenario. How and when do it increase the number of threads?
According to MSDN
Behind the scenes, tasks are queued to the ThreadPool, which has been
enhanced with algorithms (like hill-climbing) that determine and
adjust to the number of threads that maximizes throughput.
The ThreadPool will have a maximum number of threads depending on the environment. As you create more Tasks the pool can run them concurrently until it reaches its maximum number of threads, at which point any further tasks will be queued.
If you want to find out the maximum number of ThreadPool threads you can use System.Threading.ThreadPool.GetMaxThreads (you need to pass in two out int parameters, one that will be populated with the number of maximum worker threads and another that will be populated with the maximum number of asynchronous I/O threads).
If you want to get a better idea of what is happening in your application at runtime you can use Visual Studio's threads window by going to Debug -> Windows -> Threads (The entry will only be there when you are debugging so you'll need to set a break point in your application first).
This post is possibly of interest. It would seem the default task scheduler simply queues the task up in the ThreadPool's queue unless you use TaskCreationOptions.LongRunning. This means that it's up to the ThreadPool to decide when to create new threads.
I have an application in C# with a list of work to do. I'm looking to do as much of that work as possible in parallel. However I need to be able to control the maximum amount of parallel tasks.
From what I understand this is possible with a ThreadPool or with Tasks. Is there an difference in which one I use? My main concern is being able to control how many threads are active at one time.
Please take a look at ParallelOptions.MaxDegreeOfParallelism for Tasks.
I would advise you to use Tasks, because they provide a higher level abstraction than the ThreadPool.
A very good read on the topic can be found here. Really, a must-have book and it's free on top of that :)
In TPL you can use the WithDegreeOfParallelism on a ParallelEnumerable or ParallelOptions.MaxDegreeOfParallism
There is also the CountdownEvent which may be a better option if you are just using custom threads or tasks.
In the ThreadPool, when you use SetMaxThreads its global for the AppDomain so you could potentially be limiting unrelated code unnecessarily.
You cannot set the number of worker threads or the number of I/O completion threads to a number smaller than the number of processors in the computer.
If the common language runtime is hosted, for example by Internet Information Services (IIS) or SQL Server, the host can limit or prevent changes to the thread pool size.
Use caution when changing the maximum number of threads in the thread pool. While your code might benefit, the changes might have an adverse effect on code libraries you use.
Setting the thread pool size too large can cause performance problems. If too many threads are executing at the same time, the task switching overhead becomes a significant factor.
I agree with the other answer that you should use TPL over the ThreadPool as its a better abstraction of multi-threading, but its possible to accomplish what you want in both.
In this article on msdn, they explain why they recommend Tasks instead of ThreadPool for Parallelism.
Task have a very charming feature to me, you can build chains of tasks. Which are executed on certain results of the task before.
A feature I often use is following: Task A is running in background to do some long running work. I chain Task B after it, only executing when Task A has finished regulary and I configure it to run in the foreground, so I can easily update my controls with the result of long running Task A.
You can also create a semaphore to control how many threads can execute at a single time. You can create a new semaphore and in the constructor specify how many simultaneous threads are able to use that semaphore at a single time. Since I don't know how you are going to be using the threads, this would be a good starting point.
MSDN Article on the Semaphore class
-Wesley
I've got some trouble with .NET's ThreadPool (.NET 4).
I've read that by default .NET has a limit of 25 threads per processor, but according to forum posts on SO and on other places, I can increase the limit with the below code.
void SetThreads(int threads)
{
ThreadPool.SetMaxThreads(threads, threads);
ThreadPool.SetMinThreads(threads, threads);
}
However, when I set the above to some arbitrarily high number, for example, 2000, and queue ~1000 items, I still only have ~33 threads running (.NET CLR takes ~5 threads), and ThreadPool.GetAvailableThreads() returns 1971 threads remaining.
Why doesn't the code above work?
Firstly, your "knowledge" of the defaults is incorrect. The limit of 25 threads per processor was back from .NET 1.1. It was increased in .NET 2, and now:
Beginning with the .NET Framework version 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads.
However, there's something else at play: the thread pool doesn't immediately create new threads in all situations. In order to cope with bursts of small tasks, it limits how quickly it creates new threads. IIRC, it will create one thread every 0.5 seconds if there are outstanding tasks, up to the maximum number of threads. I can't immediately see that figure documented though, so it may well change. I strongly suspect that's what you're seeing though. Try queuing a lot of items and then monitor the number of threads over time.
From the MSDN :
When demand is low, the actual number of thread pool threads can fall
below the minimum values.
Read this too: Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4
Firstly check this link, especially this remark:
If the common language runtime is hosted, for example by Internet
Information Services (IIS) or SQL Server, the host can limit or
prevent changes to the thread pool size.
Then you should check the return value of ThreadPool.SetMaxThreads(threads, threads) method. Maybe it returns false?