I've got some trouble with .NET's ThreadPool (.NET 4).
I've read that by default .NET has a limit of 25 threads per processor, but according to forum posts on SO and on other places, I can increase the limit with the below code.
void SetThreads(int threads)
{
ThreadPool.SetMaxThreads(threads, threads);
ThreadPool.SetMinThreads(threads, threads);
}
However, when I set the above to some arbitrarily high number, for example, 2000, and queue ~1000 items, I still only have ~33 threads running (.NET CLR takes ~5 threads), and ThreadPool.GetAvailableThreads() returns 1971 threads remaining.
Why doesn't the code above work?
Firstly, your "knowledge" of the defaults is incorrect. The limit of 25 threads per processor was back from .NET 1.1. It was increased in .NET 2, and now:
Beginning with the .NET Framework version 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads.
However, there's something else at play: the thread pool doesn't immediately create new threads in all situations. In order to cope with bursts of small tasks, it limits how quickly it creates new threads. IIRC, it will create one thread every 0.5 seconds if there are outstanding tasks, up to the maximum number of threads. I can't immediately see that figure documented though, so it may well change. I strongly suspect that's what you're seeing though. Try queuing a lot of items and then monitor the number of threads over time.
From the MSDN :
When demand is low, the actual number of thread pool threads can fall
below the minimum values.
Read this too: Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4
Firstly check this link, especially this remark:
If the common language runtime is hosted, for example by Internet
Information Services (IIS) or SQL Server, the host can limit or
prevent changes to the thread pool size.
Then you should check the return value of ThreadPool.SetMaxThreads(threads, threads) method. Maybe it returns false?
Related
I am trying to figure out exactly what impact does ThreadPool.SetMinThreads makes.
According to official documentation it says
Sets the minimum number of threads the thread pool creates on demand, as new requests are made, before switching to an algorithm for managing thread creation and destruction.
In my understanding, as a developer, I'm suppose to have control over mechanism on how to spin new threads on demand, so they are created and waiting in the idle state, in situations when for example I'm expecting load of request coming at specific time.
And this is exactly what I initially thought SetMinThreads method is designed for.
But when I started actually playing with it - I got really weird results.
So I'm having my ASP.NET .NET5 application, and in controller action I'm having code like this:
ThreadPool.SetMinThreads(32000, 1000);
And of course I'm intuitively expecting runtime to create 32K of worker threads and 1000 io threads for me.
And when I do that, and then call other method - Process.GetCurrentProcess().Threads to get all the process' threads, and print statistic on them, I get something like this
Standby - 17
Running - 4
I thought that maybe app needs some time to spin new threads, so I've tried diffrent delays, 1min, 5min and 10mins.
But result always stays the same, I get 15-20 Standby and 2-4 Running.
So then comes logical question - what exactly SetMinThreads method is doing at all? The description provided by MSDN does not seem very helpful.
And another logical question - what if I wanted to force dotnet to spin 32K of new threads in idle state - does dotnet provide any mechanism for it at all?
The ThreadPool.SetMinThreads sets the minimum number of threads that the ThreadPool creates instantly on demand. That's the key phrase, and it is indeed quite unintuitive. The ThreadPool currently¹ (.NET 5) works in two modes:
When a new request for work arrives, and all the threads in the pool are busy, create instantly a new thread in order to satisfy the request.
When a new request for work arrives, and all the threads in the pool are busy, queue the request, and wait for 1 sec before creating a new thread, hoping that in the meantime one of the worker threads will complete its current work, and will become available for serving the queued request.
The ThreadPool.SetMinThreads sets the threshold between these two modes. It does not give you control over the number of threads that are alive right now. Which is not very satisfying, but it is what it is. If you want to force the ThreadPool to create 1,000 threads instantly, you must also sent an equal number of requests for work, additionally to calling ThreadPool.SetMinThreads(1000, 1000). Something like this should do the trick:
ThreadPool.SetMinThreads(1000, 1000);
Task[] tasks = Enumerable.Range(0, 1000)
.Select(_ => Task.Run(() => Thread.Sleep(100)))
.ToArray();
Honestly I don't think that anyone does that. Creating a new Thread is quite fast in human time (it requires around 0.25 milliseconds per thread in my PC), so for a system that receives requests from humans, the overhead of creating a thread shouldn's have any measurable impact. On the other hand 0.25 msec is an eon in computer time, when you want a thread to do a tiny amount of work (in the range of nanoseconds), like adding something in a List<T>. That's why the ThreadPool was invented in the first place: To amortize the overhead of thread-creation for tiny but numerous workloads.
Be aware that creating a new Thread has also a memory cost, which is generally more significant than the time cost: each thread requires at least 1 MB of RAM for its stack. So creating 32,000 threads will tie down 32 GB of memory just for stack space. This is not very efficient. That's why in recent years asynchronous programming has become so prominent in server-side web development, because it allows to do more work with less threads.
¹ There is nothing preventing the Microsoft engineers from changing/twicking the implementation of the ThreadPool in the future. AFAIK this has already happened at least once in the past.
I want to know, how many threads will be used when I run Parallel.For/ForEach loop.
I found, that it can be changed by MaxDegreeOfParallelism option.
MaxDegreeOfParallelism help on MSDN says (link):
By default, For and ForEach will utilize however many threads the
underlying scheduler provides, so changing MaxDegreeOfParallelism from
the default only limits how many concurrent tasks will be used.
But I don't know how many threads underlying scheduler provides.
How can I find out that?
I could test it with loop with 9999999 runs, however this test will show me number, but not the rule that determine that number.
Edit/added later:
I googled for "sheduler max concurrency", and I found (at MSDN - link), that TashSheduler class has MaximumConcurrencyLevel property, and:
Returns an integer that represents the maximum concurrency level. The
default scheduler returns Int32.MaxValue.
That TaskSheduler class is used as "underlying scheduler" for these parallel loops?
According to MSDN:
The default scheduler for Task Parallel Library and PLINQ uses the .NET Framework ThreadPool to queue and execute work. In the .NET Framework 4, the ThreadPool uses the information that is provided by the System.Threading.Tasks.Task type to efficiently support the fine-grained parallelism (short-lived units of work) that parallel tasks and queries often represent.
Looking at the documentation of ThreadPool, it says:
There is one thread pool per process. Beginning with the .NET Framework 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads. The number of threads in the thread pool can be changed by using the SetMaxThreads method.
I need to optimize a WCF service... it's quite a complex thing. My problem this time has to do with tasks (Task Parallel Library, .NET 4.0). What happens is that I launch several tasks when the service is invoked (using Task.Factory.StartNew) and then wait for them to finish:
Task.WaitAll(task1, task2, task3, task4, task5, task6);
Ok... what I see, and don't like, is that on the first call (sometimes the first 2-3 calls, if made quickly one after another), the final task starts much later than the others (I am looking at a case where it started 0.5 seconds after the others). I tried calling
ThreadPool.SetMinThreads(12*Environment.ProcessorCount, 20);
at the beginning of my service, but it doesn't seem to help.
The tasks are all database-related: I'm reading from multiple databases and it has to take as little time as possible.
Any idea why the last task is taking so long? Is there something I can do about it?
Alternatively, should I use the thread pool directly? As it happens, in one case I'm looking at, one task had already ended before the last one started - I would had saved 0.2 seconds if I had reused that thread instead of waiting for a new one to be created. However, I can not be sure that that task will always end so quickly, so I can't put both requests in the same task.
[Edit] The OS is Windows Server 2003, so there should be no connection limit. Also, it is hosted in IIS - I don't know if I should create regular threads or using the thread pool - which is the preferred version?
[Edit] I've also tried using Task.Factory.StartNew(action, TaskCreationOptions.LongRunning); - it doesn't help, the last task still starts much later (around half a second later) than the rest.
[Edit] MSDN1 says:
The thread pool has a built-in delay
(half a second in the .NET Framework
version 2.0) before starting new idle
threads. If your application
periodically starts many tasks in a
short time, a small increase in the
number of idle threads can produce a
significant increase in throughput.
Setting the number of idle threads too
high consumes system resources
needlessly.
However, as I said, I'm already calling SetMinThreads and it doesn't help.
I have had problems myself with delays in thread startup when using the (.Net 4.0) Task-object. So for time-critical stuff I now use dedicated threads (... again, as that is what I was doing before .Net 4.0.)
The purpose of a thread pool is to avoid the operative system cost of starting and stopping threads. The threads are simply being reused. This is a common model found in for example internet servers. The advantage is that they can respond quicker.
I've written many applications where I implement my own threadpool by having dedicated threads picking up tasks from a task queue. Note however that this most often required locking that can cause delays/bottlenecks. This depends on your design; are the tasks small then there would be a lot of locking and it might be faster to trade some CPU in for less locking: http://www.boyet.com/Articles/LockfreeStack.html
SmartThreadPool is a replacement/extension of the .Net thread pool. As you can see in this link it has a nice GUI to do some testing: http://www.codeproject.com/KB/threads/smartthreadpool.aspx
In the end it depends on what you need, but for high performance I recommend implementing your own thread pool. If you experience a lot of thread idling then it could be beneficial to increase the number of threads (beyond the recommended cpucount*2). This is actually how HyperThreading works inside the CPU - using "idle" time while doing operations to do other operations.
Note that .Net has a built-in limit of 25 threads per process (ie. for all WCF-calls you receive simultaneously). This limit is independent and overrides the ThreadPool setting. It can be increased, but it requires some magic: http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=201
Following from my prior question (yep, should have been a Q against original message - apologies):
Why do you feel that creating 12 threads for each processor core in your machine will in some way speed-up your server's ability to create worker threads? All you're doing is slowing your server down!
As per MSDN do
As per the MSDN docs: "You can use the SetMinThreads method to increase the minimum number of threads. However, unnecessarily increasing these values can cause performance problems. If too many tasks start at the same time, all of them might appear to be slow. In most cases, the thread pool will perform better with its own algorith for allocating threads. Reducing the minimum to less than the number of processors can also hurt performance.".
Issues like this are usually caused by bumping into limits or contention on a shared resource.
In your case, I am guessing that your last task(s) is/are blocking while they wait for a connection to the DB server to come available or for the DB to respond. Remember - if your invocation kicks off 5-6 other tasks then your machine is going to have to create and open numerous DB connections and is going to kick the DB with, potentially, a lot of work. If your WCF server and/or your DB server are cold, then your first few invocations are going to be slower until the machine's caches etc., are populated.
Have you tried adding a little tracing/logging using the stopwatch to time how long it takes for your tasks to connect to the DB server and then execute their operations?
You may find that reducing the number of concurrent tasks you kick off actually speeds things up. Try spawning 3 tasks at a time, waiting for them to complete and then spawn the next 3.
When you call Task.Factory.StartNew, it uses a TaskScheduler to map those tasks into actual work items.
In your case, it sounds like one of your Tasks is delaying occasionally while the OS spins up a new Thread for the work item. You could, potentially, build a custom TaskScheduler which already contained six threads in a wait state, and explicitly used them for these six tasks. This would allow you to have complete control over how those initial tasks were created and started.
That being said, I suspect there is something else at play here... You mentioned that using TaskCreationOptions.LongRunning demonstrates the same behavior. This suggests that there is some other factor at play causing this half second delay. The reason I suspect this is due to the nature of TaskCreationOptions.LongRunning - when using the default TaskScheduler (LongRunning is a hint used by the TaskScheduler class), starting a task with TaskCreationOptions.LongRunning actually creates an entirely new (non-ThreadPool) thread for that Task. If creating 6 tasks, all with TaskCreationOptions.LongRunning, demonstrates the same behavior, you've pretty much guaranteed that the problem is NOT the default TaskScheduler, since this is going to always spin up 6 threads manually.
I'd recommend running your code through a performance profiler, and potentially the Concurrency Visualizer in VS 2010. This should help you determine exactly what is causing the half second delay.
What is the OS? If you are not running the server versions of windows, there is a connection limit. Your many threads are probably being serialized because of the connection limit.
Also, I have not used the task parallel library yet, but my limited experience is that new threads are cheap to make in the context of networking.
These articles might explain the problem you're having:
http://blogs.msdn.com/b/wenlong/archive/2010/02/11/why-are-wcf-responses-slow-and-setminthreads-does-not-work.aspx
http://blogs.msdn.com/b/wenlong/archive/2010/02/11/why-does-wcf-become-slow-after-being-idle-for-15-seconds.aspx
seeing as you're using .Net 4, the first article probably doesn't apply, but as the second article points out the ThreadPool terminates idle threads after 15 seconds which might explain the problem you're having and offers a simple (though a little hacky) solution to get around it.
Whether or not you should be using the ThreadPool directly wouldn't make any difference as I suspect the task library is using it for you underneath anyway.
One third-party library we have been using for a while might help you here - Smart Thread Pool. You still get the same benefits of using the task libraries, in that you can have the return values from the threads and get any exception information from them too.
Also, you can instantiate threadpools so that when you have multiple places each needing a threadpool (so that a low priority process doesn't start eating into the quota of some high priority process) and oh yeah you can set the priority of the threads in the pool too which you can't do with the standard ThreadPool where all the threads are background threads.
You can find plenty of info on the codeplex page, I've also got a post which highlights some of the key differences:
http://theburningmonk.com/2010/03/threading-introducing-smartthreadpool/
Just on a side note, for tasks like the one you've mentioned, which might take some time to return, you probably shouldn't be using the threadpool anyway. It's recommended that we should avoid using the threadpool for any blocking tasks like that because it hogs up the threadpool which is used by all sorts of things by the framework classes, like handling timer events, etc. etc. (not to mention handling incoming WCF requests!). I feel like I'm spamming here but here's some of the info I've gathered around the use of the threadpool and some useful links at the bottom:
http://theburningmonk.com/2010/03/threading-using-the-threadpool-vs-creating-your-own-threads/
well, hope this helps!
Is there a magic number or formula for setting the values of SetMaxThreads and SetMinThreads for ThreadPool? I have thousands of long-running methods that need execution but just can't find the perfect match for setting these values. Any advise would be greatly appreciated.
The default minimum number of threads is the number of cores your machine has. That's a good number, it doesn't generally make sense to run more threads than you have cores.
The default maximum number of threads is 250 times the number of cores you have on .NET 2.0 SP1 and up. There is an enormous amount of breathing room here. On a four core machine, it would take 499 seconds to reach that maximum if none of the threads complete in a reasonable amount of time.
The threadpool scheduler tries to limit the number of active threads to the minimum, by default the number of cores you have. Twice a second it allows one more thread to start if the active threads do not complete. Threads that run for a very long time or do a lot of blocking that is not caused by I/O are not good candidates for the threadpool. You should use a regular Thread instead.
Getting to the maximum isn't healthy. On a four core machine, just the stacks of those threads will consume a gigabyte of virtual memory space. Getting OOM is very likely. Consider lowering the max number of threads if that's your problem. Or consider starting just a few regular Threads that receive packets of work from a thread-safe queue.
Typically, the magic number is to leave it alone. The ThreadPool does a good job of handling this.
That being said, if you're doing a lot of long running services, and those services will have large periods where they're waiting, you may want to increase the maximum threads to handle more options. (If the processes aren't blocking, you'll probably just slow things down if you increase the thread count...)
Profile your application to find the correct number.
If you want better control, you might want to consider NOT using the built-in ThreadPool. There is a nice replacement at http://www.codeproject.com/KB/threads/smartthreadpool.aspx.
Are there any benefits to limiting the number of concurrent threads doing a given task to equal the number of processors on the host system? Or better to simply trust libraries such as .NET's ThreadPool to do the right thing ... even if there are 25 different concurrent threads happening at any one given moment?
Most threads are not CPU bound, they end up waiting on IO or other events. If you look at your system now, I imagine you have 100's (if not 1000's) of threads executing with no problems. By that measure, you're probably best just leaving the .NET thread pool to do the right thing!
However, if the threads were all CPU bound (e.g. something like ray tracing) then it would be a good idea to limit the number of threads to the number of cores, otherwise chances are that context switching will begin to hurt performance.
The threadpool already does a reasonably good job at this. It tries to limit the number of running threads to the number of CPU cores in your machine. When one thread ends, it immediately schedules another eligible thread for execution.
Every 0.5 seconds, it evaluates what is going on with the running threads. When the threads have been running too long, it assumes they are stalled and allows another thread to start executing. You'll now have more threads running than you have CPU cores. This can go up to the maximum number of allowed thread, as set by ThreadPool.SetMaxThreads().
Starting around .NET 2.0 SP1, the default maximum number of threads was increased considerably to 250 times the number of cores. You should never ever get there. If you do, you would have wasted about 2 minutes of time where a possibly non-optimal number of threads were running. Those threads however would all have to be blocking for that long, not exactly a typical execution pattern for a thread. On the other hand, if these threads are all waiting on the same kind of resource they are likely to just take turns, adding more threads cannot improve throughput.
Long story short, the thread pool will work well if you run threads that execute quickly (seconds at most) and don't block for a long time. You probably ought to consider creating your own Thread objects when your code doesn't match that pattern.
Well, if your bottleneck is ONLY processors, then it might make sense, but that would ignore all memory and other i/o bottlenecks, and chances are at least your cache memory is throwing page faults and other events that would slow the threads.
I'd trust the library myself. Threads wait for all kinds of things, and you don't want your application to slow down because it can't spawn a new thread, even though most of the rest are just sleeping, waiting for some event or resource.
Measure your application under a variety of thread:processor ratios. Come to conclusions based on hard data about your application. Accept no arguments from first principles about what performance you should get, only what you do get matters.