In this web tutorial on threading in C#, Joseph Albahari writes: "Don't go sleeping in pooled threads!" Why should you not do this? How badly can it affect performance? (It's not that I want to do it; I'm just curious.)
There are only a limited number of threads in the thread pool; thread pools are designed to efficiently execute a large number of short tasks. They rely on each task finishing quickly, so that the thread can return to the pool and be used for the next task.
So sleeping in a thread pool thread starves out the pool, which may eventually run out of available threads, and be unable to process the tasks you assign to it.
The thread pool is meant to quickly do a relatively short task on a different thread without having to spend the cost of creating a new thread. The thread pool has a maximum number of threads, and once that is reached, tasks are queued until a thread becomes available.
A thread sleeping on the thread pool would therefore hold up the queue, or contribute to thread pool exhaustion.
Thread is a heavy-weight object.
Creating a new thread requires lots of resources, such as assigning 1 MB for a managed stack, creating managed thread object, kernel stack, kernel thread object, user thread environment block. This all takes time and memory. Therefore you do not want to create and destroy objects really quickly. Furthermore, once you have more than one thread context switching will take some resources as well
Thread pool is a place where CLR can put unused threads, in case your application needs it. Threadpool initially contains 0 threads, once you request a thread from a pool, the pool will quickly create the minimum number of threads defined for the pool. After around 2 minutes unused threads get killed. But if the load increases and you need more threads, thread pool will slowly create new threads until the maximum bound reached. You cannot have more threads than maximum, all new requests will be queued and executed once a working thread returned to the pool. In worse case scenario you can get OutOfMemoryException
If a thread taken from a pool is blocked, it:
Holds the resources
Does not do any valuable work, while an application may need this thread for a new request
Breaks scalability by introducing blocks
Related
As far as I understand, .NET CLR creates a thread pool for each process. So each process has its own thread pool. And in every thread pool, there exist a certain number of threads available. It might be increasd or decreased as deemed necessary by the framework, but it starts with a predetermined number of threads for each process.
I wanted to find out the number of threads it will start with for a simple WPF application. When I used the System.Threading.ThreadPool.GetMaxThreads(out worker, out io) and System.Threading.ThreadPool.GetAvailableThreads(out worker, out io), I got the same result of 2047 worker threads and 1000 io threads. But I assume this can't be right, so this is not the right way to find the currently reserved threads in the thread pool.
So I looked at the thread count using Windows Task Manager and it showed 10 threads for the application. That seemed sensible and I came to the conclusion that the thread pool has 9 threads since one of the 10 is the main UI thread.
First of all, is my conclusion of 9 threads in thread pool correct? Second, what is the right way of querying it using c#?
I use ThreadPool.QueueUserWorkItem for creating a thread on Windows CE (I use .NET Framework 3.5). Sometimes the thread waits for something and starts too late. In the QueueUserWorkItem documentation it says that the delegate will be executed "when a thread pool thread becomes available".
Is there a way to force the ThreadPool to execute my delegate immediately? Would Thread.Start() be a solution for this?
Thank you!
First off, QueueUserWorkItem doesn't create a thread, it merely places a "task" in the ThreadPool's queue for the workers to pick up and execute. In case of saturation (more tasks than available threads), there is no guarantee of when a worker will become available to execute the task. If you want immediate execution use an instance of Thread instead. The only way to improve your odds with the ThreadPool is to increase the number of workers.
Edit: Just to be clear, if thread pool threads are indeed free, they will pick up work and execute it usually faster than starting a fresh thread.
A ThreadPool have a limited size. So you can't lunch as many thread as you want in the same time. If all the threads are busy then you have to wait for one to become available.
Check the number of thread you want to lunch and compare it to the Threadpool size -> GetMaxThreads()
Then if you want more thread just resize the pool with SetMaxThreads(int)
If you start a lot of threads from a pool you can get situation when there is no a free thread and your request is queued, that's why sometimes it starts to late. Try to increase a max number of worker threads in the pool. Use ThreadPool.SetMaxThreads and ThreadPool.SetMinThreads to configure the pool.
I have a method void DoWork(object input) that takes roughly 5 seconds to complete. I have read that Thread is better suited than ThreadPool for these longer operations but I have encountered a problem.
I click a button which calls threadRun.Start(input) which runs and completes fine. I click the button again and receive the following exception:
Thread is running or terminated; it cannot restart.
Can you not "reuse" a Thread? Should I use ThreadPool? Why is Thread "better suited for longer operations" compared to ThreadPool? If you can't reuse a thread, why use it at all (i.e. what advantages does it offer)?
Can you not "reuse" a Thread?
You can. But you have to code the thread not to terminate but to instead wait for more work. That's what a thread pool does.
Should I use ThreadPool?
If you want to re-use a thread, yes.
Why is Thread "better suited for longer operations" compared to ThreadPool?
Imagine a thread pool that is serving a large number of quick operations. You don't want to have too many threads, because the computer can only do so many things at a time. Each long operation you make the thread pool do ties up a thread from the pool. So the pool either has to have lots of extra threads or may run short of threads. Neither leads to an efficient thread pool design.
For longer operations, the overhead of creating and destroying a thread is very small in comparison to the cost of the operation. So the normal downside of using a thread just for the operation doesn't apply.
If you can't reuse a thread, why use it at all (i.e. what advantages does it offer)?
I'm assuming you mean using a thread dedicated to a job that then terminates over using a thread pool. The advantage is that the number of threads will always equal the number of jobs this way. This means you have to create a thread every time you start a job and destroy a thread every time you finish one, but you never have extra threads nor do you ever run short on threads. (This can be a good thing with I/O bound threads but can be a bad thing if most threads are CPU bound most of the time.)
Thread.Start documentation says:
Once the thread terminates, it cannot be restarted with another call
to Start.
Threads are not reusable. I have already faced this problem a while ago, the solution was to create a new Thread instance whenever needed.
It looks like this by by design.
I encountered the same problem and the only solution I could find was to recreate the thread. In my case I wasn't restarting the thread very often so I didn't look any further.
A search now has turned up this thread on social.msdn where the accepted answer states:
a stopped or aborted thread cannot be stated again.
The MSDN repeat this as well:
trying to restart an aborted thread by calling Start on a thread that has terminated throws a ThreadStateException.
As the message states, you cannot restart the thread. You can simply create a new thread for your next operation. Or, you might consider a design where the background thread keeps working until it completes all of your tasks, rather than launch a new thread for each one.
for(;;){} or while(true){} are useful constructs to 'reuse' a thread. Typically, the thread waits on some synchronization object at the top of these loops. In your example, you could wait on an event or semaphore and signal it from your button OnClick() handler.
It's just in background mode. It sounds like you need to use the ThreadPool because re-starting and re-creating Thread objects are very expensive operations. If you have a long running job that may last longer than your main process, then consider the use of a Windows Service.
I know that the CLR gives each AppDomain ThreadPool time slice to work , yet i wanted to know if by creating a new thread like so Thread t = new Thread(...);
Is it managed by the CLR or by the AppDomin ThreadPool ?
Thread t = new Thread(); will not be managed by the ThreadPool. But it is an abstraction provided by the CLR on the Operating System threads. ThreadPool is an addtional abstraction which facilitates reusing threads and sharing thread resources.
Here is an excellent resource on threads in .NET: http://www.albahari.com/threading/
If you're using .NET 4.0 consider using TPL.
When you create threads with the Thread class, you are in control. You create them as you need them, and you define whether they are background or foreground (keeps the calling process alive), you set their Priority, you start and stop them.
With ThreadPool or Task (which use the ThreadPool behind the scenes) you let the ThreadPool class manage the creation of threads, and maximizes reusability of threads, which saves you the time needed to create a new thread. One thing to notice is that unlike the Thread default, threads created by the ThreadPool don't keep the calling process alive.
A huge advantage of using the ThreadPool is that you can have a small number of threads handle lots of tasks. Conversely, given that the pool doesn't kill threads (because it's designed for reusability), if you had a bunch of threads created by the ThreadPool, but later the number of items shrinks, the ThreadPool idles a lot, wasting resources.
When you create new threads they are not managed by the thread pool.
If you create a thread manually then you control its life time, this is independent from the Thread pool.
I have a thread that I fire off every time the user scans a barcode.
Most of the time it is a fairly short running thread. But sometimes it can take a very long time (waiting on a invoke to the GUI thread).
I have read that it may be a good idea to use the ThreadPool for this rather than just creating my own thread for each scan.
But I have also read that if the ThreadPool runs out of threads then it will just wait until some other thread exits (not OK for what I am doing).
So, how likely is it that I am going to run out of threads? And is the benefit of the ThreadPool really worth it? (When I scan it does not seem to take too long for the scan to "run" the thread logic.)
It depends on what you mean by "a very long time" and how common that scenario is.
The MSDN topic "The Managed Thread Pool" offers good guidelines for when not to use thread pool threads:
There are several scenarios in which it is appropriate to create and manage your own threads instead of using thread pool threads:
You require a foreground thread.
You require a thread to have a particular priority.
You have tasks that cause the thread to block for long periods of time. The
thread pool has a maximum number of
threads, so a large number of blocked
thread pool threads might prevent
tasks from starting.
You need to place threads into a single-threaded apartment. All
ThreadPool threads are in the
multithreaded apartment.
You need to have a stable identity associated with the thread, or to
dedicate a thread to a task.
Since the user will never scan more than one barcode at a time, the memory costs of the threadpool might not be worth it - I'd stick with a single thread just waiting in the background.
The point of the thread pool is to amortize the cost of creating threads, which are not inexpensive to spin up and tear down. If you have a short-running task, the cost of creating/destroying the thread can be a significant portion of the overall run-time. The maximum number of threads in the thread pool depends on the version of the .NET Framework, typically dozens to hundreds per processor. The number of threads is scaled depending on available work.
Will you run out of threads and have to wait for a thread to become available? It depends on your workload. You can get the maximum number of threads available via ThreadPool.GetMaxThreads(). Chances are (based on the description of your problem) that this number is sufficiently high.
http://msdn.microsoft.com/en-us/library/system.threading.threadpool.getmaxthreads.aspx
Another option would be to manage your own pool of scan threads and assign them work rather than creating a new thread for every scan. Personally I would try the threadpool first and only manage your own threads if it proved necessary. Even better, I would look into async programming techniques in .NET. The methods will be run on the thread pool, but give you a much nicer programming experience than manual thread management.
If most of the time it is short running threads you could use the thread pool or a BackgroundWorker which draws threads from the pool.
An advantage I can see in your case is that threadpool class puts an upper limit on the amount of threads that may be active. It depends on the context of your application whether you will exhaust system resources. Exhausting a modern desktop system is VERY hard to do really.
If the software is used in a supermarket till it is highly unlikely that you will have more then 5 barcodes being analysed at the same time. If its run in a back-end server for a whole row of supermarket tills. Then perhaps 30-100 concurrent requests might be active.
With this sort of theory crafting it is highly unlikely that you will run out of threads, even on embedded hardware. If you have a dozen or so requests active at a time, and your code works, it's ok to just leave it as it is.
A thread pool is just an abstraction though, and you could have queue in the middle that queues request onto a thread-pool, in this scenario for the row-of-till example above, I'd feel comfortable queueing 100-1000 requests against a threadpool with 10 threads.
In .net (and on windows in general), the question should always be reversed: "Is creating a new thread worth it in this scenario?"
Creating a new thread is expensive, and doing it over and over again is almost certainly not worth it. The thread pool is cheap, and really should be the first thing you turn to when you need a new thread.
If you decide to spin up a new thread, soon you will start worrying about re-using the thread if it's already running. Then you will start worrying that sometimes the thread is running but it seems to be taking too long, and so you should make a new one. Then you're going to decide to have a thread not exit immediately upon finishing work, but to wait a little while in case new work comes in. And then... bam! You've created your own thread pool. At which point you should just back up and use the system-provided one.
The folks who mentioned that the thread pool might "run out of threads" were well-intentioned, but they did you a disservice. The limit on the number of threads in the thread pool is quite large. If you run into it, you have other problems.
(And, of course, since .net 2.0, you can set the maximum number of threads, so you can tweak the number if you absolutely have to.)
Others have directed you to MSDN: "The Managed Thread Pool". I will repeat that direction, as the article is good, but in my mind does not sell the thread pool hard enough. :)