Effect of creating large amounts of system threads and waiting on MRE? - c#

I'm trying to fix memory spikes in a very large application. While I'm not sure how much of an effect this would have on memory, I noticed the following:
Application uses a custom thread pool to do all expensive tasks
Application will execute all incoming tasks
Tasks can be composed of thousands of sub tasks
While the thread pool will only execute {T} tasks at a time, and finishes a task completely before starting a new one, it does create a new system thread (Thread class) and start it for every sub task added to it
The sub task system threads are started with a thread start that instantly blocks on a manual reset event (MRE) pending a thread pool slot freeing up
So, this thread pool can create thousands of threads, but all but 30 (or whatever you configure) will be blocked on an MRE while other tasks complete.
My Question:
What impact on the memory/processor will a thousand threads blocked on MREs have? I don't have a lot of time to fix this spike, so if it's minimal, I'd rather leave the issue and work to fix it in a later patch when I have more time.
Also, is this behavior typical in thread pools, or does this sound flawed (I'm leaning towards flawed, but I don't have a good enough background to be sure).

What impact on the memory/processor will a thousand threads blocked on MREs have? I don't have a lot of time to fix this spike, so if it's minimal, I'd rather leave the issue and work to fix it in a later patch when I have more time.
Each thread, when created manually, has it's own stack allocated to it. By default, this will be 1MB per thread, though it is possible to make threads with a smaller stack via a constructor parameter.
You'd be much better off redesigning this, from the description of your problem, to use the standard ThreadPool, and a class like BlockingCollection<T> to handle your throttling. This is designed to directly allow bounding input, with blocking. Making a custom "thread pool" of an infinite number of threads is going to be far less efficient than using the highly tuned ThreadPool included with the framework.
Also, is this behavior typical in thread pools, or does this sound flawed (I'm leaning towards flawed, but I don't have a good enough background to be sure).
This is definitely flawed. The entire point of a ThreadPool is to avoid making a thread per request, and "pool" the threads (reusing them) for multiple requests without having to recreate them.

Related

ThreadPool.QueueUserWorkItem causing massive delay to UI thread due to lack of resources - better method to use? [duplicate]

Scenario
I have a Windows Forms Application. Inside the main form there is a loop that iterates around 3000 times, Creating a new instance of a class on a new thread to perform some calculations. Bearing in mind that this setup uses a Thread Pool, the UI does stay responsive when there are only around 100 iterations of this loop (100 Assets to process). But as soon as this number begins to increase heavily, the UI locks up into eggtimer mode and the thus the log that is writing out to the listbox on the form becomes unreadable.
Question
Am I right in thinking that the best way around this is to use a Background Worker?
And is the UI locking up because even though I'm using lots of different threads (for speed), the UI itself is not on its own separate thread?
Suggested Implementations greatly appreciated.
EDIT!!
So lets say that instead of just firing off and queuing up 3000 assets to process, I decide to do them in batches of 100. How would I go about doing this efficiently? I made an attempt earlier at adding "Thread.Sleep(5000);" after every batch of 100 were fired off, but the whole thing seemed to crap out....
If you are creating 3000 separate threads, you are pushing a documented limitation of the ThreadPool class:
If an application is subject to bursts
of activity in which large numbers of
thread pool tasks are queued, use the
SetMinThreads method to increase the
minimum number of idle threads.
Otherwise, the built-in delay in
creating new idle threads could cause
a bottleneck.
See that MSDN topic for suggestions to configure the thread pool for your situation.
If your work is CPU intensive, having that many separate threads will cause more overhead than it's worth. However, if it's very IO intensive, having a large number of threads may help things somewhat.
.NET 4 introduces outstanding support for parallel programming. If that is an option for you, I suggest you have a look at that.
More threads does not equal top speed. In fact too many threads equals less speed. If your task is simply CPU related you should only be using as many threads as you have cores otherwise you're wasting resources.
With 3,000 iterations and your form thread attempting to create a thread each time what's probably happening is you are maxing out the thread pool and the form is hanging because it needs to wait for a prior thread to complete before it can allocate a new one.
Apparently ThreadPool doesn't work this way. I have never checked it with threads before so I am not sure. Another possibility is that the tasks begin flooding the UI thread with invocations at which point it will give up on the GUI.
It's difficult to tell without seeing code - but, based on what you're describing, there is one suspect.
You mentioned that you have this running on the ThreadPool now. Switching to a BackgroundWorker won't change anything, dramatically, since it also uses the ThreadPool to execute. (BackgroundWorker just simplifies the invoke calls...)
That being said, I suspect the problem is your notifications back to the UI thread for your ListBox. If you're invoking too frequently, your UI may become unresponsive while it tries to "catch up". This can happen if you're feeding too much status info back to the UI thread via Control.Invoke.
Otherwise, make sure that ALL of your work is being done on the ThreadPool, and you're not blocking on the UI thread, and it should work.
If every thread logs something to your ui, every written log line must invoke the main thread. Better to cache the log-output and update the gui only every 100 iterations or something like that.
Since I haven't seen your code so this is just a lot of conjecture with some highly hopefully educated guessing.
All a threadpool does is queue up your requests and then fire new threads off as others complete their work. Now 3000 threads doesn't sounds like a lot but if there's a ton of processing going on you could be destroying your CPU.
I'm not convinced a background worker would help out since you will end up re-creating a manager to handle all the pooling the threadpool gives you. I think more you issue is you've got too much data chunking going on. I think a good place to start would be to throttle the amount of threads you start and maintain. The threadpool manager easily allows you to do this. Find a balance that allows you to process data while still keeping the UI responsive.

threading higher priority for a not known in advance thread

I create about 5000 background workers that do intensive work in a console app. I'm also using an external library that instantiates an object, say ObjectX. At some point, say t0, ObjectX tries to obtain a thread from an os thread pool and start it, but I have no control on how it obtains this thread. Things work fine for 100 background workers. For 1000 background workers it takes about 10 minutes after t0 for ObjectX to obtain and start a thread.
Is there a way to set, in advance, a high priority for any threads that will be started in the future by an object?
As I think the answer to 1 is "no", is there a way to limit the priority of the background workers so as to somehow favor everything else? Even though I only want to 'favor' ObjectX.
The goal would be to always have available resources to run the thread launched by ObjectX, no matter how overloaded the machine is.
I'm using C# and the .Net fr 3.5, on a Windows 64bit machine.
The way threads work is that they are given processor time by the OS. When this happens this is called a context switch. A context switch takes about 2000-8000 cycles (i.e. depending on processor 2000-8000 instructions). If the OS has many CPUs or cores, it may not need to take the CPU away from one thread and give it to another--avoiding a context switch. There can only be one thread per CPU running at a time, when you have more threads that need CPU than CPUs then you're forcing a context switch. Context switches are performed no faster than the system quantum (every 20ms for client and 120ms for server).
If you have 5000 background workers you effectively have 5000 threads. Each of those threads is potentially vying for CPU time. On a client version of windows, that means 250,000 context switches per second. i.e. 500,000,000 to 2,000,000,000 cycles per second are devoted simply to switching between threads. (i.e. over and above the work your threads are performing) if it could even process that many context switches per second.
The recommended practice is to only have one CPU-bound thread per processor. A CPU-bound thread is one that spends very little time "waiting". The UI thread is not a CPU-bound thread. If your background workers are spending a lot of time waiting for locks, then they may not be CPU-bound either--but, in general, background worker threads are CPU-bound. (otherwise, what would be the point of using a background worker?).
Also, the OS spends a lot of time figuring out what thread needs to get the CPU next. When you start changing thread priorities you interfere with that and most of the time end up making your entire system slower (not just your application) rather than faster.
Update:
On a related not, it takes about 200,000 cycles to create a new thread and about 100,000 cycles to destroy a thread.
Update 2:
If the impetus of the question isn't simply "If it can be done" but to be able to scale workload, then as #JoshW/#Servy mention, using something like the Producer/Consumer Pattern would allow for scalability that could facilitate horizontal scaling to multiple computers/nodes via a queue or a service bus. Simple starting up an in ordinate amount of threads is not scalable beyond the # of CPUs. If what you truly want is an architecture that can scaled out because "available resources...how overloaded the machine is" is simply impossible.
Personally I think this is a bad idea, however... given the comments you have made on other answers and your request that "No matter how many background workers are create that ObjectX runs as soon as possible"... You could conceivably force your background workers to block using a ManualResetEvent.
For example at the top of your worker code you could block on a Manual reset event with the WaitOne method. This manual reset could be static or passed as an input parameter and wherever your ObjectX gets instantiated/called or whatever, you call the .Reset method on your ManualResetEvent. This would block all your workers at the WaitOne line. Next at the bottom of the code that runs ObjectX, call the ManualResetEvent.Set() method and that will unblock the workers.
Note this is NOT an efficient way to manage your threads, but if you "just have to make it work" and have time later to improve it... I suppose it's one possible solution.
The goal would be to always have available resources to run the thread launched by ObjectX, no matter how overloaded the machine is.
Then thread priorities might not be the right tool.. Remember, thread priorities are evil
In general, windows is not a real-time OS; especially, win32 does not even attempt to be soft real-time (IIRC, the NT kernel tried, at some point, to have at least support for soft real time subsystems, but I may be wrong). So there is no guarantee about available resources, or timing.
Also, are you worried about other threads in the system? Those threads are out of your control (what if the other threads are already at the system max priority?).
If you are worried about threads in your app... you can control and throttle them, using less threads/workers to do more work (batching work in bigger units, and submitting it to a worker, for example, or by using TPL or other tools that will handle and throttle thread usage for you)
That said, you could intercept when a thread is created (look for example this question https://stackoverflow.com/a/3802316/863564) see if it was created for ObjectX (for example, checking its name) and use SetThreadPriority to boost it.

Is ThreadPool worth it in this scenario?

I have a thread that I fire off every time the user scans a barcode.
Most of the time it is a fairly short running thread. But sometimes it can take a very long time (waiting on a invoke to the GUI thread).
I have read that it may be a good idea to use the ThreadPool for this rather than just creating my own thread for each scan.
But I have also read that if the ThreadPool runs out of threads then it will just wait until some other thread exits (not OK for what I am doing).
So, how likely is it that I am going to run out of threads? And is the benefit of the ThreadPool really worth it? (When I scan it does not seem to take too long for the scan to "run" the thread logic.)
It depends on what you mean by "a very long time" and how common that scenario is.
The MSDN topic "The Managed Thread Pool" offers good guidelines for when not to use thread pool threads:
There are several scenarios in which it is appropriate to create and manage your own threads instead of using thread pool threads:
You require a foreground thread.
You require a thread to have a particular priority.
You have tasks that cause the thread to block for long periods of time. The
thread pool has a maximum number of
threads, so a large number of blocked
thread pool threads might prevent
tasks from starting.
You need to place threads into a single-threaded apartment. All
ThreadPool threads are in the
multithreaded apartment.
You need to have a stable identity associated with the thread, or to
dedicate a thread to a task.
Since the user will never scan more than one barcode at a time, the memory costs of the threadpool might not be worth it - I'd stick with a single thread just waiting in the background.
The point of the thread pool is to amortize the cost of creating threads, which are not inexpensive to spin up and tear down. If you have a short-running task, the cost of creating/destroying the thread can be a significant portion of the overall run-time. The maximum number of threads in the thread pool depends on the version of the .NET Framework, typically dozens to hundreds per processor. The number of threads is scaled depending on available work.
Will you run out of threads and have to wait for a thread to become available? It depends on your workload. You can get the maximum number of threads available via ThreadPool.GetMaxThreads(). Chances are (based on the description of your problem) that this number is sufficiently high.
http://msdn.microsoft.com/en-us/library/system.threading.threadpool.getmaxthreads.aspx
Another option would be to manage your own pool of scan threads and assign them work rather than creating a new thread for every scan. Personally I would try the threadpool first and only manage your own threads if it proved necessary. Even better, I would look into async programming techniques in .NET. The methods will be run on the thread pool, but give you a much nicer programming experience than manual thread management.
If most of the time it is short running threads you could use the thread pool or a BackgroundWorker which draws threads from the pool.
An advantage I can see in your case is that threadpool class puts an upper limit on the amount of threads that may be active. It depends on the context of your application whether you will exhaust system resources. Exhausting a modern desktop system is VERY hard to do really.
If the software is used in a supermarket till it is highly unlikely that you will have more then 5 barcodes being analysed at the same time. If its run in a back-end server for a whole row of supermarket tills. Then perhaps 30-100 concurrent requests might be active.
With this sort of theory crafting it is highly unlikely that you will run out of threads, even on embedded hardware. If you have a dozen or so requests active at a time, and your code works, it's ok to just leave it as it is.
A thread pool is just an abstraction though, and you could have queue in the middle that queues request onto a thread-pool, in this scenario for the row-of-till example above, I'd feel comfortable queueing 100-1000 requests against a threadpool with 10 threads.
In .net (and on windows in general), the question should always be reversed: "Is creating a new thread worth it in this scenario?"
Creating a new thread is expensive, and doing it over and over again is almost certainly not worth it. The thread pool is cheap, and really should be the first thing you turn to when you need a new thread.
If you decide to spin up a new thread, soon you will start worrying about re-using the thread if it's already running. Then you will start worrying that sometimes the thread is running but it seems to be taking too long, and so you should make a new one. Then you're going to decide to have a thread not exit immediately upon finishing work, but to wait a little while in case new work comes in. And then... bam! You've created your own thread pool. At which point you should just back up and use the system-provided one.
The folks who mentioned that the thread pool might "run out of threads" were well-intentioned, but they did you a disservice. The limit on the number of threads in the thread pool is quite large. If you run into it, you have other problems.
(And, of course, since .net 2.0, you can set the maximum number of threads, so you can tweak the number if you absolutely have to.)
Others have directed you to MSDN: "The Managed Thread Pool". I will repeat that direction, as the article is good, but in my mind does not sell the thread pool hard enough. :)

Thread.Sleep(Timeout.Infinite) performance issues

Main execution path (main thread) is going to be forked into two execution paths (two new threads on different jobs) but the main thread is no longer needed. I can assign one of the tasks to main thread and save one thread (one task by main thread and another by a new thread) but I was wondering putting main thread in an infinite sleep Thread.Sleep(Timeout.Infinite) is a good approach or not. My class is going to be instantiated many times and if a thread in infinite sleep takes resource from OS it's bad news for me.
Each thread you create takes up stack space. On Windows, that's 1MB by default. There are also other internal house-keeping data structures that the operating system uses to keep track of threads which will take up a bit of memory as well, but the 1MB stack is definitely going to be the biggest consumer of resources.
Having said that, if we're only talking about 2 vs. 3 threads, then the difference is quite small. If it was 200 vs. 300 then you might have something to worry about. But if you're spawning a lot of threads, you'd be better off using some kind of thread pool (like, say, the one built-in to the .NET framework) rather than spawning individual threads anyway.
All threads tie up resources, regardless of if they're sleeping or not.

Why .net Threadpool is used only for short time span tasks?

I've read at many places that .net Threadpool is meant for short time span tasks (may be not more than 3secs). In all these mentioning I've not found a concrete reason why it should be not be used.
Even some people said that it leads to nasty results if we use for long time tasks and also leads to deadlocks.
Can somebody explain it in plain english with technical reason why we should not use thread pool for long time span tasks?
To be specific, I would even like to give a scenario and want to to know why ThreadPool should not be used in this scenario with proper reasons behind it.
Scenario: I need to process some thousands of user's data. User's processing data is retrieved from a local database and using that information I need to connect to an API hosted on some other location and the response from API will be stored in the local database after processing it.
If someone can explain me pitfalls in this scenario if I use ThreadPool with thread limit of 20? Processing time of each user may range from 3 sec to 1 min (or more).
The point of the threadpool is to avoid the situation where the time spent creating the thread is longer than the time spent using it. By reusing existing threads, we get to avoid that overhead.
The downside is that the threadpool is a shared resource: if you're using a thread, something else can't. So if you have lots of long-running tasks, you could end up with thread-pool starvation, possibly even leading to deadlock.
Don't forget that your application's code may not be the only code using the thread pool... the system code uses it a lot too.
It sounds like you might want to have your own producer/consumer queue, with a small number of threads processing it. Alternatively, if you could talk to your other service using an asynchronous API, you may find that each bit of processing on your computer would be short-lived.
It is related to the way the threadpool scheduler works. It tries hard to ensure that it won't release more waiting threads than you have CPU cores. Which is a good idea, running more threads than cores is wasteful as Windows spends time switching context between threads. Making the overall time needed to complete the jobs longer.
As soon as a TP thread completes, another one is allowed to run. Two times per second, the TP scheduler steps in when the running threads do not complete. It cannot tell why these threads are taking so much time to get their job done. Half a second is a lot of CPU cycles, a cool billion or so. It therefore assumes that the threads are blocking, waiting for some kind of I/O to complete. Like a dbase query, a disk read, a socket connection attempt, stuff like that.
And it allows another thread to run. You've now got more threads then you have cores. Which isn't really a problem if those original threads are indeed blocking, they're not consuming any CPU cycles.
You can see where this leads: if your thread runs for 3 seconds then its creating a bit of a logjam. It delays, but won't block, other TP threads that are waiting to run. If your thread needs to spend so much time because it is constantly blocking then you are better off creating a regular Thread. And if you really care that the thread does not get delayed by the TP scheduler then you should use a Thread as well.
The TP scheduler was tinkered with in .NET 4.0 btw, what I wrote is really only true for earlier releases. The basics are still there, it just uses a smarter scheduling algorithm. Based on a feedback, dynamically scheduling by measuring throughput. This really only matters if you have a lot of TP threads going.
Two reasons not really touched upon:
The threadpool is used as the normal means of handling I/O callback functions, which are usually supposed to happen very soon after associated I/O operation completes. In general, timeliness is more important with short tasks than long ones, but long-running tasks in the threadpool will delay the execution of notification tasks which could have (and should have) started up, run, and completed quickly.
If a threadpool task becomes blocked until such time as some other threadpool task runs, it may hog a threadpool thread, thus delaying or in some cases blocking altogether the start of that other task (or any others).
Generally, having a threadpool thread acquire a lock (waiting if necessary) isn't a problem. If it's necessary for one threadpool thread to wait for another threadpool thread to release a lock, the fact that latter thread acquired the lock in the first place implies that it got started. On the other hand, waiting for e.g. some data to arrive from a connection may cause deadlock if an I/O callback routine is used to flag the arrival of data. If many too many threadpool threads are waiting for the I/O callback to signal that data has arrived, the system may decide to defer the callback until one of the threadpool threads completes.

Categories