I built an app that performs work on thousands of files, then writes modified copies of these files to the disk. I am using a ThreadPool but it was spawning so many threads the pc was becoming unresponsive 260 total), so i changed the max from the default of 250 down to 50, this solved that issue (app only spawns about 60 threads total), however now that the files are becoming ready so quickly, its tying up the UI to the point where the pc is unresponsive.
Is there a way to limit the amount of I/O - i mean, i like using 50 threads to perform the work on the files, but not 50 threads writing at the same time when they are processed. I would rather not re-architect the writing of the files part if i can keep from it - i was hoping i could limit the amount of I/O (simultaneous) the threads from this pool could consume.
Use a semaphore to limit no. of threads wanting to write to disk simultaneously.
http://msdn.microsoft.com/en-us/library/system.threading.semaphore.aspx
Limits the number of threads that can
access a resource or pool of resources
concurrently.
You really don't need so many threads. A disk can only support its maximum read and write throughput, which a single thread can easily max-out if it is dedicated to IO i.e. reading or writing. You also cannot read and write to a hard disk simultaneously (although this is complicated with OS caching layers, etc), so having concurrent threads reading and writting can be very counter-productive. There is also little to be gained from having more threads than processors\cores for your non-IO tasks as any additional threads will spend much of their time waiting for a core to become available e.g. if you have 50 threads and 4 cores, a minimum of 46 of the threads will be idle at any given time. The wasted threads will contribute to both memory consumption also incur performance overhead as they will all be fighting to get a crack at some time on a core, and the OS has to arbitrate this fight.
A more straightforward approach would be have a single thread whose job it is to read in the files, and then add the data to a blocking queue (e.g. see ConcurrentQueue), meanwhile have a number of worker threads that are waiting on file data in the queue (e.g. a number threads equal to the number of processors\cores). These worker threads will munch their way through the queue as items are added, and block when it is empty. When a worker thread finishes a piece of work, it can add that to another blocking queue which is being monitored either by the reader thread or a dedicated writer thread. Its job is to write the files out.
This pattern seeks to balance IO and CPU amongst a much smaller bunch of co-operating threads, where the number of IO threads is limited to what is physically capable by a hard drive, and a number of CPU worker threads that is sensible for the number of processors\cores you have. In essence it separates IO and CPU work so that things behave more predictably.
Further to this, if IO really is the problem (and not a huge amount of threads all fighting each other), then you can place some pauses (e.g. Thread.Sleep) in your file reading and writing threads to limit how much work they do.
Update
Perhaps it is worth explaining why there are so many threads being generated in the first place. This is a degenerative case for threadpool use, and is centred around queueing workitems that have a component of IO in them.
The threadpool executes work items from its queue and monitors how long executing work items are taking. If currently executing workitems are taking a long time to complete (I think half a second from memory) then it will start adding more threads to the pool as it believes this will get the queue processed quicker\more fairly. However, if the additional concurrent workitems are also performing work IO against a shared disk, then performance of the disk will actually reduce, meaning that workitems will take even longer to execute. Because workitems are taking longer to execute, the threadpool adds more threads. This is the degenerative case, where performance gets worse and worse as more threads are added.
The use of a semaphore as suggested would have to be done carefully, as the semaphore could cause blocking of threadpool threads, the threadpool would see workitems taking a long time to execute, and it will still start adding more threads.
Related
I have a console application(c#) where I have to call various third party API's and collect data. This I have to do simultaneously for different users. I am using threads for it. But as the number of users are increasing this service is eating into the CPU performance. It is affecting other processes. Is there a way we can use threads for parallel processing but do not affect the CPU performance in a huge way.
I assume from your question that you're creating threads manually, and so the quick way to answer this is to suggest that you use an API like the Task Parallel Library, because this will take an arbitrary number of tasks and try to use a sensible number of threads to process them - so given 500 API requests, it would limit itself to just a few threads.
However, to answer in more detail: the typical reason that you would see this problem is that code is creating too many threads. Threads are not free resources - they are expensive.
A made up example based on your question might be this:
you have 5 3rd party APIs that you need to call, and each is going to return ~1MB of data per user
you call each API on a separate background thread, for each user
you have 100 users
you therefore have created 500 threads in total, each of which is waiting on data from the network
The problem here is that there are 500 threads the program is trying to manage, and they are all waiting on the slowest piece of the system - the network.
More simply, we are trying to download 500 pieces of data at once (which in this example would mean everything finishes slowly), rather than downloading them one at a time so that individual items will finish earlier. Because each thread will be doing nothing (just waiting for the network), the CPU will switch between idle threads continually. As you increase your number of users, the number of threads increases - which increases the CPU usage just for switch between threads, even though each thread is actually downloading more slowly. This is (approximately) why you'll be seeing slower performance as your user count goes up.
A better example would be to take the same scenario and use just one background thread:
you have 5 3rd party APIs that you need to call, and each is going to return ~1MB of data per user
each API call is put into a queue and the queue is processed by a single thread
you have 100 users
you therefore have 1 thread running in the background which is using the full available bandwidth of the network for each request
In this example, your CPU usage will be pretty consistent - no matter how many users you have, there is only one background thread running, so context switching is minimised. Each individual API call runs at the maximum rate of the network card and so finishes as quickly as possible.
The reality is that one thread is probably not enough: a single request is unlikely to saturate the network, as there will be limiting factors elsewhere. But this is something you can tune later: maybe 2 or 3 threads would be more performant, but 4 threads would be slower again. The general rule when threading is to start small and work up, not to create a thread for each piece of work.
First, run a profiler and checkout some refactoring tools to see if you can perform code optimization to resolve the issue. If your application is still overloading the server then setup or purchase load balancing. In the meantime, if you are running the latest OS's you could try setting a hacky CPU rate limit...however, that may not work for the needs you described.
I have a dual core processor, now let's say that I want to make a spam bot program, which will spam messages such as "Hey, how are you?".
My question is, what number of threads would be able to pop up these messages the fastest, running 5 threads or 100 threads botting the messages? (Of course, these numbers aren't special, just for the example). All of the threads will run in thread-safe.
EDIT: As for the down votes before, I'm not really writing a spambot program, I just mentioned it as an example for my question, sorry for the misunderstanding
The ideal number of threads depends on your hardware (in this case a dual core processor), and on what those threads are doing. If they are CPU intensive, more than 1 thread per core will probably slow things down.
If the threads do some IO, you will see an overall increase in performance by adding threads. The point of diminishing returns depends entirely on the nature of the non-CPU tasks and on the specific hardware.
To find that point, you will have to test various thread totals.
You can design your system to self-tune the number of threads in use. I once designed a system that ran best (most total throughput) when the total CPU load was about 70%. To optimize for that value, I added threads (with a delay between threads) until the CPU was at 70%, +/- 5%. If it went above 80%, I signaled one or more threads to finish their current work and terminate. If it went below 60%, I gradually added threads. Worked like a charm.
Deliberately creating more threads than processors is a standard technique used to make use of "spare cycles" where a thread is blocked waiting for something, whether that's I/O, a mutex, or something else by providing some other useful work for the processor to do.
If your threads are doing I/O then this is a strong contender for the speed-up: as each thread blocks waiting for the I/O, the processor can run the other threads until they too block for I/O, hopefully by which time the data for the first thread is ready, and so forth.
Source: Anthony Williams
in my .net multithread program, i am wondering all these threads running on the same process or different processes?
if it is on the same process, then i assume one process run on one core, then how multithreading can utilize all the four cores that i have in my quad-core cpu?
but if it is on the different processes, as i know different processes and same process have different data sharing mechanism, then how come i don't need to write different code to handle this in my multithreading program? Would anyone shed some light on
I want to ask two more similar questions
When i open the task manager, often times, i can see around 800 threads and 54 processes,and my cpu usage is only 5%,and i was told that each core only excute one thread at a time.
is my cpu running these 800 threads all the times, or only means 800 threads are queuing, waiting cpu to process?
if i want my multithreading program fully utilze my quad-core cpu, can i raise the cpu usage by creating more threads(it seems contradict the theroy that only one thread one core at a time)
Multithreading means multiple threads in the same process.
Each thread can be assigned to a different core.
But all the threads belong to the same process, for example if one of the threads will throw an unhandeled exception, the process will crash with all its threads.
You could have read a bit about it, just search google or Wikipedia - Software Multithreading
A single process may use a number of threads; even a basic .NET "hello world" console exe probably uses 4 or 5. So yes, a single process can potentially use all your available cores if you write it to do so.
Because it is the same process, data sharing is direct, but: care must be taken if you are changing the values, as otherwise very bad things can happen. Access must be carefully synchronized (lock etc) if you are changing the data within the threaded code.
You do, however, usually have to write different code to support multiple threads. Exceptions to this is when the framework is doing that for you, for example, ASP.NET or WCF may take incoming requests and hand them to different worker threads, allowing multiple concurrent operations even though you didn't explicitly code it that way. Which means that in ASP.NET or WCF you need to be careful with shared state, for exactly the reasons already discussed.
As a minor addition, note also that a process can support multiple AppDomains; in that scenario, the threads for the process are shared between all the AppDomains at whim by the scheduler.
Threads created by that process are part of that process. Different threads within the one process can and often do run on different processors or processor cores.
in my .net multithread program, i am wondering all these threads
running on the same process or different processes?
A thread always runs in a process, however, multiple threads can run in a single process and each thread can be handled by a different core.
If you have a single core, it doesn't mean that it can't run multiple threads, it just means that the core can't execute multiple threads at the same time. If you take a look at the picture above, you will note that:
Thread #1 executes for some time.
Thread #1 "stops".
Thread #2 executes for some time.
Thread #2 "stops".
Thread #1 executes for some time, again.
This illustrates what happens when a core runs multiple threads: the core only executes one thread at a time, but in order for both threads to run, the core must perform context switching. In other words: the core runs a few commands from Thread 1, switches to Thread 2 and runs a few commands from it, then it switches back to Thread 1 to execute some more commands.
Juggling Oranges:
A good metaphor is juggling oranges: technically, you only have two hands and you can only hold one orange in each hand at a time, so the maximum you can hold is two oranges. In this case the taxing part is holding the oranges. However, if you throw an orange up in the air, then you can hold a 3rd orange while the the 2nd one is in the air. The higher you throw the oranges, the more oranges you can juggle. To be more precise: the longer it takes for an orange to come back in your hand, the more oranges you can juggle. Of course, you probably can't juggle an enormous amount of oranges, because throwing an orange requires more energy than simply holding it.
In essence, your CPU is juggling threads: the longer a thread stays away from executing code on the CPU, the more threads a CPU can "juggle." If a thread is waiting on I/O (e.g. a database request), then the CPU can execute the code of another thread at the same time. This is the same reason why you see 54 processes and 800 threads in the task manager: many
of those threads are doing things that are not CPU-bound.
Sleep:
is my cpu running these 800 threads all the times, or only means 800
threads are queuing, waiting cpu to process?
Many of the threads you're noticing in your task manager are idle/sleeping, so they use very little (if any) CPU. However, the ones that are running are executed with context switching (if there are more threads than cores, which is the case most of the time). There are many things that can cause a thread to idle/sleep, see the orange juggling for an example.
CPU Utilization:
if i want my multithreading program fully utilze my quad-core cpu, can
i raise the cpu usage by creating more threads(it seems contradict the
theroy that only one thread one core at a time)
It gets tricky :). Imagine that instead of oranges, you have bowling balls: it's VERY taxing on your hands, so even if you tried, you probably won't be able to hold more than 2 bowling balls let alone juggle a 3rd one. At maximum load, you can only hold as many objects as you have hands. The same is true for the CPU: at maximum load, the CPU can only execute as many threads as there are cores.
The reason why you can run more threads than the number of cores is because the thread are not putting the maximum load on the cores. If your threads are CPU bound, i.e. they do some heavy computational stuff and they tax the core 100%, then you can only run as many threads as you have cores. However, the CPU is the fastest thing in your computer and your thread may be accessing other parts of your computer that are significantly slower than your CPU (hard disk, network card, etc), so you can run more threads.
I've read at many places that .net Threadpool is meant for short time span tasks (may be not more than 3secs). In all these mentioning I've not found a concrete reason why it should be not be used.
Even some people said that it leads to nasty results if we use for long time tasks and also leads to deadlocks.
Can somebody explain it in plain english with technical reason why we should not use thread pool for long time span tasks?
To be specific, I would even like to give a scenario and want to to know why ThreadPool should not be used in this scenario with proper reasons behind it.
Scenario: I need to process some thousands of user's data. User's processing data is retrieved from a local database and using that information I need to connect to an API hosted on some other location and the response from API will be stored in the local database after processing it.
If someone can explain me pitfalls in this scenario if I use ThreadPool with thread limit of 20? Processing time of each user may range from 3 sec to 1 min (or more).
The point of the threadpool is to avoid the situation where the time spent creating the thread is longer than the time spent using it. By reusing existing threads, we get to avoid that overhead.
The downside is that the threadpool is a shared resource: if you're using a thread, something else can't. So if you have lots of long-running tasks, you could end up with thread-pool starvation, possibly even leading to deadlock.
Don't forget that your application's code may not be the only code using the thread pool... the system code uses it a lot too.
It sounds like you might want to have your own producer/consumer queue, with a small number of threads processing it. Alternatively, if you could talk to your other service using an asynchronous API, you may find that each bit of processing on your computer would be short-lived.
It is related to the way the threadpool scheduler works. It tries hard to ensure that it won't release more waiting threads than you have CPU cores. Which is a good idea, running more threads than cores is wasteful as Windows spends time switching context between threads. Making the overall time needed to complete the jobs longer.
As soon as a TP thread completes, another one is allowed to run. Two times per second, the TP scheduler steps in when the running threads do not complete. It cannot tell why these threads are taking so much time to get their job done. Half a second is a lot of CPU cycles, a cool billion or so. It therefore assumes that the threads are blocking, waiting for some kind of I/O to complete. Like a dbase query, a disk read, a socket connection attempt, stuff like that.
And it allows another thread to run. You've now got more threads then you have cores. Which isn't really a problem if those original threads are indeed blocking, they're not consuming any CPU cycles.
You can see where this leads: if your thread runs for 3 seconds then its creating a bit of a logjam. It delays, but won't block, other TP threads that are waiting to run. If your thread needs to spend so much time because it is constantly blocking then you are better off creating a regular Thread. And if you really care that the thread does not get delayed by the TP scheduler then you should use a Thread as well.
The TP scheduler was tinkered with in .NET 4.0 btw, what I wrote is really only true for earlier releases. The basics are still there, it just uses a smarter scheduling algorithm. Based on a feedback, dynamically scheduling by measuring throughput. This really only matters if you have a lot of TP threads going.
Two reasons not really touched upon:
The threadpool is used as the normal means of handling I/O callback functions, which are usually supposed to happen very soon after associated I/O operation completes. In general, timeliness is more important with short tasks than long ones, but long-running tasks in the threadpool will delay the execution of notification tasks which could have (and should have) started up, run, and completed quickly.
If a threadpool task becomes blocked until such time as some other threadpool task runs, it may hog a threadpool thread, thus delaying or in some cases blocking altogether the start of that other task (or any others).
Generally, having a threadpool thread acquire a lock (waiting if necessary) isn't a problem. If it's necessary for one threadpool thread to wait for another threadpool thread to release a lock, the fact that latter thread acquired the lock in the first place implies that it got started. On the other hand, waiting for e.g. some data to arrive from a connection may cause deadlock if an I/O callback routine is used to flag the arrival of data. If many too many threadpool threads are waiting for the I/O callback to signal that data has arrived, the system may decide to defer the callback until one of the threadpool threads completes.
Are there any benefits to limiting the number of concurrent threads doing a given task to equal the number of processors on the host system? Or better to simply trust libraries such as .NET's ThreadPool to do the right thing ... even if there are 25 different concurrent threads happening at any one given moment?
Most threads are not CPU bound, they end up waiting on IO or other events. If you look at your system now, I imagine you have 100's (if not 1000's) of threads executing with no problems. By that measure, you're probably best just leaving the .NET thread pool to do the right thing!
However, if the threads were all CPU bound (e.g. something like ray tracing) then it would be a good idea to limit the number of threads to the number of cores, otherwise chances are that context switching will begin to hurt performance.
The threadpool already does a reasonably good job at this. It tries to limit the number of running threads to the number of CPU cores in your machine. When one thread ends, it immediately schedules another eligible thread for execution.
Every 0.5 seconds, it evaluates what is going on with the running threads. When the threads have been running too long, it assumes they are stalled and allows another thread to start executing. You'll now have more threads running than you have CPU cores. This can go up to the maximum number of allowed thread, as set by ThreadPool.SetMaxThreads().
Starting around .NET 2.0 SP1, the default maximum number of threads was increased considerably to 250 times the number of cores. You should never ever get there. If you do, you would have wasted about 2 minutes of time where a possibly non-optimal number of threads were running. Those threads however would all have to be blocking for that long, not exactly a typical execution pattern for a thread. On the other hand, if these threads are all waiting on the same kind of resource they are likely to just take turns, adding more threads cannot improve throughput.
Long story short, the thread pool will work well if you run threads that execute quickly (seconds at most) and don't block for a long time. You probably ought to consider creating your own Thread objects when your code doesn't match that pattern.
Well, if your bottleneck is ONLY processors, then it might make sense, but that would ignore all memory and other i/o bottlenecks, and chances are at least your cache memory is throwing page faults and other events that would slow the threads.
I'd trust the library myself. Threads wait for all kinds of things, and you don't want your application to slow down because it can't spawn a new thread, even though most of the rest are just sleeping, waiting for some event or resource.
Measure your application under a variety of thread:processor ratios. Come to conclusions based on hard data about your application. Accept no arguments from first principles about what performance you should get, only what you do get matters.