when do we go for multithreading in c#? - c#

I know how to implement multithreading using c#. But I want to know how is it working like.
will only one thread run at a time and when that thread is waiting will it execute the second thread?
If the second thread is executing and the first thread is ready. What will happen?
Which thread will be given the priority?
I am confused in understanding the concept. I want to understand why do we go for multithreading and when do we use it .
Thanks in advance.

Threads may or may not be running at the same time. On a single processor machine only one thread will is running at a time. On a multiprocessor system (multi-processor, multi-core, hyper-threading) then multiple threads can be running at the same time, one thread per processor.
The operation system scheduler determines when a thread gets to run. Windows is a preemptive multitasking system. It will run a thread for a certain amount of time, called a time slice (10ms or 15ms on Windows), stop the thread, then determine which thread to run next, which could be the same thread that is running. The actual algorithm is complex.
Threads do have priorities so that affects this as well, all things being equal a higher priority thread will get more time than a lower priority thread. If you don't manually set a priority on a thread, then it defaults to "Normal priority" In a simple case, two threads of the same priority that a ready to run, then both threads will run an equal amount of time, probably round-robin.
On why do we do multi-threading there are two basic reasons:
Speed: On a multiprocessor system since more than one thread can run at a time, our code can perform more than one task at a time. For example if we are processing an image, we split up the image into pieces and have different threads work on each piece of the image.
Asynchronous operations: There is some task that will take a while (e.g. reading a file from the Internet) and we want to be able to let that go on in the background while we do something else, so we create a thread to do the download while we go about our business. One of the big draws of this is in a GUI application we don't want to block the UI thread so the user interface still responds to user will processing is occurring.

Multithreading is useful in environments where one action needs to not BLOCK another action.
The primary example of that is in the case of a background process that shouldn't lock up the main user interface thread.
The operating system is generally going to decide who can do what, when. If a computer has only one core, multithreading has little benefit except the one listed above. But, as more cores are added, more actions can be performed concurrently.
However, even in a single core system, multithreading can facilitate non-blocking-IO which is very important in increasing the responsiveness of your application.

Multithreading speeds up program execution if there are parallelizable parts of the program.
You may want to have a look at different resources for multithreading to understand more about it.
Imagine you have a problem that needs to be done as quickly as possible. You have an easy one; count to a billion. You can do a loop: for (var i = 0; i < Math.Pow(10,9); i++) {} and then this will execute on one core only. It will take x amount of time. Now imagine doing it on two cores instead:
// execute action a concurrently across the domain to-from, where a takes the current index
void Execute(Action<int> a, int from, int to)
{
// assert to > from, to != from, from - to > CPUs, otherwise equal ranges = CPUs
// assert a != null
var pllItems = Environment.ProcessorCount;
var range = to-from;
var ranges = new int[pllItems,2];
var step = Convert.ToInt64(range / pllItems);
// calculate the ranges each thread should do
for (var i = 0; i < ranges.Length; i++) {
var s = from+i*step; // where thread i starts
ranges[i,0] = s; // -''-
ranges[i,1] = s+step - 1; // where thread i ends
}
var ts = Thread[pllItems];
for (var i = 0; i < pllItems; i++) ts.Start(o => {
var currT = i; // avoid closure capture problems
for (var x = ranges[currT, 0]; x < ranges[currT, 1], x++) {
a(x);
// could also have:
// try { a(x) } catch (Exception e) { lock(ecs) ecs.Add(e); /* stop thread */ break; }
// return at the end of method failed threads:
// return ecs;
}
});
for (var i = 0; i < pllItems; i++) ts.Join();
}
Thankfully, if you download the MS Threading library from 2008 you will get this for free with
Parallel.For(0, Math.Pow(10,9), () => { });
There's also a new tool for VS2010 which displays in a graphical form how the threads are blocking, waiting for io etc.
There's a scheduler in .Net/the OS that allows threads to have different interleavings.
A few days ago, MS released documentation on how to do parallel operations in .Net 4.
Have a download/read here

If you look at the Processes tab in Task Manager on your Windows machine, you will see the processes that are currently active on the machine. If you add the Threads column to the view, you will see the number of threads that currently exist in each process. The operating system (OS) is the one that determines how all of these threads across all of these processes are scheduled for execution on the processor. So in effect, the OS is constantly determining which threads have work to do and scheduling those threads for execution on the processor.
Let's assume a single processor, single core machine for now.
In this example, your application is the only process that is doing anything. Say your application has two threads of equal priority (more on this below). In this case, the OS will alternate between these two threads, scheduling one for execution and then the other until the work that they are doing is complete. To accomplish this, the OS grants a timeslice to the first scheduled thread. For example purposes, let's say the timeslice is 10 milliseconds (it's actually much shorter than this). So thread A will execute for 10 milliseconds. The OS will then preempt thread A so thread B can execute for its timeslice, also 10 milliseconds.
This back-and-forth will continue uninterrupted until both threads have finished their work or until certain events occur. For example, let's say that thread A finishes its work before thread B. In this case, thread A has nothing else to so, so the OS will continue to grant timeslices to thread B since it is the only one with work to do. Another thing that can happen is that thread A can wait on an event, such as a System.Threading.ManualResetEvent, or an asynchronous read of a socket. Until that event is signaled or data is received on the socket, thread A is essentially dead in its tracks, so the OS will continue to grant timeslices to thread B until the event/socket that thread A is waiting on occurs. At that point, the OS will resume switching between thread A and thread B for execution.
A good example of this is the background printing that most applications do today. An application's main thread is dedicated to processing UI events - button clicks, keyboard presses, drag-and-drop, etc. If you print a document from your favorite word processor, what happens conceptually is that the task of sending the print instructions to the printer is delegated to a secondary thread. So at this point, your application has two threads that are running - one thread servicing the UI and the other thread handling the print job. Since this is on a single processor, single core machine, the OS swaps between the two threads, granting timeslices to each. In this case, the print job thread will end after it finishes sending the print instructions, and then only your UI thread will be left.
A question you may have at this point is this:
Doesn't it take longer to print this
way on a single processor, single core machine
since the OS is having to swap between
the print job thread and the UI
thread?
And the answer is YES. It does take longer this way. But consider the alternative. If the print job were executed on the UI thread, the user interface would be unresponsive to your input, i.e., button clicks, keyboard presses, etc., until the print job was complete. And this would frustrate you as the user because the application isn't responding to your input. So, in effect, multithreading is really an illusion of parallelism, at least on a single processor, single core machine. However, you get the satisfaction of being able to interact with your application while the print job is accomplished on another thread, even though the print job takes longer doing it this way.
Now let's move to a multicore machine. If your process has the same two threads, A and B, to execute, then each thread can be scheduled on a separate core. In this case, both threads run simultaneously without the interruption. The OS doesn't have to swap between the threads because each thread has its own core to run on. Make sense?
Finally, let's consider the priority associated with threads (assume single processor, single core again). Each thread in a given application has, by default, the same priority. What this means is that the OS will consider all threads equal with regard to scheduling. If you have two threads to be executed, they will get roughly the same amount of time on the processor. You can adjust this, however, by increasing/decreasing the priority of one thread over the other. In this case, the thread with the higher priority is favored for scheduling purposes over the thread with a lower priority, meaning that it gets more timeslices than the other thread. In some limited cases, adjusting the priority of threads can improve your application's performance, but for most applications, it is not necessary. The thing to be cautious of is to not "starve" a thread, especially the UI thread. The OS helps to prevent this by not starving a thread altogether. Still, adjusting the priorities can still make your application appear sluggish, if not altogether unresponsive, if the UI thread is "put on a diet," so to speak.
You can read more about thread priorities here and here.
I hope this helps.

Purposes of a thread
Hide latency (i.e. do something else while waiting)
Exploit the concurrency of the hardware (in case of multiple cores, this gives better performance)
Discriminate importance levels (i.e. high and low priority threads)
Organize structure (i.e. thread per event, thread per resouce, thread per process)
There are others, but i think this are the basic uses of a thread

Related

Sending messages to Tasks

I am writing a program that shows the mandelbrot set depending on some conditions provided by the user. As the calculation takes long (more than 500 ms), I have decided to use more than one thread. Without any previous experience, I have managed to do it by using the System.Threading.Tasks class, which works just fine. The only thing that I don't like is that every time that the mandelbrot is generated, the threads are created and then destroyed.
This is an example of how it works. It creates the threads (Tasks) every time that the method is called.
for (int i = 0; i < maxThreads; i++) {
int a = i;
tasks[a] = Task.Factory.StartNew(() => generateSector(a));
}
I don't know really how that affects performance, but it looks like creating and destroying threads is time expensive, and that it would be more efficient to have the threads ready and waiting for a trigger message, and when they are done go back to that waiting state. May be the following example code is useful to understand this idea.
for (int i = 0; i < maxThreads; i++)
tasks[i].sendMessage("Start"); // Tells the running thread to begin its work
So each thread would execute an infinite loop in which it waits until they are required to do calculations. Then, it would continue with waiting. Something like this:
// Into the method that a thread executes
while(true) {
Wait(); // Waits for the start signal
calculate(); // Do some calculations
} // Go back to waiting
Would that be more efficient? Is there any way to do that?
Leave your code as it is.
1) Tasks use ThreadPool threads, so there is no problem
2) "I don't know really how that affects performance" - this is where you should start. Never optimize before measuring. Do you have performance issues? Is your code running slow? I guess no, so you should not be bothered.
When you use Task.Factory.StartNew(...), you are not necessarily creating and destroying threads. The task library uses a ThreadPool to do this, so you don't need to manage it yourself, like you would if you created new Thread()s yourself.
It sounds like you're trying to use a set of of threads and setting up a system for scheduling work to run on those threads. This is a great idea but in fact, it's so great of an idea that it's built into the .NET framework and you don't need to build it yourself. This is actually exactly what Tasks are made for.
Tasks are a relatively lightweight abstraction over the Thread Pool which is managed by the .NET Runtime. Threads are an operating-system construct that are relatively heavy and it's somewhat expensive to start, stop, and context-switch between threads. When you create a Task, it schedules that task to execute on a the next available thread in the pool and the .NET runtime will automatically increase and decrease the size of the pool based on whether there's work getting queued up and waiting for a thread to execute. You can customize the minimum and maximum thread counts if you need to but usually this is not nessecary.
So by simply creating short-lived Tasks that exist for the lifetime of the single unit of work, they're already going to have your work be run on a managed collection of actual threads.

threading higher priority for a not known in advance thread

I create about 5000 background workers that do intensive work in a console app. I'm also using an external library that instantiates an object, say ObjectX. At some point, say t0, ObjectX tries to obtain a thread from an os thread pool and start it, but I have no control on how it obtains this thread. Things work fine for 100 background workers. For 1000 background workers it takes about 10 minutes after t0 for ObjectX to obtain and start a thread.
Is there a way to set, in advance, a high priority for any threads that will be started in the future by an object?
As I think the answer to 1 is "no", is there a way to limit the priority of the background workers so as to somehow favor everything else? Even though I only want to 'favor' ObjectX.
The goal would be to always have available resources to run the thread launched by ObjectX, no matter how overloaded the machine is.
I'm using C# and the .Net fr 3.5, on a Windows 64bit machine.
The way threads work is that they are given processor time by the OS. When this happens this is called a context switch. A context switch takes about 2000-8000 cycles (i.e. depending on processor 2000-8000 instructions). If the OS has many CPUs or cores, it may not need to take the CPU away from one thread and give it to another--avoiding a context switch. There can only be one thread per CPU running at a time, when you have more threads that need CPU than CPUs then you're forcing a context switch. Context switches are performed no faster than the system quantum (every 20ms for client and 120ms for server).
If you have 5000 background workers you effectively have 5000 threads. Each of those threads is potentially vying for CPU time. On a client version of windows, that means 250,000 context switches per second. i.e. 500,000,000 to 2,000,000,000 cycles per second are devoted simply to switching between threads. (i.e. over and above the work your threads are performing) if it could even process that many context switches per second.
The recommended practice is to only have one CPU-bound thread per processor. A CPU-bound thread is one that spends very little time "waiting". The UI thread is not a CPU-bound thread. If your background workers are spending a lot of time waiting for locks, then they may not be CPU-bound either--but, in general, background worker threads are CPU-bound. (otherwise, what would be the point of using a background worker?).
Also, the OS spends a lot of time figuring out what thread needs to get the CPU next. When you start changing thread priorities you interfere with that and most of the time end up making your entire system slower (not just your application) rather than faster.
Update:
On a related not, it takes about 200,000 cycles to create a new thread and about 100,000 cycles to destroy a thread.
Update 2:
If the impetus of the question isn't simply "If it can be done" but to be able to scale workload, then as #JoshW/#Servy mention, using something like the Producer/Consumer Pattern would allow for scalability that could facilitate horizontal scaling to multiple computers/nodes via a queue or a service bus. Simple starting up an in ordinate amount of threads is not scalable beyond the # of CPUs. If what you truly want is an architecture that can scaled out because "available resources...how overloaded the machine is" is simply impossible.
Personally I think this is a bad idea, however... given the comments you have made on other answers and your request that "No matter how many background workers are create that ObjectX runs as soon as possible"... You could conceivably force your background workers to block using a ManualResetEvent.
For example at the top of your worker code you could block on a Manual reset event with the WaitOne method. This manual reset could be static or passed as an input parameter and wherever your ObjectX gets instantiated/called or whatever, you call the .Reset method on your ManualResetEvent. This would block all your workers at the WaitOne line. Next at the bottom of the code that runs ObjectX, call the ManualResetEvent.Set() method and that will unblock the workers.
Note this is NOT an efficient way to manage your threads, but if you "just have to make it work" and have time later to improve it... I suppose it's one possible solution.
The goal would be to always have available resources to run the thread launched by ObjectX, no matter how overloaded the machine is.
Then thread priorities might not be the right tool.. Remember, thread priorities are evil
In general, windows is not a real-time OS; especially, win32 does not even attempt to be soft real-time (IIRC, the NT kernel tried, at some point, to have at least support for soft real time subsystems, but I may be wrong). So there is no guarantee about available resources, or timing.
Also, are you worried about other threads in the system? Those threads are out of your control (what if the other threads are already at the system max priority?).
If you are worried about threads in your app... you can control and throttle them, using less threads/workers to do more work (batching work in bigger units, and submitting it to a worker, for example, or by using TPL or other tools that will handle and throttle thread usage for you)
That said, you could intercept when a thread is created (look for example this question https://stackoverflow.com/a/3802316/863564) see if it was created for ObjectX (for example, checking its name) and use SetThreadPriority to boost it.

ThreadPool not starting new Thread instantly

I have a C# Windows Service that starts up various objects (Class libraries). Each of these objects has its own "processing" logic that start up multiple long running processing threads by using the ThreadPool. I have one example, just like this:
System.Threading.ThreadPool.QueueUserWorkItem(new System.Threading.WaitCallback(WorkerThread_Processing));
This works great. My app works with no issues, and my threads work well.
Now, for regression testing, I am starting those same objects up, but from a C# Console app rather than a Windows Service. It calls the same exact code (because it is invoking the same objects), however the WorkerThread_Processing method delays for up to 20 seconds before starting.
I have gone in and switched from the ThreadPool to a Thread, and the issue goes away. What could be happening here? I know that I am not over the MaxThreads count (I am starting 20 threads max).
The ThreadPool is specifically not intended for long-running items (more specifically, you aren't even necessarily starting up new threads when you use the ThreadPool, as its purpose is to spread the tasks over a limited number of threads).
If your task is long running, you should either break it up into logical sections that are put on the ThreadPool (or use the new Task framework), or spin up your own Thread object.
As to why you're experiencing the delay, the MSDN Documentation for the ThreadPool class says the following:
As part of its thread management strategy, the thread pool delays before creating threads. Therefore, when a number of tasks are queued in a short period of time, there can be a significant delay before all the tasks are started.
You only know that the ThreadPool hasn't reached its maximum thread count, not how many threads (if any) it actually has sitting idle.
The thread pool's maximum number of threads value is the maximum number that it can create. It is not the maximum number that are already created. The thread pool has logic that prevents it from spinning up a whole bunch of threads instantly.
If you call ThreadPool.QueueUserWorkItem 10 times in quick succession, the thread pool will not create 10 threads immediately. It will start a thread, delay, start another, etc.
I seem to recall that the delay was 500 milliseconds, but I can't find the documentation to verify that.
Here it is: The Managed Thread Pool:
The thread pool has a built-in delay (half a second in the .NET
Framework version 2.0) before starting new idle threads. If your
application periodically starts many tasks in a short time, a small
increase in the number of idle threads can produce a significant
increase in throughput. Setting the number of idle threads too high
consumes system resources needlessly.
You can control the number of idle threads maintained by the thread
pool by using the GetMinThreads and SetMinThreads
Note that this quote is taken from the .NET 3.5 version of the documentation. The .NET 4.0 version does not mention a delay.

Multi threading which would be the best to use? (Threadpool or threads)

Hopefully this is a better question than my previous. I have a .exe which I will be passing different parameters (file paths) to which it will then take in and parse. So I will have a loop going, looping through the file paths in a list and passing them to this .exe file.
For this to be more efficient, I want to spread the execution across multiple cores which I think you do through threading.
My question is, should I use the threadpool, or multiple threads to run this .exe asynchronously?
Also, depending on which one of those you guys think is the best, if you can point me to a tutorial that will have some info on what I want to do. Thank you!
EDIT:
I need to limit the number of executions of the .exe to ONE execution PER CORE. This is the most efficient because if I am parsing 100,000 files I can't just fire up 100000 processes. So I am using threads to limit the number of executions at one time to one execution per core. If there is another way (other than threads) to find out if a processor isn't tied up in execution, or if the .exe has finished please explain.
But if there isn't another way, my FINAL question is how would I use a thread to call a parse method and then call back when that thread is no longer in use?
SECOND UPDATE (VERY IMPORTANT):
I went through what everyone told me, and found out a key element that I left out that I thought didn't matter. So I am using a GUI and I don't want it to be locked up. THAT is why I wanted to use threads. My main question now is, how do I send back information from a thread so I know when the execution is over?
As I said in my answer to your previous question, I think you don't understand the difference between processes and threads. Processes are incredibly "heavy" (*); each process can contain many threads. If you are spawning new processes from a parent process, that parent process doesn't need to create new threads; each process will have its own collection of threads.
Only create threads in the parent process if all the work is being done in the same process.
Think of a thread as a worker, and a process as a building containing one or more workers.
One strategy is "build a single building and populate it with ten workers who do each do some amount of work". You get the expense of building one process and ten threads.
If your strategy is "build a building. Then have the one worker in that building order the construction of a thousand more buildings, each of which contains a worker that does their bidding", then you get the expense of building 1001 buildings and hiring 1001 workers.
The strategy you do not want to pursue is "build a building. Hire 1000 workers in that building. Then instruct each worker to build a building, which then has one worker to go do the real work." There is no point in making a thread whose sole job is creating a process that then creates a thread! You have 1001 buildings and 2001 workers, half of whom are immediately idle but still have to be paid.
Looking at your specific problem: the key question is "where is the bottleneck?" Spawning off new processes or new threads only helps when the performance problem is that the perf is gated on the processor. If the performance of your parser is gated not on how fast you can parse the file but rather on how fast you can get it off disk, then parallelizing it is going to make things far, far worse. You'll have a huge amount of system resources devoted to all hammering on the same disk controller at the same time, and the disk controller will get slower as more load piles up on it.
UPDATE:
I need to limit the number of executions of the .exe to ONE execution PER CORE. This is the most efficient because if I am parsing 100,000 files I can't just fire up 100000 processes. So I am using threads to limit the number of executions at one time to one execution per core. If there is another way (other than threads) to find out if a processor isn't tied up in execution, or if the .exe has finished please explain
This seems like an awfully complicated way to go about it. Suppose you have n processors. Your proposed strategy, as I understand it, is to fire up n threads, then have each thread fire up one process, and you know that since the operating system will probably schedule one thread per CPU that somehow the processor will magically also schedule the new thread in each new process on a different CPU?
That seems like a tortuous chain of reasoning that depends on implementation details of the operating system. This is craziness. If you want to set the processor affinity of a particular process, just set the processor affinity on the process! Don't be doing this crazy thing with threads and hope that it works out.
I say that if you want to have no more than n instances of an executable running, one per processor, don't mess around with threads at all. Rather, just have one thread sit in a loop, constantly monitoring what processes are running. If there are fewer than n copies of the executable running, spawn another and set its processor affinity to be the CPU you like best. If there are n or more copies of the executable running, go to sleep for a second (or a minute, or whatever makes sense), and when you wake up, check again. Keep doing that until you're done. That seems like a much easier approach.
(*) Threads are also heavy, but they are lighter than processes.
Spontaneously I would push your file paths into a thread safe queue and then fire up a number of threads (say one per core). Each thread would repeatedly pop one item from the queue and process the it accordingly. The work is done when the queue is empty.
Implementation suggestions (to answer some of the questions in comments):
Queue:
In C# you could have a look at the Queue Class and the Queue.Synchronized Method for the implementation of the queue:
"Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
To guarantee the thread safety of the Queue, all operations must be done through the wrapper returned by the Synchronized method.
Enumerating through a collection is intrinsically not a thread-safe procedure. Even when a collection is synchronized, other threads can still modify the collection, which causes the enumerator to throw an exception. To guarantee thread safety during enumeration, you can either lock the collection during the entire enumeration or catch the exceptions resulting from changes made by other threads."
Threading:
For the threading part I suppose that any of the examples in the msdn threading tutorial would do (the tutorial is a bit old, but should be valid). Should not need to worry about synchronizing the threads as they can work independently from each other. The queue above is the only common resource they should need to access (hence the importance of thread safety of the queue).
Start the external process (.exe):
The following code is borrowed (and tweaked) from How to wait for a shelled application to finish by using Visual C#. You need to edit for your own needs, but as a starter:
//How to Wait for a Shelled Process to Finish
//Create a new process info structure.
ProcessStartInfo pInfo = new ProcessStartInfo();
//Set the file name member of the process info structure.
pInfo.FileName = "mypath\myfile.exe";
//Start the process.
Process p = Process.Start(pInfo);
//Wait for the process to end.
p.WaitForExit();
Pseudo code:
Main thread;
Create thread safe queue
Populate the queue with all the file paths
Create child threads and wait for them to finish
Child threads:
While queue is not empty << this section is critical, not more then one
pop file from queue << thread can check and pop at the time
start external exe
wait for it....
end external exe
end while
Child thread exits
Main thread waits for all child threads to finish
Program finishes.
See this question for how to find out the number of cores.
Then use Parallel.ForEach with ParallelOptions with MaxDegreeOfParallelism set to the number of cores.
Parallel.ForEach(args, new ParallelOptions() { MaxDegreeOfParallelism = Environment.ProcessorCount }, (element) => Console.WriteLine(element));
If you're targeting the .Net 4 framework the Parallel.For or Parallel.Foreach are extremely helpful. If those don't meet your requirements I've found the Task.Factory to be useful and straightforward to use as well.
To answer your revised question, you want processes. You just need to create the correct number of processes running the exe. Don't worry about forcing them onto specific cores. Windows will do that automatically.
How to do this:
You want to determine the number of cores on the machine. You may simply know it, and hardcode it, or you might want to use something like System.Environment.ProcessorCount.
Create a List<Process> object.
Then you want to start that many processes using System.Diagnostics.Process.Start. The return value will be a process object, which you will want to add to the List.
Now repeat the following until you are finished:
Call Thread.Sleep to wait for a while. Perhaps a minute or so.
Loop through each Process in the list but be sure to use a for loop rather than a foreach loop. For each process, call Refresh() then check the 'HasExited' property of each process, and if it is true, create a new process using Process.Start, and replace the exited process in the list with the newly created one.
If you're launching a .exe, then you have no choice. You will be running this asynchronously in a separate process. For the program which does the launching, I would recommend that you use a single thread and keep a list of the processes you launched.
Each exe launched will occur in its own process. You don't need to use a threadpool or multiple threads; the OS manages the processes (and since they're processes and not threads, they're very independent; completely separate memory space, etc.).

Is this a right case for using the Threadpool?

Here's the setup: I'm trying to make a relatively simple Winforms app, a feed reader using the FeedDotNet library. The question I have is about using the threadpool. Since FeedDotNet is making synchronous HttpWebRequests, it is blocking the GUI thread. So the best thing seemed like putting the synchronous call on a ThreadPool thread, and while it is working, invoke the controls that need updating on the form. Some rough code:
private void ThreadProc(object state)
{
Interlocked.Increment(ref updatesPending);
// check that main form isn't closed/closing so that we don't get an ObjectDisposedException exception
if (this.IsDisposed || !this.IsHandleCreated) return;
if (this.InvokeRequired)
this.Invoke((MethodInvoker)delegate
{
if (!marqueeProgressBar.Visible)
this.marqueeProgressBar.Visible = true;
});
ThreadAction t = state as ThreadAction;
Feed feed = FeedReader.Read(t.XmlUri);
Interlocked.Decrement(ref updatesPending);
if (this.IsDisposed || !this.IsHandleCreated) return;
if (this.InvokeRequired)
this.Invoke((MethodInvoker)delegate { ProcessFeedResult(feed, t.Action, t.Node); });
// finished everything, hide progress bar
if (updatesPending == 0)
{
if (this.IsDisposed || !this.IsHandleCreated) return;
if (this.InvokeRequired)
this.Invoke((MethodInvoker)delegate { this.marqueeProgressBar.Visible = false; });
}
}
this = main form instance
updatesPending = volatile int in the main form
ProcessFeedResult = method that does some operations on the Feed object. Since a threadpool thread can't return a result, is this an acceptable way of processing the result via the main thread?
The main thing I'm worried about is how this scales. I've tried ~250 requests at once. The max number of threads I've seen was around 53 and once all threads were completed, back to 21. I recall in one exceptional instance of me playing around with the code, I had seen it rise as high as 120. This isn't normal, is it? Also, being on Windows XP, I reckon that with such high number of connections, there would be a bottleneck somewhere. Am I right?
What can I do to ensure maximum efficiency of threads/connections?
Having all these questions also made me wonder whether this is the right case for a Threadpool use. MSDN and other sources say it should be used for "short-lived" tasks. Is 1-2 seconds "short-lived" enough, considering I'm on a relatively fast connection? What if the user is on a 56K dial-up and one request could take from 5-12 seconds and ever more. Would the threadpool be an efficient solution then too?
The ThreadPool, unchecked is probably a bad idea.
Out of the box you get 250 threads in the threadpool per cpu.
Imagine if in a single burst you flatten out someones net connection and get them banned from getting notifications from a site cause they are suspected to be running a DoS attack.
Instead, when downloading stuff from the net you should build in tons of control. The user should be able to decide how many concurrent requests they make (and how many concurrent requests per domain), ideally you also want to offer controls for the amount of bandwidth.
Though this could be orchestrated with the ThreadPool, having dedicated threads or using something like a bunch of instances of the BackgroundWorker class is a better option.
My understanding of the ThreadPool is that it is designed for this type of situation. I think the definition of short-lived is of this order of time - perhaps even up to minutes. A "long-lived" thread would be one that was alive for the lifetime of the application.
Don't forget Microsoft would have spent some getting the efficiency of the ThreadPool as high as it could. Do you think that you could write something that was more efficient? I know I couldn't.
The .NET thread pool is designed specifically for executing short-running tasks for which the overhead of creating a new thread would negate the benefits of creating a new thread. It is not designed for tasks which block for prolonged periods or have a long execution time.
The idea is to for a task to hop onto a thread, run quickly, complete and hop off.
The BackgroundWorker class provides an easy way to execute tasks on a thread pool thread, and provides mechanisms for the task to report progress and handle cancel requests.
In this MSDN article on the BackgroundWorker Component, file downloads are explicitly given as examples of the appropriate use of this class. That should hopefully encourage you to use this class to perform the work you need.
If you're worried about overusing the thread pool, you can be assured the runtime does manage the number of available threads based on demand. Tasks are queued on the thread pool for execution. When a thread becomes available to do work, the task is loaded onto the thread. At regular intervals, a monitoring process checks the state of the thread pool. If there are tasks waiting to be executed, it can create more threads. If there are several idle threads, it can shut down some to release resources.
In a worse-case scenario, where all threads are busy and you have work queued up, the runtime will be adding threads to deal with the extra workload. The application will be running more slowly as it has to wait for more threads to be made available, but it will continue to run.
A few points, and to combine info form a few other answers:
your ThreadProc does not contain Exception handling. You should add that or 1 I/O error will halt your process.
Sam Saffron is quite right that you should limit the number of threads. You could use a (ThreadSafe) Queue to push your feeds into (WorkItems) and have 1+ threads reading from the queue in a loop.
The BackgrounWorker might be a good idea, it would provide you with both the Exception handling and Synchronization you need.
And the BackgrounWorker uses the ThreadPool, and that is fine
You may want to take a look to the "BackgroundWorker" class.

Categories