I am writing a program that shows the mandelbrot set depending on some conditions provided by the user. As the calculation takes long (more than 500 ms), I have decided to use more than one thread. Without any previous experience, I have managed to do it by using the System.Threading.Tasks class, which works just fine. The only thing that I don't like is that every time that the mandelbrot is generated, the threads are created and then destroyed.
This is an example of how it works. It creates the threads (Tasks) every time that the method is called.
for (int i = 0; i < maxThreads; i++) {
int a = i;
tasks[a] = Task.Factory.StartNew(() => generateSector(a));
}
I don't know really how that affects performance, but it looks like creating and destroying threads is time expensive, and that it would be more efficient to have the threads ready and waiting for a trigger message, and when they are done go back to that waiting state. May be the following example code is useful to understand this idea.
for (int i = 0; i < maxThreads; i++)
tasks[i].sendMessage("Start"); // Tells the running thread to begin its work
So each thread would execute an infinite loop in which it waits until they are required to do calculations. Then, it would continue with waiting. Something like this:
// Into the method that a thread executes
while(true) {
Wait(); // Waits for the start signal
calculate(); // Do some calculations
} // Go back to waiting
Would that be more efficient? Is there any way to do that?
Leave your code as it is.
1) Tasks use ThreadPool threads, so there is no problem
2) "I don't know really how that affects performance" - this is where you should start. Never optimize before measuring. Do you have performance issues? Is your code running slow? I guess no, so you should not be bothered.
When you use Task.Factory.StartNew(...), you are not necessarily creating and destroying threads. The task library uses a ThreadPool to do this, so you don't need to manage it yourself, like you would if you created new Thread()s yourself.
It sounds like you're trying to use a set of of threads and setting up a system for scheduling work to run on those threads. This is a great idea but in fact, it's so great of an idea that it's built into the .NET framework and you don't need to build it yourself. This is actually exactly what Tasks are made for.
Tasks are a relatively lightweight abstraction over the Thread Pool which is managed by the .NET Runtime. Threads are an operating-system construct that are relatively heavy and it's somewhat expensive to start, stop, and context-switch between threads. When you create a Task, it schedules that task to execute on a the next available thread in the pool and the .NET runtime will automatically increase and decrease the size of the pool based on whether there's work getting queued up and waiting for a thread to execute. You can customize the minimum and maximum thread counts if you need to but usually this is not nessecary.
So by simply creating short-lived Tasks that exist for the lifetime of the single unit of work, they're already going to have your work be run on a managed collection of actual threads.
Related
The source code portion below makes the CPU consumes high! Is there a better way to implement multithreading?
// This application will run 24 hours 7 days per week
// in server
static void Main(string[] args)
{
const int NUM = 10;
Thread[] t = new Thread[NUM];
for (int i = 0; i < t.Length; i++)
t[i] = new Thread(new ThreadStart(DoWork));
foreach (Thread u in t)
u.Start();
}
static void DoWork()
{
while (true)
{
// Perform scanning work
// If some conditions are satisfied,
// it will perform some actions
}
}
The question is hard to answer if it isn't clear what's inside the while(true) spinning loop. If you start 10 threads where each one does spinning without any waits (using something like AutoResetEvent.WaitOne(), Thread.Sleep(), etc.), the CPU will be consumed - since you are asking the CPU for it.
To improve overall performance threads shouldn't spin if it isn't absolutely necessary to do - and in most cases it isn't. Threads should do their work and then if there aren't more work to do, they should go to sleep. They should be woken up only if they have more work items to process. If your thread is running all the time - checking some conditions that are met only from time to time and if you have a mechanism to inform your thread that conditions are true - then the spinning is wasting of your CPU cycles.
Conceptually a thread should work in this way.
while(true)
{
// Wait until a thread has something meaningful to do. Waiting can be done for instance by calling AutoResetEvent.WaitOne().
// Do a meaningful work here.
}
If you do your thread code in this way, when a thread is waiting, it doesn't spend the CPU, so system is not overworked and other threads/processes can do their work.
The principal question here is, if you have some kind of notification mechanism that allows your thread to wake up when a new work item arrives. For instance, most IO operations like TCP sockets, HTTP communication, reading from files, etc. support asynchronous communication that allows your thread to go to sleep and awake only if new data arrive.
On the other hand, if you don't have such mechanism - for instance you are using a 3rd party library that doesn't notify you when something meaningful happened, you have to do some kind of spinning. But even it this case, the question is, how often do you need to check if conditions were met - so there is some work to do.
If let say, you need to check only every second if conditions were met, add Thread.Sleep(1000) calls to your thread code. This will greatly increases overal performance. Even Thread.Sleep(0) is much better then wait-less spinning.
One important note here. In modern C#, threads should not be used as your primary asynchronous programming mechanism. Using task based asynchronous programming using class Task is much easier to implement, especially if you are using await-async C# keywords and in most cases leads to better performance.
Tasks use threads internally creating new ones if there are too many work items to process with current number of threads and releasing threads, if there isn't enough work to do. Optimal number of threads reduces overal memory consumption.
In these days virtually all standard .NET APIs support task based asynchronous programming so it should be your primary tool how to achieve paralel execution of your code.
Here is some example of task based asynchronous programming.
I've got an application where there are several threads that provide data, that needs to go through some heavy math. The math part needs a lot of initialization, afterwards it's pretty fast - as such I can't just spawn a thread every time I need to do the calculation, nor should every source thread have its own solver (there can be a LOT of such threads, beyond a certain point the memory requirements are obscene, and the overhead gets in the way or processing power).
I would like to use a following model: The data gathering and using threads would call to a single object, through one thread-safe interface function, like
public OutData DoMath(InData data) {...}
that would take care of the rest. This would involve finding a free worker thread (or waiting and blocking till one is available) passing by some means the data in a thread safe manner to one of the free worker threads, waiting (blocking) for it to do its job and gathering the result and returning it.
The worker thread(s) would then go into some sleep/blocked state, until a new input item would appear on its interface (or a command to clean up and die).
I know how to do this by means of various convoluted locks, queues and waits in a very horrible nasty way. I'm guessing there's a better, more elegant way.
My questions are:
Is this a good architecture for this?
Are there commonly used elegant means of doing this?
The target framework is .NET 4.5 or higher.
Thank you,
David
The math part needs a lot of initialization, afterwards it's pretty fast - as such I can't just spawn a thread every time I need to do the calculation, nor should every source thread have its own solver (there can be a LOT of such threads, beyond a certain point the memory requirements are obscene, and the overhead gets in the way or processing power).
Sounds like a pool of lazy-initialized items. You can use a basic BlockingCollection for this, but I recommend overriding the default queue-like behavior with a stack-like behavior to avoid initializing contexts you may not ever need.
I'll call the expensive-to-initialize type MathContext:
private static readonly BlockingColleciton<Lazy<MathContext>> Pool;
static Constructor()
{
Pool = new BlockingCollection<Lazy<MathContext>>(new ConcurrentStack<Lazy<MathContext>>());
for (int i = 0; i != 100; ++i) // or whatever you want your upper limit to be
Pool.Add(new Lazy<MathContext>());
}
This would involve finding a free worker thread (or waiting and blocking till one is available)
Actually, there's no point in using a worker thread here. Since your interface is synchronous, the calling thread can just do the work itself.
OutData DoMath(InData data)
{
// First, take a context from the pool.
var lazyContext = Pool.Take();
try
{
// Initialize the context if necessary.
var context = lazyContext.Value;
return ... // Do the actual work.
}
finally
{
// Ensure the context is returned to the pool.
Pool.Add(lazyContext);
}
}
I also think you should check out the TPL Dataflow library. It would require a bit of code restructuring, but it sounds like it may be a good fit for your problem domain.
Investigate Task Parallel Library. It has a set of methods for creating and managing threads. And such classes as ReaderWriterLock, ManualResetEvent
and their derivatives may help in synchronizing threads
Don't use locks. This problem sounds nice for a proper nearly lock free approach.
I think what you need to look into is the BlockingCollection. This class is a powerful collection for multiple consumers and producers. If you think about using it with Parallel.ForEach you may want to look into writing your own Partitioner to get some more performance out of it. Parallel contains a couple of very nice methods if you only need a couple of threads for a relatively short time. That sounds like something you need to do. There are also overloads that provide initialization and finalization methods for each spawned thread along with passing thread local variables from one stage of the function to the next. That may really help you.
The general tips apply here of cause too. Try to split up your application in as may small parts as possible. That usually clears things up nicely and the ways how to do things become clearer.
All in all from what you told about the problem at hand I do not think that you need a lot of blocking synchronization. The BlockingCollection is only blocking the consumer threads until new data is ready to be consumed. And the producer if you limit the size...
I can't think of anything beyond that out of the top of my head. This is a very general question and without some specific issues it is hard to help beyond that.
I still hope that helps.
You've pretty much described a thread pool - fortunately, there's quite a few simple APIs you can use for that. The simplest is probably
await Task.Run(() => DoMath(inData));
or just call Task.Run(() => DoMath(inData)).GetAwaiter().GetResult() if you don't mind blocking the requesting thread.
Instead of starting a whole new thread, it will simply borrow a thread from the .NET thread pool for the computation, and then return the result. Since you're doing almost pure CPU work, the thread pool will have only as much threads as you really need (that is, about the same (or double) amount as the number of CPU cores you have).
Using the await based version is a bit trickier - you need to ensure your whole call chain returns Tasks - but it has a major advantage in avoiding the need to keep the calling thread alive while you wait for the results to be done. And even better, if you make sure the original thread is also a thread-pool thread, you don't even need the Task.Run - the threads will be balanced automatically. Since you're only doing synchronous work anyway, this turns your whole problem into simply avoiding any manual new Thread, and using Task.Run(...) instead.
First, create a pool of N such "math service objects" that are heavy. Then, guard usage of that pool with a new SemaphoreSlim(N, N). Accessing those objects is then as easy as:
SemaphoreSlim sem = ...;
//...
await sem.WaitAsync();
var obj = TakeFromPool();
DoWork(obj);
Return(obj);
sem.Release();
You can vary this pattern in many ways. The core of it is the pool plus a semaphore that can be used to wait if the pool is empty at the time.
Consider a queue holding a lot of jobs that need processing. Limitation of queue is can only get 1 job at a time and no way of knowing how many jobs there are. The jobs take 10s to complete and involve a lot of waiting for responses from web services so is not CPU bound.
If I use something like this
while (true)
{
var job = Queue.PopJob();
if (job == null)
break;
Task.Factory.StartNew(job.Execute);
}
Then it will furiously pop jobs from the queue much faster than it can complete them, run out of memory and fall on its ass. >.<
I can't use (I don't think) ParallelOptions.MaxDegreeOfParallelism because I can't use Parallel.Invoke or Parallel.ForEach
3 alternatives I've found
Replace Task.Factory.StartNew with
Task task = new Task(job.Execute,TaskCreationOptions.LongRunning)
task.Start();
Which seems to somewhat solve the problem but I am not clear exactly what this is doing and if this is the best method.
Create a custom task scheduler that limits the degree of concurrency
Use something like BlockingCollection to add jobs to collection when started and remove when finished to limit number that can be running.
With #1 I've got to trust that the right decision is automatically made, #2/#3 I've got to work out the max number of tasks that can be running myself.
Have I understood this correctly - which is the better way, or is there another way?
EDIT - This is what I've come up with from the answers below, producer-consumer pattern.
As well as overall throughput aim was not to dequeue jobs faster than could be processed and not have multiple threads polling queue (not shown here but thats a non-blocking op and will lead to huge transaction costs if polled at high frequency from multiple places).
// BlockingCollection<>(1) will block if try to add more than 1 job to queue (no
// point in being greedy!), or is empty on take.
var BlockingCollection<Job> jobs = new BlockingCollection<Job>(1);
// Setup a number of consumer threads.
// Determine MAX_CONSUMER_THREADS empirically, if 4 core CPU and 50% of time
// in job is blocked waiting IO then likely be 8.
for(int numConsumers = 0; numConsumers < MAX_CONSUMER_THREADS; numConsumers++)
{
Thread consumer = new Thread(() =>
{
while (!jobs.IsCompleted)
{
var job = jobs.Take();
job.Execute();
}
}
consumer.Start();
}
// Producer to take items of queue and put in blocking collection ready for processing
while (true)
{
var job = Queue.PopJob();
if (job != null)
jobs.Add(job);
else
{
jobs.CompletedAdding()
// May need to wait for running jobs to finish
break;
}
}
I just gave an answer which is very applicable to this question.
Basically, the TPL Task class is made to schedule CPU-bound work. It is not made for blocking work.
You are working with a resource that is not CPU: waiting for service replies. This means the TPL will mismange your resource because it assumes CPU boundedness to a certain degree.
Manage the resources yourself: Start a fixed number of threads or LongRunning tasks (which is basically the same). Decide on the number of threads empirically.
You can't put unreliable systems into production. For that reason, I recommend #1 but throttled. Don't create as many threads as there are work items. Create as many threads which are needed to saturate the remote service. Write yourself a helper function which spawns N threads and uses them to process M work items. You get totally predictable and reliable results that way.
Potential flow splits and continuations caused by await, later on in your code or in a 3rd party library, won't play nicely with long running tasks (or threads), so don't bother using long running tasks. In the async/await world, they're useless. More details here.
You can call ThreadPool.SetMaxThreads but before you make this call, make sure you set the minimum number of threads with ThreadPool.SetMinThreads, using values below or equal to the max ones. And by the way, the MSDN documentation is wrong. You CAN go below the number of cores on your machine with those method calls, at least in .NET 4.5 and 4.6 where I used this technique to reduce the processing power of a memory limited 32 bit service.
If however you don't wish to restrict the whole app but just the processing part of it, a custom task scheduler will do the job. A long time ago, MS released samples with several custom task schedulers, including a LimitedConcurrencyLevelTaskScheduler. Spawn the main processing task manually with Task.Factory.StartNew, providing the custom task scheduler, and every other task spawned by it will use it, including async/await and even Task.Yield, used for achieving asynchronousy early on in an async method.
But for your particular case, both solutions won't stop exhausting your queue of jobs before completing them. That might not be desirable, depending on the implementation and purpose of that queue of yours. They are more like "fire a bunch of tasks and let the scheduler find the time to execute them" type of solutions. So perhaps something a bit more appropriate here could be a stricter method of control over the execution of the jobs via semaphores. The code would look like this:
semaphore = new SemaphoreSlim(max_concurrent_jobs);
while(...){
job = Queue.PopJob();
semaphore.Wait();
ProcessJobAsync(job);
}
async Task ProcessJobAsync(Job job){
await Task.Yield();
... Process the job here...
semaphore.Release();
}
There's more than one way to skin a cat. Use what you believe is appropriate.
Microsoft has a very cool library called DataFlow which does exactly what you want (and much more). Details here.
You should use the ActionBlock class and set the MaxDegreeOfParallelism of the ExecutionDataflowBlockOptions object. ActionBlock plays nicely with async/await, so even when your external calls are awaited, no new jobs will begin processing.
ExecutionDataflowBlockOptions actionBlockOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 10
};
this.sendToAzureActionBlock = new ActionBlock<List<Item>>(async items => await ProcessItems(items),
actionBlockOptions);
...
this.sendToAzureActionBlock.Post(itemsToProcess)
The problem here doesn't seem to be too many running Tasks, it's too many scheduled Tasks. Your code will try to schedule as many Tasks as it can, no matter how fast they are executed. And if you have too many jobs, this means you will get OOM.
Because of this, none of your proposed solutions will actually solve your problem. If it seems that simply specifying LongRunning solves your problem, then that's most likely because creating a new Thread (which is what LongRunning does) takes some time, which effectively throttles getting new jobs. So, this solution only works by accident, and will most likely lead to other problems later on.
Regarding the solution, I mostly agree with usr: the simplest solution that works reasonably well is to create a fixed number of LongRunning tasks and have one loop that calls Queue.PopJob() (protected by a lock if that method is not thread-safe) and Execute()s the job.
UPDATE: After some more thinking, I realized the following attempt will most likely behave terribly. Use it only if you're really sure it will work well for you.
But the TPL tries to figure out the best degree of parallelism, even for IO-bound Tasks. So, you might try to use that to your advantage. Long Tasks won't work here, because from the point of view of TPL, it seems like no work is done and it will start new Tasks over and over. What you can do instead is to start a new Task at the end of each Task. This way, TPL will know what's going on and its algorithm may work well. Also, to let the TPL decide the degree of parallelism, at the start of a Task that is first in its line, start another line of Tasks.
This algorithm may work well. But it's also possible that the TPL will make a bad decision regarding the degree of parallelism, I haven't actually tried anything like this.
In code, it would look like this:
void ProcessJobs(bool isFirst)
{
var job = Queue.PopJob(); // assumes PopJob() is thread-safe
if (job == null)
return;
if (isFirst)
Task.Factory.StartNew(() => ProcessJobs(true));
job.Execute();
Task.Factory.StartNew(() => ProcessJob(false));
}
And start it with
Task.Factory.StartNew(() => ProcessJobs(true));
TaskCreationOptions.LongRunning is useful for blocking tasks and using it here is legitimate. What it does is it suggests to the scheduler to dedicate a thread to the task. The scheduler itself tries to keep number of threads on same level as number of CPU cores to avoid excessive context switching.
It is well described in Threading in C# by Joseph Albahari
I use a message queue/mailbox mechanism to achieve this. It's akin to the actor model. I have a class that has a MailBox. I call this class my "worker." It can receive messages. Those messages are queued and they, essentially, define tasks that I want the worker to run. The worker will use Task.Wait() for its Task to finish before dequeueing the next message and starting the next task.
By limiting the number of workers I have, I am able to limit the number of concurrent threads/tasks that are being run.
This is outlined, with source code, in my blog post on a distributed compute engine. If you look at the code for IActor and the WorkerNode, I hope it makes sense.
https://long2know.com/2016/08/creating-a-distributed-computing-engine-with-the-actor-model-and-net-core/
Here's the setup: I'm trying to make a relatively simple Winforms app, a feed reader using the FeedDotNet library. The question I have is about using the threadpool. Since FeedDotNet is making synchronous HttpWebRequests, it is blocking the GUI thread. So the best thing seemed like putting the synchronous call on a ThreadPool thread, and while it is working, invoke the controls that need updating on the form. Some rough code:
private void ThreadProc(object state)
{
Interlocked.Increment(ref updatesPending);
// check that main form isn't closed/closing so that we don't get an ObjectDisposedException exception
if (this.IsDisposed || !this.IsHandleCreated) return;
if (this.InvokeRequired)
this.Invoke((MethodInvoker)delegate
{
if (!marqueeProgressBar.Visible)
this.marqueeProgressBar.Visible = true;
});
ThreadAction t = state as ThreadAction;
Feed feed = FeedReader.Read(t.XmlUri);
Interlocked.Decrement(ref updatesPending);
if (this.IsDisposed || !this.IsHandleCreated) return;
if (this.InvokeRequired)
this.Invoke((MethodInvoker)delegate { ProcessFeedResult(feed, t.Action, t.Node); });
// finished everything, hide progress bar
if (updatesPending == 0)
{
if (this.IsDisposed || !this.IsHandleCreated) return;
if (this.InvokeRequired)
this.Invoke((MethodInvoker)delegate { this.marqueeProgressBar.Visible = false; });
}
}
this = main form instance
updatesPending = volatile int in the main form
ProcessFeedResult = method that does some operations on the Feed object. Since a threadpool thread can't return a result, is this an acceptable way of processing the result via the main thread?
The main thing I'm worried about is how this scales. I've tried ~250 requests at once. The max number of threads I've seen was around 53 and once all threads were completed, back to 21. I recall in one exceptional instance of me playing around with the code, I had seen it rise as high as 120. This isn't normal, is it? Also, being on Windows XP, I reckon that with such high number of connections, there would be a bottleneck somewhere. Am I right?
What can I do to ensure maximum efficiency of threads/connections?
Having all these questions also made me wonder whether this is the right case for a Threadpool use. MSDN and other sources say it should be used for "short-lived" tasks. Is 1-2 seconds "short-lived" enough, considering I'm on a relatively fast connection? What if the user is on a 56K dial-up and one request could take from 5-12 seconds and ever more. Would the threadpool be an efficient solution then too?
The ThreadPool, unchecked is probably a bad idea.
Out of the box you get 250 threads in the threadpool per cpu.
Imagine if in a single burst you flatten out someones net connection and get them banned from getting notifications from a site cause they are suspected to be running a DoS attack.
Instead, when downloading stuff from the net you should build in tons of control. The user should be able to decide how many concurrent requests they make (and how many concurrent requests per domain), ideally you also want to offer controls for the amount of bandwidth.
Though this could be orchestrated with the ThreadPool, having dedicated threads or using something like a bunch of instances of the BackgroundWorker class is a better option.
My understanding of the ThreadPool is that it is designed for this type of situation. I think the definition of short-lived is of this order of time - perhaps even up to minutes. A "long-lived" thread would be one that was alive for the lifetime of the application.
Don't forget Microsoft would have spent some getting the efficiency of the ThreadPool as high as it could. Do you think that you could write something that was more efficient? I know I couldn't.
The .NET thread pool is designed specifically for executing short-running tasks for which the overhead of creating a new thread would negate the benefits of creating a new thread. It is not designed for tasks which block for prolonged periods or have a long execution time.
The idea is to for a task to hop onto a thread, run quickly, complete and hop off.
The BackgroundWorker class provides an easy way to execute tasks on a thread pool thread, and provides mechanisms for the task to report progress and handle cancel requests.
In this MSDN article on the BackgroundWorker Component, file downloads are explicitly given as examples of the appropriate use of this class. That should hopefully encourage you to use this class to perform the work you need.
If you're worried about overusing the thread pool, you can be assured the runtime does manage the number of available threads based on demand. Tasks are queued on the thread pool for execution. When a thread becomes available to do work, the task is loaded onto the thread. At regular intervals, a monitoring process checks the state of the thread pool. If there are tasks waiting to be executed, it can create more threads. If there are several idle threads, it can shut down some to release resources.
In a worse-case scenario, where all threads are busy and you have work queued up, the runtime will be adding threads to deal with the extra workload. The application will be running more slowly as it has to wait for more threads to be made available, but it will continue to run.
A few points, and to combine info form a few other answers:
your ThreadProc does not contain Exception handling. You should add that or 1 I/O error will halt your process.
Sam Saffron is quite right that you should limit the number of threads. You could use a (ThreadSafe) Queue to push your feeds into (WorkItems) and have 1+ threads reading from the queue in a loop.
The BackgrounWorker might be a good idea, it would provide you with both the Exception handling and Synchronization you need.
And the BackgrounWorker uses the ThreadPool, and that is fine
You may want to take a look to the "BackgroundWorker" class.
I know how to implement multithreading using c#. But I want to know how is it working like.
will only one thread run at a time and when that thread is waiting will it execute the second thread?
If the second thread is executing and the first thread is ready. What will happen?
Which thread will be given the priority?
I am confused in understanding the concept. I want to understand why do we go for multithreading and when do we use it .
Thanks in advance.
Threads may or may not be running at the same time. On a single processor machine only one thread will is running at a time. On a multiprocessor system (multi-processor, multi-core, hyper-threading) then multiple threads can be running at the same time, one thread per processor.
The operation system scheduler determines when a thread gets to run. Windows is a preemptive multitasking system. It will run a thread for a certain amount of time, called a time slice (10ms or 15ms on Windows), stop the thread, then determine which thread to run next, which could be the same thread that is running. The actual algorithm is complex.
Threads do have priorities so that affects this as well, all things being equal a higher priority thread will get more time than a lower priority thread. If you don't manually set a priority on a thread, then it defaults to "Normal priority" In a simple case, two threads of the same priority that a ready to run, then both threads will run an equal amount of time, probably round-robin.
On why do we do multi-threading there are two basic reasons:
Speed: On a multiprocessor system since more than one thread can run at a time, our code can perform more than one task at a time. For example if we are processing an image, we split up the image into pieces and have different threads work on each piece of the image.
Asynchronous operations: There is some task that will take a while (e.g. reading a file from the Internet) and we want to be able to let that go on in the background while we do something else, so we create a thread to do the download while we go about our business. One of the big draws of this is in a GUI application we don't want to block the UI thread so the user interface still responds to user will processing is occurring.
Multithreading is useful in environments where one action needs to not BLOCK another action.
The primary example of that is in the case of a background process that shouldn't lock up the main user interface thread.
The operating system is generally going to decide who can do what, when. If a computer has only one core, multithreading has little benefit except the one listed above. But, as more cores are added, more actions can be performed concurrently.
However, even in a single core system, multithreading can facilitate non-blocking-IO which is very important in increasing the responsiveness of your application.
Multithreading speeds up program execution if there are parallelizable parts of the program.
You may want to have a look at different resources for multithreading to understand more about it.
Imagine you have a problem that needs to be done as quickly as possible. You have an easy one; count to a billion. You can do a loop: for (var i = 0; i < Math.Pow(10,9); i++) {} and then this will execute on one core only. It will take x amount of time. Now imagine doing it on two cores instead:
// execute action a concurrently across the domain to-from, where a takes the current index
void Execute(Action<int> a, int from, int to)
{
// assert to > from, to != from, from - to > CPUs, otherwise equal ranges = CPUs
// assert a != null
var pllItems = Environment.ProcessorCount;
var range = to-from;
var ranges = new int[pllItems,2];
var step = Convert.ToInt64(range / pllItems);
// calculate the ranges each thread should do
for (var i = 0; i < ranges.Length; i++) {
var s = from+i*step; // where thread i starts
ranges[i,0] = s; // -''-
ranges[i,1] = s+step - 1; // where thread i ends
}
var ts = Thread[pllItems];
for (var i = 0; i < pllItems; i++) ts.Start(o => {
var currT = i; // avoid closure capture problems
for (var x = ranges[currT, 0]; x < ranges[currT, 1], x++) {
a(x);
// could also have:
// try { a(x) } catch (Exception e) { lock(ecs) ecs.Add(e); /* stop thread */ break; }
// return at the end of method failed threads:
// return ecs;
}
});
for (var i = 0; i < pllItems; i++) ts.Join();
}
Thankfully, if you download the MS Threading library from 2008 you will get this for free with
Parallel.For(0, Math.Pow(10,9), () => { });
There's also a new tool for VS2010 which displays in a graphical form how the threads are blocking, waiting for io etc.
There's a scheduler in .Net/the OS that allows threads to have different interleavings.
A few days ago, MS released documentation on how to do parallel operations in .Net 4.
Have a download/read here
If you look at the Processes tab in Task Manager on your Windows machine, you will see the processes that are currently active on the machine. If you add the Threads column to the view, you will see the number of threads that currently exist in each process. The operating system (OS) is the one that determines how all of these threads across all of these processes are scheduled for execution on the processor. So in effect, the OS is constantly determining which threads have work to do and scheduling those threads for execution on the processor.
Let's assume a single processor, single core machine for now.
In this example, your application is the only process that is doing anything. Say your application has two threads of equal priority (more on this below). In this case, the OS will alternate between these two threads, scheduling one for execution and then the other until the work that they are doing is complete. To accomplish this, the OS grants a timeslice to the first scheduled thread. For example purposes, let's say the timeslice is 10 milliseconds (it's actually much shorter than this). So thread A will execute for 10 milliseconds. The OS will then preempt thread A so thread B can execute for its timeslice, also 10 milliseconds.
This back-and-forth will continue uninterrupted until both threads have finished their work or until certain events occur. For example, let's say that thread A finishes its work before thread B. In this case, thread A has nothing else to so, so the OS will continue to grant timeslices to thread B since it is the only one with work to do. Another thing that can happen is that thread A can wait on an event, such as a System.Threading.ManualResetEvent, or an asynchronous read of a socket. Until that event is signaled or data is received on the socket, thread A is essentially dead in its tracks, so the OS will continue to grant timeslices to thread B until the event/socket that thread A is waiting on occurs. At that point, the OS will resume switching between thread A and thread B for execution.
A good example of this is the background printing that most applications do today. An application's main thread is dedicated to processing UI events - button clicks, keyboard presses, drag-and-drop, etc. If you print a document from your favorite word processor, what happens conceptually is that the task of sending the print instructions to the printer is delegated to a secondary thread. So at this point, your application has two threads that are running - one thread servicing the UI and the other thread handling the print job. Since this is on a single processor, single core machine, the OS swaps between the two threads, granting timeslices to each. In this case, the print job thread will end after it finishes sending the print instructions, and then only your UI thread will be left.
A question you may have at this point is this:
Doesn't it take longer to print this
way on a single processor, single core machine
since the OS is having to swap between
the print job thread and the UI
thread?
And the answer is YES. It does take longer this way. But consider the alternative. If the print job were executed on the UI thread, the user interface would be unresponsive to your input, i.e., button clicks, keyboard presses, etc., until the print job was complete. And this would frustrate you as the user because the application isn't responding to your input. So, in effect, multithreading is really an illusion of parallelism, at least on a single processor, single core machine. However, you get the satisfaction of being able to interact with your application while the print job is accomplished on another thread, even though the print job takes longer doing it this way.
Now let's move to a multicore machine. If your process has the same two threads, A and B, to execute, then each thread can be scheduled on a separate core. In this case, both threads run simultaneously without the interruption. The OS doesn't have to swap between the threads because each thread has its own core to run on. Make sense?
Finally, let's consider the priority associated with threads (assume single processor, single core again). Each thread in a given application has, by default, the same priority. What this means is that the OS will consider all threads equal with regard to scheduling. If you have two threads to be executed, they will get roughly the same amount of time on the processor. You can adjust this, however, by increasing/decreasing the priority of one thread over the other. In this case, the thread with the higher priority is favored for scheduling purposes over the thread with a lower priority, meaning that it gets more timeslices than the other thread. In some limited cases, adjusting the priority of threads can improve your application's performance, but for most applications, it is not necessary. The thing to be cautious of is to not "starve" a thread, especially the UI thread. The OS helps to prevent this by not starving a thread altogether. Still, adjusting the priorities can still make your application appear sluggish, if not altogether unresponsive, if the UI thread is "put on a diet," so to speak.
You can read more about thread priorities here and here.
I hope this helps.
Purposes of a thread
Hide latency (i.e. do something else while waiting)
Exploit the concurrency of the hardware (in case of multiple cores, this gives better performance)
Discriminate importance levels (i.e. high and low priority threads)
Organize structure (i.e. thread per event, thread per resouce, thread per process)
There are others, but i think this are the basic uses of a thread