ThreadPool frustrations - Thread creation exceeding SetMaxThreads

ThreadPool frustrations - Thread creation exceeding SetMaxThreads - c#

I've got an I/O intensive operation.
I only want a MAX of 5 threads ever running at one time.
I've got 8000 tasks to queue and complete.
Each task takes approximately 15-20seconds to execute.
I've looked around at ThreadPool, but
ThreadPool.SetMaxThreads(5, 0);
List<task> tasks = GetTasks();
int toProcess = tasks.Count;
ManualResetEvent resetEvent = new ManualResetEvent(false);
for (int i = 0; i < tasks.Count; i++)
{
ReportGenerator worker = new ReportGenerator(tasks[i].Code, id);
ThreadPool.QueueUserWorkItem(x =>
{
worker.Go();
if (Interlocked.Decrement(ref toProcess) == 0)
resetEvent.Set();
});
}
resetEvent.WaitOne();
I cannot figure out why... my code is executing more than 5 threads at one time. I've tried to setmaxthreads, setminthreads, but it keeps executing more than 5 threads.
What is happening? What am I missing? Should I be doing this in another way?
Thanks

There is a limitation in SetMaxThreads in that you can never set it lower than the number of processors on the system. If you have 8 processors, setting it to 5 is the same as not calling the function at all.

Task Parallel Library can help you:
List<task> tasks = GetTasks();
Parallel.ForEach(tasks, new ParallelOptions { MaxDegreeOfParallelism = 5 },
task => {ReportGenerator worker = new ReportGenerator(task.Code, id);
worker.Go();});
What does MaxDegreeOfParallelism do?

I think there's a different and better way to approach this. (Pardon me if I accidentally Java-ize some of the syntax)
The main thread here has a lists of things to do in "Tasks" -- instead of creating threads for each task, which is really not efficient when you have so many items, create the desired number of threads and then have them request tasks from the list as needed.
The first thing to do is add a variable to the class this code comes from, for use as a pointer into the list. We'll also add one for the maximum desired thread count.
// New variable in your class definition
private int taskStackPointer;
private final static int MAX_THREADS = 5;
Create a method that returns the next task in the list and increments the stack pointer. Then create a new interface for this:
// Make sure that only one thread has access at a time
[MethodImpl(MethodImplOptions.Synchronized)]
public task getNextTask()
{
if( taskStackPointer < tasks.Count )
return tasks[taskStackPointer++];
else
return null;
}
Alternately, you could return tasks[taskStackPointer++].code, if there's a value you can designate as meaning "end of list". Probably easier to do it this way, however.
The interface:
public interface TaskDispatcher
{
[MethodImpl(MethodImplOptions.Synchronized)] public task getNextTask();
}
Within the ReportGenerator class, change the constructor to accept the dispatcher object:
public ReportGenerator( TaskDispatcher td, int idCode )
{
...
}
You'll also need to alter the ReportGenerator class so that the processing has an outer loop that starts off by calling td.getNextTask() to request a new task, and which exits the loop when it gets back a NULL.
Finally, alter the thread creation code to something like this: (this is just to give you an idea)
taskStackPointer = 0;
for (int i = 0; i < MAX_THREADS; i++)
{
ReportGenerator worker = new ReportGenerator(this,id);
worker.Go();
}
That way you create the desired number of threads and keep them all working at max capacity.
(I'm not sure I got the usage of "[MethodImpl(MethodImplOptions.Synchronized)]" exactly right... I am more used to Java than C#)

Your tasks list will have 8k items in it because you told the code to put them there:
List<task> tasks = GetTasks();
That said, this number has nothing to do with how many threads are being used in the sense that the debugger is always going to show how many items you added to the list.
There are various ways to determine how many threads are in use. Perhaps one of the simplest is to break into the application with the debugger and take a look at the threads window. Not only will you get a count, but you'll see what each thread is doing (or not) which leads me to...
There is significant discussion to be had about what your tasks are doing and how you arrived at a number to 'throttle' the thread pool. In most use cases, the thread pool is going to do the right thing.
Now to answer your specific question...
To explicitly control the number of concurrent tasks, consider a trivial implementation that would involve changing your task collection from a List to BlockingCollection (that will internally use a ConcurrentQueue) and the following code to 'consume' the work:
var parallelOptions = new ParallelOptions
{
MaxDegreeOfParallelism = 5
};
Parallel.ForEach(collection.GetConsumingEnumerable(), options, x =>
{
// Do work here...
});
Change MaxDegreeOfParallelism to whatever concurrent value you have determined is appropriate for the work you are doing.
The following might be of interest to you:
Parallel.ForEach Method
BlockingCollection
Chris

Its works for me. This way you can't use a number of workerthreads smaller than "minworkerThreads". The problem is if you need five "workerthreads" maximum and the "minworkerThreads" is six doesn't work.
{
ThreadPool.GetMinThreads(out minworkerThreads,out minportThreads);
ThreadPool.SetMaxThreads(minworkerThreads, minportThreads);
}
MSDN
Remarks
You cannot set the maximum number of worker threads or I/O completion threads to a number smaller than the number of processors on the computer. To determine how many processors are present, retrieve the value of the Environment.ProcessorCount property. In addition, you cannot set the maximum number of worker threads or I/O completion threads to a number smaller than the corresponding minimum number of worker threads or I/O completion threads. To determine the minimum thread pool size, call the GetMinThreads method.
If the common language runtime is hosted, for example by Internet Information Services (IIS) or SQL Server, the host can limit or prevent changes to the thread pool size.
Use caution when changing the maximum number of threads in the thread pool. While your code might benefit, the changes might have an adverse effect on code libraries you use.
Setting the thread pool size too large can cause performance problems. If too many threads are executing at the same time, the task switching overhead becomes a significant factor.

Related

understanding Parallel.Invoke, creation and reusing of threads

I am trying to understand how Parallel.Invoke creates and reuses threads.
I ran the following example code (from MSDN, https://msdn.microsoft.com/en-us/library/dd642243(v=vs.110).aspx):
using System;
using System.Threading;
using System.Threading.Tasks;
class ThreadLocalDemo
{
static void Main()
{
// Thread-Local variable that yields a name for a thread
ThreadLocal<string> ThreadName = new ThreadLocal<string>(() =>
{
return "Thread" + Thread.CurrentThread.ManagedThreadId;
});
// Action that prints out ThreadName for the current thread
Action action = () =>
{
// If ThreadName.IsValueCreated is true, it means that we are not the
// first action to run on this thread.
bool repeat = ThreadName.IsValueCreated;
Console.WriteLine("ThreadName = {0} {1}", ThreadName.Value, repeat ? "(repeat)" : "");
};
// Launch eight of them. On 4 cores or less, you should see some repeat ThreadNames
Parallel.Invoke(action, action, action, action, action, action, action, action);
// Dispose when you are done
ThreadName.Dispose();
}
}
As I understand it, Parallel.Invoke tries to create 8 threads here - one for each action. So it creates the first thread, runs the first action, and by that gives a ThreadName to the thread. Then it creates the next thread (which gets a different ThreadName) and so on.
If it cannot create a new thread, it will reuse one of the threads created before. In this case, the value of repeat will be true and we can see this in the console output.
Is this correct until here?
The second-last comment ("Launch eight of them. On 4 cores or less, you should see some repeat ThreadNames") implies that the threads created by Invoke correspond to the available cpu threads of the processor: on 4 cores we have 8 cpu threads, at least one is busy (running the operating system and stuff), so Invoke can only use 7 different threads, so we must get at least one "repeat".
Is my interpretation of this comment correct?
I ran this code on my PC which has an Intel® Core™ i7-2860QM processor (i.e. 4 cores, 8 cpu threads). I expected to get at least one "repeat", but I didn't. When I changed the Invoke to take 10 instead of 8 actions, I got this output:
ThreadName = Thread6
ThreadName = Thread8
ThreadName = Thread6 (repeat)
ThreadName = Thread5
ThreadName = Thread3
ThreadName = Thread1
ThreadName = Thread10
ThreadName = Thread7
ThreadName = Thread4
ThreadName = Thread9
So I have at least 9 different threads in the console application. This contradicts the fact that my processor only has 8 threads.
So I guess some of my reasoning from above is wrong. Does Parallel.Invoke work differently than what I described above? If yes, how?

If you pass less then 10 items to Parallel.Invoke, and you don't specify MaxDegreeOfParallelism in options (so - your case), it will just run them all in parallel on thread pool sheduler using rougly the following code:
var actions = new [] { action, action, action, action, action, action, action, action };
var tasks = new Task[actions.Length];
for (int index = 1; index < tasks.Length; ++index)
tasks[index] = Task.Factory.StartNew(actions[index]);
tasks[0] = new Task(actions[0]);
tasks[0].RunSynchronously();
Task.WaitAll(tasks);
So just a regular Task.Factory.StartNew. If you will look at max number of threads in thread pool
int th, io;
ThreadPool.GetMaxThreads(out th, out io);
Console.WriteLine(th);
You will see some big number, like 32767. So, number of threads on which Parallel.Invoke will be executed (in your case) are not limited to number of cpu cores at all. Even on 1-core cpu it might run 8 threads in parallel.
You might then think, why some threads are reused at all? Because when work is done on thread pool thread - that thread is returned to the pool and is ready to accept new work. Actions from your example basically do no work at all and complete very fast. So sometimes first thread started via Task.Factory.StartNew has already completed your action and is returned to the pool before all subsequent threads were started. So that thread is reused.
By the way, you can see (repeat) in your example with 8 actions, and even with 7 if you try hard enough, on a 8 core (16 logical cores) processor.
UPDATE to answer your comment. Thread pool scheduler will not necessary create new threads immediately. There is min and max number of threads in thread pool. How to see max I already shown above. To see min number:
int th, io;
ThreadPool.GetMinThreads(out th, out io);
This number will usually be equal to the number of cores (so for example 8). Now, when you request new action to be performed on thread pool thread, and number of threads in a thread pool is less than minimum - new thread will be created immeditely. However, if number of available threads is greater than minimum - certain delay will be introduced before creating new thread (I don't remember how long exactly unfortunately, about 500ms).
Statement you added in your comment I highly doubt can execute in 2-3 seconds. For me it executes for 0.3 seconds max. So when first 8 threads are created by thread pool, there is that 500ms delay before creating 9th. During that delay, some (or all) of first 8 threads are completed their job and are available for new work, so there is no need to create new thread and they can be reused.
To verify this, introduce bigger delay:
static void Main()
{
// Thread-Local variable that yields a name for a thread
ThreadLocal<string> ThreadName = new ThreadLocal<string>(() =>
{
return "Thread" + Thread.CurrentThread.ManagedThreadId;
});
// Action that prints out ThreadName for the current thread
Action action = () =>
{
// If ThreadName.IsValueCreated is true, it means that we are not the
// first action to run on this thread.
bool repeat = ThreadName.IsValueCreated;
Console.WriteLine("ThreadName = {0} {1}", ThreadName.Value, repeat ? "(repeat)" : "");
Thread.Sleep(1000000);
};
int th, io;
ThreadPool.GetMinThreads(out th, out io);
Console.WriteLine("cpu:" + Environment.ProcessorCount);
Console.WriteLine(th);
Parallel.Invoke(Enumerable.Repeat(action, 100).ToArray());
// Dispose when you are done
ThreadName.Dispose();
Console.ReadKey();
}
You will see that now thread pool has to create new threads every time (much more than there are cores), because it cannot reuse previous threads while they are busy.
You can also increase number of min threads in thread pool, like this:
int th, io;
ThreadPool.GetMinThreads(out th, out io);
ThreadPool.SetMinThreads(100, io);
This will remove the delay (until 100 threads are created) and in above example you will notice that.

Behind the scenes, threads are organized (and possessed by) the task scheduler. Primary purpose of the task scheduler is to keep all CPU cores used as much as possible with useful work.
Under the hood, scheduler is using the thread pool, and then size of the thread pool is the way to fine-tune usefulness of operations executed on CPU cores.
Now this requires some analysis. For instance, thread switching costs CPU cycles and it is not useful work. On the other hand, when one thread executes one task on a core, all other tasks are stalled and they are not progressing on that core. I believe that is the core reason why the scheduler is usually starting two threads per core, so that at least some movement is visible in case that one task takes longer to complete (like several seconds).
There are corollaries to this basic mechanism. When some tasks take long time to complete, scheduler starts new threads to compensate. That means that long-running task will now have to compete for the core with short-running tasks. In that way, short tasks will be completed one after another, and long task will slowly progress to its completion as well.
Bottom line is that your observations about threads are generally correct, but not entirely true in specific situations. In concrete execution of a number of tasks, scheduler might choose to raise more threads, or to keep going with the default. That is why you will sometimes notice that number of threads differs.
Remember the goal of the game: Utilize CPU cores with useful work as much as possible, while at the same time making all tasks move, so that the application doesn't look like frozen. Historically, people used to try to reach these goals with many different techniques. Analysis had shown that many of those techniques were applied randomly and didn't really increase CPU utilization. That analysis has lead to introduction of task schedulers in .NET, so that fine-tuning can be coded once and be done well.

So I have at least 9 different threads in the console application. This contradicts the fact that my processor only has 8 threads.
A thread is a very much overloaded term. It can mean, at the very least: (1) something you sew with, (2) a bunch of code with associated state, that is represented by an OS handle, and (3) an execution pipeline of a CPU. The Thread.CurrentThread refers to (2), the "processor thread" that you mentioned refers to (3).
The existence of a (2)-thread is not predicated on the existence of (3)-thread, and the number of (2)-threads that exist on any particular system is pretty much limited by available memory and OS design. The existence of (2)-thread doesn't imply execution of (2)-thread at any given time (unless you use APIs that guarantee that).
Furthermore, if a (2)-thread executes at some point - implying a temporary 1:1 binding between (2)-thread and (3)-thread, there is no implication that the thread will continue executing in general, and of course neither is there an implication that the thread will continue executing on the same (3)-thread if it continues executing at all.
So, even if you have "caught" the execution of a (2)-thread on a (3)-thread by some side effect, e.g. console output, as you did, that doesn't necessarily imply anything about any other (2)-threads and (3)-threads at that point.
On to your code:
// If ThreadName.IsValueCreated is true, it means that we are not the
// first action to run on this thread. <-- this refers to (2)-thread, NOT (3)-thread.
Parallel.Invoke is not precluded from (in terms of specifications) creating as many new (2)-threads as there are arguments passed to it. The actual number of (2)-threads created may be all the way from zero to a hero, since to call Parallel.Invoke there must be an existing (2)-thread with some code that calls this API. So, no new (2)-threads need to be created at all, for example. Whether the (2)-threads created by Parallel.Invoke execute on any particular number of (3)-threads concurrently is beyond your control either.
So that explains the behavior you see. You conflated (2)-threads with (3)-threads, and assumed that Parallel.Invoke does something specific it in fact is not guaranteed to do. Citing documentation:
No guarantees are made about the order in which the operations execute or whether they execute in parallel.
This implies that Invoke is free to run the actions on dedicated (2)-threads if it so wishes. And that is what you observed.

What determines the number of threads for a TaskFactory spawned jobs?

I have the following code:
var factory = new TaskFactory();
for (int i = 0; i < 100; i++)
{
var i1 = i;
factory.StartNew(() => foo(i1));
}
static void foo(int i)
{
Thread.Sleep(1000);
Console.WriteLine($"foo{i} - on thread {Thread.CurrentThread.ManagedThreadId}");
}
I can see it only does 4 threads at a time (based on observation). My questions:
What determines the number of threads used at a time?
How can I retrieve this number?
How can I change this number?
P.S. My box has 4 cores.
P.P.S. I needed to have a specific number of tasks (and no more) that are concurrently processed by the TPL and ended up with the following code:
private static int count = 0; // keep track of how many concurrent tasks are running
private static void SemaphoreImplementation()
{
var s = new Semaphore(20, 20); // allow 20 tasks at a time
for (int i = 0; i < 1000; i++)
{
var i1 = i;
Task.Factory.StartNew(() =>
{
try
{
s.WaitOne();
Interlocked.Increment(ref count);
foo(i1);
}
finally
{
s.Release();
Interlocked.Decrement(ref count);
}
}, TaskCreationOptions.LongRunning);
}
}
static void foo(int i)
{
Thread.Sleep(100);
Console.WriteLine($"foo{i:00} - on thread " +
$"{Thread.CurrentThread.ManagedThreadId:00}. Executing concurently: {count}");
}

When you are using a Task in .NET, you are telling the TPL to schedule a piece of work (via TaskScheduler) to be executed on the ThreadPool. Note that the work will be scheduled at its earliest opportunity and however the scheduler sees fit. This means that the TaskScheduler will decide how many threads will be used to run n number of tasks and which task is executed on which thread.
The TPL is very well tuned and continues to adjust its algorithm as it executes your tasks. So, in most cases, it tries to minimize contention. What this means is if you are running 100 tasks and only have 4 cores (which you can get using Environment.ProcessorCount), it would not make sense to execute more than 4 threads at any given time, as otherwise it would need to do more context switching. Now there are times where you want to explicitly override this behaviour. Let's say in the case where you need to wait for some sort of IO to finish, which is a whole different story.
In summary, trust the TPL. But if you are adamant to spawn a thread per task (not always a good idea!), you can use:
Task.Factory.StartNew(
() => /* your piece of work */,
TaskCreationOptions.LongRunning);
This tells the DefaultTaskscheduler to explicitly spawn a new thread for that piece of work.
You can also use your own Scheduler and pass it in to the TaskFactory. You can find a whole bunch of Schedulers HERE.
Note another alternative would be to use PLINQ which again by default analyses your query and decides whether parallelizing it would yield any benefit or not, again in the case of a blocking IO where you are certain starting multiple threads will result in a better execution you can force the parallelism by using WithExecutionMode(ParallelExecutionMode.ForceParallelism) you then can use WithDegreeOfParallelism, to give hints on how many threads to use but remember there is no guarantee you would get that many threads, as MSDN says:
Sets the degree of parallelism to use in a query. Degree of
parallelism is the maximum number of concurrently executing tasks that
will be used to process the query.
Finally, I highly recommend having a read of THIS great series of articles on Threading and TPL.

If you increase the number of tasks to for example 1000000 you will see a lot more threads spawned over time. The TPL tends to inject one every 500ms.
The TPL threadpool does not understand IO-bound workloads (sleep is IO). It's not a good idea to rely on the TPL for picking the right degree of parallelism in these cases. The TPL is completely clueless and injects more threads based on vague guesses about throughput. Also to avoid deadlocks.
Here, the TPL policy clearly is not useful because the more threads you add the more throughput you get. Each thread can process one item per second in this contrived case. The TPL has no idea about that. It makes no sense to limit the thread count to the number of cores.
What determines the number of threads used at a time?
Barely documented TPL heuristics. They frequently go wrong. In particular they will spawn an unlimited number of threads over time in this case. Use task manager to see for yourself. Let this run for an hour and you'll have 1000s of threads.
How can I retrieve this number? How can I change this number?
You can retrieve some of these numbers but that's not the right way to go. If you need a guaranteed DOP you can use AsParallel().WithDegreeOfParallelism(...) or a custom task scheduler. You also can manually start LongRunning tasks. Do not mess with process global settings.

I would suggest using SemaphoreSlim because it doesn't use Windows kernel (so it can be used in Linux C# microservices) and also has a property SemaphoreSlim.CurrentCount that tells how many remaining threads are left so you don't need the Interlocked.Increment or Interlocked.Decrement. I also removed i1 because i is value type and it won't be changed by the call of foo method passing the i argument so it's no need to copy it into i1 to ensure it never changes (if that was the reasoning for adding i1):
private static void SemaphoreImplementation()
{
var maxThreadsCount = 20; // allow 20 tasks at a time
var semaphoreSlim = new SemaphoreSlim(maxTasksCount, maxTasksCount);
var taskFactory = new TaskFactory();
for (int i = 0; i < 1000; i++)
{
taskFactory.StartNew(async () =>
{
try
{
await semaphoreSlim.WaitAsync();
var count = maxTasksCount-semaphoreSlim.CurrentCount; //SemaphoreSlim.CurrentCount tells how many threads are remaining
await foo(i, count);
}
finally
{
semaphoreSlim.Release();
}
}, TaskCreationOptions.LongRunning);
}
}
static async void foo(int i, int count)
{
await Task.Wait(100);
Console.WriteLine($"foo{i:00} - on thread " +
$"{Thread.CurrentThread.ManagedThreadId:00}. Executing concurently: {count}");
}

Is it the correct implementation?

I am having a Windows Service that needs to pick the jobs from database and needs to process it.
Here, each job is a scanning process that would take approx 10 mins to complete.
I am very new to Task Parallel Library. I have implemented in the following way as sample logic:
Queue queue = new Queue();
for (int i = 0; i < 10000; i++)
{
queue.Enqueue(i);
}
for (int i = 0; i < 100; i++)
{
Task.Factory.StartNew((Object data ) =>
{
var Objdata = (Queue)data;
Console.WriteLine(Objdata.Dequeue());
Console.WriteLine(
"The current thread is " + Thread.CurrentThread.ManagedThreadId);
}, queue, TaskCreationOptions.LongRunning);
}
Console.ReadLine();
But, this is creating lot of threads. Since loop is repeating 100 times, it is creating 100 threads.
Is it right approach to create that many number of parallel threads ?
Is there any way to limit the number of threads to 10 (concurrency level)?

An important factor to remember when allocating new Threads is that the OS has to allocate a number of logical entities in order for that current thread to run:
Thread kernel object - an object for describing the thread,
including the thread's context, cpu registers, etc
Thread environment block - For exception handling and thread local
storage
User-mode stack - 1MB of stack
Kernel-mode stack - For passing arguments from user mode to kernel
mode
Other than that, the number of concurrent Threads that may run depend on the number of cores your machine is packing, and creating an amount of threads that is larger than the number of cores your machine owns will start causing Context Switching, which in the long run may slow your work down.
So after the long intro, to the good stuff. What we actually want to do is limit the number of threads running and reuse them as much as possible.
For this kind of job, i would go with TPL Dataflow which is based on the Producer-Consumer pattern. Just a small example of what can be done:
// a BufferBlock is an equivalent of a ConcurrentQueue to buffer your objects
var bufferBlock = new BufferBlock<object>();
// An ActionBlock to process each object and do something with it
var actionBlock = new ActionBlock<object>(obj =>
{
// Do stuff with the objects from the bufferblock
});
bufferBlock.LinkTo(actionBlock);
bufferBlock.Completion.ContinueWith(t => actionBlock.Complete());
You may pass each Block a ExecutionDataflowBlockOptions which may limit the Bounded Capacity (The number of objects inside the BufferBlock) and MaxDegreeOfParallelism which tells the block the number of maximum concurrency you may want.
There is a good example here to get you started.

Glad you asked, because you're right in the sense that - this is not the best approach.
The concept of Task should not be confused with a Thread. A Thread can be compared to a chef in a kitchen, while a Task is a dish ordered by a customer. You have a bunch of chefs, and they process the dish orders in some ordering (usually FIFO). A chef finishes a dish then moves on to the next. The concept of Thread Pool is the same. You create a bunch of Tasks to be completed, but you do not need to assign a new thread to each task.
Ok so the actual bits to do it. There are a few. The first one is ThreadPoll.QueueUserWorkItem. (http://msdn.microsoft.com/en-us/library/system.threading.threadpool.queueuserworkitem(v=vs.110).aspx). Using the Parallel library, Parallel.For can also be used, it will automatically spawn threads based on the number of actual CPU cores available in the system.
Parallel.For(0, 100, i=>{
//here, this method will be called 100 times, and i will be 0 to 100
WaitForGrassToGrow();
Console.WriteLine(string.Format("The {0}-th task has completed!",i));
});
Note that there is no guarantee that the method called by Parallel.For is called in sequence (0,1,2,3,4,5...). The actual sequence depends on the execution.

N number of threads asynchronously getting/executing tasks

I have an unlimited number of tasks in a db queue somewhere. What is the best way to have a program working on n tasks simultaneously on n different threads, starting new tasks as old ones get done? When one task finishes, another task should asynchronously begin. The currently-running count should always be n.
My initial thought was to use a thread pool, but that seems unnecessary considering that the tasks to be worked on will be retrieved within the individual threads. In other words, each thread will on its own go get its next task rather than having a main thread get tasks and then distribute them.
I see multiple options for doing this, and I don't know which one I should use for optimal performance.
1) Thread Pool - In light of there not necessarily being any waiting threads, I'm not sure this is necessary.
2) Semaphore - Same as 1. What's the benefit of a semaphore if there aren't tasks waiting to be allocated by the main thread?
3) Same Threads Forever - Kick the program off with n threads. When a thread is done working, it gets the next task itself. The main thread just monitors to makes sure the n threads are still alive.
4) Event Handling - Same as 3, except that when a thread finishes a task, it fires off an ImFinished event before dying. An ImFinished event handler kicks off a new thread. This seems just like 3 but with more overhead (since new threads are constantly being created)
5) Something else?

BlockingCollection makes this whole thing pretty trivial:
var queue = new BlockingCollection<Action>();
int numWorkers = 5;
for (int i = 0; i < numWorkers; i++)
{
Thread t = new Thread(() =>
{
foreach (var action in queue.GetConsumingEnumerable())
{
action();
}
});
t.Start();
}
You can then have the main thread add items to the blocking collection after starting the workers (or before, if you want). You can even spawn multiple producer threads to add items to the queue.
Note that the more conventional approach would be to use Tasks instead of using Thread classes directly. The primary reasons that I didn't suggest it first is that you specifically requested an exact number of threads to be running (rather than a maximum) and you just don't have as much control over how Task objects are run (which is good; they can be optimized on your behalf). If that control isn't as important as you have stated the following may end up being preferable:
var queue = new BlockingCollection<Action>();
int numWorkers = 5;
for (int i = 0; i < numWorkers; i++)
{
Task.Factory.StartNew(() =>
{
foreach (var action in queue.GetConsumingEnumerable())
{
action();
}
}, CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default);
}

I like model #3, and have used it before; it reduces the number of threads starting and stopping, and makes the main thread a true "supervisor", reducing the work it has to do.
As Servy has indicated, the System.Collections.Concurrent namespace has a few constructs that are extremely valuable here. ConcurrentQueue is a thread-safe FIFO collection implementation designed to be used in just such a model; one or more "producer" threads add elements to the "input" side of the queue, while one or more "consumers" take elements out of the other end. If there is nothing in the queue, the call to get the item simply returns false; you can react to that by exiting out of the task method (the supervisor can then decide whether to start another task, probably by monitoring the input to the queue and ramping up when more items come in).
BlockingCollection adds the behavior of causing threads to wait when they attempt to get a value from the queue, if the queue doesn't have anything. It can also be configured to have a maximum capacity, above which it will block the "producer" threads adding any more elements until there is available capacity. BlockingCollection uses a ConcurrentQueue by default, but you can set it up to be a Stack, Dictionary or Bag if you wish. Using this model, you can have the tasks run indefinitely; when there's nothing to do they'll simply block until there is something for at least one of them to work on, so all the supervisor has to check for is tasks erroring out (a critical element of any robust threaded workflow pattern).

This is easily achieved with the TPL Dataflow library.
First, let's assume you have a BufferBlock<T>, this is your queue:
var queue = new BufferBlock<T>();
Then, you need the action to perform on the block, this is represented by the ActionBlock<T> class:
var action = new ActionBlock<T>(t => { /* Process t here */ },
new ExecutionDataflowBlockOptions {
// Number of concurrent tasks.
MaxDegreeOfParallelism = ...,
});
Note the constructor above, it takes an instance of ExecutionDataflowBlockOptions and sets the MaxDegreeOfParallelism property to however many concurrent items you want to be processed at the same time.
Underneath the surface, the Task Parallel Library is being used to handle allocating threads for tasks, etc. TPL Dataflow is meant to be a higher level abstraction which allows you to tweak just how much parallelism/throttling/etc that you want.
For example, if you didn't want the ActionBlock<TInput> to buffer any items (preferring them to live in the BufferBlock<T>), you can also set the BoundedCapacity property, which will limit the number of items that the ActionBlock<TInput> will hold onto at once (which includes the number of items being processed, as well as reserved items):
var action = new ActionBlock<T>(t => { /* Process t here */ },
new ExecutionDataflowBlockOptions {
// Number of concurrent tasks.
MaxDegreeOfParallelism = ...,
// Set to MaxDegreeOfParallelism to not buffer.
BoundedCapacity ...,
});
Also, if you want a new, fresh Task<TResult> instance to process every item, then you can set the MaxMessagesPerTask property to one, indicating that each and every Task<TResult> will process one item:
var action = new ActionBlock<T>(t => { /* Process t here */ },
new ExecutionDataflowBlockOptions {
// Number of concurrent tasks.
MaxDegreeOfParallelism = ...,
// Set to MaxDegreeOfParallelism to not buffer.
BoundedCapacity ...,
// Process once item per task.
MaxMessagesPerTask = 1,
});
Note that depending on how many other tasks your application is running, this might or might not be optimal for you, and you might also want to think of the cost of spinning up a new task for every item that comes through the ActionBlock<TInput>.
From there, it's a simple matter of linking the BufferBlock<T> to the ActionBlock<TInput> with a call to the LinkTo method:
IDisposable connection = queue.LinkTo(action, new DataflowLinkOptions {
PropagateCompletion = true;
});
You set the PropogateCompletion property to true here so that when waiting on the ActionBlock<T>, the completion will be sent to the ActionBlock<T> (if/when there are no more items to process) which you might subsequently wait on.
Note the you can call the Dispose method on the IDisposable interface implementation returned from the call to LinkTo if you want the link between the blocks to be removed.
Finally, you post items to the buffer using the Post method:
queue.Post(new T());
And when you're done (if you are ever done), you call the Complete method:
queue.Complete();
Then, on the action block, you can wait until it's done by waiting on the Task instance exposed by the Completion property:
action.Completion.Wait();
Hopefully, the elegance of this is clear:
You don't have to manage the creation of new Task instances/threads/etc to manage the work, the blocks do it for you based on the settings you provide (and this is on a per-block basis).
Cleaner separation of concerns. The buffer is separated from the action, as are all the other blocks. You build the blocks and then link them together.

I'm a VB guy, but you can easily translate:
Private Async Sub foo()
Dim n As Integer = 16
Dim l As New List(Of Task)
Dim jobs As New Queue(Of Integer)(Enumerable.Range(1, 100))
For i = 1 To n
Dim j = jobs.Dequeue
l.Add(Task.Run((Sub()
Threading.Thread.Sleep(500)
Console.WriteLine(j)
End Sub)))
Next
While l.Count > 0
Dim t = Await Task.WhenAny(l)
If jobs.Count > 0 Then
Dim j = jobs.Dequeue
l(l.IndexOf(t)) = (Task.Run((Sub()
Threading.Thread.Sleep(500)
Console.WriteLine(j)
End Sub)))
Else
l.Remove(t)
End If
End While
End Sub
There's an article from Stephen Toub, why you shouldn't use Task.WhenAny in this way ... WITH A LARGE LIST OF TASKS, but with "some" tasks you usually dont run into a problem
The idea is quite simple: You have a list, where you add as many (running) tasks as you would like to run in parallel. Then you (a)wait for the first one to finish. If there are still jobs in the queue, you assign the job to a new task and then (a)wait again. If there are no jobs in the queue, you simply remove the finished task. If both your tasklist and the queue is empty, you are done.
The Stephen Toub article: http://blogs.msdn.com/b/pfxteam/archive/2012/08/02/processing-tasks-as-they-complete.aspx

Threadpool QueueUserWorkItem without using SetMaxThreads,GetMaxThreads

I'm using Threadpool to do some parallel processing in c# .NET 2.0.
Code :
int MAXThreads=GetConfigValue("MaxThreadLimit"); //This value is read from app.config
ManualResetEvent[] doneEvents=new ManualResetEvent[MAXThreads];
for(int i=0;i<MaxThreads,i++)
{
doneEvents[i]=new ManualResetEvent(false);
//create workload
DoProcess job=new DoProcess(workload,doneEvents[i]);
ThreadPool.QueueUserWorkItem(job.ThreadPoolCallBack,i);
}
WaitHandle.WaitAll(doneEvents);
//proceed
Class DoProcess
{
private WorkLoad load;
private ManualResetEvent doneEvent;
public DoProcess(WorkLoad load,ManualResetEvent doneEvent)
{
this.load=load;
this.doneEvent=doneEvent;
}
public void ThreadPoolCallBack(object index)
{
//Do Processing
doneEvent.Set();
}
}
MAXThreads value is being read from config but I guess this has nothing to do with the actual number of threads generated. Only few ~4-5 threads handle all the workload. I want thread count to be fixed somewhere around 20. How can I achieve this? Am I missing on something?..Does SetMaxThreads address this issue?..The above code will run on quad core cpu.

You'd have to set the minimum number of threads instead.
That's not a very good idea in general, running more threads than you have processor cores usually gets less work done since the operating system is spending time swapping them in and out. These context switches are not cheap. The threadpool manager does its best to limit the number of active threads to the number of cores. Only allowing more threads to run when the existing ones don't complete in time. Up to the maximum number of threads. An enormous value by default, 1000 in your case.
Only increase the min threads when those worker threads are not performing enough work because they are blocked on I/O too often. In which case you really ought to consider Thread objects instead of thread pool threads.

There is also SetMinThreads in the ThreadPool class. Setting both min and max to the same value "should" fix the number of threads, but what actually happens under the hood is anybody's guess.
MSDN:
The thread pool provides new worker threads or I/O completion threads
on demand until it reaches the minimum for each category.
So setting the minimum number of threads to 20 should give you no less than 20 threads in the pool.

Assuming this is a C# application, you can use .NET Framework 4's Parallel programming model and limit the threads to 20.
Parallel.For(0, n, new ParallelOptions { MaxDegreeOfParallelism = 20 },
i =>
{
DoWork(i);
});
However, if this is a web site/app, it's best to stay away from the ThreadPool all together, except for really small jobs, 1-2 threads, because you don't want to starve the ThreadPool. And NEVER set the max or min threads in the code, because it will affect your entire site, and all other sites using the same thread pool.
In this case, I recommend using SmartThreadPool instead.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.