I'm writing an application that should simulate the behavior of a PLC. This means I have to run several threads making sure only one thread at a time is active and all others are suspended.
For example:
thread 1 repeats every 130ms and blocks all other threads. The effective runtime is 30ms and the remaining 100ms before the thread restarts can be used by other threads.
thread 2 repeats every 300ms and blocks all threads except for thread 1. The effective runtime is 50ms (the remaining 250ms can be used by other threads). Thread 2 is paused until thread 1 has finished executing code (the remaining 100ms of thread 1) and once thread 1 is asleep it resumes from where it has been paused
thread 3 repeats every 1000ms. The effective runtime is 100ms. This thread continues execution only if all other threads are suspended.
The highest priority is to complete the tasks before they are called again, otherwise I have to react, therefore a thread that should be blocked should not run until a certain point, otherwise multicore processing would elaborate the code and only wait to pass the results.
I read several posts and learned that Thread.suspend is not recomended and semaphore or monitor operations mean that the code is executed until a specific and fixed point in the code while I have to pause the threads exactly where the execution has arrived when an other thread (with higher "priority") is called.
I also looked at the priority setting but it doesn't seem to be 100% relevant since the system can override priorities.
Is there a correct or at least solid way to code the blocking mechanism?
I don't think you need to burden yourself with Threads at all. Instead, you can use Tasks with a prioritised TaskScheduler (it's not too hard to write or find by googling).
This makes the code quite easy to write, for example the highest priority thread might be something like:
while (!cancellationRequested)
{
var repeatTask = Task.Delay(130);
// Do your high priority work
await repeatTask;
}
Your other tasks will have a similar basic layout, but they will be given a lower priority in the task scheduler (this is usually handled by the task scheduler having a separate queue for each of the task priorities). Once in a while, they can check whether there is a higher priority task, and if so, they can do await Task.Yield();. In fact, in your case, it seems like you don't even need real queues - that makes this a lot easier, and even better, allows you to use Task.Yield really efficiently.
The end result is that all three of your periodic tasks are efficiently run on just a single thread (or even no thread at all if they're all waiting).
This does rely on coöperative multi-tasking, of course. It's not really possible to handle full blown real-time like pre-emption on Windows - and partial pre-emption solutions tend to be full of problems. If you're in control of most of the time spent in the task (and offload any other work using asynchronous I/O), the coöperative solution is actually far more efficient, and can give you a lot less latency (though it really shouldn't matter much).
I hope I don't missunderstand your question :)
One possibility to your problem might be to use a concurrent queue: https://msdn.microsoft.com/de-de/library/dd267265(v=vs.110).aspx
For example you create a enum to control your state and init the queue:
private ConcurrentQueue<Action> _clientActions ;
private enum Statuskatalog
{
Idle,
Busy
};
Create a timer to start and create a timerfunktion.
Timer _taskTimer = new Timer(ProcessPendingTasks, null, 100, 333);
private void ProcessPendingTasks(object x)
{
_status = Statuskatalog.Busy;
_taskTimer.Change(Timeout.Infinite, Timeout.Infinite);
Action currentTask;
while( _clientActions.TryDequeue( out currentTask ))
{
var task = new Task(currentTask);
task.Start();
task.Wait();
}
_status=Statuskatalog.Idle;
}
Now you only have to add your tasks as delegates to the queue:
_clientActions.Enqueue(delegate { **Your task** });
if (_status == Statuskatalog.Idle) _taskTimer.Change(0, 333);
On this base, you can manage your special requirements you were asking for.
Hope this was, what you were searching for.
Related
I am trying to understand how Parallel.Invoke creates and reuses threads.
I ran the following example code (from MSDN, https://msdn.microsoft.com/en-us/library/dd642243(v=vs.110).aspx):
using System;
using System.Threading;
using System.Threading.Tasks;
class ThreadLocalDemo
{
static void Main()
{
// Thread-Local variable that yields a name for a thread
ThreadLocal<string> ThreadName = new ThreadLocal<string>(() =>
{
return "Thread" + Thread.CurrentThread.ManagedThreadId;
});
// Action that prints out ThreadName for the current thread
Action action = () =>
{
// If ThreadName.IsValueCreated is true, it means that we are not the
// first action to run on this thread.
bool repeat = ThreadName.IsValueCreated;
Console.WriteLine("ThreadName = {0} {1}", ThreadName.Value, repeat ? "(repeat)" : "");
};
// Launch eight of them. On 4 cores or less, you should see some repeat ThreadNames
Parallel.Invoke(action, action, action, action, action, action, action, action);
// Dispose when you are done
ThreadName.Dispose();
}
}
As I understand it, Parallel.Invoke tries to create 8 threads here - one for each action. So it creates the first thread, runs the first action, and by that gives a ThreadName to the thread. Then it creates the next thread (which gets a different ThreadName) and so on.
If it cannot create a new thread, it will reuse one of the threads created before. In this case, the value of repeat will be true and we can see this in the console output.
Is this correct until here?
The second-last comment ("Launch eight of them. On 4 cores or less, you should see some repeat ThreadNames") implies that the threads created by Invoke correspond to the available cpu threads of the processor: on 4 cores we have 8 cpu threads, at least one is busy (running the operating system and stuff), so Invoke can only use 7 different threads, so we must get at least one "repeat".
Is my interpretation of this comment correct?
I ran this code on my PC which has an Intel® Core™ i7-2860QM processor (i.e. 4 cores, 8 cpu threads). I expected to get at least one "repeat", but I didn't. When I changed the Invoke to take 10 instead of 8 actions, I got this output:
ThreadName = Thread6
ThreadName = Thread8
ThreadName = Thread6 (repeat)
ThreadName = Thread5
ThreadName = Thread3
ThreadName = Thread1
ThreadName = Thread10
ThreadName = Thread7
ThreadName = Thread4
ThreadName = Thread9
So I have at least 9 different threads in the console application. This contradicts the fact that my processor only has 8 threads.
So I guess some of my reasoning from above is wrong. Does Parallel.Invoke work differently than what I described above? If yes, how?
If you pass less then 10 items to Parallel.Invoke, and you don't specify MaxDegreeOfParallelism in options (so - your case), it will just run them all in parallel on thread pool sheduler using rougly the following code:
var actions = new [] { action, action, action, action, action, action, action, action };
var tasks = new Task[actions.Length];
for (int index = 1; index < tasks.Length; ++index)
tasks[index] = Task.Factory.StartNew(actions[index]);
tasks[0] = new Task(actions[0]);
tasks[0].RunSynchronously();
Task.WaitAll(tasks);
So just a regular Task.Factory.StartNew. If you will look at max number of threads in thread pool
int th, io;
ThreadPool.GetMaxThreads(out th, out io);
Console.WriteLine(th);
You will see some big number, like 32767. So, number of threads on which Parallel.Invoke will be executed (in your case) are not limited to number of cpu cores at all. Even on 1-core cpu it might run 8 threads in parallel.
You might then think, why some threads are reused at all? Because when work is done on thread pool thread - that thread is returned to the pool and is ready to accept new work. Actions from your example basically do no work at all and complete very fast. So sometimes first thread started via Task.Factory.StartNew has already completed your action and is returned to the pool before all subsequent threads were started. So that thread is reused.
By the way, you can see (repeat) in your example with 8 actions, and even with 7 if you try hard enough, on a 8 core (16 logical cores) processor.
UPDATE to answer your comment. Thread pool scheduler will not necessary create new threads immediately. There is min and max number of threads in thread pool. How to see max I already shown above. To see min number:
int th, io;
ThreadPool.GetMinThreads(out th, out io);
This number will usually be equal to the number of cores (so for example 8). Now, when you request new action to be performed on thread pool thread, and number of threads in a thread pool is less than minimum - new thread will be created immeditely. However, if number of available threads is greater than minimum - certain delay will be introduced before creating new thread (I don't remember how long exactly unfortunately, about 500ms).
Statement you added in your comment I highly doubt can execute in 2-3 seconds. For me it executes for 0.3 seconds max. So when first 8 threads are created by thread pool, there is that 500ms delay before creating 9th. During that delay, some (or all) of first 8 threads are completed their job and are available for new work, so there is no need to create new thread and they can be reused.
To verify this, introduce bigger delay:
static void Main()
{
// Thread-Local variable that yields a name for a thread
ThreadLocal<string> ThreadName = new ThreadLocal<string>(() =>
{
return "Thread" + Thread.CurrentThread.ManagedThreadId;
});
// Action that prints out ThreadName for the current thread
Action action = () =>
{
// If ThreadName.IsValueCreated is true, it means that we are not the
// first action to run on this thread.
bool repeat = ThreadName.IsValueCreated;
Console.WriteLine("ThreadName = {0} {1}", ThreadName.Value, repeat ? "(repeat)" : "");
Thread.Sleep(1000000);
};
int th, io;
ThreadPool.GetMinThreads(out th, out io);
Console.WriteLine("cpu:" + Environment.ProcessorCount);
Console.WriteLine(th);
Parallel.Invoke(Enumerable.Repeat(action, 100).ToArray());
// Dispose when you are done
ThreadName.Dispose();
Console.ReadKey();
}
You will see that now thread pool has to create new threads every time (much more than there are cores), because it cannot reuse previous threads while they are busy.
You can also increase number of min threads in thread pool, like this:
int th, io;
ThreadPool.GetMinThreads(out th, out io);
ThreadPool.SetMinThreads(100, io);
This will remove the delay (until 100 threads are created) and in above example you will notice that.
Behind the scenes, threads are organized (and possessed by) the task scheduler. Primary purpose of the task scheduler is to keep all CPU cores used as much as possible with useful work.
Under the hood, scheduler is using the thread pool, and then size of the thread pool is the way to fine-tune usefulness of operations executed on CPU cores.
Now this requires some analysis. For instance, thread switching costs CPU cycles and it is not useful work. On the other hand, when one thread executes one task on a core, all other tasks are stalled and they are not progressing on that core. I believe that is the core reason why the scheduler is usually starting two threads per core, so that at least some movement is visible in case that one task takes longer to complete (like several seconds).
There are corollaries to this basic mechanism. When some tasks take long time to complete, scheduler starts new threads to compensate. That means that long-running task will now have to compete for the core with short-running tasks. In that way, short tasks will be completed one after another, and long task will slowly progress to its completion as well.
Bottom line is that your observations about threads are generally correct, but not entirely true in specific situations. In concrete execution of a number of tasks, scheduler might choose to raise more threads, or to keep going with the default. That is why you will sometimes notice that number of threads differs.
Remember the goal of the game: Utilize CPU cores with useful work as much as possible, while at the same time making all tasks move, so that the application doesn't look like frozen. Historically, people used to try to reach these goals with many different techniques. Analysis had shown that many of those techniques were applied randomly and didn't really increase CPU utilization. That analysis has lead to introduction of task schedulers in .NET, so that fine-tuning can be coded once and be done well.
So I have at least 9 different threads in the console application. This contradicts the fact that my processor only has 8 threads.
A thread is a very much overloaded term. It can mean, at the very least: (1) something you sew with, (2) a bunch of code with associated state, that is represented by an OS handle, and (3) an execution pipeline of a CPU. The Thread.CurrentThread refers to (2), the "processor thread" that you mentioned refers to (3).
The existence of a (2)-thread is not predicated on the existence of (3)-thread, and the number of (2)-threads that exist on any particular system is pretty much limited by available memory and OS design. The existence of (2)-thread doesn't imply execution of (2)-thread at any given time (unless you use APIs that guarantee that).
Furthermore, if a (2)-thread executes at some point - implying a temporary 1:1 binding between (2)-thread and (3)-thread, there is no implication that the thread will continue executing in general, and of course neither is there an implication that the thread will continue executing on the same (3)-thread if it continues executing at all.
So, even if you have "caught" the execution of a (2)-thread on a (3)-thread by some side effect, e.g. console output, as you did, that doesn't necessarily imply anything about any other (2)-threads and (3)-threads at that point.
On to your code:
// If ThreadName.IsValueCreated is true, it means that we are not the
// first action to run on this thread. <-- this refers to (2)-thread, NOT (3)-thread.
Parallel.Invoke is not precluded from (in terms of specifications) creating as many new (2)-threads as there are arguments passed to it. The actual number of (2)-threads created may be all the way from zero to a hero, since to call Parallel.Invoke there must be an existing (2)-thread with some code that calls this API. So, no new (2)-threads need to be created at all, for example. Whether the (2)-threads created by Parallel.Invoke execute on any particular number of (3)-threads concurrently is beyond your control either.
So that explains the behavior you see. You conflated (2)-threads with (3)-threads, and assumed that Parallel.Invoke does something specific it in fact is not guaranteed to do. Citing documentation:
No guarantees are made about the order in which the operations execute or whether they execute in parallel.
This implies that Invoke is free to run the actions on dedicated (2)-threads if it so wishes. And that is what you observed.
I have been trying to figure out how to solve an requirement I have but for the life of me I just can't come up with a solution.
I have a database of items which stores them a kind of queue.
(The database has already been implemented and other processes will be adding items to this queue.)
The items require a lot of work/time to "process" so I need to be able to:
Constantly de-queue items from the database.
For each item run a new thread and process the item and then return true/false it it was successfully processed. (this will be used to re-add it to the database queue or not)
But to only do this while the current number of active threads (one per item being processed) is less then a maximum number of threads parameter.
Once the maximum number of threads has been reached I need to stop de-queuing items from the database until the current number of threads is less than the maximum number of threads.
At which point it needs to continue de-queuing items.
It feels like this should be something I can come up with but it is just not coming to me.
To clarify: I only need to implement the threading. The database has already be implemented.
One really easy way to do this is with a Semaphore. You have one thread that dequeues items and creates threads to process them. For example:
const int MaxThreads = 4;
Semaphore sem = new Semaphore(MaxThreads, MaxThreads);
while (Queue.HasItems())
{
sem.WaitOne();
var item = Queue.Dequeue();
Threadpool.QueueUserWorkItem(ProcessItem, item); // see below
}
// When the queue is empty, you have to wait for all processing
// threads to complete.
// If you can acquire the semaphore MaxThreads times, all workers are done
int count = 0;
while (count < MaxThreads)
{
sem.WaitOne();
++count;
}
// the code to process an item
void ProcessItem(object item)
{
// cast the item to whatever type you need,
// and process it.
// when done processing, release the semaphore
sem.Release();
}
The above technique works quite well. It's simple to code, easy to understand, and very effective.
One change is that you might want to use the Task API rather Threadpool.QueueUserWorkItem. Task gives you more control over the asynchronous processing, including cancellation. I used QueueUserWorkItem in my example because I'm more familiar with it. I would use Task in a production program.
Although this does use N+1 threads (where N is the number of items you want processed concurrently), that extra thread isn't often doing anything. The only time it's running is when it's assigning work to worker threads. Otherwise, it's doing a non-busy wait on the semaphore.
Do you just not know where to start?
Consider a thread pool with a max number of threads. http://msdn.microsoft.com/en-us/library/y5htx827.aspx
Consider spinning up your max number of threads immediately and monitoring the DB. http://msdn.microsoft.com/en-us/library/system.threading.threadpool.queueuserworkitem.aspx is convenient.
Remember that you can't guarantee your process will be ended safely...crashes happen. Consider logging of processing state.
Remember that your select and remove-from-queue operations should be atomic.
Ok, so the architecture of the solution is going to depend on one thing: does the processing time per queue item vary according to the item's data?
If not then you can have something that merely round-robins between the processing threads. This will be fairly simple to implement.
If the processing time does vary then you're going to need something with more of a 'next available' feel to it, so that whichever of you threads happens to be free first gets given the job of processing the data item.
Having worked that out you're then going to have the usual run around with how to synchronise between a queue reader and the processing threads. The difference between 'next-available' and 'round-robin' is how you do that synchronisation.
I'm not overly familiar with C#, but I've heard tell of a beast called a background worker. That is likely to be an acceptable means of bringing this about.
For round robin, just start up a background worker per queue item, storing the workers' references in an array. Limit yourself to, say, 16 in progress background workers. The idea is that having started 16 you would then wait for the first to complete before starting the 17th, and so on. I believe that background workers actually run as jobs on the thread pool, so that will automatically limit the number of threads that are actually running at any one time to something appropriate for the underlying hardware. To wait for a background worker see this. Having waited for a background worker to complete you'd then handle its result and start another up.
For the next available approach its not so different. Instead of waiting for the 1st to complete you would use WaitAny() to wait for any of the workers to complete. You handle the return from whichever one completed, and then start another one up and go back to WaitAny().
The general philosophy of both approaches is to keep a number of threads on the boil all the time. A features of the next-available approach is that the order in which you emit the results is not necessarily the same as the order of the input items. If that matters then the round robin approach with more background workers than CPU cores will be reasonably efficient (the threadpool will just start commissioned but not yet running workers anyway). However the latency will vary with the processing time.
BTW 16 is an arbitrary number chosen on the basis of how many cores you think will be on the PC running the software. More cores, bigger number.
Of course, in the seemingly restless and ever changing world of .NET there may now be a better way of doing this.
Good luck!
in console because threads sleep with randoms it will show the order of threads
3,2,1 or 1,2,3 or ...
how can I have fixed order?
and why when I set priority it doeasn't effect the code?
// ThreadTester.cs
// Multiple threads printing at different intervals.
using System;
using System.Threading;
namespace threadTester
{
// class ThreadTester demonstrates basic threading concepts
class ThreadTester
{
static void Main(string[] args)
{
// Create and name each thread. Use MessagePrinter's
// Print method as argument to ThreadStart delegate.
MessagePrinter printer1 = new MessagePrinter();
Thread thread1 =
new Thread(new ThreadStart(printer1.Print));
thread1.Name = "thread1";
MessagePrinter printer2 = new MessagePrinter();
Thread thread2 =
new Thread(new ThreadStart(printer2.Print));
thread2.Name = "thread2";
MessagePrinter printer3 = new MessagePrinter();
Thread thread3 =
new Thread(new ThreadStart(printer3.Print));
thread3.Name = "thread3";
Console.WriteLine("Starting threads");
// call each thread's Start method to place each
// thread in Started state
thread1.Priority = ThreadPriority.Lowest;
thread2.Priority = ThreadPriority.Normal;
thread3.Priority = ThreadPriority.Highest;
thread1.Start();
thread2.Start();
thread3.Start();
Console.WriteLine("Threads started\n");
Console.ReadLine();
} // end method Main
} // end class ThreadTester
// Print method of this class used to control threads
class MessagePrinter
{
private int sleepTime;
private static Random random = new Random();
// constructor to initialize a MessagePrinter object
public MessagePrinter()
{
// pick random sleep time between 0 and 5 seconds
sleepTime = random.Next(5001);
}
// method Print controls thread that prints messages
public void Print()
{
// obtain reference to currently executing thread
Thread current = Thread.CurrentThread;
// put thread to sleep for sleepTime amount of time
Console.WriteLine(
current.Name + " going to sleep for " + sleepTime);
Thread.Sleep(sleepTime);
// print thread name
Console.WriteLine(current.Name + " done sleeping");
} // end method Print
} // end class MessagePrinter
}
You use threads precisely because you do not care about having things happen in a particular order, but want either:
At the same time, if there are enough cores to allow them to happen together.
With some making progress while others are waiting for something.
Interleaved with paying attention to I/O or user-input, so as to continue being responsive.
In each of these cases, you just don't care that you don't know just which bit of what will happen when.
However:
You may still care about the order of certain sequences. In the simplest case, you just have these things happen in sequence within the same thread, while other things happen in other threads. More complicated cases can be served by chaining tasks together.
You may want the results from different threads to finally be put into a different order. The simplest approach is to put them all into order after they've all finished, though you can also sort results as they come (tricky though).
For ideal performance, there should be one thread running on each core (or possibly two on a hyperthreaded core, but that has further complications) at all times. Let's say you have a machine with 4 cores and 8 tasks you need done.
If the tasks involved a lot of waiting on I/O, then four will start, each will reach a point where it's waiting on that I/O, and allow one of the other tasks to make some progress. Chances are that even with the number of tasks being twice the number of cores, it'll still end up with plenty of idle time. If each task was going to take 20seconds, then doing them on different threads will probably have them all done in just a little over 20seconds, since all of them were spending most of their 20seconds waiting on something else.
If you are doing tasks that keep the CPU busy all the time (not much waiting for memory and certainly not for I/O) then you will be able to have four such tasks going at a time, while the others are waiting for them to either finish, or give up their slice of time. Here if each took 20seconds, the best you could hope for is a total time of about 40seconds (and that's assuming that no other thread from any process on the system wants the CPU, that you've a perfect lack of overhead in setting up the threads, etc).
In cases where there is more work to do (active work to do, rather than waiting for I/O to complete, another thread to release a lock, etc.) than cores, the OSs scheduler will swap around between different threads that want to be active. The exact details differs from OS to OS (different Windows versions, including some important differences between desktop and server set ups, take different approaches, different Linux versions with some particularly big changes from 2.4 to 2.6 and different Unixes, etc. all have different strategies).
One thing they all have in common is the common goal of making sure stuff gets done.
Thread priorities and process priorities are ways to influence this scheduling. With Windows, whenever there's more threads waiting to work than cores to work, those of the highest priority get given CPU time in a round-robin fashion. Should there be no threads of that priority, then those of the next lowest are given CPU time, then the next and so on.
This is a great way to have things grind things to a halt. It can lead to complications where a thread that was given high priority (presumably because it's work is considered particularly crucial) is waiting on a thread given low priority (presumably because its work is considered less important and one wants it to always cede time to the others), and the low-priority thread keeps not being given CPU time, because there's always more threads of higher priority than available cores. Hence the supposedly high-priority thread gets no CPU time at all.
To fix this situation, windows will occasionally promote the threads that haven't run in a long time. This fixes things, but now means you've got the supposedly low-priority threads bursting along as super-high priority to the detriment not just of the rest of the application but also the rest of the sytem.
(One of the best things about having a multi-core system, is it means your computing experience is less affected by people who set the priority of threads!)
If you use a debugger to stop a multi-threaded .NET application and examine the threads you'll probably find that all of them are at normal except for one at highest. This one at highest will be the finalizer thread and its running at highest priority is one of the reasons its important that finalizers should not take a long time to execute - having work done at highest priority is a bad thing and while it is justified in this case, it must end as soon as possible.
At least 95% of all other cases where someone sets the priority of a thread is a logical bug - it'll do nothing most of the time and allows things to get very messed up the rest. They can be used well (or we wouldn't have that ability at all), but should definitely be put in the "advanced techniques" category. (I like to spend my free time experimenting with multi-threading techniques that would count as excessive and premature optimisation most of the time, and I still hardly ever touch priorities).
In your example, priority will have little effect because each thread spends most of its time sleeping, so whichever thread does want CPU time can get it for the few nano-seconds it needs to run. What it could do though is cause the whole thing to become needlessly slower should you run it on a machine where the cores are also busy with other normal threads. In this case thead1 wouldn't get any CPU time at first (because there's always a higher priority thread that wants the CPU), then after 3seconds the scheduler would realise its been starved for an eternity the terms of CPU speeds (9billion CPU cycles or so) and give it a burst to highest priority for long enough to let it screw with the timing of vital windows services! Luckily it then sleeps and then does a minute amount of work before finishing, so it does no harm, but if it was doing anything real it could have some really nasty effects on the entire system's performance.
You can't guarantee when windows will execute a particular thread. You can make suggestions to the OS (I.E. the priority level) but ultimately Windows will decide when, what and where.
If you want to ensure that 1 starts before 2 which in turns starts before 3 you should make thread 1 start thread 2 and thread 2 start thread 3.
Threads are considered lightweight processes, in that they run completely independent of each other. If your task relies heavily on the order in which threads execute, you probably shouldn't be using threads.
Otherwise, you need to look at the thread synchronization constructs that the .NET framework provides.
You can not synchronize threads like this. If you need the work done in a certain order, don't use seperate threads, or use ResetEvents or something similar.
Thread scheduling is never guaranteed. Order is never preserved unless you explicitly force it through your code via locks/etc.
I have asked a similar question before here, but after much thought, and implementations from those that answered me, I found that my approach might have been incorrect.
When I implement the solution given to me on this previous question the following test result appeared:
When I 'simulate' multiple tasks running concurrently on multiple threads from the threadpool (by making the threads sleep at random times from 1 to 20 seconds for instance), then the model seems to work fine. I set the system to poll every 1 second to see if it can spawn another thread and all seems fine. Longer running (sleeping) threads would complete later on and threads would start and die all over the place. If I happen to run out of threads (I set it to spawn no more than 10) it would sit and wait for one to become available.
When I however make the system do actual processing in each thread (which would take anything from 3 seconds upwards), which would involve reading data, generating XMLs saving data, sending emails and the like, the system would spawn 1, 2 or 3 threads, do processing and then just close the threads (3...2...1...) and then say 0 threads running (I added console.writelines everywhere to document the process). It would then hang around 0 threads, never picking any more work!
So I decided to state my issue again the hopes that someone has a solution. I have tried various solutions so far:
ThreadPool: There's always the mention that you shouldn't over-work the ThreadPool and jobs has to be 'quick', but what is the definition of 'quick'? How do I know how big/busy the ThreadPool is?
Threads: It's always stated that Threads are expensive and you have to handle them starting up and ending, but how do I limit them, I have tried Semaphores, 'lock' objects, public variables, but it no no avail
So here is what I would like to accomplish:
I have the same job that needs to run at regular intervals, i.e. like gmail would check it's server for new email for you every 5 seconds.
If there is work to be done (i.e. you have new emails to be sent to your inbox), then spawn an async thread and make it start the work. This work will typically take longer than the interval stated in (1), hence the async thread, if an interval passes and the system checks again to see if there's new work and see you have more work, it will spawn another thread and make it start the work.
As in my example, all the jobs are the same kind of job (check of new mail), and are totally independent of eachother, they do not influence each other. If one of them fails, the rest can continue on working with no issue.
I need there to be a limit of how many concurrent threads and maximum threads I can have. If I pick '10', then the system should start checking for jobs as in (1), and keep on spawning threads as in (1), until it reaches 10 threads. All new attempts on an interval to spawn a new thread should just fail (do nothing) until a thread is released again. Here I suppose the choice will be: (a) when it's released there will already be some work queued waiting to be given to the new open thread or (b) on the next interval check if there's new work and assign it to the new open thread.
If there is no work, then typically the system should sit and wait, having no threads and in essence the only thing that should be running is some sort of timer
I currently use the sample in the previous question to do the following:
I start a timer, that ticks every 1 sec
On every tick I 'ThreadPool.QueueUserWorkItem(new WaitCallback(DoWork)'
In DoWork I I instantiate a class and call various methods that does some work
...but this leads to what I mentioned before, only 3 threads that die off and then nothing.
I as thinking of doing the following:
Set the ThreadPool to 10 thread's
Start a timer and in each tick ThreadPool.QueueUserWorkItem', and just keep on doing this, hoping that the ThreadPool will handle everything else. Isn't this what the ThreadPool is supposed to do?
Any help will be fantastic! (Sorry for the involved explanation!)
Try to have a look at the Semaphore class. You can use that to set a limit to how many threads can concurrently access a particular resource (and when I say resource, it can be anything).
Ok, edited for details:
In your class managing the threads, you create:
Semaphore concurrentThreadsEnforcer = new Semaphore(value1, value2);
Then, each thread you start will call:
concurrentThreadsEnforcer.WaitOne();
That will either take one slot from the semaphore and give it to the new thread, or block the new thread until a slot becomes available.
Whenever your new thread finishes its work, he (I like personalizing) MUST call, for obvious reasons:
concurrentThreadsEnforcer.Release().
Now, regarding the constructor, the second parameter is fairly simple: states how many concurrent threads can access the resource at any given time.
The first one is a bit trickier. The difference between the second parameter and the first one will state how many semaphore slots are reserved for the calling thread. That is, all your newly spawned threads will have access to the number of slots stated by the first parameter, and the rest of them up to the second parameter's value will be reserved for the original thread that created the semaphore (calling thread).
In your case, for 10 max threads, you would use:
... = new Semaphore(10, 10);
Since I posted a story anyway, let me gibe more details.
The way I will do it in the new threads, will be like this:
bool aquired = false;
try
{
aquired = concurrentThreadsEnforcer.WaitOne();
// Do some work here
} // Optional catch statements
finally
{
if (aquired)
concurrentThreadsEnforcer.Release();;
}
I would use a combination of BlockingCollection and Parallel.ForEach
Something like this:
private BlockingCollection<Job> jobs = new BlockingCollection<Job>();
private Task jobprocessor;
public void StartWork() {
timer.Start();
jobprocessor = Task.Factory.StartNew(RunJobs);
}
public void EndWork() {
timer.Stop();
jobs.CompleteAdding();
jobprocessor.Wait();
}
public void TimerTick() {
var job = new Job();
if (job.NeedsMoreWork())
jobs.Add(job);
}
public void RunJobs() {
var options = new ParallelOptions { MaxDegreeOfParallelism = 10 };
Parallel.ForEach(jobs.GetConsumingPartitioner(), options,
job => job.DoSomething());
}
For multiple threads wait, can anyone compare the pros and cons of using WaitHandle.WaitAll and Thread.Join?
WaitHandle.WaitAll has a 64 handle limit so that is obviously a huge limitation. On the other hand, it is a convenient way to wait for many signals in only a single call. Thread.Join does not require creating any additional WaitHandle instances. And since it could be called individually on each thread the 64 handle limit does not apply.
Personally, I have never used WaitHandle.WaitAll. I prefer a more scalable pattern when I want to wait on multiple signals. You can create a counting mechanism that counts up or down and once a specific value is reach you signal a single shared event. The CountdownEvent class conveniently packages all of this into a single class.
var finished = new CountdownEvent(1);
for (int i = 0; i < NUM_WORK_ITEMS; i++)
{
finished.AddCount();
SpawnAsynchronousOperation(
() =>
{
try
{
// Place logic to run in parallel here.
}
finally
{
finished.Signal();
}
}
}
finished.Signal();
finished.Wait();
Update:
The reason why you want to signal the event from the main thread is subtle. Basically, you want to treat the main thread as if it were just another work item. Afterall, it, along with the other real work items, is running concurrently as well.
Consider for a moment what might happen if we did not treat the main thread as a work item. It will go through one iteration of the for loop and add a count to our event (via AddCount) indicating that we have one pending work item right? Lets say the SpawnAsynchronousOperation completes and gets the work item queued on another thread. Now, imagine if the main thread gets preempted before swinging around to the next iteration of the loop. The thread executing the work item gets its fair share of the CPU and starts humming along and actually completes the work item. The Signal call in the work item runs and decrements our pending work item count to zero which will change the state of the CountdownEvent to signalled. In the meantime the main thread wakes up and goes through all iterations of the loop and hits the Wait call, but since the event got prematurely signalled it pass on by even though there are still pending work items.
Again, avoiding this subtle race condition is easy when you treat the main thread as a work item. That is why the CountdownEvent is intialized with one count and the Signal method is called before the Wait.
I like #Brian's answer as a comparison of the two mechanisms.
If you are on .Net 4, it would be worthwhile exploring Task Parallel Library to achieve Task Parellelism via System.Threading.Tasks which allows you to manage tasks across multiple threads at a higher level of abstraction. The signalling you asked about in this question to manage thread interactions is hidden or much simplified, and you can concentrate on properly defining what each Task consists of and how to coordinate them.
This may seem offtopic but as Microsoft themselves say in the MSDN docs:
in the .NET Framework 4, tasks are the
preferred API for writing
multi-threaded, asynchronous, and
parallel code.
The waitall mechanism involves kernal-mode objects. I don't think the same is true for the join mechanism. I would prefer join, given the opportunity.
Technically though, the two are not equivalent. IIRC Join can only operate on one thread. Waitall can hold for the signalling of multiple kernel objects.