TaskScheduler for parallel Asyc Tasks

TaskScheduler for parallel Asyc Tasks - c#

Feel free to ask for more context.
I have a requirement where a lot of tasks come and they can get in large number. So I want to throttle the queuing of tasks and want to control the concurrency also. Till now I have found that I can created a Custom TaskScheduler that limits the concurrency (I read here: http://msdn.microsoft.com/en-us/library/ee789351(v=vs.110).aspx) and also when tasks queued becomes more than a maximum limit I block the task scheduling in TaskFactory.StartNew method. Here are my questions:
Is there a better advice to do this in a simple way please feel free to suggest.
Also, in the implementation of LimitedConcurrencyLevelTaskScheduler I see that the implementation is using a ThreadPool underneath. Can I use something other than threadpool to schedule tasks? Doesn't extending the TaskScheduler gives the behavior of executing the in a threadpool and we jsut need to tell how to queue and dequeue tasks. Right?
If we look at the Default TaskScheduler it uses the ThreadPool underneath as well. There are two things MaximumConcurrencyLevel and the thread count in ThreadPool. So if we set the MaximumConcurrencyLevel to integer. Then the concurrency level will depend on the threadpool count. How are the two related in default implementation. If the concurrency eventually depends on the threadpool count then what is the point of MaximumConcurrencyLevel used for in the framework?

Have you considered any of the specialized task schedulers from parallel extensions extras?
http://blogs.msdn.com/b/pfxteam/archive/2010/04/09/9990424.aspx

Related

Task.Factory.StartNew or Parallel.ForEach for many long-running tasks? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Parallel.ForEach vs Task.Factory.StartNew
I need to run about 1,000 tasks in a ThreadPool on a nightly basis (the number may grow in the future). Each task is performing a long running operation (reading data from a web service) and is not CPU intensive. Async I/O is not an option for this particular use case.
Given an IList<string> of parameters, I need to DoSomething(string x). I am trying to pick between the following two options:
IList<Task> tasks = new List<Task>();
foreach (var p in parameters)
{
tasks.Add(Task.Factory.StartNew(() => DoSomething(p), TaskCreationOptions.LongRunning));
}
Task.WaitAll(tasks.ToArray());
OR
Parallel.ForEach(parameters, new ParallelOptions {MaxDegreeOfParallelism = Environment.ProcessorCount*32}, DoSomething);
Which option is better and why?
Note :
The answer should include a comparison between the usage of TaskCreationOptions.LongRunning and MaxDegreeOfParallelism = Environment.ProcessorCount * SomeConstant.

Perhaps you aren't aware of this, but the members in the Parallel class are simply (complicated) wrappers around Task objects. In case you're wondering, the Parallel class creates the Task objects with TaskCreationOptions.None. However, the MaxDegreeOfParallelism would affect those task objects no matter what creation options were passed to the task object's constructor.
TaskCreationOptions.LongRunning gives a "hint" to the underlying TaskScheduler that it might perform better with oversubscription of the threads. Oversubscription is good for threads with high-latency, for example I/O, because it will assign more than one thread (yes thread, not task) to a single core so that it will always have something to do, instead of waiting around for an operation to complete while the thread is in a waiting state. On the TaskScheduler that uses the ThreadPool, it will run LongRunning tasks on their own dedicated thread (the only case where you have a thread per task), otherwise it will run normally, with scheduling and work stealing (really, what you want here anyway)
MaxDegreeOfParallelism controls the number of concurrent operations run. It's similar to specifying the max number of paritions that the data will be split into and processed from. If TaskCreationOptions.LongRunning were able to be specified, all this would do would be to limit the number of tasks running at a single time, similar to a TaskScheduler whose maximum concurrency level is set to that value, similar to this example.
You might want the Parallel.ForEach. However, adding MaxDegreeOfParallelism equal to such a high number actually won't guarantee that there will be that many threads running at once, since the tasks will still be controlled by the ThreadPoolTaskScheduler. That scheduler will the number of threads running at once to the smallest amount possible, which I suppose is the biggest difference between the two methods. You could write (and specify) your own TaskScheduler that would mimic the max degree of parallelism behavior, and have the best of both worlds, but I'm doubting that something you're interested in doing.
My guess is that, depending on latency and the number of actual requests you need to do, using tasks will perform better in many(?) cases, though wind up using more memory, while parallel will be more consistent in resource usage. Of course, async I/O will perform monstrously better than any of these two options, but I understand you can't do that because you're using legacy libraries. So, unfortunately, you'll be stuck with mediocre performance no matter which one of those you chose.
A real solution would be to figure out a way to make async I/O happen; since I don't know the situation, I don't think I can be more helpful than that. Your program (read, thread) will continue execution, and the kernel will wait for the I/O operation to complete (this is also known as using I/O completion ports). Because the thread is not in a waiting state, the runtime can do more work on less threads, which usually ends up in an optimal relationship between the number of cores and number of threads. Adding more threads, as much as I wish it would, does not equate to better performance (actually, it can often hurt performance, because of things like context switching).
However, this entire answer is useless in a determining a final answer for your question, though I hope it will give you some needed direction. You won't know what performs better until you profile it. If you don't try them both (I should clarify that I mean the Task without the LongRunning option, letting the scheduler handle thread switching) and profile them to determine what is best for your particular use case, you're selling yourself short.

Both options are entirely inappropriate for your scenario.
TaskCreationOptions.LongRunning is certainly a better choice for tasks that are not CPU-bound, as the TPL (Parallel classes/extensions) are almost exclusively meant for maximizing the throughput of a CPU-bound operation by running it on multiple cores (not threads).
However, 1000 tasks is an unacceptable number for this. Whether or not they're all running at once isn't exactly the issue; even 100 threads waiting on synchronous I/O is an untenable situation. As one of the comments suggests, your application will be using an enormous amount of memory and end up spending almost all of its time in context-switching. The TPL is not designed for this scale.
If your operations are I/O bound - and if you are using web services, they are - then async I/O is not only the correct solution, it's the only solution. If you have to re-architect some of your code (such as, for example, adding asynchronous methods to major interfaces where there were none originally), do it, because I/O completion ports are the only mechanism in Windows or .NET that can properly support this particular type of concurrency.
I've never heard of a situation where async I/O was somehow "not an option". I cannot even conceive of any valid use case for this constraint. If you are unable to use async I/O then this would indicate a serious design problem that must be fixed, ASAP.

While this is not a direct comparison, I think it may help you. I do something similar to what you describe (in my case I know there is a load balanced server cluster on the other end serving REST calls). I get good results using Parrallel.ForEach to spin up an optimal number of worker threads provided that I also use the following code to tell my operating system it can connect to more than usual number of endpoints.
var servicePointManager = System.Net.ServicePointManager.FindServicePoint(Uri);
servicePointManager.ConnectionLimit = 250;
Note you have to call that once for each unique URL you connect to.

Tasks vs ThreadPool

I have an application in C# with a list of work to do. I'm looking to do as much of that work as possible in parallel. However I need to be able to control the maximum amount of parallel tasks.
From what I understand this is possible with a ThreadPool or with Tasks. Is there an difference in which one I use? My main concern is being able to control how many threads are active at one time.

Please take a look at ParallelOptions.MaxDegreeOfParallelism for Tasks.
I would advise you to use Tasks, because they provide a higher level abstraction than the ThreadPool.
A very good read on the topic can be found here. Really, a must-have book and it's free on top of that :)

In TPL you can use the WithDegreeOfParallelism on a ParallelEnumerable or ParallelOptions.MaxDegreeOfParallism
There is also the CountdownEvent which may be a better option if you are just using custom threads or tasks.
In the ThreadPool, when you use SetMaxThreads its global for the AppDomain so you could potentially be limiting unrelated code unnecessarily.
You cannot set the number of worker threads or the number of I/O completion threads to a number smaller than the number of processors in the computer.
If the common language runtime is hosted, for example by Internet Information Services (IIS) or SQL Server, the host can limit or prevent changes to the thread pool size.
Use caution when changing the maximum number of threads in the thread pool. While your code might benefit, the changes might have an adverse effect on code libraries you use.
Setting the thread pool size too large can cause performance problems. If too many threads are executing at the same time, the task switching overhead becomes a significant factor.
I agree with the other answer that you should use TPL over the ThreadPool as its a better abstraction of multi-threading, but its possible to accomplish what you want in both.

In this article on msdn, they explain why they recommend Tasks instead of ThreadPool for Parallelism.

Task have a very charming feature to me, you can build chains of tasks. Which are executed on certain results of the task before.
A feature I often use is following: Task A is running in background to do some long running work. I chain Task B after it, only executing when Task A has finished regulary and I configure it to run in the foreground, so I can easily update my controls with the result of long running Task A.

You can also create a semaphore to control how many threads can execute at a single time. You can create a new semaphore and in the constructor specify how many simultaneous threads are able to use that semaphore at a single time. Since I don't know how you are going to be using the threads, this would be a good starting point.
MSDN Article on the Semaphore class
-Wesley

C# - ThreadPool vs Tasks

As some may have seen in .NET 4.0, they've added a new namespace System.Threading.Tasks which basically is what is means, a task. I've only been using it for a few days, from using ThreadPool.
Which one is more efficient and less resource consuming? (Or just better overall?)

The objective of the Tasks namespace is to provide a pluggable architecture to make multi-tasking applications easier to write and more flexible.
The implementation uses a TaskScheduler object to control the handling of tasks. This has virtual methods that you can override to create your own task handling. Methods include for instance
protected virtual void QueueTask(Task task)
public virtual int MaximumConcurrencyLevel
There will be a tiny overhead to using the default implementation as there's a wrapper around the .NET threads implementation, but I'd not expect it to be huge.
There is a (draft) implementation of a custom TaskScheduler that implements multiple tasks on a single thread here.

which one is more efficient and less
resource consuming?
Irrelevant, there will be very little difference.
(Or just better overall)
The Task class will be the easier-to-use as it offers a very clean interface for starting and joining threads, and transfers exceptions. It also supports a (limited) form of load balancing.

"Starting with the .NET Framework 4, the TPL is the preferred way to write multithreaded and parallel code."
http://msdn.microsoft.com/en-us/library/dd460717.aspx

Thread
The bare metal thing, you probably don't need to use it, you probably can use a LongRunning Task and benefit from its facilities.
Tasks
Abstraction above the Threads. It uses the thread pool (unless you specify the task as a LongRunning operation, if so, a new thread is created under the hood for you).
Thread Pool
As the name suggests: a pool of threads. Is the .NET framework handling a limited number of threads for you. Why? Because opening 100 threads to execute expensive CPU operations on a CPU with just 8 cores definitely is not a good idea. The framework will maintain this pool for you, reusing the threads (not creating/killing them at each operation), and executing some of they in parallel in a way that your CPU will not burn.
OK, but when to use each one?
In resume: always use tasks.
Task is an abstratcion, so it is a lot easier to use. I advise you to always try to use Tasks and if you face some problem that makes you need to handle a thread by yourself (probably 1% of the time) then use threads.
BUT be aware that:
I/O Bound: For I/O bound operations (database calls, read/write files, APIs calls, etc) never use normal tasks, use LongRunning tasks or threads if you need to, but not normal tasks. Because it would lead you to a thread pool with a few threads busy and a lot of another tasks waiting for its turn to take the pool.
CPU Bound: For CPU bound operations just use the normal tasks and be happy.

Scheduling is an important aspect of parallel tasks.
Unlike threads, new tasks don't necessarily begin executing immediately. Instead, they are placed in a work queue. Tasks run when their associated task scheduler removes them from the queue, usually as cores become available. The task scheduler attempts to optimize overall throughput by controlling the system's degree of concurrency. As long as there are enough tasks and the tasks are sufficiently free of serializing dependencies, the program's performance scales with the number of available cores. In this way, tasks embody the concept of potential parallelism
As I saw on msdn http://msdn.microsoft.com/en-us/library/ff963549.aspx

ThreadPool and Task difference is very simple.
To understand task you should know about the threadpool.
ThreadPool is basically help to manage and reuse the free threads. In
other words a threadpool is the collection of background thread.
Simple definition of task can be:
Task work asynchronously manages the the unit of work. In easy words
Task doesn’t create new threads. Instead it efficiently manages the
threads of a threadpool.Tasks are executed by TaskScheduler, which queues tasks onto threads.

Another good point to consider about task is, when you use ThreadPool, you don't have any way to abort or wait on the running threads (unless you do it manually in the method of thread), but using task it is possible. Please correct me if I'm wrong

C# lower thread priority in thread pool

I have several low-imprtance tasks to be performed when some cpu time is available. I don't want this task to perform if other more import task are running. Ie if a normal/high priority task comes I want the low-importance task to pause until the importance task is done.
There is a pretty big number of low importance task to be performed (50 to 1000). So I don't want to create one thread per task. However I believe that the threadpool do not allow some priority specification, does it ?
How would you do solve this ?

You can new up a Thread and use a Dispatcher to send it takes of various priorities.
The priorities are a bit UI-centric but that doesn't really matter.

You shouldn't mess with the priority of the regular ThreadPool, since you aren't the only consumer. I suppose the logical approach would be to write your own - perhaps as simple as a producer/consumer queue, using your own Thread(s) as the consumer(s) - setting the thread priority yourself.
.NET 4.0 includes new libraries (the TPL etc) to make all this easier - until then you need additional code to create a custom thread pool or work queue.

When you are using the build in ThreadPool all threads execute with the default priority. If you mess with this setting it will be ignored. This is a case where you should roll your own ThreadPool. A few years ago I extended the SmartThreadPool to meet my needs. This may satisfy yours as well.

I'd create a shared Queue of pending task objects, with each object specifying its priority. Then write a dispatcher thread that watches the Queue and launches a new thread for each task, up to some max thread limit, and specifying the thread priority as it creates it. Its only a small amount of work to do that, and you can have the dispatcher report activity and even dynamically adjust the number of running threads. That concept has worked very well for me, and can be wrapped in a windows service to boot if you make your queue a database table.

Design Pattern Alternative to Coroutines

Currently, I have a large number of C# computations (method calls) residing in a queue that will be run sequentially. Each computation will use some high-latency service (network, disk...).
I was going to use Mono coroutines to allow the next computation in the computation queue to continue while a previous computation is waiting for the high latency service to return. However, I prefer to not depend on Mono coroutines.
Is there a design pattern that's implementable in pure C# that will enable me to process additional computations while waiting for high latency services to return?
Thanks
Update:
I need to execute a huge number (>10000) of tasks, and each task will be using some high-latency service. On Windows, you can't create that much threads.
Update:
Basically, I need a design pattern that emulates the advantages (as follows) of tasklets in Stackless Python (http://www.stackless.com/)
Huge # of tasks
If a task blocks the next task in the queue executes
No wasted cpu cycle
Minimal overhead switching between tasks

You can simulate cooperative microthreading using IEnumerable. Unfortunately this won't work with blocking APIs, so you need to find APIs that you can poll, or which have callbacks that you can use for signalling.
Consider a method
IEnumerable Thread ()
{
//do some stuff
Foo ();
//co-operatively yield
yield null;
//do some more stuff
Bar ();
//sleep 2 seconds
yield new TimeSpan (2000);
}
The C# compiler will unwrap this into a state machine - but the appearance is that of a co-operative microthread.
The pattern is quite straightforward. You implement a "scheduler" that keeps a list of all the active IEnumerators. As it cycles through the list, it "runs" each one using MoveNext (). If the value of MoveNext is false, the thread has ended, and the scheduler removes it from the list. If it's true, then the scheduler accesses the Current property to determine the current state of the thread. If it's a TimeSpan, the thread wishes to sleep, and the scheduler moved it onto some queue that can be flushed back into the main list when the sleep timespans have ended.
You can use other return objects to implement other signalling mechanisms. For example, define some kind of WaitHandle. If the thread yields one of these, it can be moved to a waiting queue until the handle is signalled. Or you could support WaitAll by yielding an array of wait handles. You could even implement priorities.
I did a simple implementation of this scheduler in about 150LOC but I haven't got round to blogging the code yet. It was for our PhyreSharp PhyreEngine wrapper (which won't be public), where it seems to work pretty well for controlling a couple of hundred characters in one of our demos. We borrowed the concept from the Unity3D engine -- they have some online docs that explain it from a user point of view.

.NET 4.0 comes with extensive support for Task parallelism:
How to: Use Parallel.Invoke to Execute Simple Parallel Tasks
How to: Return a Value from a Task
How to: Chain Multiple Tasks with Continuations

I'd recommend using the Thread Pool to execute multiple tasks from your queue at once in manageable batches using a list of active tasks that feeds off of the task queue.
In this scenario your main worker thread would initially pop N tasks from the queue into the active tasks list to be dispatched to the thread pool (most likely using QueueUserWorkItem), where N represents a manageable amount that won't overload the thread pool, bog your app down with thread scheduling and synchronization costs, or suck up available memory due to the combined I/O memory overhead of each task.
Whenever a task signals completion to the worker thread, you can remove it from the active tasks list and add the next one from your task queue to be executed.
This will allow you to have a rolling set of N tasks from your queue. You can manipulate N to affect the performance characteristics and find what is best in your particular circumstances.
Since you are ultimately bottlenecked by hardware operations (disk I/O and network I/O, CPU) I imagine smaller is better. Two thread pool tasks working on disk I/O most likely won't execute faster than one.
You could also implement flexibility in the size and contents of the active task list by restricting it to a set number of particular type of task. For example if you are running on a machine with 4 cores, you might find that the highest performing configuration is four CPU-bound tasks running concurrently along with one disk-bound task and a network task.
If you already have one task classified as a disk IO task, you may choose to wait until it is complete before adding another disk IO task, and you may choose to schedule a CPU-bound or network-bound task in the meanwhile.
Hope this makes sense!
PS: Do you have any dependancies on the order of tasks?

You should definitely check out the Concurrency and Coordination Runtime. One of their samples describes exactly what you're talking about: you call out to long-latency services, and the CCR efficiently allows some other task to run while you wait. It can handle huge number of tasks because it doesn't need to spawn a thread for each one, though it will use all your cores if you ask it to.

Isn't this a conventional use of multi-threaded processing?
Have a look at patterns such as Reactor here

Writing it to use Async IO might be sufficient.
This can lead to nasy, hard to debug code without strong structure in the design.

You should take a look at this:
http://www.replicator.org/node/80
This should do exactly what you want. It is a hack, though.

Some more information about the "Reactive" pattern (as mentioned by another poster) with respect to an implementation in .NET; aka "Linq to Events"
http://themechanicalbride.blogspot.com/2009/07/introducing-rx-linq-to-events.html
-Oisin

In fact, if you use one thread for a task, you will lose the game. Think about why Node.js can support huge number of conections. Using a few number of thread with async IO!!! Async and await functions can help on this.
foreach (var task in tasks)
{
await SendAsync(task.value);
ReadAsync();
}
SendAsync() and ReadAsync() are faked functions to async IO call.
Task parallelism is also a good choose. But I am not sure which one is faster. You can test both of them
in your case.

Yes of course you can. You just need to build a dispatcher mechanism that will call back on a lambda that you provide and goes into a queue. All the code I write in unity uses this approach and I never use coroutines. I wrap methods that use coroutines such as WWW stuff to just get rid of it. In theory, coroutines can be faster because there is less overhead. Practically they introduce new syntax to a language to do a fairly trivial task and furthermore you can't follow the stack trace properly on an error in a co-routine because all you'll see is ->Next. You'll have to then implement the ability to run the tasks in the queue on another thread. However, there is parallel functions in the latest .net and you'd be essentially writing similar functionality. It wouldn't be many lines of code really.
If anyone is interested I would send the code, don't have it on me.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.