Limit number of notifications in scheduler - c#

I have an cold observable and an observer.
Both are slow but the observer is slower than the observable.
They process many and many notifications so I don't want to store notifications without limit.
This sample takes about 30 seconds to complete. Very slow. I believe they can do in 21 seconds.
var subject = new Subject<int>();
subject.Subscribe(i =>
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - {i}");
Task.Run(() =>
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - Start");
foreach (var i in Enumerable.Range(0, 10))
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - End");
This sample finishes about in 20 seconds but the observable ends before the observer shows "4".
It indicates the scheduler stored 4 to 9 somewhere. I afraid if it stores over 1'000'000 notifications and throws OutOfMemoryException.
var subject = new Subject<int>();
subject.ObserveOn(ThreadPoolScheduler.Instance).Subscribe(i =>
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - {i}");
Task.Run(() =>
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - Start");
foreach (var i in Enumerable.Range(0, 10))
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - End");
That's why I want to limit number of notifications in the scheduler.
edit: diagram
x : calculate or other task
S : send notification
R : receive notification
-- time -->
thread1: xxxS xxxS xxxS
thread2: Rxxxxxx Rxxxxxx Rxxxxxx
thread1: xxxSxxxSxxxSxxxSxxxSxxxSxxxS
thread2: RxxxxxxRxxxxxxRxxxxxxRxxxxxx
I want
thread1: xxxSxxxS xxxS xxxS xxxS
thread2: RxxxxxxRxxxxxxRxxxxxxRxxxxxx

The Rx contract requires that notifications are serialized, so even though you may specify a Scheduler, it's more akin to saying "here, use this scheduler for managing concurrency".
ThreadPoolScheduler will still serialize the notifications, so the end result is that it won't call your method in parallel.
If you want asynchronous execution, you could rewrite it to this:
subject.Subscribe(async i =>
await Task.Delay(1000);
Console.WriteLine($"{DateTime.Now:HH:mm:ss} - {i}");
But the underlying problem is that your consumer lags behind your producer.
You could look into using backpressure, or if your application is a series of data processing tasks, you could also look into the excellent TPL Dataflow.
What happens in that pipeline between the source of events and its final sink, that's where Rx shines best.

You could try implementing a custom IScheduler that throttles the requests (the method Schedule) using a SemaphoreSlim.
Alternatively you could create a BlockingThrottle extension method that accepts and returns an IObservable, that you could chain to the original IObservable before subscribing to it. Here is an implementation that uses a BlockingCollection as a throttling mechanism:
private static IObservable<T> BlockingThrottle<T>(this IObservable<T> source,
int boundedCapacity)
return Observable.Create<T>(observer =>
var queue = new BlockingCollection<T>(boundedCapacity);
var cts = new CancellationTokenSource();
var locker = new object();
Exception exception = null;
new Thread(() =>
foreach (var item in queue.GetConsumingEnumerable(cts.Token))
catch (OperationCanceledException)
Exception ex; lock (locker) ex = exception;
if (ex != null) observer.OnError(ex);
// Leave all other exceptions unhandled.
// The responsibility for catching them belongs to the caller.
{ IsBackground = true }.Start();
var subscription = source.Subscribe(x =>
queue.Add(x, cts.Token);
catch (OperationCanceledException) { } // Ignore this exception too
}, ex =>
lock (locker) exception = ex;
}, () =>
return Disposable.Create(() =>
Usage example:
subject.BlockingThrottle(boundedCapacity: 10).Subscribe(i =>
Note: If you are planing to use this inside an ASP.NET application, consider replacing the BlockingCollection with an async queue (like a BufferBlock<T> or a Channel<T>), to avoid blocking threads.


C# .NET Parallel I/O operation (with throttling) [duplicate]

I would like to run a bunch of async tasks, with a limit on how many tasks may be pending completion at any given time.
Say you have 1000 URLs, and you only want to have 50 requests open at a time; but as soon as one request completes, you open up a connection to the next URL in the list. That way, there are always exactly 50 connections open at a time, until the URL list is exhausted.
I also want to utilize a given number of threads if possible.
I came up with an extension method, ThrottleTasksAsync that does what I want. Is there a simpler solution already out there? I would assume that this is a common scenario.
class Program
static void Main(string[] args)
Enumerable.Range(1, 10).ThrottleTasksAsync(5, 2, async i => { Console.WriteLine(i); return i; }).Wait();
Console.WriteLine("Press a key to exit...");
Here is the code:
static class IEnumerableExtensions
public static async Task<Result_T[]> ThrottleTasksAsync<Enumerable_T, Result_T>(this IEnumerable<Enumerable_T> enumerable, int maxConcurrentTasks, int maxDegreeOfParallelism, Func<Enumerable_T, Task<Result_T>> taskToRun)
var blockingQueue = new BlockingCollection<Enumerable_T>(new ConcurrentBag<Enumerable_T>());
var semaphore = new SemaphoreSlim(maxConcurrentTasks);
// Run the throttler on a separate thread.
var t = Task.Run(() =>
foreach (var item in enumerable)
// Wait for the semaphore
var taskList = new List<Task<Result_T>>();
Parallel.ForEach(IterateUntilTrue(() => blockingQueue.IsCompleted), new ParallelOptions { MaxDegreeOfParallelism = maxDegreeOfParallelism },
_ =>
Enumerable_T item;
if (blockingQueue.TryTake(out item, 100))
// Run the task
.ContinueWith(tsk =>
// For effect
// Release the semaphore
return tsk.Result;
// Await all the tasks.
return await Task.WhenAll(taskList);
static IEnumerable<bool> IterateUntilTrue(Func<bool> condition)
while (!condition()) yield return true;
The method utilizes BlockingCollection and SemaphoreSlim to make it work. The throttler is run on one thread, and all the async tasks are run on the other thread. To achieve parallelism, I added a maxDegreeOfParallelism parameter that's passed to a Parallel.ForEach loop re-purposed as a while loop.
The old version was:
foreach (var master = ...)
var details = ...;
Parallel.ForEach(details, detail => {
// Process each detail record here
}, new ParallelOptions { MaxDegreeOfParallelism = 15 });
// Perform the final batch updates here
But, the thread pool gets exhausted fast, and you can't do async/await.
To get around the problem in BlockingCollection where an exception is thrown in Take() when CompleteAdding() is called, I'm using the TryTake overload with a timeout. If I didn't use the timeout in TryTake, it would defeat the purpose of using a BlockingCollection since TryTake won't block. Is there a better way? Ideally, there would be a TakeAsync method.
As suggested, use TPL Dataflow.
A TransformBlock<TInput, TOutput> may be what you're looking for.
You define a MaxDegreeOfParallelism to limit how many strings can be transformed (i.e., how many urls can be downloaded) in parallel. You then post urls to the block, and when you're done you tell the block you're done adding items and you fetch the responses.
var downloader = new TransformBlock<string, HttpResponse>(
url => Download(url),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 50 }
var buffer = new BufferBlock<HttpResponse>();
foreach(var url in urls)
//or await downloader.SendAsync(url);
await downloader.Completion;
IList<HttpResponse> responses;
if (buffer.TryReceiveAll(out responses))
//process responses
Note: The TransformBlock buffers both its input and output. Why, then, do we need to link it to a BufferBlock?
Because the TransformBlock won't complete until all items (HttpResponse) have been consumed, and await downloader.Completion would hang. Instead, we let the downloader forward all its output to a dedicated buffer block - then we wait for the downloader to complete, and inspect the buffer block.
Say you have 1000 URLs, and you only want to have 50 requests open at
a time; but as soon as one request completes, you open up a connection
to the next URL in the list. That way, there are always exactly 50
connections open at a time, until the URL list is exhausted.
The following simple solution has surfaced many times here on SO. It doesn't use blocking code and doesn't create threads explicitly, so it scales very well:
const int MAX_DOWNLOADS = 50;
static async Task DownloadAsync(string[] urls)
using (var semaphore = new SemaphoreSlim(MAX_DOWNLOADS))
using (var httpClient = new HttpClient())
var tasks = urls.Select(async url =>
await semaphore.WaitAsync();
var data = await httpClient.GetStringAsync(url);
await Task.WhenAll(tasks);
The thing is, the processing of the downloaded data should be done on a different pipeline, with a different level of parallelism, especially if it's a CPU-bound processing.
E.g., you'd probably want to have 4 threads concurrently doing the data processing (the number of CPU cores), and up to 50 pending requests for more data (which do not use threads at all). AFAICT, this is not what your code is currently doing.
That's where TPL Dataflow or Rx may come in handy as a preferred solution. Yet it is certainly possible to implement something like this with plain TPL. Note, the only blocking code here is the one doing the actual data processing inside Task.Run:
const int MAX_DOWNLOADS = 50;
const int MAX_PROCESSORS = 4;
// process data
class Processing
SemaphoreSlim _semaphore = new SemaphoreSlim(MAX_PROCESSORS);
HashSet<Task> _pending = new HashSet<Task>();
object _lock = new Object();
async Task ProcessAsync(string data)
await _semaphore.WaitAsync();
await Task.Run(() =>
// simuate work
public async void QueueItemAsync(string data)
var task = ProcessAsync(data);
lock (_lock)
await task;
if (!task.IsCanceled && !task.IsFaulted)
throw; // not the task's exception, rethrow
// don't remove faulted/cancelled tasks from the list
// remove successfully completed tasks from the list
lock (_lock)
public async Task WaitForCompleteAsync()
Task[] tasks;
lock (_lock)
tasks = _pending.ToArray();
await Task.WhenAll(tasks);
// download data
static async Task DownloadAsync(string[] urls)
var processing = new Processing();
using (var semaphore = new SemaphoreSlim(MAX_DOWNLOADS))
using (var httpClient = new HttpClient())
var tasks = urls.Select(async (url) =>
await semaphore.WaitAsync();
var data = await httpClient.GetStringAsync(url);
// put the result on the processing pipeline
await Task.WhenAll(tasks.ToArray());
await processing.WaitForCompleteAsync();
As requested, here's the code I ended up going with.
The work is set up in a master-detail configuration, and each master is processed as a batch. Each unit of work is queued up in this fashion:
var success = true;
// Start processing all the master records.
Master master;
while (null != (master = await StoredProcedures.ClaimRecordsAsync(...)))
await masterBuffer.SendAsync(master);
// Finished sending master records
// Now, wait for all the batches to complete.
await batchAction.Completion;
return success;
Masters are buffered one at a time to save work for other outside processes. The details for each master are dispatched for work via the masterTransform TransformManyBlock. A BatchedJoinBlock is also created to collect the details in one batch.
The actual work is done in the detailTransform TransformBlock, asynchronously, 150 at a time. BoundedCapacity is set to 300 to ensure that too many Masters don't get buffered at the beginning of the chain, while also leaving room for enough detail records to be queued to allow 150 records to be processed at one time. The block outputs an object to its targets, because it's filtered across the links depending on whether it's a Detail or Exception.
The batchAction ActionBlock collects the output from all the batches, and performs bulk database updates, error logging, etc. for each batch.
There will be several BatchedJoinBlocks, one for each master. Since each ISourceBlock is output sequentially and each batch only accepts the number of detail records associated with one master, the batches will be processed in order. Each block only outputs one group, and is unlinked on completion. Only the last batch block propagates its completion to the final ActionBlock.
The dataflow network:
// The dataflow network
BufferBlock<Master> masterBuffer = null;
TransformManyBlock<Master, Detail> masterTransform = null;
TransformBlock<Detail, object> detailTransform = null;
ActionBlock<Tuple<IList<object>, IList<object>>> batchAction = null;
// Buffer master records to enable efficient throttling.
masterBuffer = new BufferBlock<Master>(new DataflowBlockOptions { BoundedCapacity = 1 });
// Sequentially transform master records into a stream of detail records.
masterTransform = new TransformManyBlock<Master, Detail>(async masterRecord =>
var records = await StoredProcedures.GetObjectsAsync(masterRecord);
// Filter the master records based on some criteria here
var filteredRecords = records;
// Only propagate completion to the last batch
var propagateCompletion = masterBuffer.Completion.IsCompleted && masterTransform.InputCount == 0;
// Create a batch join block to encapsulate the results of the master record.
var batchjoinblock = new BatchedJoinBlock<object, object>(records.Count(), new GroupingDataflowBlockOptions { MaxNumberOfGroups = 1 });
// Add the batch block to the detail transform pipeline's link queue, and link the batch block to the the batch action block.
var detailLink1 = detailTransform.LinkTo(batchjoinblock.Target1, detailResult => detailResult is Detail);
var detailLink2 = detailTransform.LinkTo(batchjoinblock.Target2, detailResult => detailResult is Exception);
var batchLink = batchjoinblock.LinkTo(batchAction, new DataflowLinkOptions { PropagateCompletion = propagateCompletion });
// Unlink batchjoinblock upon completion.
// (the returned task does not need to be awaited, despite the warning.)
batchjoinblock.Completion.ContinueWith(task =>
return filteredRecords;
}, new ExecutionDataflowBlockOptions { BoundedCapacity = 1 });
// Process each detail record asynchronously, 150 at a time.
detailTransform = new TransformBlock<Detail, object>(async detail => {
// Perform the action for each detail here asynchronously
await DoSomethingAsync();
return detail;
catch (Exception e)
success = false;
return e;
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 150, BoundedCapacity = 300 });
// Perform the proper action for each batch
batchAction = new ActionBlock<Tuple<IList<object>, IList<object>>>(async batch =>
var details = batch.Item1.Cast<Detail>();
var errors = batch.Item2.Cast<Exception>();
// Do something with the batch here
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4 });
masterBuffer.LinkTo(masterTransform, new DataflowLinkOptions { PropagateCompletion = true });
masterTransform.LinkTo(detailTransform, new DataflowLinkOptions { PropagateCompletion = true });

How to limit the Maximum number of parallel tasks in c#

I have a collection of 1000 input message to process. I'm looping the input collection and starting the new task for each message to get processed.
//Assume this messages collection contains 1000 items
var messages = new List<string>();
foreach (var msg in messages)
Task.Factory.StartNew(() =>
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time?
How to ensure this message get processed in the same sequence/order of the Collection?
You could use Parallel.Foreach and rely on MaxDegreeOfParallelism instead.
Parallel.ForEach(messages, new ParallelOptions {MaxDegreeOfParallelism = 10},
msg =>
// logic
SemaphoreSlim is a very good solution in this case and I higly recommend OP to try this, but #Manoj's answer has flaw as mentioned in comments.semaphore should be waited before spawning the task like this.
Updated Answer: As #Vasyl pointed out Semaphore may be disposed before completion of tasks and will raise exception when Release() method is called so before exiting the using block must wait for the completion of all created Tasks.
int maxConcurrency=10;
var messages = new List<string>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
List<Task> tasks = new List<Task>();
foreach(var msg in messages)
var t = Task.Factory.StartNew(() =>
Answer to Comments
for those who want to see how semaphore can be disposed without Task.WaitAll
Run below code in console app and this exception will be raised.
System.ObjectDisposedException: 'The semaphore has been disposed.'
static void Main(string[] args)
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
var t = Task.Factory.StartNew(() =>
// Task.WaitAll(tasks.ToArray());
Console.WriteLine("Exited using block");
private static void Process(string msg)
I think it would be better to use Parallel LINQ
Parallel.ForEach(messages ,
new ParallelOptions{MaxDegreeOfParallelism = 4},
x => Process(x);
where x is the MaxDegreeOfParallelism
With .NET 5.0 and Core 3.0 channels were introduced.
The main benefit of this producer/consumer concurrency pattern is that you can also limit the input data processing to reduce resource impact.
This is especially helpful when processing millions of data records.
Instead of reading the whole dataset at once into memory, you can now consecutively query only chunks of the data and wait for the workers to process it before querying more.
Code sample with a queue capacity of 50 messages and 5 consumer threads:
/// <exception cref="System.AggregateException">Thrown on Consumer Task exceptions.</exception>
public static async Task ProcessMessages(List<string> messages)
const int producerCapacity = 10, consumerTaskLimit = 3;
var channel = Channel.CreateBounded<string>(producerCapacity);
_ = Task.Run(async () =>
foreach (var msg in messages)
await channel.Writer.WriteAsync(msg);
// blocking when channel is full
// waiting for the consumer tasks to pop messages from the queue
// signaling the end of queue so that
// WaitToReadAsync will return false to stop the consumer tasks
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
var consumerTasks = Enumerable
.Range(1, consumerTaskLimit)
.Select(_ => Task.Run(async () =>
while (await channel.Reader.WaitToReadAsync(ct))
while (channel.Reader.TryRead(out var message))
await Task.Delay(500);
catch (OperationCanceledException) { }
Task waitForConsumers = Task.WhenAll(consumerTasks);
try { await waitForConsumers; }
foreach (var e in waitForConsumers.Exception.Flatten().InnerExceptions)
throw waitForConsumers.Exception.Flatten();
As pointed out by Theodor Zoulias:
On multiple consumer exceptions, the remaining tasks will continue to run and have to take the load of the killed tasks. To avoid this, I implemented a CancellationToken to stop all the remaining tasks and handle the exceptions combined in the AggregateException of waitForConsumers.Exception.
Side note:
The Task Parallel Library (TPL) might be good at automatically limiting the tasks based on your local resources. But when you are processing data remotely via RPC, it's necessary to manually limit your RPC calls to avoid filling the network/processing stack!
If your Process method is async you can't use Task.Factory.StartNew as it doesn't play well with an async delegate. Also there are some other nuances when using it (see this for example).
The proper way to do it in this case is to use Task.Run. Here's #ClearLogic answer modified for an async Process method.
static void Main(string[] args)
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
var t = Task.Run(async () =>
await Process(msg);
Console.WriteLine("Exited using block");
private static async Task Process(string msg)
await Task.Delay(2000);
You can create your own TaskScheduler and override QueueTask there.
protected virtual void QueueTask(Task task)
Then you can do anything you like.
One example here:
Limited concurrency level task scheduler (with task priority) handling wrapped tasks
You can simply set the max concurrency degree like this way:
int maxConcurrency=10;
var messages = new List<1000>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
foreach(var msg in messages)
Task.Factory.StartNew(() =>
If you need in-order queuing (processing might finish in any order), there is no need for a semaphore. Old fashioned if statements work fine:
const int maxConcurrency = 5;
List<Task> tasks = new List<Task>();
foreach (var arg in args)
var t = Task.Run(() => { Process(arg); } );
if(tasks.Count >= maxConcurrency)
I ran into a similar problem where I wanted to produce 5000 results while calling apis, etc. So, I ran some speed tests.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id).GetAwaiter().GetResult();
produced 100 results in 30 seconds.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id);
Moving the GetAwaiter().GetResult() to the individual async api calls inside GetProductMetaData resulted in 14.09 seconds to produce 100 results.
foreach (var id in ids.Take(100))
GetProductMetaData(productsMetaData, client, id);
Complete non-async programming with the GetAwaiter().GetResult() in api calls resulted in 13.417 seconds.
var tasks = new List<Task>();
while (y < ids.Count())
foreach (var id in ids.Skip(y).Take(100))
tasks.Add(GetProductMetaData(productsMetaData, client, id));
y += 100;
Console.WriteLine($"Finished {y}, {sw.Elapsed}");
Forming a task list and working through 100 at a time resulted in a speed of 7.36 seconds.
using (SemaphoreSlim cons = new SemaphoreSlim(10))
var tasks = new List<Task>();
foreach (var id in ids.Take(100))
var t = Task.Factory.StartNew(() =>
GetProductMetaData(productsMetaData, client, id);
Using SemaphoreSlim resulted in 13.369 seconds, but also took a moment to boot to start using it.
var throttler = new SemaphoreSlim(initialCount: take);
foreach (var id in ids)
tasks.Add(Task.Run(async () =>
skip += 1;
await GetProductMetaData(productsMetaData, client, id);
if (skip % 100 == 0)
Console.WriteLine($"started {skip}/{count}, {sw.Elapsed}");
Using Semaphore Slim with a throttler for my async task took 6.12 seconds.
The answer for me in this specific project was use a throttler with Semaphore Slim. Although the while foreach tasklist did sometimes beat the throttler, 4/6 times the throttler won for 1000 records.
I realize I'm not using the OPs code, but I think this is important and adds to this discussion because how is sometimes not the only question that should be asked, and the answer is sometimes "It depends on what you are trying to do."
Now to answer the specific questions:
How to limit the maximum number of parallel tasks in c#: I showed how to limit the number of tasks that are completed at a time.
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time? I cannot guess how many will be processed at a time unless I set an upper limit but I can set an upper limit. Obviously different computers function at different speeds due to CPU, RAM etc. and how many threads and cores the program itself has access to as well as other programs running in tandem on the same computer.
How to ensure this message get processed in the same sequence/order of the Collection? If you want to process everything in a specific order, it is synchronous programming. The point of being able to run things asynchronously is ensuring that they can do everything without an order. As you can see from my code, the time difference is minimal in 100 records unless you use async code. In the event that you need an order to what you are doing, use asynchronous programming up until that point, then await and do things synchronously from there. For example, task1a.start, task2a.start, then later task1a.await, task2a.await... then later task1b.start task1b.await and task2b.start task 2b.await.
public static void RunTasks(List<NamedTask> importTaskList)
List<NamedTask> runningTasks = new List<NamedTask>();
foreach (NamedTask currentTask in importTaskList)
if (runningTasks.Where(x => x.Status == TaskStatus.Running).Count() >= MaxCountImportThread)
catch (Exception ex)
Log.Fatal("ERROR!", ex);
you can use the BlockingCollection, If the consume collection limit has reached, the produce will stop producing until a consume process will finish. I find this pattern more easy to understand and implement than the SemaphoreSlim.
int TasksLimit = 10;
BlockingCollection<Task> tasks = new BlockingCollection<Task>(new ConcurrentBag<Task>(), TasksLimit);
void ProduceAndConsume()
var producer = Task.Factory.StartNew(RunProducer);
var consumer = Task.Factory.StartNew(RunConsumer);
Task.WaitAll(new[] { producer, consumer });
catch (AggregateException ae) { }
void RunConsumer()
foreach (var task in tasks.GetConsumingEnumerable())
void RunProducer()
for (int i = 0; i < 1000; i++)
tasks.Add(new Task(() => Thread.Sleep(1000), TaskCreationOptions.AttachedToParent));
Note that the RunProducer and RunConsumer has spawn two independent tasks.

Throttling asynchronous tasks

I would like to run a bunch of async tasks, with a limit on how many tasks may be pending completion at any given time.
Say you have 1000 URLs, and you only want to have 50 requests open at a time; but as soon as one request completes, you open up a connection to the next URL in the list. That way, there are always exactly 50 connections open at a time, until the URL list is exhausted.
I also want to utilize a given number of threads if possible.
I came up with an extension method, ThrottleTasksAsync that does what I want. Is there a simpler solution already out there? I would assume that this is a common scenario.
class Program
static void Main(string[] args)
Enumerable.Range(1, 10).ThrottleTasksAsync(5, 2, async i => { Console.WriteLine(i); return i; }).Wait();
Console.WriteLine("Press a key to exit...");
Here is the code:
static class IEnumerableExtensions
public static async Task<Result_T[]> ThrottleTasksAsync<Enumerable_T, Result_T>(this IEnumerable<Enumerable_T> enumerable, int maxConcurrentTasks, int maxDegreeOfParallelism, Func<Enumerable_T, Task<Result_T>> taskToRun)
var blockingQueue = new BlockingCollection<Enumerable_T>(new ConcurrentBag<Enumerable_T>());
var semaphore = new SemaphoreSlim(maxConcurrentTasks);
// Run the throttler on a separate thread.
var t = Task.Run(() =>
foreach (var item in enumerable)
// Wait for the semaphore
var taskList = new List<Task<Result_T>>();
Parallel.ForEach(IterateUntilTrue(() => blockingQueue.IsCompleted), new ParallelOptions { MaxDegreeOfParallelism = maxDegreeOfParallelism },
_ =>
Enumerable_T item;
if (blockingQueue.TryTake(out item, 100))
// Run the task
.ContinueWith(tsk =>
// For effect
// Release the semaphore
return tsk.Result;
// Await all the tasks.
return await Task.WhenAll(taskList);
static IEnumerable<bool> IterateUntilTrue(Func<bool> condition)
while (!condition()) yield return true;
The method utilizes BlockingCollection and SemaphoreSlim to make it work. The throttler is run on one thread, and all the async tasks are run on the other thread. To achieve parallelism, I added a maxDegreeOfParallelism parameter that's passed to a Parallel.ForEach loop re-purposed as a while loop.
The old version was:
foreach (var master = ...)
var details = ...;
Parallel.ForEach(details, detail => {
// Process each detail record here
}, new ParallelOptions { MaxDegreeOfParallelism = 15 });
// Perform the final batch updates here
But, the thread pool gets exhausted fast, and you can't do async/await.
To get around the problem in BlockingCollection where an exception is thrown in Take() when CompleteAdding() is called, I'm using the TryTake overload with a timeout. If I didn't use the timeout in TryTake, it would defeat the purpose of using a BlockingCollection since TryTake won't block. Is there a better way? Ideally, there would be a TakeAsync method.
As suggested, use TPL Dataflow.
A TransformBlock<TInput, TOutput> may be what you're looking for.
You define a MaxDegreeOfParallelism to limit how many strings can be transformed (i.e., how many urls can be downloaded) in parallel. You then post urls to the block, and when you're done you tell the block you're done adding items and you fetch the responses.
var downloader = new TransformBlock<string, HttpResponse>(
url => Download(url),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 50 }
var buffer = new BufferBlock<HttpResponse>();
foreach(var url in urls)
//or await downloader.SendAsync(url);
await downloader.Completion;
IList<HttpResponse> responses;
if (buffer.TryReceiveAll(out responses))
//process responses
Note: The TransformBlock buffers both its input and output. Why, then, do we need to link it to a BufferBlock?
Because the TransformBlock won't complete until all items (HttpResponse) have been consumed, and await downloader.Completion would hang. Instead, we let the downloader forward all its output to a dedicated buffer block - then we wait for the downloader to complete, and inspect the buffer block.
Say you have 1000 URLs, and you only want to have 50 requests open at
a time; but as soon as one request completes, you open up a connection
to the next URL in the list. That way, there are always exactly 50
connections open at a time, until the URL list is exhausted.
The following simple solution has surfaced many times here on SO. It doesn't use blocking code and doesn't create threads explicitly, so it scales very well:
const int MAX_DOWNLOADS = 50;
static async Task DownloadAsync(string[] urls)
using (var semaphore = new SemaphoreSlim(MAX_DOWNLOADS))
using (var httpClient = new HttpClient())
var tasks = urls.Select(async url =>
await semaphore.WaitAsync();
var data = await httpClient.GetStringAsync(url);
await Task.WhenAll(tasks);
The thing is, the processing of the downloaded data should be done on a different pipeline, with a different level of parallelism, especially if it's a CPU-bound processing.
E.g., you'd probably want to have 4 threads concurrently doing the data processing (the number of CPU cores), and up to 50 pending requests for more data (which do not use threads at all). AFAICT, this is not what your code is currently doing.
That's where TPL Dataflow or Rx may come in handy as a preferred solution. Yet it is certainly possible to implement something like this with plain TPL. Note, the only blocking code here is the one doing the actual data processing inside Task.Run:
const int MAX_DOWNLOADS = 50;
const int MAX_PROCESSORS = 4;
// process data
class Processing
SemaphoreSlim _semaphore = new SemaphoreSlim(MAX_PROCESSORS);
HashSet<Task> _pending = new HashSet<Task>();
object _lock = new Object();
async Task ProcessAsync(string data)
await _semaphore.WaitAsync();
await Task.Run(() =>
// simuate work
public async void QueueItemAsync(string data)
var task = ProcessAsync(data);
lock (_lock)
await task;
if (!task.IsCanceled && !task.IsFaulted)
throw; // not the task's exception, rethrow
// don't remove faulted/cancelled tasks from the list
// remove successfully completed tasks from the list
lock (_lock)
public async Task WaitForCompleteAsync()
Task[] tasks;
lock (_lock)
tasks = _pending.ToArray();
await Task.WhenAll(tasks);
// download data
static async Task DownloadAsync(string[] urls)
var processing = new Processing();
using (var semaphore = new SemaphoreSlim(MAX_DOWNLOADS))
using (var httpClient = new HttpClient())
var tasks = urls.Select(async (url) =>
await semaphore.WaitAsync();
var data = await httpClient.GetStringAsync(url);
// put the result on the processing pipeline
await Task.WhenAll(tasks.ToArray());
await processing.WaitForCompleteAsync();
As requested, here's the code I ended up going with.
The work is set up in a master-detail configuration, and each master is processed as a batch. Each unit of work is queued up in this fashion:
var success = true;
// Start processing all the master records.
Master master;
while (null != (master = await StoredProcedures.ClaimRecordsAsync(...)))
await masterBuffer.SendAsync(master);
// Finished sending master records
// Now, wait for all the batches to complete.
await batchAction.Completion;
return success;
Masters are buffered one at a time to save work for other outside processes. The details for each master are dispatched for work via the masterTransform TransformManyBlock. A BatchedJoinBlock is also created to collect the details in one batch.
The actual work is done in the detailTransform TransformBlock, asynchronously, 150 at a time. BoundedCapacity is set to 300 to ensure that too many Masters don't get buffered at the beginning of the chain, while also leaving room for enough detail records to be queued to allow 150 records to be processed at one time. The block outputs an object to its targets, because it's filtered across the links depending on whether it's a Detail or Exception.
The batchAction ActionBlock collects the output from all the batches, and performs bulk database updates, error logging, etc. for each batch.
There will be several BatchedJoinBlocks, one for each master. Since each ISourceBlock is output sequentially and each batch only accepts the number of detail records associated with one master, the batches will be processed in order. Each block only outputs one group, and is unlinked on completion. Only the last batch block propagates its completion to the final ActionBlock.
The dataflow network:
// The dataflow network
BufferBlock<Master> masterBuffer = null;
TransformManyBlock<Master, Detail> masterTransform = null;
TransformBlock<Detail, object> detailTransform = null;
ActionBlock<Tuple<IList<object>, IList<object>>> batchAction = null;
// Buffer master records to enable efficient throttling.
masterBuffer = new BufferBlock<Master>(new DataflowBlockOptions { BoundedCapacity = 1 });
// Sequentially transform master records into a stream of detail records.
masterTransform = new TransformManyBlock<Master, Detail>(async masterRecord =>
var records = await StoredProcedures.GetObjectsAsync(masterRecord);
// Filter the master records based on some criteria here
var filteredRecords = records;
// Only propagate completion to the last batch
var propagateCompletion = masterBuffer.Completion.IsCompleted && masterTransform.InputCount == 0;
// Create a batch join block to encapsulate the results of the master record.
var batchjoinblock = new BatchedJoinBlock<object, object>(records.Count(), new GroupingDataflowBlockOptions { MaxNumberOfGroups = 1 });
// Add the batch block to the detail transform pipeline's link queue, and link the batch block to the the batch action block.
var detailLink1 = detailTransform.LinkTo(batchjoinblock.Target1, detailResult => detailResult is Detail);
var detailLink2 = detailTransform.LinkTo(batchjoinblock.Target2, detailResult => detailResult is Exception);
var batchLink = batchjoinblock.LinkTo(batchAction, new DataflowLinkOptions { PropagateCompletion = propagateCompletion });
// Unlink batchjoinblock upon completion.
// (the returned task does not need to be awaited, despite the warning.)
batchjoinblock.Completion.ContinueWith(task =>
return filteredRecords;
}, new ExecutionDataflowBlockOptions { BoundedCapacity = 1 });
// Process each detail record asynchronously, 150 at a time.
detailTransform = new TransformBlock<Detail, object>(async detail => {
// Perform the action for each detail here asynchronously
await DoSomethingAsync();
return detail;
catch (Exception e)
success = false;
return e;
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 150, BoundedCapacity = 300 });
// Perform the proper action for each batch
batchAction = new ActionBlock<Tuple<IList<object>, IList<object>>>(async batch =>
var details = batch.Item1.Cast<Detail>();
var errors = batch.Item2.Cast<Exception>();
// Do something with the batch here
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4 });
masterBuffer.LinkTo(masterTransform, new DataflowLinkOptions { PropagateCompletion = true });
masterTransform.LinkTo(detailTransform, new DataflowLinkOptions { PropagateCompletion = true });

How do I create an Observable Timer that calls a method and blocks on cancellation if the method is running until it finishes?

My requirements:
Run method DoWork on a specified interval.
If stop is called between calls to DoWork just stop the timer.
If stop is called while DoWork is running, block until DoWork is finished.
If DoWork takes too long to finish after stop is called, timeout.
I have a solution that seems to work so far, but I'm not super happy with it and think I may be missing something. The following is the void Main from my test app:
var source = new CancellationTokenSource();
// Create an observable sequence for the Cancel event.
var cancelObservable = Observable.Create<Int64>(o =>
source.Token.Register(() =>
Console.WriteLine("Start on canceled handler.");
Console.WriteLine("End on canceled handler.");
return Disposable.Empty;
var observable =
// Create observable timer.
Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), Scheduler.Default)
// Merge with the cancel observable so we have a composite that
// generates an event every 10 seconds AND immediately when a cancel is requested.
// This is what I ended up doing instead of disposing the timer so that I could wait
// for the sequence to finish, including DoWork.
.TakeWhile(i => !source.IsCancellationRequested)
// I could put this in an observer, but this way exceptions could be caught and handled
// or the results of the work could be fed to a subscriber.
.Do(l =>
Console.WriteLine("Start DoWork.");
Console.WriteLine("Finish DoWork.");
var published = observable.Publish();
var disposable = published.Connect();
// Press key between Start DoWork and Finish DoWork to test the cancellation while
// running DoWork.
// Press key between Finish DoWork and Start DoWork to test cancellation between
// events.
// I doubt this is good practice, but I was finding that o.OnNext was blocking
// inside of register, and the timeout wouldn't work if I blocked here before
// I set it up.
// Is there a preferred way to block until a sequence is finished? My experience
// is there's a timing issue if Cancel finishes fast enough the sequence may already
// be finished by the time I get here and .Wait() complains that the sequence contains
// no elements.
.ForEach(i => { });
Console.WriteLine("All finished! Press any key to continue.");
First, in your cancelObservable, make sure and return the result of Token.Register as your disposable instead of returning Disposable.Empty.
Here's a good extension method for turning CancellationTokens into observables:
public static IObservable<Unit> AsObservable(this CancellationToken token, IScheduler scheduler)
return Observable.Create<Unit>(observer =>
var d1 = new SingleAssignmentDisposable();
return new CompositeDisposable(d1, token.Register(() =>
d1.Disposable = scheduler.Schedule(() =>
Now, to your actual request:
public IObservable<Unit> ScheduleWork(IObservable<Unit> cancelSignal)
// Performs work on an interval
// stops the timer (but finishes any work in progress) when the cancelSignal is received
var workTimer = Observable
.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10))
.Select(_ =>
return Unit.Default;
// starts a timer after cancellation that will eventually throw a timeout exception.
var timeoutAfterCancelSignal = cancelSignal
.SelectMany(c => Observable.Never<Unit>().Timeout(TimeSpan.FromSeconds(5)));
// Use Amb to listen to both the workTimer
// and the timeoutAfterCancelSignal
// Since neither produce any data we are really just
// listening to see which will complete first.
// if the workTimer completes before the timeout
// then Amb will complete without error.
// However if the timeout expires first, then Amb
// will produce an error
return Observable.Amb(workTimer, timeoutAfterCancelSignal);
// Usage
var cts = new CancellationTokenSource();
var s = ScheduleWork(cts.Token.AsObservable(Scheduler.Default));
using (var finishedSignal = new ManualResetSlim())
_ => { /* will never be called */},
error => { /* handle error */ },
() => { /* canceled without error */ } );
Note, instead of cancellation tokens you can also do:
var cancelSignal = new AsyncSubject<Unit>();
var s = ScheduleWork(cancelSignal);
// .. to cancel ..

Task.WaitAll and Exceptions

I have a problem with exception handling and parallel tasks.
The code shown below starts 2 tasks and waits for them to finish. My problem is, that in case a task throws an exception, the catch handler is never reached.
List<Task> tasks = new List<Task>();
var arr = tasks.ToArray();
catch (AggregateException e)
// do something
However when I use the following code to wait for the tasks with a timeout, the exception is caught.
I seem to be missing something, as the documentation for WaitAll describes my first attempt to be the correct one. Please help me in understanding why it is not working.
Can't reproduce this - it works fine for me:
using System;
using System.Threading;
using System.Threading.Tasks;
class Test
static void Main()
Task t1 = Task.Factory.StartNew(() => Thread.Sleep(1000));
Task t2 = Task.Factory.StartNew(() => {
throw new Exception("Oops");
Task.WaitAll(t1, t2);
Console.WriteLine("All done");
catch (AggregateException)
Console.WriteLine("Something went wrong");
That prints "Something went wrong" just as I'd expect.
Is it possible that one of your tasks isn't finished? WaitAll really does wait for all the tasks to complete, even if some have already failed.
Here's how I solved the problem, as alluded to in the comments on my answer/question (above):
The caller catches any exceptions raised by the tasks being coordinated by the barrier, and signals the other tasks with a forced cancellation:
CancellationTokenSource cancelSignal = new CancellationTokenSource();
// do work
List<Task> workerTasks = new List<Task>();
foreach (Worker w in someArray)
while (!Task.WaitAll(workerTasks.ToArray(), 100, cancelSignal.Token)) ;
catch (Exception)
I was trying to create a call for each item in a collection, which turned out something like this:
var parent = Task.Factory.StartNew(() => {
foreach (var acct in AccountList)
var currAcctNo = acct.Number;
Task.Factory.StartNew(() =>
}, TaskCreationOptions.AttachedToParent);
I had to add the Thread.Sleep after each addition of a child task because if I didn't, the process would tend to overwrite the currAcctNo with the next iteration. I would have 3 or 4 distinct account numbers in my list, and when it processed each, the ProcessThisAccount call would show the last account number for all calls. Once I put the Sleep in, the process works great.
