Async Producer / Consumer with throttled duration and batched consumption - c#

I am trying to build a service that provides a queue for many asynchronous clients to make requests and await a response. I need to be able to throttle the queue processing by X requests per Y duration. For example: 50 web requests per second. It is for a 3rd party REST Service where I can only issue X requests per second.
Found many SO questions, it is lead me down the path of using TPL Dataflow, I've used a TranformBlock to provide my custom throttling and then X number of ActionBlocks to complete the tasks in parallel. The implementation of the Action seems a bit clunky, so wondering if there is a better way for me to pass Tasks into the pipeline that notify the callers once completed.
I'm wondering if there is there a better or more optimal/simpler way to do what I want? Is there any glaring issues with my implementation? I know it is missing cancellation and exception handing and I'll be doing this next, but your comments are most welcomed.
I've Extended Stephen Cleary's example for my Dataflow pipeline and used
svick's concept of a time throttled TransformBlock. I am wondering if what I've built could be easily achieved with a pure SemaphoreSlim design, its the time based throttling with max operations that I think will complicate things.
Here is the latest implementation. FIFO queue async queue where I can pass in custom actions.
public class ThrottledProducerConsumer<T>
private class TimerState<T1>
public SemaphoreSlim Sem;
public T1 Value;
private BufferBlock<T> _queue;
private IPropagatorBlock<T, T> _throttleBlock;
private List<Task> _consumers;
private static IPropagatorBlock<T1, T1> CreateThrottleBlock<T1>(TimeSpan Interval, Int32 MaxPerInterval)
SemaphoreSlim _sem = new SemaphoreSlim(MaxPerInterval);
return new TransformBlock<T1, T1>(async (x) =>
var sw = new Stopwatch();
//Console.WriteLine($"Current count: {_sem.CurrentCount}");
await _sem.WaitAsync();
var now = DateTime.UtcNow;
var releaseTime = now.Add(Interval) - now;
//-- Using timer as opposed to Task.Delay as I do not want to await or wait for it to complete
var tm = new Timer((s) => {
var state = (TimerState<T1>)s;
//Console.WriteLine($"RELEASE: {state.Value} was released {DateTime.UtcNow:mm:ss:ff} Reset Sem");
}, new TimerState<T1> { Sem = _sem, Value = x }, (int)Interval.TotalMilliseconds,
Console.WriteLine($"RELEASE(FAKE): {x} was released {DateTime.UtcNow:mm:ss:ff} Reset Sem");
//Console.WriteLine($"{x} was tramsformed in {sw.ElapsedMilliseconds}ms. Will release {now.Add(Interval):mm:ss:ff}");
return x;
//new ExecutionDataflowBlockOptions { BoundedCapacity = 1 });
new ExecutionDataflowBlockOptions { BoundedCapacity = 5, MaxDegreeOfParallelism = 10 });
public ThrottledProducerConsumer(TimeSpan Interval, int MaxPerInterval, Int32 QueueBoundedMax = 5, Action<T> ConsumerAction = null, Int32 MaxConsumers = 1)
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1, };
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true, };
//-- Create the Queue
_queue = new BufferBlock<T>(new DataflowBlockOptions { BoundedCapacity = QueueBoundedMax, });
//-- Create and link the throttle block
_throttleBlock = CreateThrottleBlock<T>(Interval, MaxPerInterval);
_queue.LinkTo(_throttleBlock, linkOptions);
//-- Create and link the consumer(s) to the throttle block
var consumerAction = (ConsumerAction != null) ? ConsumerAction : new Action<T>(ConsumeItem);
_consumers = new List<Task>();
for (int i = 0; i < MaxConsumers; i++)
var consumer = new ActionBlock<T>(consumerAction, consumerOptions);
_throttleBlock.LinkTo(consumer, linkOptions);
//-- TODO: Add some cancellation tokens to shut this thing down
/// <summary>
/// Default Consumer Action, just prints to console
/// </summary>
/// <param name="ItemToConsume"></param>
private void ConsumeItem(T ItemToConsume)
Console.WriteLine($"Consumed {ItemToConsume} at {DateTime.UtcNow}");
public async Task EnqueueAsync(T ItemToEnqueue)
await this._queue.SendAsync(ItemToEnqueue);
public async Task EnqueueItemsAsync(IEnumerable<T> ItemsToEnqueue)
foreach (var item in ItemsToEnqueue)
await this._queue.SendAsync(item);
public async Task CompleteAsync()
await Task.WhenAll(_consumers);
Console.WriteLine($"All consumers completed {DateTime.UtcNow}");
The test method
public class WorkItem<T>
public TaskCompletionSource<T> tcs;
//public T respone;
public string url;
public WorkItem(string Url)
tcs = new TaskCompletionSource<T>();
url = Url;
public override string ToString()
return $"{url}";
public static void TestQueue()
Console.WriteLine("Created the queue");
var defaultAction = new Action<WorkItem<String>>(async i => {
var taskItem = ((WorkItem<String>)i);
Console.WriteLine($"Consuming: {taskItem.url} {DateTime.UtcNow:mm:ss:ff}");
//-- Assume calling another async method e.g. await httpClient.DownloadStringTaskAsync(url);
await Task.Delay(5000);
//Console.WriteLine($"Consumed: {taskItem.url} {DateTime.UtcNow}");
var queue = new ThrottledProducerConsumer<WorkItem<String>>(TimeSpan.FromMilliseconds(2000), 5, 2, defaultAction);
var results = new List<Task>();
foreach (var no in Enumerable.Range(0, 20))
var workItem = new WorkItem<String>($"http://someurl{no}.com");
results.Add(workItem.tcs.Task.ContinueWith(response =>
Console.WriteLine($"Received: {response.Result} {DateTime.UtcNow:mm:ss:ff}");
Console.WriteLine("All Work Items Have Been Processed");

Since asking, I have created a ThrottledConsumerProducer class based on TPL Dataflow. It was tested over a number of days which included concurrent producers which were queued and completed in order, approx 281k without any problems, however there my be bugs I've not discovered.
I am using a BufferBlock as an asynchronous queue, this is linked to:
A TransformBlock which provides the throttling and blocking I need. It is used in conjunction with a SempahoreSlim to control the max requests. As each item is passed through the block, it increments the semaphore and schedules a task to run X duration later to release the semaphore by one. This way I have a sliding window of X requests per duration; exactly what I wanted. Because of TPL I am also leveraging parallelism to the connected:
ActionBlock(s) which are responsible for performing the task I need.
The classes are generic, so it might be useful to others if they need something similar. I have not written cancellation or error handling, but thought I should just mark this as answered to move it along. I would be quite happy to see some alternatives and feedback, rather than mark mine as an accepted answer. Thanks for reading.
NOTE: I removed the Timer from the original implementation as it was doing weird stuff causing the semaphore to release more than the maximum, I am assuming it is dynamic context error, it occurred when I started running concurrent requests. I worked around it using Task.Delay to schedule a release of a semaphore lock.
Throttled Producer Consumer
public class ThrottledProducerConsumer<T>
private BufferBlock<T> _queue;
private IPropagatorBlock<T, T> _throttleBlock;
private List<Task> _consumers;
private static IPropagatorBlock<T1, T1> CreateThrottleBlock<T1>(TimeSpan Interval,
Int32 MaxPerInterval, Int32 BlockBoundedMax = 2, Int32 BlockMaxDegreeOfParallelism = 2)
SemaphoreSlim _sem = new SemaphoreSlim(MaxPerInterval, MaxPerInterval);
return new TransformBlock<T1, T1>(async (x) =>
//Log($"Transform blk: {x} {DateTime.UtcNow:mm:ss:ff} Semaphore Count: {_sem.CurrentCount}");
var sw = new Stopwatch();
//Console.WriteLine($"Current count: {_sem.CurrentCount}");
await _sem.WaitAsync();
var delayTask = Task.Delay(Interval).ContinueWith((t) =>
//Log($"Pre-RELEASE: {x} {DateTime.UtcNow:mm:ss:ff} Semaphore Count {_sem.CurrentCount}");
//Log($"PostRELEASE: {x} {DateTime.UtcNow:mm:ss:ff} Semaphoere Count {_sem.CurrentCount}");
//Log($"Transformed: {x} in queue {sw.ElapsedMilliseconds}ms. {DateTime.Now:mm:ss:ff} will release {DateTime.Now.Add(Interval):mm:ss:ff} Semaphoere Count {_sem.CurrentCount}");
return x;
//-- Might be better to keep Bounded Capacity in sync with the semaphore
new ExecutionDataflowBlockOptions { BoundedCapacity = BlockBoundedMax,
MaxDegreeOfParallelism = BlockMaxDegreeOfParallelism });
public ThrottledProducerConsumer(TimeSpan Interval, int MaxPerInterval,
Int32 QueueBoundedMax = 5, Action<T> ConsumerAction = null, Int32 MaxConsumers = 1,
Int32 MaxThrottleBuffer = 20, Int32 MaxDegreeOfParallelism = 10)
//-- Probably best to link MaxPerInterval and MaxThrottleBuffer
// and MaxConsumers with MaxDegreeOfParallelism
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1, };
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true, };
//-- Create the Queue
_queue = new BufferBlock<T>(new DataflowBlockOptions { BoundedCapacity = QueueBoundedMax, });
//-- Create and link the throttle block
_throttleBlock = CreateThrottleBlock<T>(Interval, MaxPerInterval);
_queue.LinkTo(_throttleBlock, linkOptions);
//-- Create and link the consumer(s) to the throttle block
var consumerAction = (ConsumerAction != null) ? ConsumerAction : new Action<T>(ConsumeItem);
_consumers = new List<Task>();
for (int i = 0; i < MaxConsumers; i++)
var consumer = new ActionBlock<T>(consumerAction, consumerOptions);
_throttleBlock.LinkTo(consumer, linkOptions);
//-- TODO: Add some cancellation tokens to shut this thing down
/// <summary>
/// Default Consumer Action, just prints to console
/// </summary>
/// <param name="ItemToConsume"></param>
private void ConsumeItem(T ItemToConsume)
Log($"Consumed {ItemToConsume} at {DateTime.UtcNow}");
public async Task EnqueueAsync(T ItemToEnqueue)
await this._queue.SendAsync(ItemToEnqueue);
public async Task EnqueueItemsAsync(IEnumerable<T> ItemsToEnqueue)
foreach (var item in ItemsToEnqueue)
await this._queue.SendAsync(item);
public async Task CompleteAsync()
await Task.WhenAll(_consumers);
Console.WriteLine($"All consumers completed {DateTime.UtcNow}");
private static void Log(String messageToLog)
- Example Usage -
A Generic WorkItem
public class WorkItem<Toutput,Tinput>
private TaskCompletionSource<Toutput> _tcs;
public Task<Toutput> Task { get { return _tcs.Task; } }
public Tinput InputData { get; private set; }
public Toutput OutputData { get; private set; }
public WorkItem(Tinput inputData)
_tcs = new TaskCompletionSource<Toutput>();
InputData = inputData;
public void Complete(Toutput result)
public void Failed(Exception ex)
public override string ToString()
return InputData.ToString();
Creating the action block executed in the pipeline
private Action<WorkItem<Location,PointToLocation>> CreateProcessingAction()
return new Action<WorkItem<Location,PointToLocation>>(async i => {
var sw = new Stopwatch();
var taskItem = ((WorkItem<Location,PointToLocation>)i);
var inputData = taskItem.InputData;
//Log($"Consuming: {inputData.Latitude},{inputData.Longitude} {DateTime.UtcNow:mm:ss:ff}");
//-- Assume calling another async method e.g. await httpClient.DownloadStringTaskAsync(url);
await Task.Delay(500);
Location outData = new Location()
Latitude = inputData.Latitude,
Longitude = inputData.Longitude,
StreetAddress = $"Consumed: {inputData.Latitude},{inputData.Longitude} Duration(ms): {sw.ElapsedMilliseconds}"
//Console.WriteLine($"Consumed: {taskItem.url} {DateTime.UtcNow}");
Test Method
You'll need to provide your own implementation for PointToLocation and Location. Just an example of how you'd use it with your own classes.
int startRange = 0;
int nextRange = 1000;
ThrottledProducerConsumer<WorkItem<Location,PointToLocation>> tpc;
private void cmdTestPipeline_Click(object sender, EventArgs e)
Log($"Pipeline test started {DateTime.Now:HH:mm:ss:ff}");
if(tpc == null)
tpc = new ThrottledProducerConsumer<WorkItem<Location, PointToLocation>>(
//1010, 2, 20000,
TimeSpan.FromMilliseconds(1010), 45, 100000,
var workItems = new List<WorkItem<Models.Location, PointToLocation>>();
foreach (var i in Enumerable.Range(startRange, nextRange))
var ptToLoc = new PointToLocation() { Latitude = i + 101, Longitude = i + 100 };
var wrkItem = new WorkItem<Location, PointToLocation>(ptToLoc);
wrkItem.Task.ContinueWith(t =>
var loc = t.Result;
string line = $"[Simulated:{DateTime.Now:HH:mm:ss:ff}] - {loc.StreetAddress}";
//txtResponse.Text = String.Concat(txtResponse.Text, line, System.Environment.NewLine);
//var lines = txtResponse.Text.Split(new string[] { System.Environment.NewLine},
// StringSplitOptions.RemoveEmptyEntries).LongCount();
//lblLines.Text = lines.ToString();
//}, TaskScheduler.FromCurrentSynchronizationContext());
startRange += nextRange;
Log($"Pipeline test completed {DateTime.Now:HH:mm:ss:ff}");


How to span MaxDegreeOfParallelism across multiple TPL Dataflow blocks?

I want to limit the total number of queries that I submit to my database server across all Dataflow blocks to 30. In the following scenario, the throttling of 30 concurrent tasks is per block so it always hits 60 concurrent tasks during execution. Obviously I could limit my parallelism to 15 per block to achieve a system wide total of 30 but this wouldn't be optimal.
How do I make this work? Do I limit (and block) my awaits using SemaphoreSlim, etc, or is there an intrinsic Dataflow approach that works better?
public class TPLTest
private long AsyncCount = 0;
private long MaxAsyncCount = 0;
private long TaskId = 0;
private object MetricsLock = new object();
public async Task Start()
ExecutionDataflowBlockOptions execOption
= new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 30 };
DataflowLinkOptions linkOption = new DataflowLinkOptions()
{ PropagateCompletion = true };
var doFirstIOWorkAsync = new TransformBlock<Data, Data>(
async data => await DoIOBoundWorkAsync(data), execOption);
var doCPUWork = new TransformBlock<Data, Data>(
data => DoCPUBoundWork(data));
var doSecondIOWorkAsync = new TransformBlock<Data, Data>(
async data => await DoIOBoundWorkAsync(data), execOption);
var doProcess = new TransformBlock<Data, string>(
i => $"Task finished, ID = : {i.TaskId}");
var doPrint = new ActionBlock<string>(
s => Debug.WriteLine(s));
doFirstIOWorkAsync.LinkTo(doCPUWork, linkOption);
doCPUWork.LinkTo(doSecondIOWorkAsync, linkOption);
doSecondIOWorkAsync.LinkTo(doProcess, linkOption);
doProcess.LinkTo(doPrint, linkOption);
int taskCount = 150;
for (int i = 0; i < taskCount; i++)
await doFirstIOWorkAsync.SendAsync(new Data() { Delay = 2500 });
await doPrint.Completion;
Debug.WriteLine("Max concurrent tasks: " + MaxAsyncCount.ToString());
private async Task<Data> DoIOBoundWorkAsync(Data data)
if (AsyncCount > MaxAsyncCount)
MaxAsyncCount = AsyncCount;
if (data.TaskId <= 0)
data.TaskId = Interlocked.Increment(ref TaskId);
await Task.Delay(data.Delay);
lock (MetricsLock)
return data;
private Data DoCPUBoundWork(Data data)
data.Step = 1;
return data;
Data Class:
public class Data
public int Delay { get; set; }
public long TaskId { get; set; }
public int Step { get; set; }
Starting point:
TPLTest tpl = new TPLTest();
await tpl.Start();
Why don't you marshal everything to an action block that has the actual limitation?
var count = 0;
var ab1 = new TransformBlock<int, string>(l => $"1:{l}");
var ab2 = new TransformBlock<int, string>(l => $"2:{l}");
var doPrint = new ActionBlock<string>(
async s =>
var c = Interlocked.Increment(ref count);
await Task.Delay(5);
Interlocked.Decrement(ref count);
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 15 });
for (var i = 100; i > 0; i--)
if (i % 3 == 0) await ab1.SendAsync(i);
if (i % 5 == 0) await ab2.SendAsync(i);
await ab1.Completion;
await ab2.Completion;
This is the solution I ended up going with (unless I can figure out how to use a single generic DataFlow block for marshalling every type of database access):
I defined a SemaphoreSlim at the class level:
private SemaphoreSlim ThrottleDatabaseQuerySemaphore = new SemaphoreSlim(30, 30);
I modified the I/O class to call a throttling class:
private async Task<Data> DoIOBoundWorkAsync(Data data)
if (data.TaskId <= 0)
data.TaskId = Interlocked.Increment(ref TaskId);
Task t = Task.Delay(data.Delay); ;
await ThrottleDatabaseQueryAsync(t);
return data;
The throttling class: (I also have a generic version of the throttling routine because I couldn't figure out how to write one routine to handle both Task and Task<TResult>)
private async Task ThrottleDatabaseQueryAsync(Task task)
await ThrottleDatabaseQuerySemaphore.WaitAsync();
lock (MetricsLock)
if (AsyncCount > MaxAsyncCount)
MaxAsyncCount = AsyncCount;
await task;
lock (MetricsLock)
The simplest solution to this problem is to configure all your blocks with a limited-concurrency TaskScheduler:
TaskScheduler scheduler = new ConcurrentExclusiveSchedulerPair(
TaskScheduler.Default, maxConcurrencyLevel: 30).ConcurrentScheduler;
ExecutionDataflowBlockOptions execOption = new()
TaskScheduler = scheduler,
MaxDegreeOfParallelism = scheduler.MaximumConcurrencyLevel,
TaskSchedulers can only limit the concurrency of work done on threads. They can't throttle asynchronous operations that are not running on threads. So in order to enforce the MaximumConcurrencyLevel policy, unfortunately you must pass synchronous delegates to all the Dataflow blocks. For example:
TransformBlock<Data, Data> doFirstIOWorkAsync = new(data =>
return DoIOBoundWorkAsync(data).GetAwaiter().GetResult();
}, execOption);
This change will increase the demand for ThreadPool threads, so you'd better increase the number of threads that the ThreadPool creates instantly on demand to a higher value than the default Environment.ProcessorCount:
ThreadPool.SetMinThreads(100, 100); // At the start of the program
I am proposing this solution not because it is optimal, but because it is easy to implement. My understanding is that wasting some RAM on ~30 threads that are going to be blocked most of the time, won't have any measurable negative effect on the type of application that you are working with.

TPL dataflow process N latest messages

I'm trying to create some sort of queue that will process the N latest messages received. Right now I have this:
private static void SetupMessaging()
_messagingBroadcastBlock = new BroadcastBlock<string>(msg => msg, new ExecutionDataflowBlockOptions
//BoundedCapacity = 1,
EnsureOrdered = true,
MaxDegreeOfParallelism = 1,
MaxMessagesPerTask = 1
_messagingActionBlock = new ActionBlock<string>(msg =>
}, new ExecutionDataflowBlockOptions
BoundedCapacity = 2,
EnsureOrdered = true,
MaxDegreeOfParallelism = 1,
MaxMessagesPerTask = 1
_messagingBroadcastBlock.LinkTo(_messagingActionBlock, new DataflowLinkOptions { PropagateCompletion = true });
The problem is if I post 1,2,3,4,5 to it I will get 1,2,5 but i'd like it to be 1,4,5. Any suggestions are welcome.
I was able to make the following solution work
class FixedCapacityActionBlock<T>
private readonly ActionBlock<CancellableMessage<T>> _actionBlock;
private readonly ConcurrentQueue<CancellableMessage<T>> _inputCollection = new ConcurrentQueue<CancellableMessage<T>>();
private readonly int _maxQueueSize;
private readonly object _syncRoot = new object();
public FixedCapacityActionBlock(Action<T> act, ExecutionDataflowBlockOptions opt)
var options = new ExecutionDataflowBlockOptions
EnsureOrdered = opt.EnsureOrdered,
CancellationToken = opt.CancellationToken,
MaxDegreeOfParallelism = opt.MaxDegreeOfParallelism,
MaxMessagesPerTask = opt.MaxMessagesPerTask,
NameFormat = opt.NameFormat,
SingleProducerConstrained = opt.SingleProducerConstrained,
TaskScheduler = opt.TaskScheduler,
//we intentionally ignore this value
//BoundedCapacity = opt.BoundedCapacity
_actionBlock = new ActionBlock<CancellableMessage<T>>(cmsg =>
if (cmsg.CancellationTokenSource.IsCancellationRequested)
}, options);
_maxQueueSize = opt.BoundedCapacity;
public bool Post(T msg)
var fullMsg = new CancellableMessage<T>(msg);
//what if next task starts here?
lock (_syncRoot)
var itemsToDrop = _inputCollection.Skip(1).Except(_inputCollection.Skip(_inputCollection.Count - _maxQueueSize + 1));
foreach (var item in itemsToDrop)
CancellableMessage<T> temp;
_inputCollection.TryDequeue(out temp);
return _actionBlock.Post(fullMsg);
class CancellableMessage<T> : IDisposable
public CancellationTokenSource CancellationTokenSource { get; set; }
public T Message { get; set; }
public CancellableMessage(T msg)
CancellationTokenSource = new CancellationTokenSource();
Message = msg;
public void Dispose()
While this works and actually does the job this implementation looks dirty, also possibly not thread safe.
Here is a TransformBlock and ActionBlock implementation that drops the oldest messages in its queue, whenever newer messages are received and the BoundedCapacity limit has been reached. It behaves quite similar to a Channel configured with BoundedChannelFullMode.DropOldest.
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldest<TInput, TOutput>(
Func<TInput, Task<TOutput>> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
if (transform == null) throw new ArgumentNullException(nameof(transform));
dataflowBlockOptions = dataflowBlockOptions ?? new ExecutionDataflowBlockOptions();
var boundedCapacity = dataflowBlockOptions.BoundedCapacity;
var cancellationToken = dataflowBlockOptions.CancellationToken;
var queue = new Queue<TInput>(Math.Max(0, boundedCapacity));
var outputBlock = new BufferBlock<TOutput>(new DataflowBlockOptions()
BoundedCapacity = boundedCapacity,
CancellationToken = cancellationToken
if (boundedCapacity != DataflowBlockOptions.Unbounded)
dataflowBlockOptions.BoundedCapacity = checked(boundedCapacity * 2);
// After testing, at least boundedCapacity + 1 is required.
// Make it double to be sure that all non-dropped messages will be processed.
var transformBlock = new ActionBlock<object>(async _ =>
TInput item;
lock (queue)
if (queue.Count == 0) return;
item = queue.Dequeue();
var result = await transform(item).ConfigureAwait(false);
await outputBlock.SendAsync(result, cancellationToken).ConfigureAwait(false);
}, dataflowBlockOptions);
dataflowBlockOptions.BoundedCapacity = boundedCapacity; // Restore initial value
var inputBlock = new ActionBlock<TInput>(item =>
var droppedEntry = (Exists: false, Item: (TInput)default);
lock (queue)
if (queue.Count == boundedCapacity) droppedEntry = (true, queue.Dequeue());
if (droppedEntry.Exists) droppedMessages?.Report(droppedEntry.Item);
}, new ExecutionDataflowBlockOptions()
CancellationToken = cancellationToken
PropagateCompletion(inputBlock, transformBlock);
PropagateFailure(transformBlock, inputBlock);
PropagateCompletion(transformBlock, outputBlock);
_ = transformBlock.Completion.ContinueWith(_ => { lock (queue) queue.Clear(); },
return DataflowBlock.Encapsulate(inputBlock, outputBlock);
async void PropagateCompletion(IDataflowBlock source, IDataflowBlock target)
try { await source.Completion.ConfigureAwait(false); } catch { }
var exception = source.Completion.IsFaulted ? source.Completion.Exception : null;
if (exception != null) target.Fault(exception); else target.Complete();
async void PropagateFailure(IDataflowBlock source, IDataflowBlock target)
try { await source.Completion.ConfigureAwait(false); } catch { }
if (source.Completion.IsFaulted) target.Fault(source.Completion.Exception);
// Overload with synchronous lambda
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldest<TInput, TOutput>(
Func<TInput, TOutput> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
return CreateTransformBlockDropOldest(item => Task.FromResult(transform(item)),
dataflowBlockOptions, droppedMessages);
// ActionBlock equivalent
public static ITargetBlock<TInput>
Func<TInput, Task> action,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
if (action == null) throw new ArgumentNullException(nameof(action));
var block = CreateTransformBlockDropOldest<TInput, object>(
async item => { await action(item).ConfigureAwait(false); return null; },
dataflowBlockOptions, droppedMessages);
return block;
// ActionBlock equivalent with synchronous lambda
public static ITargetBlock<TInput>
Action<TInput> action,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
return CreateActionBlockDropOldest(
item => { action(item); return Task.CompletedTask; },
dataflowBlockOptions, droppedMessages);
The idea is to store the queued items in an auxiliary Queue, and pass dummy (null) values to an internal ActionBlock<object>. The block ignores the items passed as arguments, and takes instead an item from the queue, if there is any. Α lock is used to ensure that all non-dropped items in the queue will be eventually processed (unless of course an exception occurs).
There is also an extra feature. An optional IProgress<TInput> droppedMessages argument allows to receive notifications every time a message is dropped.
Usage example:
_messagingActionBlock = CreateActionBlockDropOldest<string>(msg =>
Console.WriteLine($"Processing: {msg}");
}, new ExecutionDataflowBlockOptions
BoundedCapacity = 2,
}, new Progress<string>(msg =>
Console.WriteLine($"Message dropped: {msg}");
TPL Dataflow doesn't fit well into Last N messages, as it's meant to be queue, or pipeline (FIFO), not the stack (LIFO). Are you really need to do this with a dataflow library?
It's much easier with ConcurrentStack<T>, you just introduce one producer task, which posts to the stack, and one consumer task, which gets messages from stack while number of handled ones are lesser than N (More about Producer-Consumer).
If you need TPL Dataflow, you can use it in consumer task, to start handling the last messages, but not in producer, as it's really not the way it was meant to be used. Moreover, there are some other libraries with event-based architecture, which may fit more naturally for your problem.

TaskFactory, Starting a new Task when one ends

I have found many methods of using the TaskFactory but I could not find anything about starting more tasks and watching when one ends and starting another one.
I always want to have 10 tasks working.
I want something like this
int nTotalTasks=10;
int nCurrentTask=0;
Task<bool>[] tasks=new Task<bool>[nThreadsNum];
for (int i=0; i<1000; i++)
string param1="test";
string param2="test";
if (nCurrentTask<10) // if there are less than 10 tasks then start another one
tasks[nCurrentThread++] = Task.Factory.StartNew<bool>(() =>
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
// How can I stop the for loop until a new task is finished and start a new one?
Check out the Task.WaitAny method:
Waits for any of the provided Task objects to complete execution.
Example from the documentation:
var t1 = Task.Factory.StartNew(() => DoOperation1());
var t2 = Task.Factory.StartNew(() => DoOperation2());
Task.WaitAny(t1, t2)
I would use a combination of Microsoft's Reactive Framework (NuGet "Rx-Main") and TPL for this. It becomes very simple.
Here's the code:
int nTotalTasks=10;
string param1="test";
string param2="test";
IDisposable subscription =
.Range(0, 1000)
.Select(i => Observable.FromAsync(() => Task.Factory.StartNew<bool>(() =>
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
.Subscribe((bool[] results) =>
/* Do something with the results. */
The key part here is the .Merge(nTotalTasks) which limits the number of concurrent tasks.
If you need to stop the processing part way thru just call subscription.Dispose() and everything gets cleaned up for you.
If you want to process each result as they are produced you can change the code from the .Merge(...) like this:
.Subscribe((bool result) =>
/* Do something with each result. */
This should be all you need, not complete, but all you need to do is wait on the first to complete and then run the second.
Task.WaitAny(task to wait on);
Have you seen the BlockingCollection class? It allows you to have multiple threads running in parallel and you can wait from results from one task to execute another. See more information here.
The answer depends on whether the tasks to be scheduled are CPU or I/O bound.
For CPU-intensive work I would use Parallel.For() API setting the number of thread/tasks through MaxDegreeOfParallelism property of ParallelOptions
For I/O bound work the number of concurrently executing tasks can be significantly larger than the number of available CPUs, so the strategy is to rely on async methods as much as possible, which reduces the total number of threads waiting for completion.
How can I stop the for loop until a new task is finished and start a
new one?
The loop can be throttled by using await:
static void Main(string[] args)
var task = DoWorkAsync();
// handle results
// task.Result;
async static Task<bool> DoWorkAsync()
const int NUMBER_OF_SLOTS = 10;
string param1="test";
string param2="test";
var results = new bool[NUMBER_OF_SLOTS];
AsyncWorkScheduler ws = new AsyncWorkScheduler(NUMBER_OF_SLOTS);
for (int i = 0; i < 1000; ++i)
await ws.ScheduleAsync((slotNumber) => DoWorkAsync(i, slotNumber, param1, param2, results));
await ws.Completion;
async static Task DoWorkAsync(int index, int slotNumber, string param1, string param2, bool[] results)
results[slotNumber] = results[slotNumber} && await Task.Factory.StartNew<bool>(() =>
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
A helper class AsyncWorkScheduler uses TPL.DataFlow components as well as Task.WhenAll():
class AsyncWorkScheduler
public AsyncWorkScheduler(int numberOfSlots)
m_slots = new Task[numberOfSlots];
m_availableSlots = new BufferBlock<int>();
m_errors = new List<Exception>();
m_tcs = new TaskCompletionSource<bool>();
m_completionPending = 0;
// Initial state: all slots are available
for(int i = 0; i < m_slots.Length; ++i)
m_slots[i] = Task.FromResult(false);
public async Task ScheduleAsync(Func<int, Task> action)
if (Volatile.Read(ref m_completionPending) != 0)
throw new InvalidOperationException("Unable to schedule new items.");
// Acquire a slot
int slotNumber = await m_availableSlots.ReceiveAsync().ConfigureAwait(false);
// Schedule a new task for a given slot
var task = action(slotNumber);
// Store a continuation on the task to handle completion events
m_slots[slotNumber] = task.ContinueWith(t => HandleCompletedTask(t, slotNumber), TaskContinuationOptions.ExecuteSynchronously);
public async void Complete()
if (Interlocked.CompareExchange(ref m_completionPending, 1, 0) != 0)
// Signal the queue's completion
await Task.WhenAll(m_slots).ConfigureAwait(false);
// Set completion
if (m_errors.Count != 0)
public Task Completion
return m_tcs.Task;
void SetFailed(Exception error)
void HandleCompletedTask(Task task, int slotNumber)
if (task.IsFaulted || task.IsCanceled)
if (Volatile.Read(ref m_completionPending) == 1)
// Release a slot
int m_completionPending;
List<Exception> m_errors;
BufferBlock<int> m_availableSlots;
TaskCompletionSource<bool> m_tcs;
Task[] m_slots;

Guarantee TransformBlock output sequence

From the TPL documentation
As with ActionBlock<TInput>, TransformBlock<TInput,TOutput> defaults
to processing one message at a time, maintaining strict FIFO ordering.
However, in a multi-threaded scenario, i.e. if multiple threads are "simultaneously" doing SendAsync and then "awaiting" for a result by calling ReceiveAsync, how do we guarantee that the thread that posted something into the TransformBlock<TInput,TOutput> actually gets the intended result that it is waiting for?
In my experiments, it seems like the way to "guarantee" my desired outcome, is to add the option BoundedCapacity = 1. At least the thread(s) still doesn't get blocked when sending and receiving.
If I don't do this, some threads will receive the result intended for another thread.
Is this the right approach in this particular use case?
Here is some code that illustrates my concern:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace ConsoleTransformBlock
class Program
private readonly static TransformBlock<int, int> _pipeline;
static Program()
_pipeline = new TransformBlock<int, int>(async (input) =>
await Task.Delay(RandomGen2.Next(5, 100)).ConfigureAwait(false);
return input;
new ExecutionDataflowBlockOptions() { BoundedCapacity = 1 }); // this is the fix???
static void Main(string[] args)
var dop = System.Environment.ProcessorCount;// 8-core
Parallel.For(0, dop, new ParallelOptions() { MaxDegreeOfParallelism = dop },
(d) =>
Console.WriteLine("Parallel For Done ...");
var tasks = new Task[dop];
for (var i = 0; i < dop; i++)
var temp = i;
tasks[temp] = Task.Factory.StartNew
(async () => await DoStuff().ConfigureAwait(false),
private static async Task DoStuff()
for (var i = 0; i < 100; i++)
var temp = RandomGen2.Next();
await _pipeline.SendAsync(temp).ConfigureAwait(false);
Console.WriteLine("Just sent {0}, now waiting {1}...", new object[] { temp, System.Threading.Thread.CurrentThread.ManagedThreadId });
await Task.Delay(RandomGen2.Next(5, 50)).ConfigureAwait(false);
var result = await _pipeline.ReceiveAsync().ConfigureAwait(false);
Console.WriteLine("Received {0}... {1}", new object[] { result, System.Threading.Thread.CurrentThread.ManagedThreadId });
if (result != temp)
var error = string.Format("************** Sent {0} But Received {1}", temp, result, System.Threading.Thread.CurrentThread.ManagedThreadId);
/// <summary>
/// Thread-Safe Random Generator
/// </summary>
public static class RandomGen2
private static Random _global = new Random();
private static Random _local;
public static int Next()
return Next(0, int.MaxValue);
public static int Next(int max)
return Next(0, max);
public static int Next(int min, int max)
Random inst = _local;
if (inst == null)
int seed;
lock (_global) seed = _global.Next();
_local = inst = new Random(seed);
return inst.Next(min, max);
TransformBlock already maintains FIFO order. The order in which you post items to the block is the exact order in which the items will be returned from the block.
When you specify a maximum degree of parallelism that is larger than 1, multiple messages are processed simultaneously, and therefore, messages might not be processed in the order in which they are received. The order in which the messages are output from the block will, however, be correctly ordered.
From Dataflow (Task Parallel Library)
You can see that with this example:
private static async Task MainAsync()
var transformBlock = new TransformBlock<int, int>(async input =>
await Task.Delay(RandomGen2.Next(5, 100));
return input;
}, new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 10});
foreach (var number in Enumerable.Range(0,100))
await transformBlock.SendAsync(number);
for (int i = 0; i < 100; i++)
var result = await transformBlock.ReceiveAsync();
In which the order will be ordered 0-99.
However, what you seem to want is some correlation with threads, so a thread will post an item to the block and then receive its result. This doesn't really fit into TPL Dataflow which should be more a pipeline of blocks. You can hack it with BoundedCapacity = 1 but you probably shouldn't.

Use Task.Run instead of Delegate.BeginInvoke

I have recently upgraded my projects to ASP.NET 4.5 and I have been waiting a long time to use 4.5's asynchronous capabilities. After reading the documentation I'm not sure whether I can improve my code at all.
I want to execute a task asynchronously and then forget about it. The way that I'm currently doing this is by creating delegates and then using BeginInvoke.
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited:
public override void OnActionExecuting(ActionExecutingContext filterContext)
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
var invoker = new MethodInvoker(delegate
var audit = new Audit
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
invoker.BeginInvoke(StopAsynchronousMethod, invoker);
But in order to finish this asynchronous task, I need to always define a callback, which looks like this:
public void StopAsynchronousMethod(IAsyncResult result)
var state = (MethodInvoker)result.AsyncState;
catch (Exception e)
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
I would rather not use the callback at all due to the fact that I do not need a result from the task that I am invoking asynchronously.
How can I improve this code with Task.Run() (or async and await)?
If I understood your requirements correctly, you want to kick off a task and then forget about it. When the task completes, and if an exception occurred, you want to log it.
I'd use Task.Run to create a task, followed by ContinueWith to attach a continuation task. This continuation task will log any exception that was thrown from the parent task. Also, use TaskContinuationOptions.OnlyOnFaulted to make sure the continuation only runs if an exception occurred.
Task.Run(() => {
var audit = new Audit
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
}).ContinueWith(task => {
task.Exception.Handle(ex => {
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(ex, username);
}, TaskContinuationOptions.OnlyOnFaulted);
As a side-note, background tasks and fire-and-forget scenarios in ASP.NET are highly discouraged. See The Dangers of Implementing Recurring Background Tasks In ASP.NET
It may sound a bit out of scope, but if you just want to forget after you launch it, why not using directly ThreadPool?
Something like:
x =>
// Do something
catch (Exception e)
// Log something
I had to do some performance benchmarking for different async call methods and I found that (not surprisingly) ThreadPool works much better, but also that, actually, BeginInvoke is not that bad (I am on .NET 4.5). That's what I found out with the code at the end of the post. I did not find something like this online, so I took the time to check it myself. Each call is not exactly equal, but it is more or less functionally equivalent in terms of what it does:
ThreadPool: 70.80ms
Task: 90.88ms
BeginInvoke: 121.88ms
Thread: 4657.52ms
public class Program
public delegate void ThisDoesSomething();
// Perform a very simple operation to see the overhead of
// different async calls types.
public static void Main(string[] args)
const int repetitions = 25;
const int calls = 1000;
var results = new List<Tuple<string, double>>();
"{0} parallel calls, {1} repetitions for better statistics\n",
// Threads
Console.Write("Running Threads");
results.Add(new Tuple<string, double>("Threads", RunOnThreads(repetitions, calls)));
// BeginInvoke
Console.Write("Running BeginInvoke");
results.Add(new Tuple<string, double>("BeginInvoke", RunOnBeginInvoke(repetitions, calls)));
// Tasks
Console.Write("Running Tasks");
results.Add(new Tuple<string, double>("Tasks", RunOnTasks(repetitions, calls)));
// Thread Pool
Console.Write("Running Thread pool");
results.Add(new Tuple<string, double>("ThreadPool", RunOnThreadPool(repetitions, calls)));
// Show results
results = results.OrderBy(rs => rs.Item2).ToList();
foreach (var result in results)
"{0}: Done in {1}ms avg",
(result.Item2 / repetitions).ToString("0.00"));
Console.WriteLine("Press a key to exit");
/// <summary>
/// The do stuff.
/// </summary>
public static void DoStuff()
public static double RunOnThreads(int repetitions, int calls)
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
var toProcess = calls;
var stopwatch = new Stopwatch();
var resetEvent = new ManualResetEvent(false);
var threadList = new List<Thread>();
for (var i = 0; i < calls; i++)
threadList.Add(new Thread(() =>
// Do something
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
foreach (var thread in threadList)
totalMs += stopwatch.ElapsedMilliseconds;
return totalMs;
public static double RunOnThreadPool(int repetitions, int calls)
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
var toProcess = calls;
var resetEvent = new ManualResetEvent(false);
var stopwatch = new Stopwatch();
var list = new List<int>();
for (var i = 0; i < calls; i++)
for (var i = 0; i < calls; i++)
x =>
// Do something
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
totalMs += stopwatch.ElapsedMilliseconds;
return totalMs;
public static double RunOnBeginInvoke(int repetitions, int calls)
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
var beginInvokeStopwatch = new Stopwatch();
var delegateList = new List<ThisDoesSomething>();
var resultsList = new List<IAsyncResult>();
for (var i = 0; i < calls; i++)
foreach (var delegateToCall in delegateList)
resultsList.Add(delegateToCall.BeginInvoke(null, null));
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(rs => !rs.IsCompleted))
totalMs += beginInvokeStopwatch.ElapsedMilliseconds;
return totalMs;
public static double RunOnTasks(int repetitions, int calls)
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
var resultsList = new List<Task>();
var stopwatch = new Stopwatch();
for (var i = 0; i < calls; i++)
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(task => !task.IsCompleted))
totalMs += stopwatch.ElapsedMilliseconds;
return totalMs;
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited
Auditing is certainly not something I would call "fire and forget". Remember, on ASP.NET, "fire and forget" means "I don't care whether this code actually executes or not". So, if your desired semantics are that audits may occasionally be missing, then (and only then) you can use fire and forget for your audits.
If you want to ensure your audits are all correct, then either wait for the audit save to complete before sending the response, or queue the audit information to reliable storage (e.g., Azure queue or MSMQ) and have an independent backend (e.g., Azure worker role or Win32 service) process the audits in that queue.
But if you want to live dangerously (accepting that occasionally audits may be missing), you can mitigate the problems by registering the work with the ASP.NET runtime. Using the BackgroundTaskManager from my blog:
public override void OnActionExecuting(ActionExecutingContext filterContext)
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
BackgroundTaskManager.Run(() =>
var audit = new Audit
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
catch (Exception e)
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
