Retry policy within ITargetBlock<TInput> - c#

I need to introduce a retry policy to the workflow. Let's say there are 3 blocks that are connected in such a way:
var executionOptions = new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 3 };
var buffer = new BufferBlock<int>();
var processing = new TransformBlock<int, int>(..., executionOptions);
var send = new ActionBlock<int>(...);
buffer.LinkTo(processing);
processing.LinkTo(send);
So there is a buffer which accumulates data, then send it to the transform block that processes not more that 3 items at one time, and then the result send to the action block.
Potentially during processing the transform block transient errors are possible, and I want retry the block if the error is transient for several times.
I know that blocks generally are not retryable (delegates that passed into the blocks could be made retryable). And one of the options is to wrap the delegate passed to support retrying.
I also know that there is a very good library TransientFaultHandling.Core that provides the retry mechanisms to transient faults. This is an excellent library but not in my case. If I wrap the delegate that is passed to the transform block into the RetryPolicy.ExecuteAsync method, the message inside the transform block will be locked, and until retry either completes or fails, the transform block won't be able to receive a new message. Imagine, if all the 3 messages are entered into the retrying (let's say, the next retry attempt will be in 2 minutes) and fail, the transform block will be stuck until at least one message leave the transform block.
The only solution I see is to extend the TranformBlock (actually, ITargetBlock will be enough too), and do the retry manually (like from here):
do
{
try { return await transform(input); }
catch
{
if( numRetries <= 0 ) throw;
else Task.Delay(timeout).ContinueWith(t => processing.Post(message));
}
} while( numRetries-- > 0 );
i.g. to put the message inside the transform block again with a delay, but in this case the retry context (number of retries left, etc.) also should be passed into this block. Sounds too complex...
Does anyone see a simpler approach to implement retry policy for a workflow block?

I think you pretty much have to do that, you have to track the remaining number of retries for a message and you have to schedule the retried attempt somehow.
But you could make this better by encapsulating it in a separate method. Something like:
// it's a private class, so public fields are okay
private class RetryingMessage<T>
{
public T Data;
public int RetriesRemaining;
public readonly List<Exception> Exceptions = new List<Exception>();
}
public static IPropagatorBlock<TInput, TOutput>
CreateRetryingBlock<TInput, TOutput>(
Func<TInput, Task<TOutput>> transform, int numberOfRetries,
TimeSpan retryDelay, Action<IEnumerable<Exception>> failureHandler)
{
var source = new TransformBlock<TInput, RetryingMessage<TInput>>(
input => new RetryingMessage<TInput>
{ Data = input, RetriesRemaining = numberOfRetries });
// TransformManyBlock, so that we can propagate zero results on failure
TransformManyBlock<RetryingMessage<TInput>, TOutput> target = null;
target = new TransformManyBlock<RetryingMessage<TInput>, TOutput>(
async message =>
{
try
{
return new[] { await transform(message.Data) };
}
catch (Exception ex)
{
message.Exceptions.Add(ex);
if (message.RetriesRemaining == 0)
{
failureHandler(message.Exceptions);
}
else
{
message.RetriesRemaining--;
Task.Delay(retryDelay)
.ContinueWith(_ => target.Post(message));
}
return null;
}
});
source.LinkTo(
target, new DataflowLinkOptions { PropagateCompletion = true });
return DataflowBlock.Encapsulate(source, target);
}
I have added code to track the exceptions, because I think that failures should not be ignored, they should be at the very least logged.
Also, this code doesn't work very well with completion: if there are retries waiting for their delay and you Complete() the block, it will immediately complete and the retries will be lost. If that's a problem for you, you will have to track outstanding reties and complete target when source completes and no retries are waiting.

In addition to svick's excellent answer, there are a couple of other options:
You can use TransientFaultHandling.Core - just set MaxDegreeOfParallelism to Unbounded so the other messages can get through.
You can modify the block output type to include failure indication and a retry count, and create a dataflow loop, passing a filter to LinkTo that examines whether another retry is necessary. This approach is more complex; you'd have to add a delay to your block if it is doing a retry, and add a TransformBlock to remove the failure/retry information for the rest of the mesh.

Here are two methods CreateRetryTransformBlock and CreateRetryActionBlock that operate under these assumptions:
The caller wants all items to be processed, even if some of them have repeatedly failed.
The caller is interested to know about all occured exceptions, even for items that finally succeeded (not applicable for the CreateRetryActionBlock).
The caller may want to set an upper limit to the number of total retries, after which the block should transition to a faulted state.
The caller wants to be able to set all available options of a normal block, including the MaxDegreeOfParallelism, BoundedCapacity, CancellationToken and EnsureOrdered, on top of the options related to the retry functionality.
The implementation below uses a SemaphoreSlim to control the level of concurrency between operations that are attempted for the first time, and previously faulted operations that are retried after their delay duration has elapsed.
public class RetryExecutionDataflowBlockOptions : ExecutionDataflowBlockOptions
{
/// <summary>The limit after which an item is returned as failed.</summary>
public int MaxAttemptsPerItem { get; set; } = 1;
/// <summary>The delay duration before retrying an item.</summary>
public TimeSpan RetryDelay { get; set; } = TimeSpan.Zero;
/// <summary>The limit after which the block transitions to a faulted
/// state (unlimited is the default).</summary>
public int MaxRetriesTotal { get; set; } = -1;
}
public readonly struct RetryResult<TInput, TOutput>
{
public readonly TInput Input { get; }
public readonly TOutput Output { get; }
public readonly bool Success { get; }
public readonly Exception[] Exceptions { get; }
public bool Failed => !Success;
public Exception FirstException => Exceptions != null ? Exceptions[0] : null;
public int Attempts =>
Exceptions != null ? Exceptions.Length + (Success ? 1 : 0) : 1;
public RetryResult(TInput input, TOutput output, bool success,
Exception[] exceptions)
{
Input = input;
Output = output;
Success = success;
Exceptions = exceptions;
}
}
public class RetryLimitException : Exception
{
public RetryLimitException(string message, Exception innerException)
: base(message, innerException) { }
}
public static IPropagatorBlock<TInput, RetryResult<TInput, TOutput>>
CreateRetryTransformBlock<TInput, TOutput>(
Func<TInput, Task<TOutput>> transform,
RetryExecutionDataflowBlockOptions dataflowBlockOptions)
{
if (transform == null) throw new ArgumentNullException(nameof(transform));
if (dataflowBlockOptions == null)
throw new ArgumentNullException(nameof(dataflowBlockOptions));
int maxAttemptsPerItem = dataflowBlockOptions.MaxAttemptsPerItem;
int maxRetriesTotal = dataflowBlockOptions.MaxRetriesTotal;
TimeSpan retryDelay = dataflowBlockOptions.RetryDelay;
if (maxAttemptsPerItem < 1) throw new ArgumentOutOfRangeException(
nameof(dataflowBlockOptions.MaxAttemptsPerItem));
if (maxRetriesTotal < -1) throw new ArgumentOutOfRangeException(
nameof(dataflowBlockOptions.MaxRetriesTotal));
if (retryDelay < TimeSpan.Zero) throw new ArgumentOutOfRangeException(
nameof(dataflowBlockOptions.RetryDelay));
var cancellationToken = dataflowBlockOptions.CancellationToken;
var exceptionsCount = 0;
var semaphore = new SemaphoreSlim(
dataflowBlockOptions.MaxDegreeOfParallelism);
async Task<(TOutput, Exception)> ProcessOnceAsync(TInput item)
{
await semaphore.WaitAsync(); // Preserve the SynchronizationContext
try
{
var result = await transform(item).ConfigureAwait(false);
return (result, null);
}
catch (Exception ex)
{
if (maxRetriesTotal != -1)
{
if (Interlocked.Increment(ref exceptionsCount) > maxRetriesTotal)
{
throw new RetryLimitException($"The max retry limit " +
$"({maxRetriesTotal}) has been reached.", ex);
}
}
return (default, ex);
}
finally
{
semaphore.Release();
}
}
async Task<Task<RetryResult<TInput, TOutput>>> ProcessWithRetryAsync(
TInput item)
{
// Creates a two-stages operation. Preserves the context on every await.
var (result, firstException) = await ProcessOnceAsync(item);
if (firstException == null) return Task.FromResult(
new RetryResult<TInput, TOutput>(item, result, true, null));
return RetryStageAsync();
async Task<RetryResult<TInput, TOutput>> RetryStageAsync()
{
var exceptions = new List<Exception>();
exceptions.Add(firstException);
for (int i = 2; i <= maxAttemptsPerItem; i++)
{
await Task.Delay(retryDelay, cancellationToken);
var (result, exception) = await ProcessOnceAsync(item);
if (exception != null)
exceptions.Add(exception);
else
return new RetryResult<TInput, TOutput>(item, result,
true, exceptions.ToArray());
}
return new RetryResult<TInput, TOutput>(item, default, false,
exceptions.ToArray());
};
}
// The input block awaits the first stage of each operation
var input = new TransformBlock<TInput, Task<RetryResult<TInput, TOutput>>>(
item => ProcessWithRetryAsync(item), dataflowBlockOptions);
// The output block awaits the second (and final) stage of each operation
var output = new TransformBlock<Task<RetryResult<TInput, TOutput>>,
RetryResult<TInput, TOutput>>(t => t, dataflowBlockOptions);
input.LinkTo(output, new DataflowLinkOptions { PropagateCompletion = true });
// In case of failure ensure that the input block is faulted too,
// so that its input/output queues are emptied, and any pending
// SendAsync operations are aborted
PropagateFailure(output, input);
return DataflowBlock.Encapsulate(input, output);
async void PropagateFailure(IDataflowBlock block1, IDataflowBlock block2)
{
try { await block1.Completion.ConfigureAwait(false); }
catch (Exception ex) { block2.Fault(ex); }
}
}
public static ITargetBlock<TInput> CreateRetryActionBlock<TInput>(
Func<TInput, Task> action,
RetryExecutionDataflowBlockOptions dataflowBlockOptions)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var block = CreateRetryTransformBlock<TInput, object>(async input =>
{
await action(input).ConfigureAwait(false); return null;
}, dataflowBlockOptions);
var nullTarget = DataflowBlock.NullTarget<RetryResult<TInput, object>>();
block.LinkTo(nullTarget);
return block;
}

Related

TPL Dataflow, how to discard previous (first) messages if BoundedCapacity is full [duplicate]

I'm trying to create some sort of queue that will process the N latest messages received. Right now I have this:
private static void SetupMessaging()
{
_messagingBroadcastBlock = new BroadcastBlock<string>(msg => msg, new ExecutionDataflowBlockOptions
{
//BoundedCapacity = 1,
EnsureOrdered = true,
MaxDegreeOfParallelism = 1,
MaxMessagesPerTask = 1
});
_messagingActionBlock = new ActionBlock<string>(msg =>
{
Console.WriteLine(msg);
Thread.Sleep(5000);
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 2,
EnsureOrdered = true,
MaxDegreeOfParallelism = 1,
MaxMessagesPerTask = 1
});
_messagingBroadcastBlock.LinkTo(_messagingActionBlock, new DataflowLinkOptions { PropagateCompletion = true });
_messagingBroadcastBlock.LinkTo(DataflowBlock.NullTarget<string>());
}
The problem is if I post 1,2,3,4,5 to it I will get 1,2,5 but i'd like it to be 1,4,5. Any suggestions are welcome.
UPD 1
I was able to make the following solution work
class FixedCapacityActionBlock<T>
{
private readonly ActionBlock<CancellableMessage<T>> _actionBlock;
private readonly ConcurrentQueue<CancellableMessage<T>> _inputCollection = new ConcurrentQueue<CancellableMessage<T>>();
private readonly int _maxQueueSize;
private readonly object _syncRoot = new object();
public FixedCapacityActionBlock(Action<T> act, ExecutionDataflowBlockOptions opt)
{
var options = new ExecutionDataflowBlockOptions
{
EnsureOrdered = opt.EnsureOrdered,
CancellationToken = opt.CancellationToken,
MaxDegreeOfParallelism = opt.MaxDegreeOfParallelism,
MaxMessagesPerTask = opt.MaxMessagesPerTask,
NameFormat = opt.NameFormat,
SingleProducerConstrained = opt.SingleProducerConstrained,
TaskScheduler = opt.TaskScheduler,
//we intentionally ignore this value
//BoundedCapacity = opt.BoundedCapacity
};
_actionBlock = new ActionBlock<CancellableMessage<T>>(cmsg =>
{
if (cmsg.CancellationTokenSource.IsCancellationRequested)
{
return;
}
act(cmsg.Message);
}, options);
_maxQueueSize = opt.BoundedCapacity;
}
public bool Post(T msg)
{
var fullMsg = new CancellableMessage<T>(msg);
//what if next task starts here?
lock (_syncRoot)
{
_inputCollection.Enqueue(fullMsg);
var itemsToDrop = _inputCollection.Skip(1).Except(_inputCollection.Skip(_inputCollection.Count - _maxQueueSize + 1));
foreach (var item in itemsToDrop)
{
item.CancellationTokenSource.Cancel();
CancellableMessage<T> temp;
_inputCollection.TryDequeue(out temp);
}
return _actionBlock.Post(fullMsg);
}
}
}
And
class CancellableMessage<T> : IDisposable
{
public CancellationTokenSource CancellationTokenSource { get; set; }
public T Message { get; set; }
public CancellableMessage(T msg)
{
CancellationTokenSource = new CancellationTokenSource();
Message = msg;
}
public void Dispose()
{
CancellationTokenSource?.Dispose();
}
}
While this works and actually does the job this implementation looks dirty, also possibly not thread safe.
Here is a TransformBlock and ActionBlock implementation that drops the oldest messages in its queue, whenever newer messages are received and the BoundedCapacity limit has been reached. It behaves quite similar to a Channel configured with BoundedChannelFullMode.DropOldest.
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldest<TInput, TOutput>(
Func<TInput, Task<TOutput>> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
if (transform == null) throw new ArgumentNullException(nameof(transform));
dataflowBlockOptions = dataflowBlockOptions ?? new ExecutionDataflowBlockOptions();
var boundedCapacity = dataflowBlockOptions.BoundedCapacity;
var cancellationToken = dataflowBlockOptions.CancellationToken;
var queue = new Queue<TInput>(Math.Max(0, boundedCapacity));
var outputBlock = new BufferBlock<TOutput>(new DataflowBlockOptions()
{
BoundedCapacity = boundedCapacity,
CancellationToken = cancellationToken
});
if (boundedCapacity != DataflowBlockOptions.Unbounded)
dataflowBlockOptions.BoundedCapacity = checked(boundedCapacity * 2);
// After testing, at least boundedCapacity + 1 is required.
// Make it double to be sure that all non-dropped messages will be processed.
var transformBlock = new ActionBlock<object>(async _ =>
{
TInput item;
lock (queue)
{
if (queue.Count == 0) return;
item = queue.Dequeue();
}
var result = await transform(item).ConfigureAwait(false);
await outputBlock.SendAsync(result, cancellationToken).ConfigureAwait(false);
}, dataflowBlockOptions);
dataflowBlockOptions.BoundedCapacity = boundedCapacity; // Restore initial value
var inputBlock = new ActionBlock<TInput>(item =>
{
var droppedEntry = (Exists: false, Item: (TInput)default);
lock (queue)
{
transformBlock.Post(null);
if (queue.Count == boundedCapacity) droppedEntry = (true, queue.Dequeue());
queue.Enqueue(item);
}
if (droppedEntry.Exists) droppedMessages?.Report(droppedEntry.Item);
}, new ExecutionDataflowBlockOptions()
{
CancellationToken = cancellationToken
});
PropagateCompletion(inputBlock, transformBlock);
PropagateFailure(transformBlock, inputBlock);
PropagateCompletion(transformBlock, outputBlock);
_ = transformBlock.Completion.ContinueWith(_ => { lock (queue) queue.Clear(); },
TaskScheduler.Default);
return DataflowBlock.Encapsulate(inputBlock, outputBlock);
async void PropagateCompletion(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
var exception = source.Completion.IsFaulted ? source.Completion.Exception : null;
if (exception != null) target.Fault(exception); else target.Complete();
}
async void PropagateFailure(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
if (source.Completion.IsFaulted) target.Fault(source.Completion.Exception);
}
}
// Overload with synchronous lambda
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldest<TInput, TOutput>(
Func<TInput, TOutput> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
return CreateTransformBlockDropOldest(item => Task.FromResult(transform(item)),
dataflowBlockOptions, droppedMessages);
}
// ActionBlock equivalent
public static ITargetBlock<TInput>
CreateActionBlockDropOldest<TInput>(
Func<TInput, Task> action,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var block = CreateTransformBlockDropOldest<TInput, object>(
async item => { await action(item).ConfigureAwait(false); return null; },
dataflowBlockOptions, droppedMessages);
block.LinkTo(DataflowBlock.NullTarget<object>());
return block;
}
// ActionBlock equivalent with synchronous lambda
public static ITargetBlock<TInput>
CreateActionBlockDropOldest<TInput>(
Action<TInput> action,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
return CreateActionBlockDropOldest(
item => { action(item); return Task.CompletedTask; },
dataflowBlockOptions, droppedMessages);
}
The idea is to store the queued items in an auxiliary Queue, and pass dummy (null) values to an internal ActionBlock<object>. The block ignores the items passed as arguments, and takes instead an item from the queue, if there is any. Α lock is used to ensure that all non-dropped items in the queue will be eventually processed (unless of course an exception occurs).
There is also an extra feature. An optional IProgress<TInput> droppedMessages argument allows to receive notifications every time a message is dropped.
Usage example:
_messagingActionBlock = CreateActionBlockDropOldest<string>(msg =>
{
Console.WriteLine($"Processing: {msg}");
Thread.Sleep(5000);
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 2,
}, new Progress<string>(msg =>
{
Console.WriteLine($"Message dropped: {msg}");
}));
TPL Dataflow doesn't fit well into Last N messages, as it's meant to be queue, or pipeline (FIFO), not the stack (LIFO). Are you really need to do this with a dataflow library?
It's much easier with ConcurrentStack<T>, you just introduce one producer task, which posts to the stack, and one consumer task, which gets messages from stack while number of handled ones are lesser than N (More about Producer-Consumer).
If you need TPL Dataflow, you can use it in consumer task, to start handling the last messages, but not in producer, as it's really not the way it was meant to be used. Moreover, there are some other libraries with event-based architecture, which may fit more naturally for your problem.

Handle exceptions with TPL Dataflow blocks

I have a simple tpl data flow which basically does some tasks.
I noticed when there is an exception in any of the datablocks, it wasn't getting caught in the initial parent block caller.
I have added some manual code to check for exception but doesn't seem the right approach.
if (readBlock.Completion.Exception != null
|| saveBlockJoinedProcess.Completion.Exception != null
|| processBlock1.Completion.Exception != null
|| processBlock2.Completion.Exception != null)
{
throw readBlock.Completion.Exception;
}
I had a look online to see what's a suggested approach but didn't see anything obvious.
So I created some sample code below and was hoping to get some guidance on a better solution:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace TPLDataflow
{
class Program
{
static void Main(string[] args)
{
try
{
//ProcessB();
ProcessA();
}
catch (Exception e)
{
Console.WriteLine("Exception in Process!");
throw new Exception($"exception:{e}");
}
Console.WriteLine("Processing complete!");
Console.ReadLine();
}
private static void ProcessB()
{
Task.WhenAll(Task.Run(() => DoSomething(1, "ProcessB"))).Wait();
}
private static void ProcessA()
{
var random = new Random();
var readBlock = new TransformBlock<int, int>(x =>
{
try { return DoSomething(x, "readBlock"); }
catch (Exception e) { throw e; }
}); //1
var braodcastBlock = new BroadcastBlock<int>(i => i); // ⬅ Here
var processBlock1 = new TransformBlock<int, int>(x =>
DoSomethingAsync(5, "processBlock1")); //2
var processBlock2 = new TransformBlock<int, int>(x =>
DoSomethingAsync(2, "processBlock2")); //3
//var saveBlock =
// new ActionBlock<int>(
// x => Save(x)); //4
var saveBlockJoinedProcess =
new ActionBlock<Tuple<int, int>>(
x => SaveJoined(x.Item1, x.Item2)); //4
var saveBlockJoin = new JoinBlock<int, int>();
readBlock.LinkTo(braodcastBlock, new DataflowLinkOptions
{ PropagateCompletion = true });
braodcastBlock.LinkTo(processBlock1,
new DataflowLinkOptions { PropagateCompletion = true }); //5
braodcastBlock.LinkTo(processBlock2,
new DataflowLinkOptions { PropagateCompletion = true }); //6
processBlock1.LinkTo(
saveBlockJoin.Target1); //7
processBlock2.LinkTo(
saveBlockJoin.Target2); //8
saveBlockJoin.LinkTo(saveBlockJoinedProcess,
new DataflowLinkOptions { PropagateCompletion = true });
readBlock.Post(1); //10
//readBlock.Post(2); //10
Task.WhenAll(processBlock1.Completion,processBlock2.Completion)
.ContinueWith(_ => saveBlockJoin.Complete());
readBlock.Complete(); //12
saveBlockJoinedProcess.Completion.Wait(); //13
if (readBlock.Completion.Exception != null
|| saveBlockJoinedProcess.Completion.Exception != null
|| processBlock1.Completion.Exception != null
|| processBlock2.Completion.Exception != null)
{
throw readBlock.Completion.Exception;
}
}
private static int DoSomething(int i, string method)
{
Console.WriteLine($"Do Something, callng method : { method}");
throw new Exception("Fake Exception!");
return i;
}
private static async Task<int> DoSomethingAsync(int i, string method)
{
Console.WriteLine($"Do SomethingAsync");
throw new Exception("Fake Exception!");
await Task.Delay(new TimeSpan(0, 0, i));
Console.WriteLine($"Do Something : {i}, callng method : { method}");
return i;
}
private static void Save(int x)
{
Console.WriteLine("Save!");
}
private static void SaveJoined(int x, int y)
{
Thread.Sleep(new TimeSpan(0, 0, 10));
Console.WriteLine("Save Joined!");
}
}
}
I had a look online to see what's a suggested approach but didn't see anything obvious.
If you have a pipeline (more or less), then the common approach is to use PropagateCompletion to shut down the pipe. If you have more complex topologies, then you would need to complete blocks by hand.
In your case, you have an attempted propagation here:
Task.WhenAll(
processBlock1.Completion,
processBlock2.Completion)
.ContinueWith(_ => saveBlockJoin.Complete());
But this code will not propagate exceptions. When both processBlock1.Completion and processBlock2.Completion complete, saveBlockJoin is completed successfully.
A better solution would be to use await instead of ContinueWith:
async Task PropagateToSaveBlockJoin()
{
try
{
await Task.WhenAll(processBlock1.Completion, processBlock2.Completion);
saveBlockJoin.Complete();
}
catch (Exception ex)
{
((IDataflowBlock)saveBlockJoin).Fault(ex);
}
}
_ = PropagateToSaveBlockJoin();
Using await encourages you to handle exceptions, which you can do by passing them to Fault to propagate the exception.
Propagating errors backward in the pipeline is not supported in the TPL Dataflow out of the box, which is especially annoying when the blocks have a bounded capacity. In this case an error in a block downstream may cause the blocks in front of it to block indefinitely. The only solution I know is to use the cancellation feature, and cancel all blocks in case anyone fails. Here is how it can be done. First create a CancellationTokenSource:
var cts = new CancellationTokenSource();
Then create the blocks one by one, embedding the same CancellationToken in the options of all of them:
var options = new ExecutionDataflowBlockOptions()
{ BoundedCapacity = 10, CancellationToken = cts.Token };
var block1 = new TransformBlock<double, double>(Math.Sqrt, options);
var block2 = new ActionBlock<double>(Console.WriteLine, options);
Then link the blocks together, including the PropagateCompletion setting:
block1.LinkTo(block2, new DataflowLinkOptions { PropagateCompletion = true });
Finally use an extension method to trigger the cancellation of the CancellationTokenSource in case of an exception:
block1.OnFaultedCancel(cts);
block2.OnFaultedCancel(cts);
The OnFaultedCancel extension method is shown below:
public static class DataflowExtensions
{
public static void OnFaultedCancel(this IDataflowBlock dataflowBlock,
CancellationTokenSource cts)
{
dataflowBlock.Completion.ContinueWith(_ => cts.Cancel(), default,
TaskContinuationOptions.OnlyOnFaulted |
TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
}
}
at the first look, if have only some minor points (not looking at your architecture). it seems to me that you have mixed some newer and some older constructs. and there are some code parts which are unnecessary.
for example:
private static void ProcessB()
{
Task.WhenAll(Task.Run(() => DoSomething(1, "ProcessB"))).Wait();
}
using the Wait()-method, if any exceptions happen, they will be wrapped in a System.AggregateException. in my opinion, this is better:
private static async Task ProcessBAsync()
{
await Task.Run(() => DoSomething(1, "ProcessB"));
}
using async-await, if an exception occurs, the await statement rethrows the first exception which is wrapped in the System.AggregateException. This allows you to try-catch for concrete exception types and handle only cases you really can handle.
another thing is this part of your code:
private static void ProcessA()
{
var random = new Random();
var readBlock = new TransformBlock<int, int>(
x =>
{
try { return DoSomething(x, "readBlock"); }
catch (Exception e)
{
throw e;
}
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 1 }); //1
Why catch an exception only to rethrow it? in this case, the try-catch is redundant.
And this here:
private static void SaveJoined(int x, int y)
{
Thread.Sleep(new TimeSpan(0, 0, 10));
Console.WriteLine("Save Joined!");
}
It is much better to use await Task.Delay(....). Using Task.Delay(...), your application will not freeze.

TPL dataflow process N latest messages

I'm trying to create some sort of queue that will process the N latest messages received. Right now I have this:
private static void SetupMessaging()
{
_messagingBroadcastBlock = new BroadcastBlock<string>(msg => msg, new ExecutionDataflowBlockOptions
{
//BoundedCapacity = 1,
EnsureOrdered = true,
MaxDegreeOfParallelism = 1,
MaxMessagesPerTask = 1
});
_messagingActionBlock = new ActionBlock<string>(msg =>
{
Console.WriteLine(msg);
Thread.Sleep(5000);
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 2,
EnsureOrdered = true,
MaxDegreeOfParallelism = 1,
MaxMessagesPerTask = 1
});
_messagingBroadcastBlock.LinkTo(_messagingActionBlock, new DataflowLinkOptions { PropagateCompletion = true });
_messagingBroadcastBlock.LinkTo(DataflowBlock.NullTarget<string>());
}
The problem is if I post 1,2,3,4,5 to it I will get 1,2,5 but i'd like it to be 1,4,5. Any suggestions are welcome.
UPD 1
I was able to make the following solution work
class FixedCapacityActionBlock<T>
{
private readonly ActionBlock<CancellableMessage<T>> _actionBlock;
private readonly ConcurrentQueue<CancellableMessage<T>> _inputCollection = new ConcurrentQueue<CancellableMessage<T>>();
private readonly int _maxQueueSize;
private readonly object _syncRoot = new object();
public FixedCapacityActionBlock(Action<T> act, ExecutionDataflowBlockOptions opt)
{
var options = new ExecutionDataflowBlockOptions
{
EnsureOrdered = opt.EnsureOrdered,
CancellationToken = opt.CancellationToken,
MaxDegreeOfParallelism = opt.MaxDegreeOfParallelism,
MaxMessagesPerTask = opt.MaxMessagesPerTask,
NameFormat = opt.NameFormat,
SingleProducerConstrained = opt.SingleProducerConstrained,
TaskScheduler = opt.TaskScheduler,
//we intentionally ignore this value
//BoundedCapacity = opt.BoundedCapacity
};
_actionBlock = new ActionBlock<CancellableMessage<T>>(cmsg =>
{
if (cmsg.CancellationTokenSource.IsCancellationRequested)
{
return;
}
act(cmsg.Message);
}, options);
_maxQueueSize = opt.BoundedCapacity;
}
public bool Post(T msg)
{
var fullMsg = new CancellableMessage<T>(msg);
//what if next task starts here?
lock (_syncRoot)
{
_inputCollection.Enqueue(fullMsg);
var itemsToDrop = _inputCollection.Skip(1).Except(_inputCollection.Skip(_inputCollection.Count - _maxQueueSize + 1));
foreach (var item in itemsToDrop)
{
item.CancellationTokenSource.Cancel();
CancellableMessage<T> temp;
_inputCollection.TryDequeue(out temp);
}
return _actionBlock.Post(fullMsg);
}
}
}
And
class CancellableMessage<T> : IDisposable
{
public CancellationTokenSource CancellationTokenSource { get; set; }
public T Message { get; set; }
public CancellableMessage(T msg)
{
CancellationTokenSource = new CancellationTokenSource();
Message = msg;
}
public void Dispose()
{
CancellationTokenSource?.Dispose();
}
}
While this works and actually does the job this implementation looks dirty, also possibly not thread safe.
Here is a TransformBlock and ActionBlock implementation that drops the oldest messages in its queue, whenever newer messages are received and the BoundedCapacity limit has been reached. It behaves quite similar to a Channel configured with BoundedChannelFullMode.DropOldest.
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldest<TInput, TOutput>(
Func<TInput, Task<TOutput>> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
if (transform == null) throw new ArgumentNullException(nameof(transform));
dataflowBlockOptions = dataflowBlockOptions ?? new ExecutionDataflowBlockOptions();
var boundedCapacity = dataflowBlockOptions.BoundedCapacity;
var cancellationToken = dataflowBlockOptions.CancellationToken;
var queue = new Queue<TInput>(Math.Max(0, boundedCapacity));
var outputBlock = new BufferBlock<TOutput>(new DataflowBlockOptions()
{
BoundedCapacity = boundedCapacity,
CancellationToken = cancellationToken
});
if (boundedCapacity != DataflowBlockOptions.Unbounded)
dataflowBlockOptions.BoundedCapacity = checked(boundedCapacity * 2);
// After testing, at least boundedCapacity + 1 is required.
// Make it double to be sure that all non-dropped messages will be processed.
var transformBlock = new ActionBlock<object>(async _ =>
{
TInput item;
lock (queue)
{
if (queue.Count == 0) return;
item = queue.Dequeue();
}
var result = await transform(item).ConfigureAwait(false);
await outputBlock.SendAsync(result, cancellationToken).ConfigureAwait(false);
}, dataflowBlockOptions);
dataflowBlockOptions.BoundedCapacity = boundedCapacity; // Restore initial value
var inputBlock = new ActionBlock<TInput>(item =>
{
var droppedEntry = (Exists: false, Item: (TInput)default);
lock (queue)
{
transformBlock.Post(null);
if (queue.Count == boundedCapacity) droppedEntry = (true, queue.Dequeue());
queue.Enqueue(item);
}
if (droppedEntry.Exists) droppedMessages?.Report(droppedEntry.Item);
}, new ExecutionDataflowBlockOptions()
{
CancellationToken = cancellationToken
});
PropagateCompletion(inputBlock, transformBlock);
PropagateFailure(transformBlock, inputBlock);
PropagateCompletion(transformBlock, outputBlock);
_ = transformBlock.Completion.ContinueWith(_ => { lock (queue) queue.Clear(); },
TaskScheduler.Default);
return DataflowBlock.Encapsulate(inputBlock, outputBlock);
async void PropagateCompletion(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
var exception = source.Completion.IsFaulted ? source.Completion.Exception : null;
if (exception != null) target.Fault(exception); else target.Complete();
}
async void PropagateFailure(IDataflowBlock source, IDataflowBlock target)
{
try { await source.Completion.ConfigureAwait(false); } catch { }
if (source.Completion.IsFaulted) target.Fault(source.Completion.Exception);
}
}
// Overload with synchronous lambda
public static IPropagatorBlock<TInput, TOutput>
CreateTransformBlockDropOldest<TInput, TOutput>(
Func<TInput, TOutput> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
return CreateTransformBlockDropOldest(item => Task.FromResult(transform(item)),
dataflowBlockOptions, droppedMessages);
}
// ActionBlock equivalent
public static ITargetBlock<TInput>
CreateActionBlockDropOldest<TInput>(
Func<TInput, Task> action,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var block = CreateTransformBlockDropOldest<TInput, object>(
async item => { await action(item).ConfigureAwait(false); return null; },
dataflowBlockOptions, droppedMessages);
block.LinkTo(DataflowBlock.NullTarget<object>());
return block;
}
// ActionBlock equivalent with synchronous lambda
public static ITargetBlock<TInput>
CreateActionBlockDropOldest<TInput>(
Action<TInput> action,
ExecutionDataflowBlockOptions dataflowBlockOptions = null,
IProgress<TInput> droppedMessages = null)
{
return CreateActionBlockDropOldest(
item => { action(item); return Task.CompletedTask; },
dataflowBlockOptions, droppedMessages);
}
The idea is to store the queued items in an auxiliary Queue, and pass dummy (null) values to an internal ActionBlock<object>. The block ignores the items passed as arguments, and takes instead an item from the queue, if there is any. Α lock is used to ensure that all non-dropped items in the queue will be eventually processed (unless of course an exception occurs).
There is also an extra feature. An optional IProgress<TInput> droppedMessages argument allows to receive notifications every time a message is dropped.
Usage example:
_messagingActionBlock = CreateActionBlockDropOldest<string>(msg =>
{
Console.WriteLine($"Processing: {msg}");
Thread.Sleep(5000);
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 2,
}, new Progress<string>(msg =>
{
Console.WriteLine($"Message dropped: {msg}");
}));
TPL Dataflow doesn't fit well into Last N messages, as it's meant to be queue, or pipeline (FIFO), not the stack (LIFO). Are you really need to do this with a dataflow library?
It's much easier with ConcurrentStack<T>, you just introduce one producer task, which posts to the stack, and one consumer task, which gets messages from stack while number of handled ones are lesser than N (More about Producer-Consumer).
If you need TPL Dataflow, you can use it in consumer task, to start handling the last messages, but not in producer, as it's really not the way it was meant to be used. Moreover, there are some other libraries with event-based architecture, which may fit more naturally for your problem.

BatchBlock produces batch with elements sent after TriggerBatch()

I have a Dataflow pipeline consisting of several blocks.
When elements are flowing through my processing pipeline, I want to group them by field A. To do this I have a BatchBlock with high BoundedCapacity. In it I store my elements until I decide that they should be released. So I invoke TriggerBatch() method.
private void Forward(TStronglyTyped data)
{
if (ShouldCreateNewGroup(data))
{
GroupingBlock.TriggerBatch();
}
GroupingBlock.SendAsync(data).Wait(SendTimeout);
}
This is how it looks.
The problem is, that the batch produced, sometimes contains the next posted element, which shouldn't be there.
To illustrate:
BatchBlock.InputQueue = {A,A,A}
NextElement = B //we should trigger a Batch!
BatchBlock.TriggerBatch()
BatchBlock.SendAsync(B);
In this point I expect my batch to be {A,A,A}, but it is {A,A,A,B}
Like TriggerBatch() was asynchronous, and SendAsync was in fact executed before the batch was actually made.
How can I solve this?
I obviously don't want to put Task.Wait(x) in there (I tried, and it works, but then performance is poor, of course).
I also encountered this issue by trying to call TriggerBatch in the wrong place. As mentioned, the SlidingWindow example using DataflowBlock.Encapsulate is the answer here, but it took some time to adapt so I thought I'd share my completed block.
My ConditionalBatchBlock creates batches up to a maximum size, possibly sooner if a certain condition is met. In my specific scenario I needed to create batches of 100, but always create new batches when certain changes in the data were detected.
public static IPropagatorBlock<T, T[]> CreateConditionalBatchBlock<T>(int batchSize, Func<Queue<T>, T, bool> condition)
{
var queue = new Queue<T>();
var source = new BufferBlock<T[]>();
var target = new ActionBlock<T>(async item =>
{
// start a new batch if required by the condition
if (condition(queue, item))
{
await source.SendAsync(queue.ToArray());
queue.Clear();
}
queue.Enqueue(item);
// always send a batch when the max size has been reached
if (queue.Count == batchSize)
{
await source.SendAsync(queue.ToArray());
queue.Clear();
}
});
// send any remaining items
target.Completion.ContinueWith(async t =>
{
if (queue.Any())
await source.SendAsync(queue.ToArray());
source.Complete();
});
return DataflowBlock.Encapsulate(target, source);
}
The condition parameter may be simpler in your case. I needed to look at the queue as well as the current item to make the determination whether to create a new batch.
I used it like this:
public async Task RunExampleAsync<T>()
{
var conditionalBatchBlock = CreateConditionalBatchBlock<T>(100, (queue, currentItem) => ShouldCreateNewBatch(queue, currentItem));
var actionBlock = new ActionBlock<T[]>(async x => await PerformActionAsync(x));
conditionalBatchBlock.LinkTo(actionBlock, new DataflowLinkOptions { PropagateCompletion = true });
await ReadDataAsync<T>(conditionalBatchBlock);
await actionBlock.Completion;
}
Here is a specialized version of Loren Paulsen's CreateConditionalBatchBlock method. This one accepts a Func<TItem, TKey> keySelector argument, and emits a new batch every time an item with different key is received.
public static IPropagatorBlock<TItem, TItem[]> CreateConditionalBatchBlock<TItem, TKey>(
Func<TItem, TKey> keySelector,
DataflowBlockOptions dataflowBlockOptions = null,
int maxBatchSize = DataflowBlockOptions.Unbounded,
IEqualityComparer<TKey> keyComparer = null)
{
if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
if (maxBatchSize < 1 && maxBatchSize != DataflowBlockOptions.Unbounded)
throw new ArgumentOutOfRangeException(nameof(maxBatchSize));
keyComparer = keyComparer ?? EqualityComparer<TKey>.Default;
var options = new ExecutionDataflowBlockOptions();
if (dataflowBlockOptions != null)
{
options.BoundedCapacity = dataflowBlockOptions.BoundedCapacity;
options.CancellationToken = dataflowBlockOptions.CancellationToken;
options.MaxMessagesPerTask = dataflowBlockOptions.MaxMessagesPerTask;
options.TaskScheduler = dataflowBlockOptions.TaskScheduler;
}
var output = new BufferBlock<TItem[]>(options);
var queue = new Queue<TItem>(); // Synchronization is not needed
TKey previousKey = default;
var input = new ActionBlock<TItem>(async item =>
{
var key = keySelector(item);
if (queue.Count > 0 && !keyComparer.Equals(key, previousKey))
{
await output.SendAsync(queue.ToArray()).ConfigureAwait(false);
queue.Clear();
}
queue.Enqueue(item);
previousKey = key;
if (queue.Count == maxBatchSize)
{
await output.SendAsync(queue.ToArray()).ConfigureAwait(false);
queue.Clear();
}
}, options);
_ = input.Completion.ContinueWith(async t =>
{
if (queue.Count > 0)
{
await output.SendAsync(queue.ToArray()).ConfigureAwait(false);
queue.Clear();
}
if (t.IsFaulted)
{
((IDataflowBlock)output).Fault(t.Exception.InnerException);
}
else
{
output.Complete();
}
}, TaskScheduler.Default);
return DataflowBlock.Encapsulate(input, output);
}

cancel async task if running

I have the following method called on several occasions (e.g onkeyup of textbox) which asynchronously filters items in listbox.
private async void filterCats(string category,bool deselect)
{
List<Category> tempList = new List<Category>();
//Wait for categories
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
//HERE,CANCEL TASK IF ALREADY RUNNING
tempList= await _filterCats(category,token);
//Show results
CAT_lb_Cats.DataSource = tempList;
CAT_lb_Cats.DisplayMember = "strCategory";
CAT_lb_Cats.ValueMember = "idCategory";
}
and the following task
private async Task<List<Category>> _filterCats(string category,CancellationToken token)
{
List<Category> result = await Task.Run(() =>
{
return getCatsByStr(category);
},token);
return result;
}
and I would like to test whether the task is already runing and if so cancel it and start it with the new value. I know how to cancel task, but how can I check whether it is already running?
This is the code that I use to do this :
if (_tokenSource != null)
{
_tokenSource.Cancel();
}
_tokenSource = new CancellationTokenSource();
try
{
await loadPrestatieAsync(_bedrijfid, _projectid, _medewerkerid, _prestatieid, _startDate, _endDate, _tokenSource.Token);
}
catch (OperationCanceledException ex)
{
}
and for the procedure call it is like this (simplified of course) :
private async Task loadPrestatieAsync(int bedrijfId, int projectid, int medewerkerid, int prestatieid,
DateTime? startDate, DateTime? endDate, CancellationToken token)
{
await Task.Delay(100, token).ConfigureAwait(true);
try{
//do stuff
token.ThrowIfCancellationRequested();
}
catch (OperationCanceledException ex)
{
throw;
}
catch (Exception Ex)
{
throw;
}
}
I am doing a delay of 100 ms because the same action is triggered rather quickly and repeatedly, a small postpone of 100 ms makes it look like the GUI is more responsive actually.
It appears you are looking for a way to get an "autocomplete list" from text entered in a text box, where an ongoing async search is canceled when the text has changed since the search was started.
As was mentioned in the comments, Rx (Reactive Extensions), provides very nice patterns for this, allowing you to easily connect your UI elements to cancellable asynchronous tasks, building in retry logic, etc.
The less than 90 line program below, shows a "full UI" sample (unfortunately excluding any cats ;-). It includes some reporting on the search status.
I have created this using a number of static methods in the RxAutoComplete class, to show how to this is achieved in small documented steps, and how they can be combined, to achieve a more complex task.
namespace TryOuts
{
using System;
using System.Linq;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Reactive.Linq;
using System.Threading;
// Simulated async search service, that can fail.
public class FakeWordSearchService
{
private static Random _rnd = new Random();
private static string[] _allWords = new[] {
"gideon", "gabby", "joan", "jessica", "bob", "bill", "sam", "johann"
};
public async Task<string[]> Search(string searchTerm, CancellationToken cancelToken)
{
await Task.Delay(_rnd.Next(600), cancelToken); // simulate async call.
if ((_rnd.Next() % 5) == 0) // every 5 times, we will cause a search failure
throw new Exception(string.Format("Search for '{0}' failed on purpose", searchTerm));
return _allWords.Where(w => w.StartsWith(searchTerm)).ToArray();
}
}
public static class RxAutoComplete
{
// Returns an observable that pushes the 'txt' TextBox text when it has changed.
static IObservable<string> TextChanged(TextBox txt)
{
return from evt in Observable.FromEventPattern<EventHandler, EventArgs>(
h => txt.TextChanged += h,
h => txt.TextChanged -= h)
select ((TextBox)evt.Sender).Text.Trim();
}
// Throttles the source.
static IObservable<string> ThrottleInput(IObservable<string> source, int minTextLength, TimeSpan throttle)
{
return source
.Where(t => t.Length >= minTextLength) // Wait until we have at least 'minTextLength' characters
.Throttle(throttle) // We don't start when the user is still typing
.DistinctUntilChanged(); // We only fire, if after throttling the text is different from before.
}
// Provides search results and performs asynchronous,
// cancellable search with automatic retries on errors
static IObservable<string[]> PerformSearch(IObservable<string> source, FakeWordSearchService searchService)
{
return from term in source // term from throttled input
from result in Observable.FromAsync(async token => await searchService.Search(term, token))
.Retry(3) // Perform up to 3 tries on failure
.TakeUntil(source) // Cancel pending request if new search request was made.
select result;
}
// Putting it all together.
public static void RunUI()
{
// Our simple search GUI.
var inputTextBox = new TextBox() { Width = 300 };
var searchResultLB = new ListBox { Top = inputTextBox.Height + 10, Width = inputTextBox.Width };
var searchStatus = new Label { Top = searchResultLB.Height + 30, Width = inputTextBox.Width };
var mainForm = new Form { Controls = { inputTextBox, searchResultLB, searchStatus }, Width = inputTextBox.Width + 20 };
// Our UI update handlers.
var syncContext = SynchronizationContext.Current;
Action<Action> onUITread = (x) => syncContext.Post(_ => x(), null);
Action<string> onSearchStarted = t => onUITread(() => searchStatus.Text = (string.Format("searching for '{0}'.", t)));
Action<string[]> onSearchResult = w => {
searchResultLB.Items.Clear();
searchResultLB.Items.AddRange(w);
searchStatus.Text += string.Format(" {0} maches found.", w.Length > 0 ? w.Length.ToString() : "No");
};
// Connecting input to search
var input = ThrottleInput(TextChanged(inputTextBox), 1, TimeSpan.FromSeconds(0.5)).Do(onSearchStarted);
var result = PerformSearch(input, new FakeWordSearchService());
// Running it
using (result.ObserveOn(syncContext).Subscribe(onSearchResult, ex => Console.WriteLine(ex)))
Application.Run(mainForm);
}
}
}

Categories