TPL Dataflow SendAsync task never completes when block is linked - c#

I was hoping for a clean solution for throttling a specific type of producer while the consumer is busy processing, without writing a custom block of my own. I had hoped the code below would have done exactly that, but once SendAsync blocks after the capacity limit hits, its task never completes, insinuating that the postponed message is never consumed.
_block = new TransformBlock<int, string>(async i =>
{
// Send the next request to own input queue
// before processing this request, or block
// while pipeline is full.
// Do not start processing if pipeline is full!
await _block.SendAsync(i + 1);
// Process this request and pass it on to the
// next block in the pipeline.
return i.ToString();
},
// TransformBlock block has input and output buffers. Limit both,
// otherwise requests that cannot be passed on to the next
// block in the pipeline will be cached in this block's output
// buffer, never throttling this block.
new ExecutionDataflowBlockOptions { BoundedCapacity = 5 });
// This block is linked to the output of the
// transform block.
var action = new ActionBlock<string>(async i =>
{
// Do some very long processing on the transformed element.
await Task.Delay(1000);
},
// Limit buffer size, and consequently throttle previous blocks
// in the pipeline.
new ExecutionDataflowBlockOptions { BoundedCapacity = 5 });
_block.LinkTo(action);
// Start running.
_block.Post(0);
I was wondering if there is any reason why the linked ActionBlock does not consume the postponed message.

I faced the same problem as you. I didn't dig deep into implementation of LinkTo but I think it propogate message only when source block received some. I mean, there may be a case when source block have some messages in its input, but it will not process them until next Post/SendAsync it received. And that's your case.
Here is my solution and it's working for me.
First declare "engine"
/// <summary>
/// Engine-class (like a car engine) that produced a lot count (or infinite) of actions.
/// </summary>
public class Engine
{
private BufferBlock<int> _bufferBlock;
/// <summary>
/// Creates source block that produced stub data.
/// </summary>
/// <param name="count">Count of actions. If count = 0 then it's infinite loop.</param>
/// <param name="boundedCapacity">Bounded capacity (throttling).</param>
/// <param name="cancellationToken">Cancellation token (used to stop infinite loop).</param>
/// <returns>Source block that constantly produced 0-value.</returns>
public ISourceBlock<int> CreateEngine(int count, int boundedCapacity, CancellationToken cancellationToken)
{
_bufferBlock = new BufferBlock<int>(new DataflowBlockOptions { BoundedCapacity = boundedCapacity });
Task.Run(async () =>
{
var counter = 0;
while (count == 0 || counter < count)
{
await _bufferBlock.SendAsync(0);
if (cancellationToken.IsCancellationRequested)
return;
counter++;
}
}, cancellationToken).ContinueWith((task) =>
{
_bufferBlock.Complete();
});
return _bufferBlock;
}
}
And then Producer that uses engine
/// <summary>
/// Producer that generates random byte blobs with specified size.
/// </summary>
public class Producer
{
private static Random random = new Random();
/// <summary>
/// Returns source block that produced byte arrays.
/// </summary>
/// <param name="blobSize">Size of byte arrays.</param>
/// <param name="count">Total count of blobs (if 0 then infinite).</param>
/// <param name="boundedCapacity">Bounded capacity (throttling).</param>
/// <param name="cancellationToken">Cancellation token (used to stop infinite loop).</param>
/// <returns>Source block.</returns>
public static ISourceBlock<byte[]> BlobsSourceBlock(int blobSize, int count, int boundedCapacity, CancellationToken cancellationToken)
{
// Creating engine with specified bounded capacity.
var engine = new Engine().CreateEngine(count, boundedCapacity, cancellationToken);
// Creating transform block that uses our driver as a source.
var block = new TransformBlock<int, byte[]>(
// Useful work.
i => CreateBlob(blobSize),
new ExecutionDataflowBlockOptions
{
// Here you can specify your own throttling.
BoundedCapacity = boundedCapacity,
MaxDegreeOfParallelism = Environment.ProcessorCount,
});
// Linking engine (and engine is already working at that time).
engine.LinkTo(block, new DataflowLinkOptions { PropagateCompletion = true });
return block;
}
/// <summary>
/// Simple random byte[] generator.
/// </summary>
/// <param name="size">Array size.</param>
/// <returns>byte[]</returns>
private static byte[] CreateBlob(int size)
{
var buffer = new byte[size];
random.NextBytes(buffer);
return buffer;
}
}
Now you can use producer with consumer (eg ActionBlock)
var blobsProducer = BlobsProducer.CreateAndStartBlobsSourceBlock(0, 1024 * 1024, 10, cancellationTokenSource.Token);
var md5Hash = MD5.Create();
var actionBlock = new ActionBlock<byte[]>(b =>
{
Console.WriteLine(GetMd5Hash(md5Hash, b));
},
new ExecutionDataflowBlockOptions() { BoundedCapacity = 10 });
blobsProducer.LinkTo(actionBlock);
Hope it will help you!

Related

RateGate class to Polly policy

I'm trying to replace the RateGate logic with a Polly policy. However, there is no status code or anything and I'm not sure if it's possible achieve the same idea but with Polly.
Usage
// Binance allows 5 messages per second, but we still get rate limited if we send a lot of messages at that rate
// By sending 3 messages per second, evenly spaced out, we can keep sending messages without being limited
private readonly RateGate _webSocketRateLimiter = new RateGate(1, TimeSpan.FromMilliseconds(330));
private void Send(IWebSocket webSocket, object obj)
{
var json = JsonConvert.SerializeObject(obj);
_webSocketRateLimiter.WaitToProceed();
Log.Trace("Send: " + json);
webSocket.Send(json);
}
RateGate class
public class RateGate : IDisposable
{
// Semaphore used to count and limit the number of occurrences per
// unit time.
private readonly SemaphoreSlim _semaphore;
// Times (in millisecond ticks) at which the semaphore should be exited.
private readonly ConcurrentQueue<int> _exitTimes;
// Timer used to trigger exiting the semaphore.
private readonly Timer _exitTimer;
// Whether this instance is disposed.
private bool _isDisposed;
/// <summary>
/// Number of occurrences allowed per unit of time.
/// </summary>
public int Occurrences
{
get; private set;
}
/// <summary>
/// The length of the time unit, in milliseconds.
/// </summary>
public int TimeUnitMilliseconds
{
get; private set;
}
/// <summary>
/// Flag indicating we are currently being rate limited
/// </summary>
public bool IsRateLimited
{
get { return !WaitToProceed(0); }
}
/// <summary>
/// Initializes a <see cref="RateGate"/> with a rate of <paramref name="occurrences"/>
/// per <paramref name="timeUnit"/>.
/// </summary>
/// <param name="occurrences">Number of occurrences allowed per unit of time.</param>
/// <param name="timeUnit">Length of the time unit.</param>
/// <exception cref="ArgumentOutOfRangeException">
/// If <paramref name="occurrences"/> or <paramref name="timeUnit"/> is negative.
/// </exception>
public RateGate(int occurrences, TimeSpan timeUnit)
{
// Check the arguments.
if (occurrences <= 0)
throw new ArgumentOutOfRangeException(nameof(occurrences), "Number of occurrences must be a positive integer");
if (timeUnit != timeUnit.Duration())
throw new ArgumentOutOfRangeException(nameof(timeUnit), "Time unit must be a positive span of time");
if (timeUnit >= TimeSpan.FromMilliseconds(UInt32.MaxValue))
throw new ArgumentOutOfRangeException(nameof(timeUnit), "Time unit must be less than 2^32 milliseconds");
Occurrences = occurrences;
TimeUnitMilliseconds = (int)timeUnit.TotalMilliseconds;
// Create the semaphore, with the number of occurrences as the maximum count.
_semaphore = new SemaphoreSlim(Occurrences, Occurrences);
// Create a queue to hold the semaphore exit times.
_exitTimes = new ConcurrentQueue<int>();
// Create a timer to exit the semaphore. Use the time unit as the original
// interval length because that's the earliest we will need to exit the semaphore.
_exitTimer = new Timer(ExitTimerCallback, null, TimeUnitMilliseconds, -1);
}
// Callback for the exit timer that exits the semaphore based on exit times
// in the queue and then sets the timer for the nextexit time.
// Credit to Jim: http://www.jackleitch.net/2010/10/better-rate-limiting-with-dot-net/#comment-3620
// for providing the code below, fixing issue #3499 - https://github.com/QuantConnect/Lean/issues/3499
private void ExitTimerCallback(object state)
{
try
{
// While there are exit times that are passed due still in the queue,
// exit the semaphore and dequeue the exit time.
var exitTime = 0;
var exitTimeValid = _exitTimes.TryPeek(out exitTime);
while (exitTimeValid)
{
if (unchecked(exitTime - Environment.TickCount) > 0)
{
break;
}
_semaphore.Release();
_exitTimes.TryDequeue(out exitTime);
exitTimeValid = _exitTimes.TryPeek(out exitTime);
}
// we are already holding the next item from the queue, do not peek again
// although this exit time may have already pass by this stmt.
var timeUntilNextCheck = exitTimeValid
? Math.Min(TimeUnitMilliseconds, Math.Max(0, exitTime - Environment.TickCount))
: TimeUnitMilliseconds;
_exitTimer.Change(timeUntilNextCheck, -1);
}
catch (Exception)
{
// can throw if called when disposing
}
}
/// <summary>
/// Blocks the current thread until allowed to proceed or until the
/// specified timeout elapses.
/// </summary>
/// <param name="millisecondsTimeout">Number of milliseconds to wait, or -1 to wait indefinitely.</param>
/// <returns>true if the thread is allowed to proceed, or false if timed out</returns>
public bool WaitToProceed(int millisecondsTimeout)
{
// Check the arguments.
if (millisecondsTimeout < -1)
throw new ArgumentOutOfRangeException(nameof(millisecondsTimeout));
CheckDisposed();
// Block until we can enter the semaphore or until the timeout expires.
var entered = _semaphore.Wait(millisecondsTimeout);
// If we entered the semaphore, compute the corresponding exit time
// and add it to the queue.
if (entered)
{
var timeToExit = unchecked(Environment.TickCount + TimeUnitMilliseconds);
_exitTimes.Enqueue(timeToExit);
}
return entered;
}
/// <summary>
/// Blocks the current thread until allowed to proceed or until the
/// specified timeout elapses.
/// </summary>
/// <param name="timeout"></param>
/// <returns>true if the thread is allowed to proceed, or false if timed out</returns>
public bool WaitToProceed(TimeSpan timeout)
{
return WaitToProceed((int)timeout.TotalMilliseconds);
}
/// <summary>
/// Blocks the current thread indefinitely until allowed to proceed.
/// </summary>
public void WaitToProceed()
{
WaitToProceed(Timeout.Infinite);
}
// Throws an ObjectDisposedException if this object is disposed.
private void CheckDisposed()
{
if (_isDisposed)
throw new ObjectDisposedException("RateGate is already disposed");
}
/// <summary>
/// Releases unmanaged resources held by an instance of this class.
/// </summary>
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
/// <summary>
/// Releases unmanaged resources held by an instance of this class.
/// </summary>
/// <param name="isDisposing">Whether this object is being disposed.</param>
protected virtual void Dispose(bool isDisposing)
{
if (!_isDisposed)
{
if (isDisposing)
{
// The semaphore and timer both implement IDisposable and
// therefore must be disposed.
_semaphore.Dispose();
_exitTimer.Dispose();
_isDisposed = true;
}
}
}
}
GitHub source code
Rate gate
Disclaimer: I haven't used this component so what I describe here is what I understand from the code.
It is an intrusive policy which means it modifies/alters the execution/data flow in order to slow down fast producer or smoothen out burst. It is blocking the flow to avoid resource abuse.
Here you can specify the "sleep duration" between subsequent calls which is enforced by the gate itself.
Polly's Rate limiter
This policy is designed to avoid resource abuse as well. That means if the consumer issues too many requests against the resource under a predefined time then it simply shortcuts the execution by throwing a RateLimitRejectedException.
So, if you want to allow 20 executions under 1 minute
RateLimitPolicy rateLimiter = Policy
.RateLimit(20, TimeSpan.FromSeconds(1));
and you do not want to exceed the limit you have to wait by yourself
rateLimiter.Execute(() =>
{
//Your Action delegate which runs <1ms
Thread.Sleep(50);
});
So, the executions should be distributed evenly during the allowed period. If your manually injected delay is shorter let's say 10ms then it will throw an exception.
Conclusion
According to my understanding both works like a proxy object. They are sitting between the consumer and producer to control the consumption rate.
The rate gate does that by injecting artificial delays whereas the rate limiter shortcuts the execution if abuse is detected.

c# run timer event every 1 milisecond

Hi I have a question I have a simple timed event that looks like:
public override async Task Execute(uint timedIntervalInMs = 1)
{
timer.Interval = timedInterval;
timer.Elapsed += OnTimedEvent;
timer.AutoReset = true;
timer.Enabled = true;
}
protected override void OnTimedEvent(object source, ElapsedEventArgs evrntArgs)
{
Task.Run(async () =>
{
var message = await BuildFrame();
await sender.Send(message, null);
});
}
What it does it build simple byte array about 27 bytes and send it via UDP, and I want to send that message each 1 ms, but as i checked with timer sending 1000 request takes about 2 - 3 (so about 330 frames per second)seconds, and that is not what I am aiming for, I suspect that timer is waiting for event to finish its work. Is this true, and can this be avoided so I can start sending frame each ms no matter if event is finished or not?
Something like this might be quite useful, the PeriodicYield<T> function will return a sequence of results from a generator function.
These results will be delivered at the end of the last full period that didn't complete yet.
Alter SimpleGenerator to mimic whatever delay in gneration you would like.
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace AsynchronouslyDelayedEnumerable
{
internal class Program
{
private static int Counter;
private static async Task Main(string[] args)
{
await foreach (var value in PeriodicYield(SimpleGenerator, 1000))
{
Console.WriteLine(
$"Time\"{DateTimeOffset.UtcNow}\", Value:{value}");
}
}
private static async Task<int> SimpleGenerator()
{
await Task.Delay(1500);
return Interlocked.Increment(ref Counter);
}
/// <summary>
/// Yield a result periodically.
/// </summary>
/// <param name="generatorAsync">Some generator delegate.</param>
/// <param name="periodMilliseconds">
/// The period in milliseconds at which results should be yielded.
/// </param>
/// <param name="token">A cancellation token.</param>
/// <typeparam name="T">The type of the value to yield.</typeparam>
/// <returns>A sequence of values.</returns>
private static async IAsyncEnumerable<T> PeriodicYield<T>(
Func<Task<T>> generatorAsync,
int periodMilliseconds,
CancellationToken token = default)
{
// Set up a starting point.
var last = DateTimeOffset.UtcNow;
// Continue until cancelled.
while (!token.IsCancellationRequested)
{
// Get the next value.
var nextValue = await generatorAsync();
// Work out the end of the next whole period.
var now = DateTimeOffset.UtcNow;
var gap = (int)(now - last).TotalMilliseconds;
var head = gap % periodMilliseconds;
var tail = periodMilliseconds - head;
var next = now.AddMilliseconds(tail);
// Wait for the end of the next whole period with
// logarithmically shorter delays.
while (next >= DateTimeOffset.Now)
{
var delay = (int)(next - DateTimeOffset.Now).TotalMilliseconds;
delay = (int)Math.Max(1.0, delay * 0.1);
await Task.Delay(delay, token);
}
// Check if cancelled.
if (token.IsCancellationRequested)
{
continue;
}
// return the value and update the last time.
yield return nextValue;
last = DateTimeOffset.UtcNow;
}
}
}
}
As #harol said, Timer doesn't have such high resolution. Because Windows or Linux is not a real-time operating system. It is not possible to trigger an event on precise time. You can trigger an event on approximately time.
Also operating system or your network card driver may decide to wait until the network buffer is full or at a specific value.

How can I read messages from a queue in parallel?

Situation
We have one message queue. We would like to process messages in parallel and limit the number of simultaneously processed messages.
Our trial code below does process messages in parallel, but it only starts a new batch of processes when the previous one is finished. We would like to restart Tasks as they finish.
In other words: The maximum number of Tasks should always be active as long as the message queue is not empty.
Trial code
static string queue = #".\Private$\concurrenttest";
private static void Process(CancellationToken token)
{
Task.Factory.StartNew(async () =>
{
while (true)
{
IEnumerable<Task> consumerTasks = ConsumerTasks();
await Task.WhenAll(consumerTasks);
await PeekAsync(new MessageQueue(queue));
}
});
}
private static IEnumerable<Task> ConsumerTasks()
{
for (int i = 0; i < 15; i++)
{
Command1 message;
try
{
MessageQueue msMq = new MessageQueue(queue);
msMq.Formatter = new XmlMessageFormatter(new Type[] { typeof(Command1) });
Message msg = msMq.Receive();
message = (Command1)msg.Body;
}
catch (MessageQueueException mqex)
{
if (mqex.MessageQueueErrorCode == MessageQueueErrorCode.IOTimeout)
yield break; // nothing in queue
else throw;
}
yield return Task.Run(() =>
{
Console.WriteLine("id: " + message.id + ", name: " + message.name);
Thread.Sleep(1000);
});
}
}
private static Task<Message> PeekAsync(MessageQueue msMq)
{
return Task.Factory.FromAsync<Message>(msMq.BeginPeek(), msMq.EndPeek);
}
EDIT
I spent a lot of time thinking about reliability of the pump - specifically if a message is received from the MessageQueue, cancellation becomes tricky - so I provided two ways to terminate the queue:
Signaling the CancellationToken stops the pipeline as quickly as possible and will likely result in dropped messages.
Calling MessagePump.Stop() terminates the pump but allows all messages which have already been taken from the queue to be fully processed before the MessagePump.Completion task transitions to RanToCompletion.
The solution uses TPL Dataflow (NuGet: Microsoft.Tpl.Dataflow).
Full implementation:
using System;
using System.Messaging;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace StackOverflow.Q34437298
{
/// <summary>
/// Pumps the message queue and processes messages in parallel.
/// </summary>
public sealed class MessagePump
{
/// <summary>
/// Creates a <see cref="MessagePump"/> and immediately starts pumping.
/// </summary>
public static MessagePump Run(
MessageQueue messageQueue,
Func<Message, Task> processMessage,
int maxDegreeOfParallelism,
CancellationToken ct = default(CancellationToken))
{
if (messageQueue == null) throw new ArgumentNullException(nameof(messageQueue));
if (processMessage == null) throw new ArgumentNullException(nameof(processMessage));
if (maxDegreeOfParallelism <= 0) throw new ArgumentOutOfRangeException(nameof(maxDegreeOfParallelism));
ct.ThrowIfCancellationRequested();
return new MessagePump(messageQueue, processMessage, maxDegreeOfParallelism, ct);
}
private readonly TaskCompletionSource<bool> _stop = new TaskCompletionSource<bool>();
/// <summary>
/// <see cref="Task"/> which completes when this instance
/// stops due to a <see cref="Stop"/> or cancellation request.
/// </summary>
public Task Completion { get; }
/// <summary>
/// Maximum number of parallel message processors.
/// </summary>
public int MaxDegreeOfParallelism { get; }
/// <summary>
/// <see cref="MessageQueue"/> that is pumped by this instance.
/// </summary>
public MessageQueue MessageQueue { get; }
/// <summary>
/// Creates a new <see cref="MessagePump"/> instance.
/// </summary>
private MessagePump(MessageQueue messageQueue, Func<Message, Task> processMessage, int maxDegreeOfParallelism, CancellationToken ct)
{
MessageQueue = messageQueue;
MaxDegreeOfParallelism = maxDegreeOfParallelism;
// Kick off the loop.
Completion = RunAsync(processMessage, ct);
}
/// <summary>
/// Soft-terminates the pump so that no more messages will be pumped.
/// Any messages already removed from the message queue will be
/// processed before this instance fully completes.
/// </summary>
public void Stop()
{
// Multiple calls to Stop are fine.
_stop.TrySetResult(true);
}
/// <summary>
/// Pump implementation.
/// </summary>
private async Task RunAsync(Func<Message, Task> processMessage, CancellationToken ct = default(CancellationToken))
{
using (CancellationTokenSource producerCTS = ct.CanBeCanceled
? CancellationTokenSource.CreateLinkedTokenSource(ct)
: new CancellationTokenSource())
{
// This CancellationToken will either be signaled
// externally, or if our consumer errors.
ct = producerCTS.Token;
// Handover between producer and consumer.
DataflowBlockOptions bufferOptions = new DataflowBlockOptions {
// There is no point in dequeuing more messages than we can process,
// so we'll throttle the producer by limiting the buffer capacity.
BoundedCapacity = MaxDegreeOfParallelism,
CancellationToken = ct
};
BufferBlock<Message> buffer = new BufferBlock<Message>(bufferOptions);
Task producer = Task.Run(async () =>
{
try
{
while (_stop.Task.Status != TaskStatus.RanToCompletion)
{
// This line and next line are the *only* two cancellation
// points which will not cause dropped messages.
ct.ThrowIfCancellationRequested();
Task<Message> peekTask = WithCancellation(PeekAsync(MessageQueue), ct);
if (await Task.WhenAny(peekTask, _stop.Task).ConfigureAwait(false) == _stop.Task)
{
// Stop was signaled before PeekAsync returned. Wind down the producer gracefully
// by breaking out and propagating completion to the consumer blocks.
break;
}
await peekTask.ConfigureAwait(false); // Observe Peek exceptions.
ct.ThrowIfCancellationRequested();
// Zero timeout means that we will error if someone else snatches the
// peeked message from the queue before we get to it (due to a race).
// I deemed this better than getting stuck waiting for a message which
// may never arrive, or, worse yet, let this ReceiveAsync run onobserved
// due to a cancellation (if we choose to abandon it like we do PeekAsync).
// You will have to restart the pump if this throws.
// Omit timeout if this behaviour is undesired.
Message message = await ReceiveAsync(MessageQueue, timeout: TimeSpan.Zero).ConfigureAwait(false);
await buffer.SendAsync(message, ct).ConfigureAwait(false);
}
}
finally
{
buffer.Complete();
}
},
ct);
// Wire up the parallel consumers.
ExecutionDataflowBlockOptions executionOptions = new ExecutionDataflowBlockOptions {
CancellationToken = ct,
MaxDegreeOfParallelism = MaxDegreeOfParallelism,
SingleProducerConstrained = true, // We don't require thread safety guarantees.
BoundedCapacity = MaxDegreeOfParallelism,
};
ActionBlock<Message> consumer = new ActionBlock<Message>(async message =>
{
ct.ThrowIfCancellationRequested();
await processMessage(message).ConfigureAwait(false);
},
executionOptions);
buffer.LinkTo(consumer, new DataflowLinkOptions { PropagateCompletion = true });
if (await Task.WhenAny(producer, consumer.Completion).ConfigureAwait(false) == consumer.Completion)
{
// If we got here, consumer probably errored. Stop the producer
// before we throw so we don't go dequeuing more messages.
producerCTS.Cancel();
}
// Task.WhenAll checks faulted tasks before checking any
// canceled tasks, so if our consumer threw a legitimate
// execption, that's what will be rethrown, not the OCE.
await Task.WhenAll(producer, consumer.Completion).ConfigureAwait(false);
}
}
/// <summary>
/// APM -> TAP conversion for MessageQueue.Begin/EndPeek.
/// </summary>
private static Task<Message> PeekAsync(MessageQueue messageQueue)
{
return Task.Factory.FromAsync(messageQueue.BeginPeek(), messageQueue.EndPeek);
}
/// <summary>
/// APM -> TAP conversion for MessageQueue.Begin/EndReceive.
/// </summary>
private static Task<Message> ReceiveAsync(MessageQueue messageQueue, TimeSpan timeout)
{
return Task.Factory.FromAsync(messageQueue.BeginReceive(timeout), messageQueue.EndPeek);
}
/// <summary>
/// Allows abandoning tasks which do not natively
/// support cancellation. Use with caution.
/// </summary>
private static async Task<T> WithCancellation<T>(Task<T> task, CancellationToken ct)
{
ct.ThrowIfCancellationRequested();
TaskCompletionSource<bool> tcs = new TaskCompletionSource<bool>();
using (ct.Register(s => ((TaskCompletionSource<bool>)s).TrySetResult(true), tcs, false))
{
if (task != await Task.WhenAny(task, tcs.Task).ConfigureAwait(false))
{
// Cancellation task completed first.
// We are abandoning the original task.
throw new OperationCanceledException(ct);
}
}
// Task completed: synchronously return result or propagate exceptions.
return await task.ConfigureAwait(false);
}
}
}
Usage:
using (MessageQueue msMq = GetQueue())
{
MessagePump pump = MessagePump.Run(
msMq,
async message =>
{
await Task.Delay(50);
Console.WriteLine($"Finished processing message {message.Id}");
},
maxDegreeOfParallelism: 4
);
for (int i = 0; i < 100; i++)
{
msMq.Send(new Message());
Thread.Sleep(25);
}
pump.Stop();
await pump.Completion;
}
Untidy but functional unit tests:
https://gist.github.com/KirillShlenskiy/7f3e2c4b28b9f940c3da
ORIGINAL ANSWER
As mentioned in my comment, there are established producer/consumer patterns in .NET, one of which is pipeline. An excellent example of such can be found in "Patterns of Parallel Programming" by Microsoft's own Stephen Toub (full text here: https://www.microsoft.com/en-au/download/details.aspx?id=19222, page 55).
The idea is simple: producers continuously throw stuff in a queue, and consumers pull it out and process (in parallel to producers and possibly one another).
Here's an example of a message pipeline where the consumer uses synchronous, blocking methods to process the items as they arrive (I've parallelised the consumer to suit your scenario):
void MessageQueueWithBlockingCollection()
{
// If your processing is continuous and never stops throughout the lifetime of
// your application, you can ignore the fact that BlockingCollection is IDisposable.
using (BlockingCollection<Message> messages = new BlockingCollection<Message>())
{
Task producer = Task.Run(() =>
{
try
{
for (int i = 0; i < 10; i++)
{
// Hand over the message to the consumer.
messages.Add(new Message());
// Simulated arrival delay for the next message.
Thread.Sleep(10);
}
}
finally
{
// Notify consumer that there is no more data.
messages.CompleteAdding();
}
});
Task consumer = Task.Run(() =>
{
ParallelOptions options = new ParallelOptions {
MaxDegreeOfParallelism = 4
};
Parallel.ForEach(messages.GetConsumingEnumerable(), options, message => {
ProcessMessage(message);
});
});
Task.WaitAll(producer, consumer);
}
}
void ProcessMessage(Message message)
{
Thread.Sleep(40);
}
The above code completes in approx 130-140 ms, which is exactly what you would expect given the parallelisation of the consumers.
Now, in your scenario you are using Tasks and async/await better suited to TPL Dataflow (official Microsoft supported library tailored to parallel and asynchronous sequence processing).
Here's a little demo showing the different types of TPL Dataflow processing blocks that you would use for the job:
async Task MessageQueueWithTPLDataflow()
{
// Set up our queue.
BufferBlock<Message> queue = new BufferBlock<Message>();
// Set up our processing stage (consumer).
ExecutionDataflowBlockOptions options = new ExecutionDataflowBlockOptions {
CancellationToken = CancellationToken.None, // Plug in your own in case you need to support cancellation.
MaxDegreeOfParallelism = 4
};
ActionBlock<Message> consumer = new ActionBlock<Message>(m => ProcessMessageAsync(m), options);
// Link the queue to the consumer.
queue.LinkTo(consumer, new DataflowLinkOptions { PropagateCompletion = true });
// Wire up our producer.
Task producer = Task.Run(async () =>
{
try
{
for (int i = 0; i < 10; i++)
{
queue.Post(new Message());
await Task.Delay(10).ConfigureAwait(false);
}
}
finally
{
// Signal to the consumer that there are no more items.
queue.Complete();
}
});
await consumer.Completion.ConfigureAwait(false);
}
Task ProcessMessageAsync(Message message)
{
return Task.Delay(40);
}
It's not hard to adapt the above to use your MessageQueue and you can be sure that the end result will be free of threading issues. I'll do just that if I get a bit more time today/tomorrow.
You have one collection of things you want to process.
You create another collection for things being processed (this could be your task objects or items of some sort that reference a task).
You create a loop that will repeat as long as you have work to do. That is, work items are waiting to be started or you still have work items being processed.
At the start of the loop you populate your active task collection with as many tasks as you want to run concurrently and you start them as you add them.
You let the things run for a while (like Thread.Sleep(10);).
You create an inner loop that checks all your started tasks for completion. If one has completed, you remove it and report the results or do whatever seems appropriate.
That's it. On the next turn the upper part of your outer loop will add tasks to your running tasks collection until the number equals the maximum you have set, keeping your work-in-progress collection full.
You may want to do all this on a worker thread and monitor cancel requests in your loop.
The task library in .NET is made to execute a number of tasks in parallell. While there are ways to limit the number of active tasks, the library itself will limit the number of active tasks according to the computers CPU.
The first question that needs to be answered is why do you need to create another limit? If the limit imposed by the task library is OK, then you can just keep create tasks and rely on the task library to execute it with good performance.
If this is OK, then as soon as you get a message from MSMQ just start a task to process the message, skip the waiting (WhenAll call), start over and wait for the next message.
You can limit the number of concurrent tasks by using a custom task scheduler. More on MSDN: https://msdn.microsoft.com/en-us/library/system.threading.tasks.taskscheduler%28v=vs.110%29.aspx.
My colleague came up with the solution below. This solution works, but I'll let this code be reviewed on Code Review.
Based on answers given and some research of our own, we've come to a solution. We're using a SemaphoreSlim to limit our number of parallel Tasks.
static string queue = #".\Private$\concurrenttest";
private static async Task Process(CancellationToken token)
{
MessageQueue msMq = new MessageQueue(queue);
msMq.Formatter = new XmlMessageFormatter(new Type[] { typeof(Command1) });
SemaphoreSlim s = new SemaphoreSlim(15, 15);
while (true)
{
await s.WaitAsync();
await PeekAsync(msMq);
Command1 message = await ReceiveAsync(msMq);
Task.Run(async () =>
{
try
{
await HandleMessage(message);
}
catch (Exception)
{
// Exception handling
}
s.Release();
});
}
}
private static Task HandleMessage(Command1 message)
{
Console.WriteLine("id: " + message.id + ", name: " + message.name);
Thread.Sleep(1000);
return Task.FromResult(1);
}
private static Task<Message> PeekAsync(MessageQueue msMq)
{
return Task.Factory.FromAsync<Message>(msMq.BeginPeek(), msMq.EndPeek);
}
public class Command1
{
public int id { get; set; }
public string name { get; set; }
}
private static async Task<Command1> ReceiveAsync(MessageQueue msMq)
{
var receiveAsync = await Task.Factory.FromAsync<Message>(msMq.BeginReceive(), msMq.EndPeek);
return (Command1)receiveAsync.Body;
}
You should look at using Microsoft's Reactive Framework for this.
You code could look like this:
var query =
from command1 in FromQueue<Command1>(queue)
from text in Observable.Start(() =>
{
Thread.Sleep(1000);
return "id: " + command1.id + ", name: " + command1.name;
})
select text;
var subscription =
query
.Subscribe(text => Console.WriteLine(text));
This does all of the processing in parallel, and ensures that the processing is properly distributed across all cores. When one value ends another starts.
To cancel the subscription just call subscription.Dispose().
The code for FromQueue is:
static IObservable<T> FromQueue<T>(string serverQueue)
{
return Observable.Create<T>(observer =>
{
var responseQueue = Environment.MachineName + "\\Private$\\" + Guid.NewGuid().ToString();
var queue = MessageQueue.Create(responseQueue);
var frm = new System.Messaging.BinaryMessageFormatter();
var srv = new MessageQueue(serverQueue);
srv.Formatter = frm;
queue.Formatter = frm;
srv.Send("S " + responseQueue);
var loop = NewThreadScheduler.Default.ScheduleLongRunning(cancel =>
{
while (!cancel.IsDisposed)
{
var msg = queue.Receive();
observer.OnNext((T)msg.Body);
}
});
return new CompositeDisposable(
loop,
Disposable.Create(() =>
{
srv.Send("D " + responseQueue);
MessageQueue.Delete(responseQueue);
})
);
});
}
Just NuGet "Rx-Main" to get the bits.
In order to limit the concurrency you can do this:
int maxConcurrent = 2;
var query =
FromQueue<Command1>(queue)
.Select(command1 => Observable.Start(() =>
{
Thread.Sleep(1000);
return "id: " + command1.id + ", name: " + command1.name;
}))
.Merge(maxConcurrent);

BufferBlock deadlock with OutputAvailableAsync after TryReceiveAll

While working on an answer to this question, I wrote this snippet:
var buffer = new BufferBlock<object>();
var producer = Task.Run(async () =>
{
while (true)
{
await Task.Delay(TimeSpan.FromMilliseconds(100));
buffer.Post(null);
Console.WriteLine("Post " + buffer.Count);
}
});
var consumer = Task.Run(async () =>
{
while (await buffer.OutputAvailableAsync())
{
IList<object> items;
buffer.TryReceiveAll(out items);
Console.WriteLine("TryReceiveAll " + buffer.Count);
}
});
await Task.WhenAll(consumer, producer);
The producer should post items to the buffer every 100 ms and the consumer should clear all items out of the buffer and asynchronously wait for more items to show up.
What actually happens is that the producer clears all items once, and then never again moves beyond OutputAvailableAsync. If I switch the consumer to remove items one by one it works as excepted:
while (await buffer.OutputAvailableAsync())
{
object item;
while (buffer.TryReceive(out item)) ;
}
Am I misunderstanding something? If not, what is the problem?
This is a bug in SourceCore being used internally by BufferBlock. Its TryReceiveAll method doesn't turn on the _enableOffering boolean data member while TryReceive does. That results in the task returned from OutputAvailableAsync never completing.
Here's a minimal reproduce:
var buffer = new BufferBlock<object>();
buffer.Post(null);
IList<object> items;
buffer.TryReceiveAll(out items);
var outputAvailableAsync = buffer.OutputAvailableAsync();
buffer.Post(null);
await outputAvailableAsync; // Never completes
I've just fixed it in the .Net core repository with this pull request. Hopefully the fix finds itself in the nuget package soon.
Alas, it's the end of September 2015, and although i3arnon fixed the error it is not solved in the version that was released two days after the error was fixed: Microsoft TPL Dataflow version 4.5.24.
However IReceivableSourceBlock.TryReceive(...) works correctly.
An extension method will solve the problem. After a new release of TPL Dataflow it will be easy to change the extension method.
/// <summary>
/// This extension method returns all available items in the IReceivableSourceBlock
/// or an empty sequence if nothing is available. The functin does not wait.
/// </summary>
/// <typeparam name="T">The type of items stored in the IReceivableSourceBlock</typeparam>
/// <param name="buffer">the source where the items should be extracted from </param>
/// <returns>The IList with the received items. Empty if no items were available</returns>
public static IList<T> TryReceiveAllEx<T>(this IReceivableSourceBlock<T> buffer)
{
/* Microsoft TPL Dataflow version 4.5.24 contains a bug in TryReceiveAll
* Hence this function uses TryReceive until nothing is available anymore
* */
IList<T> receivedItems = new List<T>();
T receivedItem = default(T);
while (buffer.TryReceive<T>(out receivedItem))
{
receivedItems.Add(receivedItem);
}
return receivedItems;
}
usage:
while (await this.bufferBlock.OutputAvailableAsync())
{
// some data available
var receivedItems = this.bufferBlock.TryReceiveAllEx();
if (receivedItems.Any())
{
ProcessReceivedItems(bufferBlock);
}
}

How to properly run multiple async tasks in parallel? [duplicate]

This question already has answers here:
How to limit the amount of concurrent async I/O operations?
(11 answers)
Closed 10 days ago.
What if you need to run multiple asynchronous I/O tasks in parallel but need to make sure that no more than X I/O processes are running at the same time; and pre and post I/O processing tasks shouldn't have such limitation.
Here is a scenario - let's say there are 1000 tasks; each of them accepts a text string as an input parameter; transforms that text (pre I/O processing) then writes that transformed text into a file. The goal is to make pre-processing logic utilize 100% of CPU/Cores and I/O portion of the tasks run with max 10 degree of parallelism (max 10 simultaneously opened for writing files at a time).
Can you provide a sample code how to do it with C# / .NET 4.5?
http://blogs.msdn.com/b/csharpfaq/archive/2012/01/23/using-async-for-file-access-alan-berman.aspx
I think using TPL Dataflow for this would be a good idea: you create pre- and post-process blocks with unbounded parallelism, a file-writing block with limited parallelism and link them together. Something like:
var unboundedParallelismOptions =
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded
};
var preProcessBlock = new TransformBlock<string, string>(
s => PreProcess(s), unboundedParallelismOptions);
var writeToFileBlock = new TransformBlock<string, string>(
async s =>
{
await WriteToFile(s);
return s;
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 10 });
var postProcessBlock = new ActionBlock<string>(
s => PostProcess(s), unboundedParallelismOptions);
var propagateCompletionOptions =
new DataflowLinkOptions { PropagateCompletion = true };
preProcessBlock.LinkTo(writeToFileBlock, propagateCompletionOptions);
writeToFileBlock.LinkTo(postProcessBlock, propagateCompletionOptions);
// use something like await preProcessBlock.SendAsync("text") here
preProcessBlock.Complete();
await postProcessBlock.Completion;
Where WriteToFile() could look like this:
private static async Task WriteToFile(string s)
{
using (var writer = new StreamWriter(GetFileName()))
await writer.WriteAsync(s);
}
It sounds like you'd want to consider a Djikstra Semaphore to control access to the starting of tasks.
However, this sounds like a typical queue/fixed number of consumers kind of problem, which may be a more appropriate way to structure it.
I would create an extension method in which one can set maximum degree of parallelism. SemaphoreSlim will be the savior here.
/// <summary>
/// Concurrently Executes async actions for each item of <see cref="IEnumerable<typeparamref name="T"/>
/// </summary>
/// <typeparam name="T">Type of IEnumerable</typeparam>
/// <param name="enumerable">instance of <see cref="IEnumerable<typeparamref name="T"/>"/></param>
/// <param name="action">an async <see cref="Action" /> to execute</param>
/// <param name="maxDegreeOfParallelism">Optional, An integer that represents the maximum degree of parallelism,
/// Must be grater than 0</param>
/// <returns>A Task representing an async operation</returns>
/// <exception cref="ArgumentOutOfRangeException">If the maxActionsToRunInParallel is less than 1</exception>
public static async Task ForEachAsyncConcurrent<T>(
this IEnumerable<T> enumerable,
Func<T, Task> action,
int? maxDegreeOfParallelism = null)
{
if (maxDegreeOfParallelism.HasValue)
{
using (var semaphoreSlim = new SemaphoreSlim(
maxDegreeOfParallelism.Value, maxDegreeOfParallelism.Value))
{
var tasksWithThrottler = new List<Task>();
foreach (var item in enumerable)
{
// Increment the number of currently running tasks and wait if they are more than limit.
await semaphoreSlim.WaitAsync();
tasksWithThrottler.Add(Task.Run(async () =>
{
await action(item).ContinueWith(res =>
{
// action is completed, so decrement the number of currently running tasks
semaphoreSlim.Release();
});
}));
}
// Wait for all tasks to complete.
await Task.WhenAll(tasksWithThrottler.ToArray());
}
}
else
{
await Task.WhenAll(enumerable.Select(item => action(item)));
}
}
Sample Usage:
await enumerable.ForEachAsyncConcurrent(
async item =>
{
await SomeAsyncMethod(item);
},
5);

Categories