Task sequencing and re-entracy - c#

I've got the following scenario, which I think might be quite common:
There is a task (a UI command handler) which can complete either synchronously or asynchronously.
Commands may arrive faster than they are getting processed.
If there is already a pending task for a command, the new command handler task should be queued and processed sequentially.
Each new task's result may depend on the result of the previous task.
Cancellation should be observed, but I'd like to leave it outside the scope of this question for simplicity. Also, thread-safety (concurrency) is not a requirement, but re-entrancy must be supported.
Here's a basic example of what I'm trying to achieve (as a console app, for simplicity):
using System;
using System.Threading.Tasks;
namespace ConsoleApp
{
class Program
{
static void Main(string[] args)
{
var asyncOp = new AsyncOp<int>();
Func<int, Task<int>> handleAsync = async (arg) =>
{
Console.WriteLine("this task arg: " + arg);
//await Task.Delay(arg); // make it async
return await Task.FromResult(arg); // sync
};
Console.WriteLine("Test #1...");
asyncOp.RunAsync(() => handleAsync(1000));
asyncOp.RunAsync(() => handleAsync(900));
asyncOp.RunAsync(() => handleAsync(800));
asyncOp.CurrentTask.Wait();
Console.WriteLine("\nPress any key to continue to test #2...");
Console.ReadLine();
asyncOp.RunAsync(() =>
{
asyncOp.RunAsync(() => handleAsync(200));
return handleAsync(100);
});
asyncOp.CurrentTask.Wait();
Console.WriteLine("\nPress any key to exit...");
Console.ReadLine();
}
// AsyncOp
class AsyncOp<T>
{
Task<T> _pending = Task.FromResult(default(T));
public Task<T> CurrentTask { get { return _pending; } }
public Task<T> RunAsync(Func<Task<T>> handler)
{
var pending = _pending;
Func<Task<T>> wrapper = async () =>
{
// await the prev task
var prevResult = await pending;
Console.WriteLine("\nprev task result: " + prevResult);
// start and await the handler
return await handler();
};
_pending = wrapper();
return _pending;
}
}
}
}
The output:
Test #1...
prev task result: 0
this task arg: 1000
prev task result: 1000
this task arg: 900
prev task result: 900
this task arg: 800
Press any key to continue to test #2...
prev task result: 800
prev task result: 800
this task arg: 200
this task arg: 100
Press any key to exit...
It works in accordance with the requirements, until re-entrancy is introduced in test #2:
asyncOp.RunAsync(() =>
{
asyncOp.RunAsync(() => handleAsync(200));
return handleAsync(100);
});
The desired output should be 100, 200, rather than 200, 100, because there's already a pending outer task for 100. That's obviously because the inner task executes synchronously, breaking the logic var pending = _pending; /* ... */ _pending = wrapper() for the outer task.
How to make it work for test #2, too?
One solution would be to enforce asynchrony for every task, with Task.Factory.StartNew(..., TaskScheduler.FromCurrentSynchronizationContext(). However, I don't want to impose asynchronous execution upon the command handlers which might be synchronous internally. Also, I don't want to depend on the behavior of any particular synchronization context (i.e. relying upon that Task.Factory.StartNew should return before the created task has been actually started).
In the real-life project, I'm responsible for what AsyncOp is above, but have no control over the command handlers (i.e., whatever is inside handleAsync).

I almost forgot it's possible to construct a Task manually, without starting or scheduling it. Then, "Task.Factory.StartNew" vs "new Task(...).Start" put me back on track. I think this is one of those few cases when the Task<TResult> constructor may actually be useful, along with nested tasks (Task<Task<T>>) and Task.Unwrap():
// AsyncOp
class AsyncOp<T>
{
Task<T> _pending = Task.FromResult(default(T));
public Task<T> CurrentTask { get { return _pending; } }
public Task<T> RunAsync(Func<Task<T>> handler, bool useSynchronizationContext = false)
{
var pending = _pending;
Func<Task<T>> wrapper = async () =>
{
// await the prev task
var prevResult = await pending;
Console.WriteLine("\nprev task result: " + prevResult);
// start and await the handler
return await handler();
};
var task = new Task<Task<T>>(wrapper);
var inner = task.Unwrap();
_pending = inner;
task.RunSynchronously(useSynchronizationContext ?
TaskScheduler.FromCurrentSynchronizationContext() :
TaskScheduler.Current);
return inner;
}
}
The output:
Test #1...
prev task result: 0
this task arg: 1000
prev task result: 1000
this task arg: 900
prev task result: 900
this task arg: 800
Press any key to continue to test #2...
prev task result: 800
this task arg: 100
prev task result: 100
this task arg: 200
It's now also very easy to make AsyncOp thread-safe by adding a lock to protect _pending, if needed.
Updated, this has been further improved with cancel/restart logic.

Here is a solution that is worse on every aspect compared to the accepted answer, except from being thread-safe (which is not a requirement of the question). Disadvantages:
All lambdas are executed asynchronously (there is no fast path).
The executeOnCurrentContext configuration effects all lambdas (it's not a per-lambda configuration).
This solution uses as processing engine an ActionBlock from the TPL Dataflow library.
public class AsyncOp<T>
{
private readonly ActionBlock<Task<Task<T>>> _actionBlock;
public AsyncOp(bool executeOnCurrentContext = false)
{
var options = new ExecutionDataflowBlockOptions();
if (executeOnCurrentContext)
options.TaskScheduler = TaskScheduler.FromCurrentSynchronizationContext();
_actionBlock = new ActionBlock<Task<Task<T>>>(async taskTask =>
{
try
{
taskTask.RunSynchronously();
await await taskTask;
}
catch { } // Ignore exceptions
}, options);
}
public Task<T> RunAsync(Func<Task<T>> taskFactory)
{
var taskTask = new Task<Task<T>>(taskFactory);
if (!_actionBlock.Post(taskTask))
throw new InvalidOperationException("Not accepted"); // Should never happen
return taskTask.Unwrap();
}
}

Microsoft's Rx does provide an easy way to do this kind of thing. Here's a simple (perhaps overly simple) way of doing it:
var subject = new BehaviorSubject<int>(0);
IDisposable subscription =
subject
.Scan((x0, x1) =>
{
Console.WriteLine($"previous value {x0}");
return x1;
})
.Skip(1)
.Subscribe(x => Console.WriteLine($"current value {x}\r\n"));
subject.OnNext(1000);
subject.OnNext(900);
subject.OnNext(800);
Console.WriteLine("\r\nPress any key to continue to test #2...\r\n");
Console.ReadLine();
subject.OnNext(200);
subject.OnNext(100);
Console.WriteLine("\r\nPress any key to exit...");
Console.ReadLine();
The output I get is this:
previous value 0
current value 1000
previous value 1000
current value 900
previous value 900
current value 800
Press any key to continue to test #2...
previous value 800
current value 200
previous value 200
current value 100
Press any key to exit...
It's easy to cancel at any time by calling subscription.Dispose().
Error handling in Rx is generally a little more bespoke than normal. It's not just a matter of throwing a try/catch around things. You also can repeat steps that error with a Retry operator in the case of things like IO errors.
In this circumstance, because I've used a BehaviorSubject (which repeats its last value whenever it is subscribed to) you can easily just resubscribe using a Catch operator.
var subject = new BehaviorSubject<int>(0);
var random = new Random();
IDisposable subscription =
subject
.Select(x =>
{
if (random.Next(10) == 0)
throw new Exception();
return x;
})
.Catch<int, Exception>(ex => subject.Select(x => -x))
.Scan((x0, x1) =>
{
Console.WriteLine($"previous value {x0}");
return x1;
})
.Skip(1)
.Subscribe(x => Console.WriteLine($"current value {x}\r\n"));
Now with the .Catch<int, Exception>(ex => subject.Select(x => -x)) it inverts the value of the query should an exception be raised.
A typical output may be like this:
previous value 0
current value 1000
previous value 1000
current value 900
previous value 900
current value 800
Press any key to continue to test #2...
previous value 800
current value -200
previous value -200
current value -100
Press any key to exit...
Note the -ve numbers in the second half. An exception was handled and the query was able to continue.

Related

Multiple Async Calls with Pause Between Calls

I have an IEnumerable<Task>, where each Task will call the same endpoint. However, the endpoint can only handle so many calls per second. How can I put, say, a half second delay between each call?
I have tried adding Task.Delay(), but of course awaiting them simply means that the app waits a half second before sending all the calls at once.
Here is a code snippet:
var resultTasks = orders
.Select(async task =>
{
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
return result;
} );
var results = Task.WhenAll(resultTasks);
I feel like I should do something like
Task.WhenAll(resultTasks.EmitOverTime(500));
... but how exactly do I do that?
What you describe in your question is in other words rate limiting. You'd like to apply rate limiting policy to your client, because the API you use enforces such a policy on the server to protect itself from abuse.
While you could implement rate limiting yourself, I'd recommend you to go with some well established solution. Rate Limiter from Davis Desmaisons was the one that I picked at random and I instantly liked it. It had solid documentation, superior coverage and was easy to use. It is also available as NuGet package.
Check out the simple snippet below that demonstrates running semi-overlapping tasks in sequence while defering the task start by half a second after the immediately preceding task started. Each task lasts at least 750 ms.
using ComposableAsync;
using RateLimiter;
using System;
using System.Threading.Tasks;
namespace RateLimiterTest
{
class Program
{
static void Main(string[] args)
{
Log("Starting tasks ...");
var constraint = TimeLimiter.GetFromMaxCountByInterval(1, TimeSpan.FromSeconds(0.5));
var tasks = new[]
{
DoWorkAsync("Task1", constraint),
DoWorkAsync("Task2", constraint),
DoWorkAsync("Task3", constraint),
DoWorkAsync("Task4", constraint)
};
Task.WaitAll(tasks);
Log("All tasks finished.");
Console.ReadLine();
}
static void Log(string message)
{
Console.WriteLine(DateTime.Now.ToString("HH:mm:ss.fff ") + message);
}
static async Task DoWorkAsync(string name, IDispatcher constraint)
{
await constraint;
Log(name + " started");
await Task.Delay(750);
Log(name + " finished");
}
}
}
Sample output:
10:03:27.121 Starting tasks ...
10:03:27.154 Task1 started
10:03:27.658 Task2 started
10:03:27.911 Task1 finished
10:03:28.160 Task3 started
10:03:28.410 Task2 finished
10:03:28.680 Task4 started
10:03:28.913 Task3 finished
10:03:29.443 Task4 finished
10:03:29.443 All tasks finished.
If you change the constraint to allow maximum two tasks per second (var constraint = TimeLimiter.GetFromMaxCountByInterval(2, TimeSpan.FromSeconds(1));), which is not the same as one per half a second, then the output could be like:
10:06:03.237 Starting tasks ...
10:06:03.264 Task1 started
10:06:03.268 Task2 started
10:06:04.026 Task2 finished
10:06:04.031 Task1 finished
10:06:04.275 Task3 started
10:06:04.276 Task4 started
10:06:05.032 Task4 finished
10:06:05.032 Task3 finished
10:06:05.033 All tasks finished.
Note that the current version of Rate Limiter targets .NETFramework 4.7.2+ or .NETStandard 2.0+.
This is just a thought, but another approach could be to create a queue and add another thread that runs polling the queue for calls that need to go out to your endpoint.
Have you considered just turning that into a foreach-loop with a Task.Delay call? You seem to want to explicitly call them sequentially, it won't hurt if that is obvious from your code.
var results = new List<YourResultType>;
foreach(var order in orders){
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
results.Add(result.Response);
}
catch(Exception ex)
{
result.Exception = ex;
}
}
Instead of selecting from orders you could loop over them, and inside the loop put the result into a list and then call Task.WhenAll.
Would look something like:
var resultTasks = new List<VendorTaskResult>(orders.Count);
orders.ToList().ForEach( item => {
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
resultTasks.Add(result);
Thread.Sleep(x);
});
var results = Task.WhenAll(resultTasks);
If you want to control the number of requests executed simultaneously, you have to use a semaphore.
I have something very similar, and it works fine with me. Please note that I call ToArray() after the Linq query finishes, that triggers the tasks:
using (HttpClient client = new HttpClient()) {
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
await Task.Delay(300);
return client.GetStringAsync(<url with variable job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
}
Now please note that this will create n nunmber of tasks, all in parallel, and the Task.Delay literally does nothing. If you want to call the pages synchronously (as it sounds by putting a delay between the calls), then this code may be better:
using (HttpClient client = new HttpClient()) {
foreach (string job in _group) {
await Task.Delay(300);
_pages.Add(await client.GetStringAsync(<url with variable job>));
}
}
The download of the pages is still asynchronous (while downloading other tasks are done), but each call to download the page is synchronous, ensuring that you can wait for one to finish in order to call the next one.
The code can be easily changed to call the pages asynchronously in chunks, like every 10 pages, wait 300ms, like in this sample:
IEnumerable<string[]> toParse = myData
.Select((v, i) => new { v.code, group = i / 20 })
.GroupBy(x => x.group)
.Select(g => g.Select(x => x.code).ToArray());
using (HttpClient client = new HttpClient()) {
foreach (string[] _group in toParse) {
string[] _pages = null;
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
return client.GetStringAsync(<url with job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
await Task.Delay(5000);
}
}
All this does is group your pages in chunks of 20, iterate through the chunks, download all pages of the chunk asynchronously, wait 5 seconds, move on to the next chunk.
I hope that is what you were waiting for :)
The proposed method EmitOverTime is doable, but only by blocking the current thread:
public static IEnumerable<Task<TResult>> EmitOverTime<TResult>(
this IEnumerable<Task<TResult>> tasks, int delay)
{
foreach (var item in tasks)
{
Thread.Sleep(delay); // Delay by blocking
yield return item;
}
}
Usage:
var results = await Task.WhenAll(resultTasks.EmitOverTime(500));
Probably better is to create a variant of Task.WhenAll that accepts a delay argument, and delays asyncronously:
public static async Task<TResult[]> WhenAllWithDelay<TResult>(
IEnumerable<Task<TResult>> tasks, int delay)
{
var tasksList = new List<Task<TResult>>();
foreach (var task in tasks)
{
await Task.Delay(delay).ConfigureAwait(false);
tasksList.Add(task);
}
return await Task.WhenAll(tasksList).ConfigureAwait(false);
}
Usage:
var results = await WhenAllWithDelay(resultTasks, 500);
This design implies that the enumerable of tasks should be enumerated only once. It is easy to forget this during development, and start enumerating it again, spawning a new set of tasks. For this reason I propose to make it an OnlyOnce enumerable, as it is shown in this question.
Update: I should mention why the above methods work, and under what premise. The premise is that the supplied IEnumerable<Task<TResult>> is deferred, in other words non-materialized. At the method's start there are no tasks created yet. The tasks are created one after the other during the enumeration of the enumerable, and the trick is that the enumeration is slow and controlled. The delay inside the loop ensures that the tasks are not created all at once. They are created hot (in other words already started), so at the time the last task has been created some of the first tasks may have already been completed. The materialized list of half-running/half-completed tasks is then passed to Task.WhenAll, that waits for all to complete asynchronously.

Unwrapping IObservable<Task<T>> into IObservable<T> with order preservation

Is there a way to unwrap the IObservable<Task<T>> into IObservable<T> keeping the same order of events, like this?
Tasks: ----a-------b--c----------d------e---f---->
Values: -------A-----------B--C------D-----E---F-->
Let's say I have a desktop application that consumes a stream of messages, some of which require heavy post-processing:
IObservable<Message> streamOfMessages = ...;
IObservable<Task<Result>> streamOfTasks = streamOfMessages
.Select(async msg => await PostprocessAsync(msg));
IObservable<Result> streamOfResults = ???; // unwrap streamOfTasks
I imagine two ways of dealing with that.
First, I can subscribe to streamOfTasks using the asynchronous event handler:
streamOfTasks.Subscribe(async task =>
{
var result = await task;
Display(result);
});
Second, I can convert streamOfTasks using Observable.Create, like this:
var streamOfResults =
from task in streamOfTasks
from value in Observable.Create<T>(async (obs, cancel) =>
{
var v = await task;
obs.OnNext(v);
// TODO: don't know when to call obs.OnComplete()
})
select value;
streamOfResults.Subscribe(result => Display(result));
Either way, the order of messages is not preserved: some later messages that
don't need any post-processing come out faster than earlier messages that
require post-processing. Both my solutions handle the incoming messages
in parallel, but I'd like them to be processed sequentially, one by one.
I can write a simple task queue to process just one task at a time,
but perhaps it's an overkill. Seems to me that I'm missing something obvious.
UPD. I wrote a sample console program to demonstrate my approaches. All solutions by far don't preserve the original order of events. Here is the output of the program:
Timer: 0
Timer: 1
Async handler: 1
Observable.Create: 1
Observable.FromAsync: 1
Timer: 2
Async handler: 2
Observable.Create: 2
Observable.FromAsync: 2
Observable.Create: 0
Async handler: 0
Observable.FromAsync: 0
Here is the complete source code:
// "C:\Program Files (x86)\MSBuild\14.0\Bin\csc.exe" test.cs /r:System.Reactive.Core.dll /r:System.Reactive.Linq.dll /r:System.Reactive.Interfaces.dll
using System;
using System.Reactive;
using System.Reactive.Concurrency;
using System.Reactive.Linq;
using System.Threading.Tasks;
class Program
{
static void Main()
{
Console.WriteLine("Press ENTER to exit.");
// the source stream
var timerEvents = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(1));
timerEvents.Subscribe(x => Console.WriteLine($"Timer: {x}"));
// solution #1: using async event handler
timerEvents.Subscribe(async x =>
{
var result = await PostprocessAsync(x);
Console.WriteLine($"Async handler: {x}");
});
// solution #2: using Observable.Create
var processedEventsV2 =
from task in timerEvents.Select(async x => await PostprocessAsync(x))
from value in Observable.Create<long>(async (obs, cancel) =>
{
var v = await task;
obs.OnNext(v);
})
select value;
processedEventsV2.Subscribe(x => Console.WriteLine($"Observable.Create: {x}"));
// solution #3: using FromAsync, as answered by #Enigmativity
var processedEventsV3 =
from msg in timerEvents
from result in Observable.FromAsync(() => PostprocessAsync(msg))
select result;
processedEventsV3.Subscribe(x => Console.WriteLine($"Observable.FromAsync: {x}"));
Console.ReadLine();
}
static async Task<long> PostprocessAsync(long x)
{
// some messages require long post-processing
if (x % 3 == 0)
{
await Task.Delay(TimeSpan.FromSeconds(2.5));
}
// and some don't
return x;
}
}
Combining #Enigmativity's simple approach with #VMAtm's idea of attaching the counter and some code snippets from this SO question, I came up with this solution:
// usage
var processedStream = timerEvents.SelectAsync(async t => await PostprocessAsync(t));
processedStream.Subscribe(x => Console.WriteLine($"Processed: {x}"));
// my sample console program prints the events ordered properly:
Timer: 0
Timer: 1
Timer: 2
Processed: 0
Processed: 1
Processed: 2
Timer: 3
Timer: 4
Timer: 5
Processed: 3
Processed: 4
Processed: 5
....
Here is my SelectAsync extension method to transform IObservable<Task<TSource>> into IObservable<TResult> keeping the original order of events:
public static IObservable<TResult> SelectAsync<TSource, TResult>(
this IObservable<TSource> src,
Func<TSource, Task<TResult>> selectorAsync)
{
// using local variable for counter is easier than src.Scan(...)
var counter = 0;
var streamOfTasks =
from source in src
from result in Observable.FromAsync(async () => new
{
Index = Interlocked.Increment(ref counter) - 1,
Result = await selectorAsync(source)
})
select result;
// buffer the results coming out of order
return Observable.Create<TResult>(observer =>
{
var index = 0;
var buffer = new Dictionary<int, TResult>();
return streamOfTasks.Subscribe(item =>
{
buffer.Add(item.Index, item.Result);
TResult result;
while (buffer.TryGetValue(index, out result))
{
buffer.Remove(index);
observer.OnNext(result);
index++;
}
});
});
}
I'm not particularly satisfied with my solution as it looks too complex to me, but at least it doesn't require any external dependencies. I'm using here a simple Dictionary to buffer and reorder task results because the subscriber need not to be thread-safe (the subscriptions are neved called concurrently).
Any comments or suggestions are welcome. I'm still hoping to find the native RX way of doing this without custom buffering extension method.
The RX library contains three operators that can unwrap an observable sequence of tasks, the Concat, Merge and Switch. All three accept a single source argument of type IObservable<Task<T>>, and return an IObservable<T>. Here are their descriptions from the documentation:
Concat
Concatenates all task results, as long as the previous task terminated successfully.
Merge
Merges results from all source tasks into a single observable sequence.
Switch
Transforms an observable sequence of tasks into an observable sequence producing values only from the most recent observable sequence. Each time a new task is received, the previous task's result is ignored.
In other words the Concat returns the results in their original order, the Merge returns the results in order of completion, and the Switch filters out any results from tasks that didn't complete before the next task was emitted. So your problem can be solved by just using the built-in Concat operator. No custom operator is needed.
var streamOfResults = streamOfTasks
.Select(async task =>
{
var result1 = await task;
var result2 = await PostprocessAsync(result1);
return result2;
})
.Concat();
The tasks are already started before they are emitted by the streamOfTasks. In other words they are emerging in a "hot" state. So the fact that the Concat operator awaits them the one after the other has no consequence regarding the concurrency of the operations. It only affects the order of their results. This would be a consideration if instead of hot tasks you had cold observables, like these created by the Observable.FromAsync and Observable.Create methods, in which case the Concat would execute the operations sequentially.
Is the following simple approach an answer for you?
IObservable<Result> streamOfResults =
from msg in streamOfMessages
from result in Observable.FromAsync(() => PostprocessAsync(msg))
select result;
To maintain the order of events you can funnel your stream into a TransformBlock from TPL Dataflow. The TransformBlock would execute your post-processing logic and will maintain the order of its output by default.
using System;
using System.Collections.Generic;
using System.Reactive.Linq;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
using NUnit.Framework;
namespace HandlingStreamInOrder {
[TestFixture]
public class ItemHandlerTests {
[Test]
public async Task Items_Are_Output_In_The_Same_Order_As_They_Are_Input() {
var itemHandler = new ItemHandler();
var timerEvents = Observable.Timer(TimeSpan.Zero, TimeSpan.FromMilliseconds(250));
timerEvents.Subscribe(async x => {
var data = (int)x;
Console.WriteLine($"Value Produced: {x}");
var dataAccepted = await itemHandler.SendAsync((int)data);
if (dataAccepted) {
InputItems.Add(data);
}
});
await Task.Delay(5000);
itemHandler.Complete();
await itemHandler.Completion;
CollectionAssert.AreEqual(InputItems, itemHandler.OutputValues);
}
private IList<int> InputItems {
get;
} = new List<int>();
}
public class ItemHandler {
public ItemHandler() {
var options = new ExecutionDataflowBlockOptions() {
BoundedCapacity = DataflowBlockOptions.Unbounded,
MaxDegreeOfParallelism = Environment.ProcessorCount,
EnsureOrdered = true
};
PostProcessBlock = new TransformBlock<int, int>((Func<int, Task<int>>)PostProcess, options);
var output = PostProcessBlock.AsObservable().Subscribe(x => {
Console.WriteLine($"Value Output: {x}");
OutputValues.Add(x);
});
}
public async Task<bool> SendAsync(int data) {
return await PostProcessBlock.SendAsync(data);
}
public void Complete() {
PostProcessBlock.Complete();
}
public Task Completion {
get { return PostProcessBlock.Completion; }
}
public IList<int> OutputValues {
get;
} = new List<int>();
private IPropagatorBlock<int, int> PostProcessBlock {
get;
}
private async Task<int> PostProcess(int data) {
if (data % 3 == 0) {
await Task.Delay(TimeSpan.FromSeconds(2));
}
return data;
}
}
}
Rx and TPL can be easily combined here, and TPL do save the order of events, by default, so your code could be something like this:
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
static async Task<long> PostprocessAsync(long x) { ... }
IObservable<Message> streamOfMessages = ...;
var streamOfTasks = new TransformBlock<long, long>(async msg =>
await PostprocessAsync(msg)
// set the concurrency level for messages to handle
, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = Environment.ProcessorCount });
// easily convert block into observable
IObservable<long> streamOfResults = streamOfTasks.AsObservable();
Edit: Rx extensions meant to be a reactive pipeline of events for UI. As this type of applications are in general single-threaded, so messages are being handled with saving the order. But in general events in C# aren't thread safe, so you have to provide some additional logic to same the order.
If you don't like the idea to introduce another dependency, you need to store the operation number with Interlocked class, something like this:
// counter for operations get started
int operationNumber = 0;
// counter for operations get done
int doneNumber = 0;
...
var currentOperationNumber = Interlocked.Increment(ref operationNumber);
...
while (Interlocked.CompareExchange(ref doneNumber, currentOperationNumber + 1, currentOperationNumber) != currentOperationNumber)
{
// spin once here
}
// handle event
Interlocked.Increment(ref doneNumber);

How do I create an Observable Timer that calls a method and blocks on cancellation if the method is running until it finishes?

My requirements:
Run method DoWork on a specified interval.
If stop is called between calls to DoWork just stop the timer.
If stop is called while DoWork is running, block until DoWork is finished.
If DoWork takes too long to finish after stop is called, timeout.
I have a solution that seems to work so far, but I'm not super happy with it and think I may be missing something. The following is the void Main from my test app:
var source = new CancellationTokenSource();
// Create an observable sequence for the Cancel event.
var cancelObservable = Observable.Create<Int64>(o =>
{
source.Token.Register(() =>
{
Console.WriteLine("Start on canceled handler.");
o.OnNext(1);
Console.WriteLine("End on canceled handler.");
});
return Disposable.Empty;
});
var observable =
// Create observable timer.
Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), Scheduler.Default)
// Merge with the cancel observable so we have a composite that
// generates an event every 10 seconds AND immediately when a cancel is requested.
.Merge(cancelObservable)
// This is what I ended up doing instead of disposing the timer so that I could wait
// for the sequence to finish, including DoWork.
.TakeWhile(i => !source.IsCancellationRequested)
// I could put this in an observer, but this way exceptions could be caught and handled
// or the results of the work could be fed to a subscriber.
.Do(l =>
{
Console.WriteLine("Start DoWork.");
Thread.Sleep(TimeSpan.FromSeconds(5));
Console.WriteLine("Finish DoWork.");
});
var published = observable.Publish();
var disposable = published.Connect();
// Press key between Start DoWork and Finish DoWork to test the cancellation while
// running DoWork.
// Press key between Finish DoWork and Start DoWork to test cancellation between
// events.
Console.ReadKey();
// I doubt this is good practice, but I was finding that o.OnNext was blocking
// inside of register, and the timeout wouldn't work if I blocked here before
// I set it up.
Task.Factory.StartNew(source.Cancel);
// Is there a preferred way to block until a sequence is finished? My experience
// is there's a timing issue if Cancel finishes fast enough the sequence may already
// be finished by the time I get here and .Wait() complains that the sequence contains
// no elements.
published.Timeout(TimeSpan.FromSeconds(1))
.ForEach(i => { });
disposable.Dispose();
Console.WriteLine("All finished! Press any key to continue.");
Console.ReadKey();
First, in your cancelObservable, make sure and return the result of Token.Register as your disposable instead of returning Disposable.Empty.
Here's a good extension method for turning CancellationTokens into observables:
public static IObservable<Unit> AsObservable(this CancellationToken token, IScheduler scheduler)
{
return Observable.Create<Unit>(observer =>
{
var d1 = new SingleAssignmentDisposable();
return new CompositeDisposable(d1, token.Register(() =>
{
d1.Disposable = scheduler.Schedule(() =>
{
observer.OnNext(Unit.Default);
observer.OnCompleted();
});
}));
});
}
Now, to your actual request:
public IObservable<Unit> ScheduleWork(IObservable<Unit> cancelSignal)
{
// Performs work on an interval
// stops the timer (but finishes any work in progress) when the cancelSignal is received
var workTimer = Observable
.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10))
.TakeUntil(cancelSignal)
.Select(_ =>
{
DoWork();
return Unit.Default;
})
.IgnoreElements();
// starts a timer after cancellation that will eventually throw a timeout exception.
var timeoutAfterCancelSignal = cancelSignal
.SelectMany(c => Observable.Never<Unit>().Timeout(TimeSpan.FromSeconds(5)));
// Use Amb to listen to both the workTimer
// and the timeoutAfterCancelSignal
// Since neither produce any data we are really just
// listening to see which will complete first.
// if the workTimer completes before the timeout
// then Amb will complete without error.
// However if the timeout expires first, then Amb
// will produce an error
return Observable.Amb(workTimer, timeoutAfterCancelSignal);
}
// Usage
var cts = new CancellationTokenSource();
var s = ScheduleWork(cts.Token.AsObservable(Scheduler.Default));
using (var finishedSignal = new ManualResetSlim())
{
s.Finally(finishedSignal.Set).Subscribe(
_ => { /* will never be called */},
error => { /* handle error */ },
() => { /* canceled without error */ } );
Console.ReadKey();
cts.Cancel();
finishedSignal.Wait();
}
Note, instead of cancellation tokens you can also do:
var cancelSignal = new AsyncSubject<Unit>();
var s = ScheduleWork(cancelSignal);
// .. to cancel ..
Console.ReadKey();
cancelSignal.OnNext(Unit.Default);
cancelSignal.OnCompleted();

Automatic repeat of one task until another is finished (TAP)

I have two operations - long running OperationA and much quicker OperationB. I was running them in parallel using TAP and returning results as they both finish :
var taskA = Task.Factory.StartNew(()=>OperationA());
var taskB = Task.Factory.StartNew(()=>OperationB());
var tasks = new Task[] { taskA, taskB };
Task.WaitAll(tasks);
// processing taskA.Result, taskB.Result
No magic here. Now what I want to do is repeat OperationB when it's finished indefinitely in case OperationA is still running. So whole procedure finish point will occur when OperationA is finished and last pass of OperationB is finished. I'm looking for some sort of effective pattern for doing that will not involve polling for OperationA's Status in while loop if that's possible. Looking toward improving WaitAllOneByOne pattern proposed in this Pluralsight course or something similar.
Try this
// Get cancellation support.
CancellationTokenSource source = new CancellationTokenSource();
CancellationToken token = source.Token;
// Start off A and set continuation to cancel B when finished.
bool taskAFinished = false;
var taskA = Task.Factory.StartNew(() => OperationA());
Task contA = taskA.ContinueWith(ant => source.Cancel());
// Set off B and in the method perform your loop. Cancellation with be thrown when
// A has completed.
var taskB = Task.Factory.StartNew(() => OperationB(token), token);
Task contB = taskB.ContinueWith(ant =>
{
switch (task.Status)
{
// Handle any exceptions to prevent UnobservedTaskException.
case TaskStatus.RanToCompletion:
// Do stuff.
break;
case TaskStatus.Canceled:
// You know TaskA is finished.
break;
case TaskStatus.Faulted:
// Something bad.
break;
}
});
then in the OperationB method you can perform your loop and include a cancellation upon TaskA's compleation...
private void OperationB(CancellationToken token)
{
foreach (var v in object)
{
...
token.ThrowIfCancellationRequested(); // This must be handeled. AggregateException.
}
}
Note, instead of complicating with a cancellation, you can just set a bool from with in the continuation of TaskA and check for this in TaskB' loop - this will avoid any faffing about with cancellations.
I hope this helps
Took your approach as basis and adapted a bit :
var source = new CancellationTokenSource();
var token = source.Token;
var taskA = Task.Factory.StartNew(
() => OperationA()
);
var taskAFinished = taskA.ContinueWith(antecedent =>
{
source.Cancel();
return antecedent.Result;
});
var taskB = Task.Factory.StartNew(
() => OperationB(token), token
);
var taskBFinished = taskB.ContinueWith(antecedent =>
{
switch (antecedent.Status)
{
case TaskStatus.RanToCompletion:
case TaskStatus.Canceled:
try
{
return ant.Result;
}
catch (AggregateException ae)
{
// Operation was canceled before start if OperationA is short
return null;
}
case TaskStatus.Faulted:
return null;
}
return null;
});
Made two continuations that returns results for respective operations so I could make wait for them both to be finished (tried to do that only with second one and it didn't work).
var tasks = new Task[] { taskAFinished, taskBFinished };
Task.WaitAll(tasks);
First one is just passing antecedent's task Result further, second takes aggregate results available at this point in OperationB (both RanToCompletion and Canceled statuses are considered correct end of process). OperationB now looks like this :
public static List<Result> OperationB(CancellationToken token)
{
var resultsList = new List<Result>();
while (true)
{
foreach (var op in operations)
{
resultsList.Add(op.GetResult();
}
if (token.IsCancellationRequested)
{
return resultsList;
}
}
}
Changes logic a bit - all loops inside OperationB now are considered as single task, but this is easier than keep them atomic and write some sort of coordination primitive that will gather results from each run. In case I don't really care which loop produced which results this seems to be a decent solution. May improve to more flexible implementation later if needed (what I was actually looking for is to chain multiple operations recursively - OperationB itself may have smaller repeating OperationC's inside with same behavior, OperationC - multiple OperationD's that are running when C is active etc.).
edit
Added exception handling in taskBFinished for case when OperationA is quick and cancellation is issued before OperationB is even started.

Rx: How can I respond immediately, and throttle subsequent requests

I would like to set up an Rx subscription that can respond to an event right away, and then ignore subsequent events that happen within a specified "cooldown" period.
The out of the box Throttle/Buffer methods respond only once the timeout has elapsed, which is not quite what I need.
Here is some code that sets up the scenario, and uses a Throttle (which isn't the solution I want):
class Program
{
static Stopwatch sw = new Stopwatch();
static void Main(string[] args)
{
var subject = new Subject<int>();
var timeout = TimeSpan.FromMilliseconds(500);
subject
.Throttle(timeout)
.Subscribe(DoStuff);
var factory = new TaskFactory();
sw.Start();
factory.StartNew(() =>
{
Console.WriteLine("Batch 1 (no delay)");
subject.OnNext(1);
});
factory.StartNewDelayed(1000, () =>
{
Console.WriteLine("Batch 2 (1s delay)");
subject.OnNext(2);
});
factory.StartNewDelayed(1300, () =>
{
Console.WriteLine("Batch 3 (1.3s delay)");
subject.OnNext(3);
});
factory.StartNewDelayed(1600, () =>
{
Console.WriteLine("Batch 4 (1.6s delay)");
subject.OnNext(4);
});
Console.ReadKey();
sw.Stop();
}
private static void DoStuff(int i)
{
Console.WriteLine("Handling {0} at {1}ms", i, sw.ElapsedMilliseconds);
}
}
The output of running this right now is:
Batch 1 (no delay)
Handling 1 at 508ms
Batch 2 (1s delay)
Batch 3 (1.3s delay)
Batch 4 (1.6s delay)
Handling 4 at 2114ms
Note that batch 2 isn't handled (which is fine!) because we wait for 500ms to elapse between requests due to the nature of throttle. Batch 3 is also not handled, (which is less alright because it happened more than 500ms from batch 2) due to its proximity to Batch 4.
What I'm looking for is something more like this:
Batch 1 (no delay)
Handling 1 at ~0ms
Batch 2 (1s delay)
Handling 2 at ~1000s
Batch 3 (1.3s delay)
Batch 4 (1.6s delay)
Handling 4 at ~1600s
Note that batch 3 wouldn't be handled in this scenario (which is fine!) because it occurs within 500ms of Batch 2.
EDIT:
Here is the implementation for the "StartNewDelayed" extension method that I use:
/// <summary>Creates a Task that will complete after the specified delay.</summary>
/// <param name="factory">The TaskFactory.</param>
/// <param name="millisecondsDelay">The delay after which the Task should transition to RanToCompletion.</param>
/// <returns>A Task that will be completed after the specified duration.</returns>
public static Task StartNewDelayed(
this TaskFactory factory, int millisecondsDelay)
{
return StartNewDelayed(factory, millisecondsDelay, CancellationToken.None);
}
/// <summary>Creates a Task that will complete after the specified delay.</summary>
/// <param name="factory">The TaskFactory.</param>
/// <param name="millisecondsDelay">The delay after which the Task should transition to RanToCompletion.</param>
/// <param name="cancellationToken">The cancellation token that can be used to cancel the timed task.</param>
/// <returns>A Task that will be completed after the specified duration and that's cancelable with the specified token.</returns>
public static Task StartNewDelayed(this TaskFactory factory, int millisecondsDelay, CancellationToken cancellationToken)
{
// Validate arguments
if (factory == null) throw new ArgumentNullException("factory");
if (millisecondsDelay < 0) throw new ArgumentOutOfRangeException("millisecondsDelay");
// Create the timed task
var tcs = new TaskCompletionSource<object>(factory.CreationOptions);
var ctr = default(CancellationTokenRegistration);
// Create the timer but don't start it yet. If we start it now,
// it might fire before ctr has been set to the right registration.
var timer = new Timer(self =>
{
// Clean up both the cancellation token and the timer, and try to transition to completed
ctr.Dispose();
((Timer)self).Dispose();
tcs.TrySetResult(null);
});
// Register with the cancellation token.
if (cancellationToken.CanBeCanceled)
{
// When cancellation occurs, cancel the timer and try to transition to cancelled.
// There could be a race, but it's benign.
ctr = cancellationToken.Register(() =>
{
timer.Dispose();
tcs.TrySetCanceled();
});
}
if (millisecondsDelay > 0)
{
// Start the timer and hand back the task...
timer.Change(millisecondsDelay, Timeout.Infinite);
}
else
{
// Just complete the task, and keep execution on the current thread.
ctr.Dispose();
tcs.TrySetResult(null);
timer.Dispose();
}
return tcs.Task;
}
Here's my approach. It's similar to others that have gone before, but it doesn't suffer the over-zealous window production problem.
The desired function works a lot like Observable.Throttle but emits qualifying events as soon as they arrive rather than delaying for the duration of the throttle or sample period. For a given duration after a qualifying event, subsequent events are suppressed.
Given as a testable extension method:
public static class ObservableExtensions
{
public static IObservable<T> SampleFirst<T>(
this IObservable<T> source,
TimeSpan sampleDuration,
IScheduler scheduler = null)
{
scheduler = scheduler ?? Scheduler.Default;
return source.Publish(ps =>
ps.Window(() => ps.Delay(sampleDuration,scheduler))
.SelectMany(x => x.Take(1)));
}
}
The idea is to use the overload of Window that creates non-overlapping windows using a windowClosingSelector that uses the source time-shifted back by the sampleDuration. Each window will therefore: (a) be closed by the first element in it and (b) remain open until a new element is permitted. We then simply select the first element from each window.
Rx 1.x Version
The Publish extension method used above is not available in Rx 1.x. Here is an alternative:
public static class ObservableExtensions
{
public static IObservable<T> SampleFirst<T>(
this IObservable<T> source,
TimeSpan sampleDuration,
IScheduler scheduler = null)
{
scheduler = scheduler ?? Scheduler.Default;
var sourcePub = source.Publish().RefCount();
return sourcePub.Window(() => sourcePub.Delay(sampleDuration,scheduler))
.SelectMany(x => x.Take(1));
}
}
The solution I found after a lot of trial and error was to replace the throttled subscription with the following:
subject
.Window(() => { return Observable.Interval(timeout); })
.SelectMany(x => x.Take(1))
.Subscribe(i => DoStuff(i));
Edited to incorporate Paul's clean-up.
Awesome solution Andrew! We can take this a step further though and clean up the inner Subscribe:
subject
.Window(() => { return Observable.Interval(timeout); })
.SelectMany(x => x.Take(1))
.Subscribe(DoStuff);
The initial answer I posted has a flaw: namely that the Window method, when used with an Observable.Interval to denote the end of the window, sets up an infinite series of 500ms windows. What I really need is a window that starts when the first result is pumped into the subject, and ends after the 500ms.
My sample data masked this problem because the data broke down nicely into the windows that were already going to be created. (i.e. 0-500ms, 501-1000ms, 1001-1500ms, etc.)
Consider instead this timing:
factory.StartNewDelayed(300,() =>
{
Console.WriteLine("Batch 1 (300ms delay)");
subject.OnNext(1);
});
factory.StartNewDelayed(700, () =>
{
Console.WriteLine("Batch 2 (700ms delay)");
subject.OnNext(2);
});
factory.StartNewDelayed(1300, () =>
{
Console.WriteLine("Batch 3 (1.3s delay)");
subject.OnNext(3);
});
factory.StartNewDelayed(1600, () =>
{
Console.WriteLine("Batch 4 (1.6s delay)");
subject.OnNext(4);
});
What I get is:
Batch 1 (300ms delay)
Handling 1 at 356ms
Batch 2 (700ms delay)
Handling 2 at 750ms
Batch 3 (1.3s delay)
Handling 3 at 1346ms
Batch 4 (1.6s delay)
Handling 4 at 1644ms
This is because the windows begin at 0ms, 500ms, 1000ms, and 1500ms and so each Subject.OnNext fits nicely into its own window.
What I want is:
Batch 1 (300ms delay)
Handling 1 at ~300ms
Batch 2 (700ms delay)
Batch 3 (1.3s delay)
Handling 3 at ~1300ms
Batch 4 (1.6s delay)
After a lot of struggling and an hour banging on it with a co-worker, we arrived at a better solution using pure Rx and a single local variable:
bool isCoolingDown = false;
subject
.Where(_ => !isCoolingDown)
.Subscribe(
i =>
{
DoStuff(i);
isCoolingDown = true;
Observable
.Interval(cooldownInterval)
.Take(1)
.Subscribe(_ => isCoolingDown = false);
});
Our assumption is that calls to the subscription method are synchronized. If they are not, then a simple lock could be introduced.
Use .Scan() !
This is what I use for Throttling when I need the first hit (after a certain period) immediately, but delay (and group/ignore) any subsequent hits.
Basically works like Throttle, but fires immediately if the previous onNext was >= interval ago, otherwise, schedule it at exactly interval from the previous hit. And of course, if within the 'cooling down' period multiple hits come, the additional ones are ignored, just like Throttle does.
The difference with your use case is that if you get an event at 0 ms and 100 ms, they will both be handled (at 0ms and 500ms), which might be what you actually want (otherwise, the accumulator is easy to adapt to ignore ANY hit closer than interval to the previous one).
public static IObservable<T> QuickThrottle<T>(this IObservable<T> src, TimeSpan interval, IScheduler scheduler)
{
return src
.Scan(new ValueAndDueTime<T>(), (prev, id) => AccumulateForQuickThrottle(prev, id, interval, scheduler))
.Where(vd => !vd.Ignore)
.SelectMany(sc => Observable.Timer(sc.DueTime, scheduler).Select(_ => sc.Value));
}
private static ValueAndDueTime<T> AccumulateForQuickThrottle<T>(ValueAndDueTime<T> prev, T value, TimeSpan interval, IScheduler s)
{
var now = s.Now;
// Ignore this completely if there is already a future item scheduled
// but do keep the dueTime for accumulation!
if (prev.DueTime > now) return new ValueAndDueTime<T> { DueTime = prev.DueTime, Ignore = true };
// Schedule this item at at least interval from the previous
var min = prev.DueTime + interval;
var nextTime = (now < min) ? min : now;
return new ValueAndDueTime<T> { DueTime = nextTime, Value = value };
}
private class ValueAndDueTime<T>
{
public DateTimeOffset DueTime;
public T Value;
public bool Ignore;
}
I got another one for your. This one doesn't use Repeat() nor Interval() so it might be what you are after:
subject
.Window(() => Observable.Timer(TimeSpan.FromMilliseconds(500)))
.SelectMany(x => x.Take(1));
Well the most obvious thing will be to use Repeat() here. However, as far as I know Repeat() might introduce problems so that notifications disappear in between the moment when the stream stops and we subscribe again. In practice this has never been a problem for me.
subject
.Take(1)
.Concat(Observable.Empty<long>().Delay(TimeSpan.FromMilliseconds(500)))
.Repeat();
Remember to replace with the actual type of your source.
UPDATE:
Updated query to use Concat instead of Merge
I have stumbled upon this question while trying to re-implement my own solution to the same or similar problem using .Window
Take a look, it seems to be the same as this one and solved quite elegantly:
https://stackoverflow.com/a/3224723/58463
It's an old post, but no answer could really fill my needs, so I'm giving my own solution :
public static IObservable<T> ThrottleOrImmediate<T>(this IObservable<T> source, TimeSpan delay, IScheduler scheduler)
{
return Observable.Create<T>((obs, token) =>
{
// Next item cannot be send before that time
DateTime nextItemTime = default;
return Task.FromResult(source.Subscribe(async item =>
{
var currentTime = DateTime.Now;
// If we already reach the next item time
if (currentTime - nextItemTime >= TimeSpan.Zero)
{
// Following item will be send only after the set delay
nextItemTime = currentTime + delay;
// send current item with scheduler
scheduler.Schedule(() => obs.OnNext(item));
}
// There is still time before we can send an item
else
{
// we schedule the time for the following item
nextItemTime = currentTime + delay;
try
{
await Task.Delay(delay, token);
}
catch (TaskCanceledException)
{
return;
}
// If next item schedule was change by another item then we stop here
if (nextItemTime > currentTime + delay)
return;
else
{
// Set next possible time for an item and send item with scheduler
nextItemTime = currentTime + delay;
scheduler.Schedule(() => obs.OnNext(item));
}
}
}));
});
}
First item is immediately sent, then following items are throttled. Then if a following item is sent after the delayed time, it's immediately sent too.

Categories