await on observable to complete - c#

I have a method that do some work asynchronously with use of observable. I would like to know what is the best way to make this method async, so that I will be able to await on it and do some work after observable completes.
My first try was to use await on observable.
public async Task DoWorkAsync()
{
var observable = Observable.Create<int>(o =>
{
Task.Run(() =>
{
Thread.Sleep(1000);
Console.WriteLine("OnNext");
o.OnNext(1);
o.OnError(new Exception("exception in observable logic"));
//o.OnCompleted();
});
return Disposable.Empty;
});
//observable = observable.Publish().RefCount();
observable.Subscribe(i => Console.WriteLine(i));
Console.WriteLine("Awaiting...");
await observable;
Console.WriteLine("After Awaiting...");
}
Depending on the scenario I had different issues with that approach (+/- means that this part of code is uncommented/commented):
+OnNext +OnCompleted -OnError -RefCount: OnNext was invoked 2 times (observable was subscribed 2 times). This is what I would like to avoid.
+OnNext +OnCompleted -OnError +RefCount: When I use RefCount() method the code works.
-OnNext +OnCompleted -OnError +RefCount: "Sequence contains no element" exception is thrown when my observable doesn't raise OnNext.
+OnNext -OnCompleted +OnError -RefCount: OnNext was invoked 2 times. Exception raised.
+OnNext -OnCompleted +OnError +RefCount: Hangs after displaying 1 (probably because it wants to return to thread that is awaited). We can make it work (and raise exception) by using SubscribeOn(ThreadPoolScheduler.Instance)
Anyway in case when observable is empty (no OnNext rised) we get exception even if OnError is not called and we don't have any exception inside observable logic. Thats why awaiting on observable is not good solution.
That is why I tried another solution using TaskCompletionSource
public async Task DoWorkAsync()
{
var observable = Observable.Create<int>(o =>
{
Task.Run(() =>
{
Thread.Sleep(1000);
Console.WriteLine("OnNext");
o.OnNext(1);
o.OnError(new Exception("exception in observable logic"));
//o.OnCompleted();
});
return Disposable.Empty;
});
var tcs = new TaskCompletionSource<bool>();
observable.Subscribe(i => Console.WriteLine(i),
e =>
{
//tcs.TrySetException(e);
tcs.SetResult(false);
},
() => tcs.TrySetResult(true));
Console.WriteLine("Awaiting...");
await tcs.Task;
Console.WriteLine("After Awaiting...");
}
It works ok in all scenarios and in case of OnError is invoked we could either use tcs.SetResult(false) and don't have information about exception details in outside method or we could use tcs.TrySetException(e) and be able to catch the exception in the outside method.
Can you suggest me if there is some better/cleaner solution or my second solution is the way to go?
EDIT
So I would like to know if there is a better solution than my second solution that will:
not require to use .Publish().RefCount()
not require additional subscription (what happens in await observable under the hood - OnNext is invoked 2 times)
Of course I could wrap my solution in some async extension method for subscribing that returns Task

EDIT:
If you remove the subscription you can do the following:
await observable.Do(i => Console.WriteLine(i)).LastOrDefaultAsync();
As for your arbitrary requirements... Not having multiple subscriptions for a cold observable makes sense; so you publish it. Refusing to use .Publish().Refcount() doesn't make sense. I don't understand why you're rejecting a solution that solves your problem.
There's a lot there, but I'm assuming this is your main question:
Anyway in case when observable is empty (no OnNext rised) we get
exception even if OnError is not called and we don't have any
exception inside observable logic. Thats why awaiting on observable is
not good solution.
await observable is the same as await observable.LastAsync(). So if there is no element, you get an exception. Imagine changing that statement to int result = await observable; What should the value of result be if there's no elements?
If you change await observable; to await observable.LastOrDefaultAsync(); everything should run smoothly.
And yes, you should use .Publish().Refcount()

I'd clearly prefer the 2nd solution, because it only subscribes once.
But out of curiosity: what's the purpose of writing a method like this?
If it's to allow for configurable side effects, this would be equivalent:
public async Task DoWorkAsync()
{
Action<int> onNext = Console.WriteLine;
await Task.Delay(1000);
onNext(1);
throw new Exception("exception in DoWork logic"); // ... or don't
}

You could use ToTask extension method:
await observable.ToTask();

Related

How Task.WhenAll works under the hood

How does Task.WhenAll works under the hood? Does it create separate thread which finished once all of tasks receive callback about finish. I have a suggestion, that under the hood it creates new thread and pass work to system drivers for each of the task and waits for them at the end, but not sure about is it correct or not?
No, Task.WhenAll doesn't create a thread. It is possible that some of the element tasks passed to Task.WhenAll have created threads (but optimally they would not). Task.WhenAll itself just calls ContinueWith on the element tasks, passing a piece of code that checks the other task states. There is no "wait".
Here is an example of how Task.WhenAll may be implemented. (It is not the Microsoft code)
Task MyWhenAll(IEnumerable<Task> tasks)
{
var a = tasks.ToArray();
var tcs = new TaskCompletionSource<bool>();
Array.ForEach(a, WatchTask);
return tcs.Task;
async void WatchTask(Task t)
{
try {
await t;
}
catch {}
if (a.All(element => element.IsCompleted)) {
if (a.Any(element => element.IsFaulted))
// omitted logic for adding each individual exception
// to the aggregate
tcs.TrySetException(new AggregateException());
else
tcs.TrySetResult(true);
}
}
}

Create list of ActionBlock<T> that will complete when any fail

In a scenario where await may be called on an 'empty' list of tasks.
How do I await a list of Task<T>, and then add new tasks to the awaiting list until one fails or completes.
I am sure there is must be an Awaiter or CancellationTokenSource solution for this problem.
public class LinkerThingBob
{
private List<Task> ofmyactions = new List<Task>();
public void LinkTo<T>(BufferBlock<T> messages) where T : class
{
var action = new ActionBlock<IMsg>(_ => this.Tx(messages, _));
// this would not actually work, because the WhenAny
// will not include subsequent actions.
ofmyactions.Add(action.Completion);
// link the new action block.
this._inboundMessageBuffer.LinkTo(block);
}
// used to catch exceptions since these blocks typically don't end.
public async Task CompletionAsync()
{
// how do i make the awaiting thread add a new action
// to the list of waiting tasks without interrupting it
// or graciously interrupting it to let it know there's one more
// more importantly, this CompletionAsync might actually be called
// before the first action is added to the list, so I actually need
// WhenAny(INFINITE + ofmyactions)
await Task.WhenAny(ofmyactions);
}
}
My problem is that I need a mechanism where I can add each of the action instances created above to a Task<T> that will complete when there is an exception.
I am not sure how best to explain this but:
The task must not complete until at least one call to LinkTo<T> has been made, so I need to start with an infinite task
each time LinkTo<T> is called, the new action must be added to the list of tasks, which may already be awaited on in another thread.
There isn't anything built-in for this, but it's not too hard to build one using TaskCompletionSource<T>. TCS is the type to use when you want to await something and there isn't already a construct for it. (Custom awaiters are for more advanced scenarios).
In this case, something like this should suffice:
public class LinkerThingBob
{
private readonly TaskCompletionSource<object> _tcs = new TaskCompletionSource<object>();
private async Task ObserveAsync(Task task)
{
try
{
await task;
_tcs.TrySetResult(null);
}
catch (Exception ex)
{
_tcs.TrySetException(ex);
}
}
public void LinkTo<T>(BufferBlock<T> messages) where T : class
{
var action = new ActionBlock<IMsg>(_ => this.Tx(messages, _));
var _ = ObserveAsync(action.Completion);
this._inboundMessageBuffer.LinkTo(block);
}
public Task Completion { get { return _tcs.Task; } }
}
Completion starts in a non-completed state. Any number of blocks can be linked to it using ObserveAsync. As soon as one of the blocks completes, Completion also completes. I wrote ObserveAsync here in a way so that if the first completed block completes without error, then so will Completion; and if the first completed block completes with an exception, then Completion will complete with that same exception. Feel free to tweak for your specific needs. :)
This is a solution that uses exclusively tools of the TPL Dataflow library itself. You can create a TransformBlock that will "process" the ActionBlocks you want to observe. Processing a block means simply awaiting for its completion. So the TransformBlock takes incomplete blocks, and outputs the same blocks as completed. The TransformBlock must be configured with unlimited parallelism and capacity, and with ordering disabled, so that all blocks are observed concurrently, and each one that completes is returned instantly.
var allBlocks = new TransformBlock<ActionBlock<IMsg>, ActionBlock<IMsg>>(async block =>
{
try { await block.Completion; }
catch { }
return block;
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded,
EnsureOrdered = false
});
Then inside the LinkerThingBob.LinkTo method, send the created ActionBlocks to the TransformBlock.
var actionBlock = new ActionBlock<IMsg>(_ => this.Tx(messages, _));
allBlocks.Post(actionBlock);
Now you need a target to receive the first faulted block. A WriteOnceBlock is quite suitable for this role, since it ensures that will receive at most one faulted block.
var firstFaulted = new WriteOnceBlock<ActionBlock<IMsg>>(x => x);
allBlocks.LinkTo(firstFaulted, block => block.Completion.IsFaulted);
Finally you can await at any place for the completion of the WriteOnceBlock. It will complete immediately after receiving a faulted block, or it may never complete if it never receives a faulted block.
await firstFaulted.Completion;
After the awaiting you can also get the faulted block if you want.
ActionBlock<IMsg> faultedBlock = firstFaulted.Receive();
The WriteOnceBlock is special on how it behaves when it forwards messages. Unlike most other blocks, you can call multiple times its Receive method, and you'll always get the same single item it contains (it is not removed from its buffer after the first Receive).

How to use async countdown event instead of collecting tasks and awaiting on them?

I have the following code:
var tasks = await taskSeedSource
.Select(taskSeed => GetPendingOrRunningTask(taskSeed, createTask, onFailed, onSuccess, sem))
.ToList()
.ToTask();
if (tasks.Count == 0)
{
return;
}
if (tasks.Contains(null))
{
tasks = tasks.Where(t => t != null).ToArray();
if (tasks.Count == 0)
{
return;
}
}
await Task.WhenAll(tasks);
Where taskSeedSource is a Reactive Observable. It could be that this code have many problems, but I see at least two:
I am collecting tasks whereas I could do without it.
Somehow, the returned tasks list may contain nulls, even though GetPendingOrRunningTask is an async method and hence never returns null. I failed to understand why it happens, so I had to defend against it without understanding the cause of the problem.
I would like to use the AsyncCountdownEvent from the AsyncEx framework instead of collecting the tasks and then awaiting on them.
So, I can pass the countdown event to GetPendingOrRunningTask which will increment it immediately and signal before returning after awaiting for the completion of its internal logic. However, I do not understand how to integrate the countdown event into the monad (that is the Reactive jargon, isn't it?).
What is the right way to do it?
EDIT
Guys, let us forget about the mysterious nulls in the returned list. Suppose everything is green and the code is
var tasks = await taskSeedSource
.Select(taskSeed => GetPendingOrRunningTask(taskSeed, ...))
.ToList()
.ToTask();
await Task.WhenAll(tasks);
Now the question is how do I do it with the countdown event? So, suppose I have:
var c = new AsyncCountdownEvent(1);
and
async Task GetPendingOrRunningTask<T>(AsyncCountdownEvent c, T taskSeed, ...)
{
c.AddCount();
try
{
await ....
}
catch (Exception exc)
{
// The exception is handled
}
c.Signal();
}
My problem is that I no longer need the returned task. These tasks where collected and awaited to get the moment when all the work items are over, but now the countdown event can be used to indicate when the work is over.
My problem is that I am not sure how to integrate it into the Reactive chain. Essentially, the GetPendingOrRunningTask can be async void. And here I am stuck.
EDIT 2
Strange appearance of a null entry in the list of tasks
#Servy is correct that you need to solve the null Task problem at the source. Nobody wants to answer a question about how to workaround a problem that violates the contracts of a method that you've defined yourself and yet haven't provided the source for examination.
As for the issue about collecting tasks, it's easy to avoid with Merge if your method returns a generic Task<T>:
await taskSeedSource
.Select(taskSeed => GetPendingOrRunningTask(taskSeed, createTask, onFailed, onSuccess, sem))
.Where(task => task != null) // According to you, this shouldn't be necessary.
.Merge();
However, unfortunately there's no official Merge overload for the non-generic Task but that's easy enough to define:
public static IObservable<Unit> Merge(this IObservable<Task> sources)
{
return sources.Select(async source =>
{
await source.ConfigureAwait(false);
return Unit.Default;
})
.Merge();
}

Is it in general dubious to call Task.Factory.StartNew(async () => {}) in Subscribe?

I have a situation where I need to use a custom scheduler to run tasks (these need to be tasks) and the scheduler does not set a synchronization context (so no ObserveOn, SubscribeOn, SynchronizationContextScheduler etc. I gather). The following is how I ended up doing it. Now, I wonder, I'm not really sure if this is the fittest way of doing asynchronous calls and awaiting their results. Is this all right or is there a more robust or idiomatic way?
var orleansScheduler = TaskScheduler.Current;
var someObservable = ...;
someObservable.Subscribe(i =>
{
Task.Factory.StartNew(async () =>
{
return await AsynchronousOperation(i);
}, CancellationToken.None, TaskCreationOptions.None, orleansScheduler);
});
What if awaiting wouldn't be needed?
<edit: I found a concrete, and a simplified example of what I'm doing here. Basically I'm using Rx in Orleans and the above code is bare-bones illustration of what I'm up to. Though I'm also interested in this situation in general too.
The final code
It turns out this was a bit tricky in the Orleans context. I don't see how I could get to use ObserveOn, which would be just the thing I'd like to use. The problem is that by using it, the Subscribe would never get called. The code:
var orleansScheduler = TaskScheduler.Current;
var factory = new TaskFactory(orleansScheduler);
var rxScheduler = new TaskPoolScheduler(factory);
var someObservable = ...;
someObservable
//.ObserveOn(rxScheduler) This doesn't look like useful since...
.SelectMany(i =>
{
//... we need to set the custom scheduler here explicitly anyway.
//See Async SelectMany at http://log.paulbetts.org/rx-and-await-some-notes/.
//Doing the "shorthand" form of .SelectMany(async... would call Task.Run, which
//in turn runs always on .NET ThreadPool and not on Orleans scheduler and hence
//the following .Subscribe wouldn't be called.
return Task.Factory.StartNew(async () =>
{
//In reality this is an asynchronous grain call. Doing the "shorthand way"
//(and optionally using ObserveOn) would get the grain called, but not the
//following .Subscribe.
return await AsynchronousOperation(i);
}, CancellationToken.None, TaskCreationOptions.None, orleansScheduler).Unwrap().ToObservable();
})
.Subscribe(i =>
{
Trace.WriteLine(i);
});
Also, a link to a related thread at Codeplex Orleans forums.
I strongly recommend against StartNew for any modern code. It does have a use case, but it's very rare.
If you have to use a custom task scheduler, I recommend using ObserveOn with a TaskPoolScheduler constructed from a TaskFactory wrapper around your scheduler. That's a mouthful, so here's the general idea:
var factory = new TaskFactory(customScheduler);
var rxScheduler = new TaskPoolScheduler(factory);
someObservable.ObserveOn(rxScheduler)...
Then you could use SelectMany to start an asynchronous operation for each event in a source stream as they arrive.
An alternative, less ideal solution is to use async void for your subscription "events". This is acceptable, but you have to watch your error handling. As a general rule, don't allow exceptions to propagate out of an async void method.
There is a third alternative, where you hook an observable into a TPL Dataflow block. A block like ActionBlock can specify its task scheduler, and Dataflow naturally understands asynchronous handlers. Note that by default, Dataflow blocks will throttle the processing to a single element at a time.
Generally speaking, instead of subscribing to execute, it's better/more idiomatic to project the task parameters into the task execution and subscribe just for the results. That way you can compose with further Rx downstream.
e.g. Given a random task like:
static async Task<int> DoubleAsync(int i, Random random)
{
Console.WriteLine("Started");
await Task.Delay(TimeSpan.FromSeconds(random.Next(10) + 1));
return i * 2;
}
Then you might do:
void Main()
{
var random = new Random();
// stream of task parameters
var source = Observable.Range(1, 5);
// project the task parameters into the task execution, collect and flatten results
source.SelectMany(i => DoubleAsync(i, random))
// subscribe just for results, which turn up as they are done
// gives you flexibility to continue the rx chain here
.Subscribe(result => Console.WriteLine(result),
() => Console.WriteLine("All done."));
}

Should I use a regular Task or a continuation Task?

Suppose the following method is defined:
Task<TResult> DoStuffAsync()
{
// ...
}
Consider the following code (snippet 1):
void MyFunction()
{
Task<TResult> task = DoStuffAsync();
task.ContinueWith(async () => {
TResult result = await task;
// do stuff with result
});
// poll `task` status...
}
in comparison to the following code (snippet 2):
void MyFunction()
{
Task<TResult> task = DoStuffAsync();
Task.Run(async () => {
TResult result = await task;
// do stuff with result
});
// poll `task` status...
}
(Note how I do not care about the status of the lambda function (fire-and-forget). But I do care if it raises any exceptions.)
The first difference between the two options seems clear: in (snippet 1) the code in the lambda will only begin execution after DoStuffAsync() has completed, whereas in (snippet 2) the code in the lambda will attempt to begin execution immediately and proceed when DoStuffAsync() has completed.
However, apart from this difference, when should you use ContinueWith and when should you use Task.Run? What happens if an exception is raised in DoStuffAsync() or in the lambda function? Will it be swallowed or is every potential exception guaranteed to be raised to a block where it can be handled?
In your first case there's no need for the lambda to be async. There is no need to await the task. You can just use Result because you know that the task has already completed by that point in time.
For your second example, you're scheduling the thread pool thread to perform the creation of a state machine that will merely schedule some code to run when the task finishes. There's no real need for this at all. The Task.Run is adding nothing here.
Most likely the method itself should be an async method:
private async Task MyFunction() {
var result = await DoStuffAsync();
// do stuff with result
}
All that said, while both of your solutions have a lot of superfluous work, both will propagate exceptions from your underlying work to the tasks that each operation computes (although you don't store that task anywhere in your first example, so you have no way of inspecting that task to see if it faulted).

Categories