Async CTP Task.WhenAll issue

Async CTP Task.WhenAll issue - c#

I am stuck on something which appears simple but I cannot see what i'm doing wrong. I have a simple class structure 'StaticQuote' which describes values returned from a complex calculation for which I am trying to find the lowest value. Because the Calculate call is expensive I am creating an array of tasks so the calculations execute in parallel, and am then using a Task.WhenAll to wait until they have all finished before comparing the results (which are stored in StaticQuote objects). The problem is that when trying to return the array of StaticQuotes I am getting the following error:
Cannot implicitly convert type 'System.Threading.Tasks.Task<Services.QuoteGeneratorAsync.StaticQuote[]>' to 'Services.QuoteGeneratorAsync.StaticQuote[]'
I have seen similar examples with strings etc where this assignment works perfectly so cannot understand which the right hand side is not returning an array of StaticQuote results? I am new to multi threaded code and the Async CTP. Can anyone provide the answer? Many thanks.
example problem:
List<Task<StaticQuote>> Calculations = new List<Task<StaticQuote>>();
foreach()
{
Calculations.Add(TaskEx.RunEx(() => Calculate(...my params....)));
}
StaticQuote[] Quotes=TaskEx.WhenAll<StaticQuote>(Calculations); --//this line won't compile

TaskEx.WhenAll returns a Task<T[]> which indicates when all the other tasks have finished. So you want:
StaticQuote[] quotes = await TaskEx.WhenAll(Calculations);
The await expression "unwraps" a Task<T> to a T. So elsewhere if you've got:
Task<string> downloadTask = webClient.DownloadStringTaskAsync(url);
string result = await downloadTask;
it's exactly the same thing - it's just that the WhenAll version is slightly more complicated because it's got a collection of task inputs and outputs, instead of a single one.
Obviously in order to use await you have to be in an async method to start with.
If all of this is still confusing, you might want to read my blog posts about async, as well as those of Eric Lippert. (There are plenty of others available too, of course.)

Related

Async LINQ - not lazy? Multithreaded?

I have the following code:
var things = await GetDataFromApi(cancellationToken);
var builder = new StringBuilder(JsonSerializer.Serialize(things));
await things
.GroupBy(x => x.Category)
.ToAsyncEnumerable()
.SelectManyAwaitWithCancellation(async (category, ct) =>
{
var thingsWithColors = await _colorsApiClient.GetColorsFor(category.Select(thing => thing.Name).ToList(), ct);
return category
.Select(thing => ChooseBestColor(thingsWithColors))
.ToAsyncEnumerable();
})
.ForEachAsync(thingAndColor =>
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId); // prints different IDs
builder.Replace(thingAndColor.Thing, $"{thingAndColor.Color} {thingAndColor.Thing}");
}, cancellationToken);
It uses System.Linq.Async and I find it difficult to understand.
In "classic"/synchronous LINQ, the whole thing would get executed only when I call ToList() or ToArray() on it. In the example above, there is no such call, but the lambdas get executed anyway. How does it work?
The other concern I have is about multi-threading. I heard many times that async != multithreading. Then, how is that possible that the Console.WriteLine(Thread.CurrentThread.ManagedThreadId); prints various IDs? Some of the IDs get printed multiple times, but overall there are about 5 thread IDs in the output. None of my code creates any threads explicitly. It's all async-await.
The StringBuilder does not support multi-threading, and I'd like to understand if the implementation above is valid.
Please ignore the algorithm of my code, it does not really matter, it's just an example. What matters is the usage of System.Async.Linq.

ForEachAsync would have a similar effect as ToList/ToArray since it forces evaluation of the entire list.
By default, anything after an await continues on the same execution context, meaning if the code runs on the UI thread, it will continue running on the UI thread. If it runs on a background thread, it will continue to run on a background thread, but not necessarily the same one.
However, none of your code should run in parallel. That does not necessarily mean it is thread safe, there probably need to be some memory barriers to ensure data is flushed correctly, but I would assume these barriers are issued by the framework code itself.

The System.Async.Linq, as well as the whole dotnet/reactive repository, is currently a semi-abandoned project. The issues on GitHub are piling up, and nobody answers them officially for almost a year. There is no documentation published, apart from the XML documentation in the source code on top of each method. You can't really use this library without studying the source code, which is generally easy to do because the code is short, readable, and honestly doesn't do too much. The functionality offered by this library is similar with the functionality found in the System.Linq, with the main difference being that the input is IAsyncEnumerable<T> instead of IEnumerable<T>, and the delegates can return values wrapped in ValueTask<T>s.
With the exception of a few operators like the Merge (and only one of its overloads), the System.Async.Linq doesn't introduce concurrency. The asynchronous operations are invoked one at a time, and then they are awaited before invoking the next operation. The SelectManyAwaitWithCancellation operator is not one of the exceptions. The selector is invoked sequentially for each element, and the resulting IAsyncEnumerable<TResult> is enumerated sequentially, and its values yielded the one after the other. So it's unlikely to create thread-safety issues.
The ForEachAsync operator is just a substitute of doing a standard await foreach loop, and was included in the library at a time when the C# language support for await foreach was non existent (before C# 8). I would recommend against using this operator, because its resemblance with the new Parallel.ForEachAsync API could create confusion. Here is what is written inside the source code of the ForEachAsync operator:
// REVIEW: Once we have C# 8.0 language support, we may want to do away with these
// methods. An open question is how to provide support for cancellation,
// which could be offered through WithCancellation on the source. If we still
// want to keep these methods, they may be a candidate for
// System.Interactive.Async if we consider them to be non-standard
// (i.e. IEnumerable<T> doesn't have a ForEach extension method either).

Not sure where to use an await operator in an async method

After reading lot of stuff about async/await, I still not sure where do I use await operator in my async method:
public async Task<IActionResult> DailySchedule(int professionalId, DateTime date)
{
var professional = professionalServices.Find(professionalId);
var appointments = scheduleService.SearchForAppointments(date, professional);
appointments = scheduleService.SomeCalculation(appointments);
return PartialView(appointments);
}
Should I create an async version for all 3 method and call like this?
var professional = await professionalServices.FindAsync(professionalId);
var appointments = await scheduleService.SearchForAppointmentsAsync(date, professional);
appointments = await scheduleService.SomeCalculationAsync(appointments);
or Should I make async only the first one ?
var professional = await professionalServices.FindAsync(professionalId);
var appointments = scheduleService.SearchForAppointments(date, professional);
appointments = scheduleService.SomeCalculation(appointments);
What´s is the difference?

I still not sure where do I use await operator in my async method
You're approaching the problem from the wrong end.
The first thing to change when converting to async is the lowest-level API calls. There are some operations that are naturally asynchronous - specifically, I/O operations. Other operations are naturally synchronous - e.g., CPU code.
Based on the names "Find", "SearchForAppointments, and "SomeCalculation", I'd suspect that Find and SearchForAppointments are I/O-based, possibly hitting a database, while SomeCalculation is just doing some CPU calculation.
But don't change Find to FindAsync just yet. That's still going the wrong way. The right way is to start at the lowest API, so whatever I/O that Find eventually does. For example, if this is an EF query, then use the EF6 asynchronous methods within Find. Then Find should become FindAsync, which should drive DailySchedule to be DailyScheduleAsync, which causes its callers to become async, etc. That's the way that async should grow.

As VSG24 has said, you should await each and every call that can be awaited that is because this will help you keep the main thread free from any long running task — that are for instance, tasks that download data from internet, tasks that save the data to the disk using I/O etc. The problem was that whenever you had to do a long running task, it always froze the UI. To overcome this, asynchronous pattern was used and thus this async/await allows you create a function that does the long running task on the background and your main thread keeps talking to the users.
I still not sure where do I use await operator in my async method
The intuition is that every function ending with Async can be awaited (provided their signature also matches, the following) and that they return either a Task, or Task<T>. Then they can be awaited using a function that returns void — you cannot await void. So the functions where there will be a bit longer to respond, you should apply await there. My own opinion is that a function that might take more than 1 second should be awaited — because that is a 1 second on your device, maybe your client has to wait for 5 seconds, or worse 10 seconds. They are just going to close the application and walk away saying, It doesn't work!.
Ok, if one of these is very very fast, it does not need to be async, right ?
Fast in what manner? Don't forget that your clients may not have very very fast machines and they are definitely going to suffer from the frozen applications. So them a favor and always await the functions.
What´s is the difference?
The difference is that in the last code sample, only the first call will be awaited and the others will execute synchronously. Visual Studio will also explain this part to you by providing green squiggly lines under the code that you can see by hovering over it. That is why, you should await every async call where possible.
Tip: If the process in the function is required, such as loading all of the data before starting the application then you should avoid async/await pattern and instead use the synchronous approach to download the data and perform other tasks. So it entirely depends on what you are doing instead of what the document says. :-)

Every async method call needs to be awaited, so that every method call releases the thread and thus not causing a block. Therefore in your case this is how you should do it:
var professional = await professionalServices.FindAsync(professionalId);
var appointments = await scheduleService.SearchForAppointmentsasync(date, professional);
appointments = await scheduleService.SomeCalculationAsync(appointments);

Flow context/state through generated continuations

First, some context (pardon the pun). Consider the following two async methods:
public async Task Async1() {
PreWork();
await Async2();
PostWork();
}
public async Task Async2() {
await Async3();
}
Thanks to the async and await keywords, this creates the illusion of a nice simple call stack:
Async1
PreWork
Async2
Async3
PostWork
But, if I understand correctly, in the generated code the call to PostWork is tacked on as a continuation to the Async2 task, so at runtime the flow of execution is actually more like this:
Async1
PreWork
Async2
Async3
PostWork
(and actually it's even more complicated than that, because in reality the use of async and await causes the compiler to generate state machines for each async method, but we might be able to ignore that detail for this question)
Now, my question: Is there any way to flow some sort of context through these auto-generated continuations, such that by the time I hit PostWork, I have accumulated state from Async1 and Async2?
I can get something similar to what I want with tools like AsyncLocal and CallContext.LogicalSetData, but they aren't quite what I need because the contexts are "rolled back" as you work your way back up the async chain. For example, calling Async1 in the following code will print "1,3", not "1,2,3":
private static readonly AsyncLocal<ImmutableQueue<String>> _asyncLocal = new AsyncLocal<ImmutableQueue<String>>();
public async Task Async1() {
_asyncLocal.Value = ImmutableQueue<String>.Empty.Enqueue("1");
await Async2();
_asyncLocal.Value = _asyncLocal.Value.Enqueue("3");
Console.WriteLine(String.Join(",", _asyncLocal.Value));
}
public async Task Async2() {
_asyncLocal.Value = _asyncLocal.Value.Enqueue("2");
await Async3();
}
I understand why this prints "1,3" (the execution context flows down to Async2 but not back up to Async1) but that isn't what I'm looking for. I really want to accumulate state through the actual execution chain, such that I'm able to print "1,2,3" at the end because that was the actual way in which the methods were executed leading up to the call to Console.WriteLine.
Note that I don't want to blindly accumulate all state across all async work, I only want to accumulate state that is causally related. In this scenario I want to print "1,2,3" because that "2" came from a dependent (awaited) task. If instead the call to Async2 was just a fire-and-forget task then I wouldn't expect to see "2" because its execution would not be in the actual chain leading up to the Console.WriteLine.
Edit: I do not want to solve this problem just passing around parameters and return values because I need a generic solution that will work across a large code base without having to modify every method to pass around this metadata.

Does IEnumerable<T>.Count() actually work for IObservable<T>?

Is there an example out there showing me how the Observable.Count<TSource> Method actually works? The examples I come up with appear to return a count wrapped in an observable instead of the expected count.
For example, I expect 1 to be returned from this:
System.Diagnostics.Debug.WriteLine((Observable.Return<string>("Hello world!")).Count());
Will 1 be returned in the future (because, after all, it is an asynchronous sequence)? Or am I missing a few things fundamental? As of this writing, I actually assume that .Count() will return the results of T and grow over time as long a results are pushed out. Really? Yes.

The aggregate operators in Rx work a bit differently than in LINQ - they do not immediately return a value, they return a future result (i.e. we can only know what the final Count is once the Observable completes).
So if you write:
Observable.Return("foo").Count().Subscribe(x => Console.WriteLine(x));
>>> 1
because, after all, it is an asynchronous sequence
This actually isn't exactly true. Here, everything will be run immediately, as soon as somebody calls Subscribe. There is nothing asynchronous about this code above, there are no extra threads, everything happens on the Subscribe.

I think that using an observable that returns immediately and also using the async/await syntax as rasx did in the comments is confusing matters rather too much.
Let's create a stream with 5 elements that come back one every second and then complete:
private IObservable<long> StreamWith5Elements()
{
return Observable.Interval(TimeSpan.FromSeconds(1))
.Take(5);
}
We can call it using async/await magic as in this LINQPad friendly example:
void Main()
{
CountExampleAsync().Wait();
}
private async Task CountExampleAsync()
{
int result = await StreamWith5Elements().Count();
Console.WriteLine(result);
}
But it's misleading what's going on here - Count() returns an IObservable<int>, but Rx is super-friendly with await and converts that result stream into a Task<int> - and the await then hands back that task's int result.
When you use await against an IObservable<T>, you are implicitly saying that you expect that observable to call OnNext() with a single result and then call OnComplete(). What actually happens is that you will get a Task<T> that returns the last value sent before the stream terminated. (Similar to how AsyncSubject<T> behaves).
This is useful because it means any stream can be mapped to a Task, but it does require some careful thought.
Now, the above example is equivalent to the following more traditional Rx:
void Main()
{
PlainRxCountExample();
}
private void PlainRxCountExample()
{
IObservable<int> countResult = StreamWith5Elements().Count();
countResult.Subscribe(count => Console.WriteLine(count));
/* block until completed for the sake of the example */
countResult.Wait();
}
Here you can see that Count() is indeed returning a stream of int - to provide an asynchronous count. It will return only when the source stream completes.
In the early days of Rx, Count() was in fact synchronous.
However, that's not a terribly useful state of affairs since it "Exits the Monad" - i.e. brings you out of IObservable<T> and prevents you from further composition with Rx operators.
Once you start "thinking in streams", the asynchronous nature of Count() is quite intuitive really, since of course you can only provide a count of a stream when it's finished - and why hang around for that?? :)

What happens when you await a Task before returning it?

I am experimenting with the new async and await keywords. I produced the following asynchronous function:
private async static Task<string> GetStringAsync(string pageAddress)
{
HttpClient client = new HttpClient();
return client.GetStringAsync(pageAddress);
}
I understand that I am returning a Task<String> and can await the result from another method. This method works fine. My question is what happens (under the hood as it were) when I replace the second line of the above function with the following (notice the introduction of the await keyword):
return await client.GetStringAsync(pageAddress);
The function behaves in exactly the same way! Remember the function returns Task<string> not string. Is the await keyword here degenerate? Does the compiler simply strip it from my code?

The answer to this question is too large to post here given your likely current level of understanding. What you should do is start by reading my MSDN article and then Mads' MSDN article; they are a good beginner introduction to the feature and Mads describes how it is implemented. You can find links here:
http://blogs.msdn.com/b/ericlippert/archive/2011/10/03/async-articles.aspx
Then if you are interested in the theory underlying the feature you should start by reading all my articles on continuation passing style:
http://blogs.msdn.com/b/ericlippert/archive/tags/continuation+passing+style/
Start from the bottom. Once you understand the notion of continuation, you can then read my series of articles on how we designed the async feature:
http://blogs.msdn.com/b/ericlippert/archive/tags/async/

As Eric Lippert pointed out, the first version won't compile; you have to remove the async keyword or you'll get a type error.
Here's a useful mental model regarding how the async and await keywords work with the return type:
Any value T returned by an async method is "wrapped" into a Task<T>.
The await keyword (which you can think of as an operator), when applied to a Task<T>, will "unwrap" it, resulting in a value of type T.
Now, that's an extreme simplification; what's actually happening is more complicated. E.g., this simplification skips over how await works with the current SynchronizationContext: in the second example, the method will attempt to return to the original context after the await completes, so you will observe different behavior if that context is busy.
But for the most part, the two examples are almost equivalent. The second one is less efficient due to the async state machine and resuming on the context.
I have an async/await intro that you may find helpful; in that post I try to explain async in a way that is not too complex but also not actually incorrect. :)

Eric's obviously the expert here and his advice is sound, but to answer your specific question:
In the first version, the async keyword on the method is irrelevant and your GetStringAsync method returns the same Task<string> awaitable that's returned by client.GetStringAsync.
In the second version, the async keyword on the method is required because you're using await in the method and the await keyword creates and returns a separate Task<string> awaitable that completes once the awaitable from client.GetStringAsync completes. When that occurs, the await then evaluates to the string that was asynchronously obtained by client.GetStringAsync which is returned as the result of your asynchronous method.
So to the caller of GetStringAsync, they're functionally the same, but the first version is cleaner.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.