await/async and going outside the box - c#

I have a question regarding await/async and using async methods in slightly different scenarios than expected, for example not directly awaiting them. For example, Lets say I have two routines I need to complete in parallel where both are async methods (they have awaits inside). I am using await TAsk.WhenAll(...) which in turn expects some sort of list of tasks to wait for. What I did is something like this:
await Task.WhenAll(new Task[]
{
Task.Run(async () => await internalLoadAllEmailTargets()),
Task.Run(async () => await internalEnumerateInvoices())
});
This seems overly elaborate to me, in the sense that I am creating async tasks whose sole purpose is to invoke another task. Can't I just use tasks which are returned from the async method state engine? Yet, I am failing to do that since compiler treats every direct mention of async method as an invocation point:
// this doesn't seem to work ok
await Task.WhenAll(new Task[]
{
internalLoadAllEmailTargets(),
internalEnumerateInvoices()
});
If its like this, it seems to synchronously calls one after another, and if I place await in front of methods, it is no longer a Task. Is there some rule book on how async methods should be handled outside plain await?

Every async method starts executing synchronously, but when it hits its first await, it may behave asynchronously. So this line:
await Task.WhenAll(internalLoadAllEmailTargetsAsync(), internalEnumerateInvoicesAsync());
should work just fine. It is roughly equivalent to this:
var _1 = internalLoadAllEmailTargetsAsync();
var _2 = internalEnumerateInvoicesAsync();
await Task.WhenAll(_1, _2);
If your methods are truly asynchronous, then this should be fine.
Now, if your methods are actually doing some synchronous work - say, heavy CPU-bound code - then you may want to use Task.Run to invoke them (if your calling code is on a UI thread).

You have some code, which creates Task object, and it will be invoked as usual, i.e synchronously. Control will be returned to the invoking code only after Task creation and in case of async it will be after the first await.
So, if it's a problem, that some part of your method will be invoked in blocking manner, you could use Task.Yield at the beginning, just be careful with SynchronizationContext and thread switches.
But in most cases there is nothing wrong with that scenario, because code, which creates Task, is small and fast, while actual timing is caused by some sort of IO operation.

Related

Running async functions in parallel from list of interfaces

I have a similair question to Running async methods in parallel in that I wish to run a number of functions from a list of functions in parallel.
I have noted in a number of comments online it is mentioned that if you have another await in your methods, Task.WhenAll() will not help as Async methods are not parallel.
I then went ahead and created a thread for each using function call with the below (the number of parallel functions will be small typically 1 to 5):
public interface IChannel
{
Task SendAsync(IMessage message);
}
public class SendingChannelCollection
{
protected List<IChannel> _channels = new List<IChannel>();
/* snip methods to add channels to list etc */
public async Task SendAsync(IMessage message)
{
var tasks = SendAll(message);
await Task.WhenAll(tasks.AsParallel().Select(async task => await task));
}
private IEnumerable<Task> SendAll(IMessage message)
{
foreach (var channel in _channels)
yield return channel.SendAsync(message, qos);
}
}
I would like to double check I am not doing anything horrendous with code smells or bugs as i get to grips with what I have patched together from what i have found online. Many thanks in advance.
Let's compare the behaviour of your line:
await Task.WhenAll(tasks.AsParallel().Select(async task => await task));
in contrast with:
await Task.WhenAll(tasks);
What are you delegating to PLINQ in the first case? Only the await operation, which does basically nothing - it invokes the async/await machinery to wait for one task. So you're setting up a PLINQ query that does all the heavy work of partitioning and merging the results of an operation that amounts to "do nothing until this task completes". I doubt that is what you want.
If you have another await in your methods, Task.WhenAll() will not help as Async methods are not parallel.
I couldn't find that in any of the answers to the linked questions, except for one comment under the question itself. I'd say that it's probably a misconception, stemming from the fact that async/await doesn't magically turn your code into concurrent code. But, assuming you're in an environment without a custom SynchronizationContext (so not an ASP or WPF app), continuations to async functions will be scheduled on the thread pool and possibly run in parallel. I'll delegate you to this answer to shed some light on that. That basically means that if your SendAsync looks something like this:
Task SendAsync(IMessage message)
{
// Synchronous initialization code.
await something;
// Continuation code.
}
Then:
The first part before await runs synchronously. If this part is heavyweight, you should introduce parallelism in SendAll so that the initialization code is run in parallel.
await works as usual, waiting for work to complete without using up any threads.
The continuation code will be scheduled on the thread pool, so if a few awaits finish up at the same time their continuations might be run in parallel if there's enough threads in the thread pool.
All of the above is assuming that await something actually awaits asynchronously. If there's a chance that await something completes synchronously, then the continuation code will also run synchronously.
Now there is a catch. In the question you linked one of the answers states:
Task.WhenAll() has a tendency to become unperformant with large scale/amount of tasks firing simultaneously - without moderation/throttling.
Now I don't know if that's true, since I weren't able to find any other source claiming that. I guess it's possible and in that case it might actually be beneficial to invoke PLINQ to deal with partitioning and throttling for you. However, you said you typically handle 1-5 functions, so you shouldn't worry about this.
So to summarize, parallelism is hard and the correct approach depends on how exactly your SendAsync method looks like. If it has heavyweight initialization code and that's what you want to parallelise, you should run all the calls to SendAsync in parallel. Otherwise, async/await will be implicitly using the thread pool anyway, so your call to PLINQ is redundant.

Should I use Task.Run to wait tasks in a synchronous context?

I have an ASPX page which I cannot convert to async but which uses some async methods in a synchronous context. The way it invokes them is like so:
public void MySyncMethod()
{
var myTask = Task.Run(() => _myField.DoSomethingAsync());
myTask.Wait();
//use myTask.Result
}
Is there any difference between doing that and the following as far as async/await and/or blocking goes?
public void MySyncMethod()
{
var myTask = _myField.DoSomethingAsync(); //just get the Task direct, no Task.Run
myTask.Wait();
//use myTask.Result
}
I assume a previous developer added the Task.Run for a reason. But I am having issues which accessing things in HttpContext as the work is being run on a different thread.
Is there a reason to use Task.Run here?
Is there any difference between doing that and the following as far as
async/await and/or blocking goes?
Yes the first block of code uses a thread pool thread then waits for this to return, so your using two threads not one. They both block.
I assume a previous developer added the Task.Run for a reason.
Yes, blocking (directly) on async code from an ASP.Net context is a bad idea and can cause deadlocks. So you second block of code is more efficent (in thread usage) but suffers from serious deadlock issues.
The correct solution here is to make public void MySyncMethod() async itself (public async Task MySyncMethod()). Both these solutions have drawbacks and the only real way out is to make the whole call stack async. If you can do this, do it.
If you can't call an async method from another async method then Task.Run is the way to go. See How to call asynchronous method from synchronous method in C#? for more details.
If you want HttpContext inside your thread have a read though Using HttpContext in Async Task I would definitely favour:
Make every thing async
Or the Read the values from the context then pass them
Options of those answers and keep in mind
First off, you're not creating a copy of the object, you're just
copying the reference to the object.HttpContext isn't a
struct.....etc
The internal workings of asynchronous code based on async/await is fundamentally different than tasks started by Task.Run. async/await tasks are promise based and depend on the caller cooperating with returning the execution back to the asynchronous method when appropriate. Tasks started by Task.Run however are usually started on a parallel thread taken from the thread pool and do not depend on the caller's cooperation to continue execution when appropriate.
This constellation leads to the problem that you can not treat a promise based task the same as the other tasks, since the promise based task might wait for the callers cooperation to return the execution, which might never occur since the other task is executed independently and might wait for the caller. The result is a deadlock.
The solution is a specific Task.Run overload that will create a proxy for an existing task-based method that allows proper execution of a promise based task. It is safe to call Wait on this proxy. That's why the other developer used this construct. He could have also simplified the call and avoided an anonymous method like this:
var myTask = Task.Run(_myField.DoSomethingAsync);
Task.Run is used to run code asynchronously.
Be clear that it returns Task and needs to be awaited. Here's an example:
Task myTask = Task.Run(() => DoSomething());
await myTask;

Why use async when I have to use await?

I've been stuck on this question for a while and haven't really found any useful clarification as to why this is.
If I have an async method like:
public async Task<bool> MyMethod()
{
// Some logic
return true;
}
public async void MyMethod2()
{
var status = MyMethod(); // Visual studio green lines this and recommends using await
}
If I use await here, what's the point of the asynchronous method? Doesn't it make the async useless that VS is telling me to call await? Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Yes, of course. But that's not the purpose of await/async. The purpose is to allow you to write synchronous code that uses asynchronous operations without wasting threads, or more generally, to give the caller a measure of control over the more or less asynchronous operations.
The basic idea is that as long as you use await and async properly, the whole operation will appear to be synchronous. This is usually a good thing, because most of the things you do are synchronous - say, you don't want to create a user before you request the user name. So you'd do something like this:
var name = await GetNameAsync();
var user = await RemoteService.CreateUserAsync(name);
The two operations are synchronous with respect to each other; the second doesn't (and cannot!) happen before the first. But they aren't (necessarily) synchronous with respect to their caller. A typical example is a Windows Forms application. Imagine you have a button, and the click handler contains the code above - all the code runs on the UI thread, but at the same time, while you're awaiting, the UI thread is free to do other tasks (similar to using Application.DoEvents until the operation completes).
Synchronous code is easier to write and understand, so this allows you to get most of the benefits of asynchronous operations without making your code harder to understand. And you don't lose the ability to do things asynchronously, since Task itself is just a promise, and you don't always have to await it right away. Imagine that GetNameAsync takes a lot of time, but at the same time, you have some CPU work to do before it's done:
var nameTask = GetNameAsync();
for (int i = 0; i < 100; i++) Thread.Sleep(100); // Important busy-work!
var name = await nameTask;
var user = await RemoteService.CreateUserAsync(name);
And now your code is still beautifuly synchronous - await is the synchronization point - while you can do other things in parallel with the asynchronous operations. Another typical example would be firing off multiple asynchronous requests in parallel but keeping the code synchronous with the completion of all of the requests:
var tasks = urls.Select(i => httpClient.GetAsync(i)).ToArray();
await Task.WhenAll(tasks);
The tasks are asynchronous in respect to each other, but not their caller, which is still beautifuly synchronous.
I've made a (incomplete) networking sample that uses await in just this way. The basic idea is that while most of the code is logically synchronous (there's a protocol to be followed - ask for login->verify login->read loop...; you can even see the part where multiple tasks are awaited in parallel), you only use a thread when you actually have CPU work to do. Await makes this almost trivial - doing the same thing with continuations or the old Begin/End async model would be much more painful, especially with respect to error handling. Await makes it look very clean.
If I use await here, what's the point of the asynchronous method?
await does not block thread. MyMethod2 will run synchronously until it reaches await expression. Then MyMethod2 will be suspended until awaited task (MyMethod) is complete. While MyMethod is not completed control will return to caller of MyMethod2. That's the point of await - caller will continue doing it's job.
Doesn't it make the async useless that VS is telling me to call await?
async is just a flag which means 'somewhere in the method you have one or more await'.
Does that not defeat the purpose of offloading a task to a thread
without waiting for it to finish?
As described above, you don't have to wait for task to finish. Nothing is blocked here.
NOTE: To follow framework naming standards I suggest you to add Async suffix to asynchronous method names.
An async method is not automatically executed on a different thread. Actually, the opposite is true: an async method is always executed in the calling thread. async means that this is a method that can yield to an asynchronous operation. That means it can return control to the caller while waiting for the other execution to complete. So asnync methods are a way to wait for other asynchronoous operations.
Since you are doing nothing to wait for in MyMethod2, async makes no sense here, so your compiler warns you.
Interestingly, the team that implemented async methods has acknowledged that marking a method async is not really necessary, since it would be enough to just use await in the method body for the compiler to recognize it as async. The requirement of using the async keyword has been added to avoid breaking changes to existing code that uses await as a variable name.

await Task.Run vs await

I've searched the web and seen a lot of questions regarding Task.Run vs await async, but there is this specific usage scenario where I don't not really understand the difference. Scenario is quite simple i believe.
await Task.Run(() => LongProcess());
vs
await LongProcess());
where LongProcess is a async method with a few asynchronous calls in it like calling db with await ExecuteReaderAsync() for instance.
Question:
Is there any difference between the two in this scenario? Any help or input appreciated, thanks!
Task.Run may post the operation to be processed at a different thread. That's the only difference.
This may be of use - for example, if LongProcess isn't truly asynchronous, it will make the caller return faster. But for a truly asynchronous method, there's no point in using Task.Run, and it may result in unnecessary waste.
Be careful, though, because the behaviour of Task.Run will change based on overload resolution. In your example, the Func<Task> overload will be chosen, which will (correctly) wait for LongProcess to finish. However, if a non-task-returning delegate was used, Task.Run will only wait for execution up to the first await (note that this is how TaskFactory.StartNew will always behave, so don't use that).
Quite often people think that async-await is done by several threads. In fact it is all done by one thread.
See the addition below about this one thread statement
The thing that helped me a lot to understand async-await is this interview with Eric Lippert about async-await. Somewhere in the middle he compares async await with a cook who has to wait for some water to boil. Instead of doing nothing, he looks around to see if there is still something else to do like slicing the onions. If that is finished, and the water still doesn't boil he checks if there is something else to do, and so forth until he has nothing to do but wait. In that case he returns to the first thing he waited for.
If your procedure calls an awaitable function, we are certain that somewhere in this awaitable function there is a call to an awaitable function, otherwise the function wouldn't be awaitable. In fact, your compiler will warn you if you forget to await somewhere in your awaitable function.
If your awaitable function calls the other awaitable function, then the thread enters this other function and starts doing the things in this function and goes deeper into other functions until he meets an await.
Instead of waiting for the results, the thread goes up in his call stack to see if there are other pieces of code he can process until he sees an await. Go up again in the call stack, process until await, etc. Once everyone is awaiting the thread looks for the bottom await and continues once that is finished.
This has the advantage, that if the caller of your awaitable function does not need the result of your function, but can do other things before the result is needed, these other things can be done by the thread instead of waiting inside your function.
A call without waiting immediately for the result would look like this:
private async Task MyFunction()
{
Task<ReturnType>taskA = SomeFunctionAsync(...)
// I don't need the result yet, I can do something else
DoSomethingElse();
// now I need the result of SomeFunctionAsync, await for it:
ReturnType result = await TaskA;
// now you can use object result
}
Note that in this scenario everything is done by one thread. As long as your thread has something to do he will be busy.
Addition. It is not true that only one thread is involved. Any thread who has nothing to do might continue processing your code after an await. If you check the thread id, you can see that this id can be changed after the await. The continuing thread has the same context as the original thread, so you can act as if it was the original thread. No need to check for InvokeRequired, no need to use mutexes or critical sections. For your code this is as if there is one thread involved.
The link to the article in the end of this answer explains a bit more about thread context
You'll see awaitable functions mainly where some other process has to do things, while your thread just has to wait idly until the other thing is finished. Examples are sending data over the internet, saving a file, communicating with a database etc.
However, sometimes some heavy calculations has to be done, and you want your thread to be free to do something else, like respond to user input. In that case you can start an awaitable action as if you called an async function.
Task<ResultType> LetSomeoneDoHeavyCalculations(...)
{
DoSomePreparations()
// start a different thread that does the heavy calculations:
var myTask = Task.Run( () => DoHeavyCalculations(...))
// now you are free to do other things
DoSomethingElse();
// once you need the result of the HeavyCalculations await for it
var myResult = await myTask;
// use myResult
...
}
Now a different thread is doing the heavy calculations while your thread is free to do other things. Once it starts awaiting your caller can do things until he starts awaiting. Effectively your thread will be fairly free to react on user input. However, this will only be the case if everyone is awaiting. While your thread is busy doing things your thread can't react on user input. Therefore always make sure that if you think your UI thread has to do some busy processing that takes some time use Task.Run and let another thread do it
Another article that helped me: Async-Await by the brilliant explainer Stephen Cleary
This answer deals with the specific case of awaiting an async method in the event handler of a GUI application. In this case the first approach has a significant advantage over the second. Before explaining why, lets rewrite the two approaches in a way that reflects clearly the context of this answer. What follows is only relevant for event handlers of GUI applications.
private async void Button1_Click(object sender, EventArgs args)
{
await Task.Run(async () => await LongProcessAsync());
}
vs
private async void Button1_Click(object sender, EventArgs args)
{
await LongProcessAsync();
}
I added the suffix Async in the method's name, to comply with the guidlines. I also made async the anonymous delegate, just for readability reasons. The overhead of creating a state machine is minuscule, and is dwarfed by the value of communicating clearly that this Task.Run returns a promise-style Task, not an old-school delegate Task intended for background processing of CPU-bound workloads.
The advantage of the first approach is that guarantees that the UI will remain responsive. The second approach offers no such guarantee. As long as you are using the build-in async APIs of the .NET platform, the probability of the UI being blocked by the second approach is pretty small. After all, these APIs are implemented by experts¹. By the moment you start awaiting your own async methods, all guarantees are off. Unless of course your first name is Stephen, and your surname is Toub or Cleary. If that's not the case, it is quite possible that sooner or later you'll write code like this:
public static async Task LongProcessAsync()
{
TeenyWeenyInitialization(); // Synchronous
await SomeBuildInAsyncMethod().ConfigureAwait(false); // Asynchronous
CalculateAndSave(); // Synchronous
}
The problem obviously is with the method TeenyWeenyInitialization(). This method is synchronous, and comes before the first await inside the body of the async method, so it won't be awaited. It will run synchronously every time you call the LongProcessAsync(). So if you follow the second approach (without Task.Run), the TeenyWeenyInitialization() will run on the UI thread.
How bad this can be? The initialization is teeny-weeny after all! Just a quick trip to the database to get a value, read the first line of a small text file, get a value from the registry. It's all over in a couple of milliseconds. At the time you wrote the program. In your PC. Before moving the data folder in a shared drive. Before the amount of data in the database became huge.
But you may get lucky and the TeenyWeenyInitialization() remains fast forever, what about the second synchronous method, the CalculateAndSave()? This one comes after an await that is configured to not capture the context, so it runs on a thread-pool thread. It should never run on the UI thread, right? Wrong. It depends to the Task returned by SomeBuildInAsyncMethod(). If the Task is completed, a thread switch will not occur, and the CalculateAndSave() will run on the same thread that called the method. If you follow the second approach, this will be the UI thread. You may never experience a case where the SomeBuildInAsyncMethod() returned a completed Task in your development environment, but the production environment may be different in ways difficult to predict.
Having an application that performs badly is unpleasant. Having an application that performs badly and freezes the UI is even worse. Do you really want to risk it? If you don't, please use always Task.Run(async inside your event handlers. Especially when awaiting methods you have coded yourself!
¹ Disclaimer, some built-in async APIs are not properly implemented.
Important: The Task.Run runs the supplied asynchronous delegate on a ThreadPool thread, so it's required that the LongProcessAsync has no affinity to the UI thread. If it involves interaction with UI controls, then the Task.Runis not an option. Thanks to #Zmaster for pointing out this important subtlety in the comments.

Task.Run vs. direct async call for starting long-running async methods

Several times, I have found myself writing long-running async methods for things like polling loops. These methods might look something like this:
private async Task PollLoop()
{
while (this.KeepPolling)
{
var response = await someHttpClient.GetAsync(...).ConfigureAwait(false);
var content = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
// do something with content
await Task.Delay(timeBetweenPolls).ConfigureAwait(false);
}
}
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly (also, no need to worry about reentrance).
My question is, what is the preferred method for launching such a loop from a synchronous context? I can think of at least 2 approaches:
var pollingTask = Task.Run(async () => await this.PollLoop());
// or
var pollingTask = this.PollLoop();
In either case, I can respond to exceptions using ContinueWith(). My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true? Are there other things to consider or better approaches to try?
My main understanding of the difference between these two methods is
that the first will initially start looping on a thread-pool thread,
whereas the second will run on the current thread until the first
await. Is this true?
Yes. An async method returns its task to its caller on the first await of an awaitable that is not already completed.
By convention most async methods return very quickly. Yours does as well because await someHttpClient.GetAsync will be reached very quickly.
There is no point in moving the beginning of this async method onto the thread-pool. It adds overhead and saves almost no latency. It certainly does not help throughput or scaling behavior.
Using an async lambda here (Task.Run(async () => await this.PollLoop())) is especially useless. It just wraps the task returned by PollLoop with another layer of tasks. it would be better to say Task.Run(() => this.PollLoop()).
My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true?
Yes, that's true.
In your scenario, there seem to be no need for using Task.Run though, there's practically no code between the method call and the first await, and so PollLoop() will return almost immediately. Needlessly wrapping a task in another task only makes the code less readable and adds overhead. I would rather use the second approach.
Regarding other considerations (e.g. exception handling), I think the two approaches are equivalent.
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly
As a side-note, this is more or less what a timer would do anyway. In fact Task.Delay is implemented using a timer!

Categories