Related
I'm having a piece of code that I'm not quite sure if this would run asynchronously. Below I've made up some sample scripts which truly reflects the situation. Please note that the GetAsync methods are proper asyn methods having async/await keywords and return type using the Task of the related object.
public async Task<SomeResults> MyMethod()
{
var customers = _customerApi.GetAllAsync("some_url");
var orders = _orderApi.GetAllAsync("some_url");
var products = _productApi.GetAllAsync("some_url");
await Task.WhenAll(customers, orders, products);
// some more processing and returning the results
}
Question 1: Would the three above API calls run asynchronously even though there's no await before them? But, we have the await before Task.WhenAll?
Question 2: Would the above code run asynchronously if the await keyword is removed from before the Task.WhenAll?
I've tried to Google it around but couldn't find the proper answer to this specific situation. I've started reading Parallel Programming in Microsoft .NET but yet have long way to finish it so I couldn't just wait it.
Question 1: Would the three above API calls run asynchronously even though there's no await before them? But, we have the await before Task.WhenAll?
If the methods are actually doing something asynchronously, then yes.
Question 2: Would the above code run asynchronously if the await keyword is removed from before the Task.WhenAll?
If the methods are actually doing something asynchronously, then yes. However, it would be pointless to use Task.WhenAll without await.
Why I say "if": The async keyword doesn't magically make a method asynchronous, neither does the await operator. The methods still have to actually do something asynchronously. They do that by returning an incomplete Task.
All async methods start out running synchronously, just like any other method. The magic happens at await. If await is given an incomplete Task, then the method returns its own incomplete Task, with the rest of the method signed up as a continuation of that Task. That happens all the way up the call stack as long as you're using await all the way up the call stack.
Once the Task completes, then the continuation runs (the rest of the methods after await).
But at the top of the call stack needs to be something that's actually asynchronous. If you have an async method that calls an async method that calls a synchronous method, then nothing will actually run asynchronously, even if you use await.
For example, this will run completely synchronously (i.e. the thread will block) because an incomplete Task is never returned anywhere:
async Task Method1() {
await Method2();
}
async Task Method2() {
await Method3();
}
Task Method3() {
Thread.Sleep(2000);
return Task.CompletedTask;
}
However, this will run asynchronously (i.e. during the delay, the thread is freed to do other work):
async Task Method1() {
await Method2();
}
async Task Method2() {
await Method3();
}
async Task Method3() {
await Task.Delay(2000);
}
The key is in what Task.Delay returns. If you look at that source code, you can see it returns a DelayPromise (which inherits from Task), immediately (before the time is up). Since it's awaited, that triggers Method3 to return an incomplete Task. Since Method2 awaits that, it returns an incomplete Task, etc. all the way up the call stack.
YES, to both questions with a lot of caveats.
await / async is just syntactical sugar that allows you to write async code in a synchronous way. It doesn't magically spin up threads to make things run in parallel. It just allows the currently executing thread to be freed up to do other chunks of work.
Think of the await keyword as a pair of scissors that snips the current chunk of work into two, meaning the current thread can go and do another chunk while waiting for the result.
In order to do these chunks of work, there needs to be some kind of TaskScheduler. WinForms and WPF both provide TaskSchedulers that allow a single thread to process chunks one by one, but you can also use the default scheduler (via Task.Run()) which will use the thread pool, meaning lots of threads will run lots of chunks at once.
Assuming you are using a single thread, you example code would run as follows:
_customerApi.GetAllAsync() would run until it either completes, or hits an await. At that point it would return to a Task to your calling function which gets stuffed into customers.
_orderApi.GetAllAsync() would then run in exactly the same way. A Task will be assigned to orders which may or may not be complete.
ditto _productApi.GetAllAsync()
then you thread hits await Task.WhenAll(customers, orders, products); this means it can go and do other things, so the TaskScheduler might give it some other chunks of work to do, such as continuing to do the next bit of _customerApi.GetAllAsync().
Eventually all the chunks of work will be done, and your three tasks inside customers, orders, and products will be complete. At this point the scheduler knows that it can run the bit after WhenAll()
So you can see that in this case a SINGLE thread has run all the code, but not necessarily synchronously.
Whether your code runs asynchronously depends on your definition of asynchronous. If you look really close what happens, there will only be one thread that will do the stuff. However this thread won't be waiting idly as long as it has something to do.
A thing that really helped me to understand async-await was the cook-making-breakfast analogy in this interview with Eric Lippert. Search somewhere in the middle for async await.
Suppose a cook has to make breakfast. He starts boiling water for the tea. Instead of waiting idly for the water to cook, he inserts bread in the toaster. Not waiting idly again he starts boiling water for the eggs. As soon as the tea water boils he makes the tea and waits for the toast or the eggs.
Async-await is similar. Whenever your thread would have to wait for another process to finish, like a file to be written, a database query to return data, internet data to load, the thread won't be waiting idly for the other process to finish, but it will go up the call stack to see if any of the callers isn't awaiting and starts executing statements until it sees an await. Go up the call stack again and execute until the await. etc.
Because GetAllAsync is declared async, you can be certain that there is an await in it. In fact, your compiler will warn you if you declare a function async without an await in it.
Your thread will go into _customerApi.GetAllAsync("some_url"); and executes statements until is sees an await. If the task that your thread is awaiting for isn't complete, the thread goes up the call stack (your procedure) and starts executing the next statement:_orderApi.GetAllAsync("some_url"). It executes statements until is sees an await. Your function gets control again and calls the next method.
This goes on until your procedure starts awaiting. In this case, the awaitable method Task.WhenAll (not to be confused with the non-awaitable Task.WaitAll).
Even now, the thread won't be waiting idly, it will go up the call stack and execute statements until is meets an await, goes up the call stack again, etc.
So note: no new threads won't be started. While your thread is busy executing statements of the first method call, no statements of the second call will be executed, and while statements of the second call are being executed, no statements of the first call will be executed, not even if the first await is ready.
This is similar to the one and only cook: while he is inserting bread in the toaster, he can't process the boiling water for the tea: only after the bread is inserted and he would start waiting idly for it to be toasted he can continue making tea.
The await Task.WhenAll is not different to other awaits, except that the task is completed when all tasks are completed. So as long as any of the tasks is not ready, your thread won't execute statements after the WhenAll. However: your thread won't be waiting idly, it will go up the call stack and start executing statements.
So although it seems that two pieces of code are executed at the same time, it is not. If you really want two pieces of code to run simultaneously you'll have to hire a new cook using `Task.Run( () => SliceTomatoes);
Hiring a new cook (starting a new thread) is only meaningful if the other task is not async and your thread has other meaningful things to do, like keeping the UI responsive. Normally your cook would Slice the Tomatoes himself. Let your caller decide whether he hires a new cook (=you) to make the breakfast and slice the tomatoes.
I oversimplified it a bit, by telling you that there is only one thread (cook) involved. In fact, it can be any thread that continues executing the statements after your await. You can see that in the debugger by examining the thread ID, quite often it will be a different thread that will continue. However this thread has the same context as your original thread, so for you it will be as if it is the same thread: no need for a mutex, no need for IsInvokeRequired for user interface threads. More information about this can be found in articles from Stephen Cleary
I've recently been learning asynchronous programming and I think I've mastered it. Asynchronous programming is simple just allowing our program to multitask.
The confusion comes with await and async of programming, it seemed to confused me a little more, could somebody help answer some of my concerns?
I don't see the async keyword as much, just something you chuck on a method to let Visual Studio know that the method may await something and for you to allow it to warn you. If it has some other special meaning that actually affects something, could someone explain?
Moving onto await, after talking to a friend I was told I had 1 major thing wrong, await doesn't block the current method, it simply executes the code left in that method and does the asynchronous operation in its own time.
Now, I'm not sure how often this happenes, but lets say yo have some code like this.
Console.WriteLine("Started checking a players data.");
var player = await GetPlayerAsync();
foreach (var uPlayer in Players.Values) {
uPlayer.SendMessage("Checking another players data");
}
if (player.Username == "SomeUsername") {
ExecuteSomeOperation();
}
Console.WriteLine("Finished checking a players data.");
As you can see, I run some asynchronous code on GetPlayerAsync, what happens if we get deeper into the scope and we need to access player, but it hasn't returned the player yet?
If it doesn't block the method, how does it know that player isn't null, does it do some magic and wait for us if we got to that situation, or do we just forbid ourselves from writing methods this way and handle it ourselves.
I've recently been learning asynchronous programming and I think I've mastered it.
I was one of the designers of the feature and I don't feel like I've even come close to mastering it, and you are asking beginner level questions and have some very, very wrong ideas, so there's some hubris going on here I suspect.
Asynchronous programming is simply just allowing our program to multitask.
Suppose you asked "why are some substances hard and some soft?" and I answered "substances are made of arrangements of atoms, and some atom arrangements are hard and some are soft". Though that is undoubtedly true, I hope you would push back on this unhelpful non-explanation.
Similarly, you've just replaced the vague word "asynchronous" with another vague word "multitask". This is an explanation that explains nothing, since you haven't clearly defined what it means to multitask.
Asynchronous workflows are undoubtedly about executing multiple tasks. That's why the fundamental unit of work in a workflow is the Task<T> monad. An asynchronous workflow is the composition of multiple tasks by constructing a graph of dependency relationships among them. But that says nothing about how that workflow is actually realized in software. This is a complex and deep subject.
I don't see the async keyword as much, just something you chuck on a method to let Visual Studio know that the method may await something and for you to allow it to warn you.
That's basically correct, though don't think of it as telling Visual Studio; VS doesn't care. It's the C# compiler that you're telling.
If it has some other special meaning that actually affects something, could someone explain?
It just makes await a keyword inside the method, and puts restrictions on the return type, and changes the meaning of return to "signal that the task associated with this invocation is complete", and a few other housekeeping details.
await doesn't block the current method
Of course it does. Why would you suppose that it does not?
It doesn't block the thread, but it surely blocks the method.
it simply executes the code left in that method and does the asynchronous operation in its own time.
ABSOLUTELY NOT. This is completely backwards. await does the opposite of that. Await means if the task is not complete then return to your caller, and sign up the remainder of this method as the continuation of the task.
As you can see, I run some asynchronous code on GetPlayerAsync, what happens if we get deeper into the scope and we need to access player, but it hasn't returned the player yet?
That doesn't ever happen.
If the value assigned to player is not available when the await executes then the await returns, and the remainder of the method is resumed when the value is available (or when the task completes exceptionally.)
Remember, await mean asynchronously wait, that's why we called it "await". An await is a point in an asynchronous workflow where the workflow cannot proceed until the awaited task is complete. That is the opposite of how you are describing await.
Again, remember what an asynchronous workflow is: it is a collection of tasks where those tasks have dependencies upon each other. We express that one task has a dependency upon the completion of another task by placing an await at the point of the dependency.
Let's look at your workflow in more detail:
var player = await GetPlayerAsync();
foreach (var uPlayer in Players.Values) ...
if (player.Username == "SomeUsername") ...
The await means "the remainder of this workflow cannot continue until the player is obtained". Is that actually correct? If you want the foreach to not execute until the player is fetched, then this is correct. But the foreach doesn't depend on the player, so we could rewrite this like this:
Task<Player> playerTask = GetPlayerAsync();
foreach (var uPlayer in Players.Values) ...
Player player = await playerTask;
if (player.Username == "SomeUsername") ...
See, we have moved the point of dependency to later in the workflow. We start the "get a player" task, then we do the foreach, and then we check to see if the player is available right before we need it.
If you have the belief that await somehow "takes a call and makes it asynchronous", this should dispel that belief. await takes a task and returns if it is not complete. If it is complete, then it extracts the value of that task and continues. The "get a player" operation is already asynchronous, await does not make it so.
If it doesn't block the method, how does it know that player isn't null
It does block the method, or more accurately, it suspends the method.
The method suspends and does not resume until the task is complete and the value is extracted.
It doesn't block the thread. It returns, so that the caller can keep on doing work in a different workflow. When the task is complete, the continuation will be scheduled onto the current context and the method will resume.
await doesn't block the current method
Correct.
it simply executes the code left in that method and does the asynchronous operation in its own time.
No, not at all. It schedules the rest of the method to run when the asynchronous operation has finished. It does not run the rest of the method immediately. It's not allowed to run any of the rest of the code in the method until the awaited operation is complete. It just doesn't block the current thread in the process, the current thread is returned back to the caller, and can go off to do whatever it wants to do. The rest of the method will be scheduled by the synchronization context (or the thread pool, if none exists) when the asynchronous operation finishes.
I had 1 major thing wrong, await doesn't block the current method, it simply executes the code left in that method and does the asynchronous operation in its own time.
But it does block the method, in the sense that a method that calls await won't continue until the results are in. It just doesn't block the thread that the method is running on.
... and we need to access player, but it hasn't returned the player yet?
That simply won't happen.
async/await is ideal for doing all kinds of I/O (file, network, database, UI) without wasting a lot of threads. Threads are expensive.
But as a programmer you can write (and think) as if it were all happening synchronously.
In this code, you will not use Await because GetPlayerAsync() runs some asynchronous code. You can consider it from the perspective that Async and Await are different in that "Async" is waiting while "Await" operates asynchronously.
Try to use Task< T > as return data.
I've been stuck on this question for a while and haven't really found any useful clarification as to why this is.
If I have an async method like:
public async Task<bool> MyMethod()
{
// Some logic
return true;
}
public async void MyMethod2()
{
var status = MyMethod(); // Visual studio green lines this and recommends using await
}
If I use await here, what's the point of the asynchronous method? Doesn't it make the async useless that VS is telling me to call await? Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Yes, of course. But that's not the purpose of await/async. The purpose is to allow you to write synchronous code that uses asynchronous operations without wasting threads, or more generally, to give the caller a measure of control over the more or less asynchronous operations.
The basic idea is that as long as you use await and async properly, the whole operation will appear to be synchronous. This is usually a good thing, because most of the things you do are synchronous - say, you don't want to create a user before you request the user name. So you'd do something like this:
var name = await GetNameAsync();
var user = await RemoteService.CreateUserAsync(name);
The two operations are synchronous with respect to each other; the second doesn't (and cannot!) happen before the first. But they aren't (necessarily) synchronous with respect to their caller. A typical example is a Windows Forms application. Imagine you have a button, and the click handler contains the code above - all the code runs on the UI thread, but at the same time, while you're awaiting, the UI thread is free to do other tasks (similar to using Application.DoEvents until the operation completes).
Synchronous code is easier to write and understand, so this allows you to get most of the benefits of asynchronous operations without making your code harder to understand. And you don't lose the ability to do things asynchronously, since Task itself is just a promise, and you don't always have to await it right away. Imagine that GetNameAsync takes a lot of time, but at the same time, you have some CPU work to do before it's done:
var nameTask = GetNameAsync();
for (int i = 0; i < 100; i++) Thread.Sleep(100); // Important busy-work!
var name = await nameTask;
var user = await RemoteService.CreateUserAsync(name);
And now your code is still beautifuly synchronous - await is the synchronization point - while you can do other things in parallel with the asynchronous operations. Another typical example would be firing off multiple asynchronous requests in parallel but keeping the code synchronous with the completion of all of the requests:
var tasks = urls.Select(i => httpClient.GetAsync(i)).ToArray();
await Task.WhenAll(tasks);
The tasks are asynchronous in respect to each other, but not their caller, which is still beautifuly synchronous.
I've made a (incomplete) networking sample that uses await in just this way. The basic idea is that while most of the code is logically synchronous (there's a protocol to be followed - ask for login->verify login->read loop...; you can even see the part where multiple tasks are awaited in parallel), you only use a thread when you actually have CPU work to do. Await makes this almost trivial - doing the same thing with continuations or the old Begin/End async model would be much more painful, especially with respect to error handling. Await makes it look very clean.
If I use await here, what's the point of the asynchronous method?
await does not block thread. MyMethod2 will run synchronously until it reaches await expression. Then MyMethod2 will be suspended until awaited task (MyMethod) is complete. While MyMethod is not completed control will return to caller of MyMethod2. That's the point of await - caller will continue doing it's job.
Doesn't it make the async useless that VS is telling me to call await?
async is just a flag which means 'somewhere in the method you have one or more await'.
Does that not defeat the purpose of offloading a task to a thread
without waiting for it to finish?
As described above, you don't have to wait for task to finish. Nothing is blocked here.
NOTE: To follow framework naming standards I suggest you to add Async suffix to asynchronous method names.
An async method is not automatically executed on a different thread. Actually, the opposite is true: an async method is always executed in the calling thread. async means that this is a method that can yield to an asynchronous operation. That means it can return control to the caller while waiting for the other execution to complete. So asnync methods are a way to wait for other asynchronoous operations.
Since you are doing nothing to wait for in MyMethod2, async makes no sense here, so your compiler warns you.
Interestingly, the team that implemented async methods has acknowledged that marking a method async is not really necessary, since it would be enough to just use await in the method body for the compiler to recognize it as async. The requirement of using the async keyword has been added to avoid breaking changes to existing code that uses await as a variable name.
I've searched the web and seen a lot of questions regarding Task.Run vs await async, but there is this specific usage scenario where I don't not really understand the difference. Scenario is quite simple i believe.
await Task.Run(() => LongProcess());
vs
await LongProcess());
where LongProcess is a async method with a few asynchronous calls in it like calling db with await ExecuteReaderAsync() for instance.
Question:
Is there any difference between the two in this scenario? Any help or input appreciated, thanks!
Task.Run may post the operation to be processed at a different thread. That's the only difference.
This may be of use - for example, if LongProcess isn't truly asynchronous, it will make the caller return faster. But for a truly asynchronous method, there's no point in using Task.Run, and it may result in unnecessary waste.
Be careful, though, because the behaviour of Task.Run will change based on overload resolution. In your example, the Func<Task> overload will be chosen, which will (correctly) wait for LongProcess to finish. However, if a non-task-returning delegate was used, Task.Run will only wait for execution up to the first await (note that this is how TaskFactory.StartNew will always behave, so don't use that).
Quite often people think that async-await is done by several threads. In fact it is all done by one thread.
See the addition below about this one thread statement
The thing that helped me a lot to understand async-await is this interview with Eric Lippert about async-await. Somewhere in the middle he compares async await with a cook who has to wait for some water to boil. Instead of doing nothing, he looks around to see if there is still something else to do like slicing the onions. If that is finished, and the water still doesn't boil he checks if there is something else to do, and so forth until he has nothing to do but wait. In that case he returns to the first thing he waited for.
If your procedure calls an awaitable function, we are certain that somewhere in this awaitable function there is a call to an awaitable function, otherwise the function wouldn't be awaitable. In fact, your compiler will warn you if you forget to await somewhere in your awaitable function.
If your awaitable function calls the other awaitable function, then the thread enters this other function and starts doing the things in this function and goes deeper into other functions until he meets an await.
Instead of waiting for the results, the thread goes up in his call stack to see if there are other pieces of code he can process until he sees an await. Go up again in the call stack, process until await, etc. Once everyone is awaiting the thread looks for the bottom await and continues once that is finished.
This has the advantage, that if the caller of your awaitable function does not need the result of your function, but can do other things before the result is needed, these other things can be done by the thread instead of waiting inside your function.
A call without waiting immediately for the result would look like this:
private async Task MyFunction()
{
Task<ReturnType>taskA = SomeFunctionAsync(...)
// I don't need the result yet, I can do something else
DoSomethingElse();
// now I need the result of SomeFunctionAsync, await for it:
ReturnType result = await TaskA;
// now you can use object result
}
Note that in this scenario everything is done by one thread. As long as your thread has something to do he will be busy.
Addition. It is not true that only one thread is involved. Any thread who has nothing to do might continue processing your code after an await. If you check the thread id, you can see that this id can be changed after the await. The continuing thread has the same context as the original thread, so you can act as if it was the original thread. No need to check for InvokeRequired, no need to use mutexes or critical sections. For your code this is as if there is one thread involved.
The link to the article in the end of this answer explains a bit more about thread context
You'll see awaitable functions mainly where some other process has to do things, while your thread just has to wait idly until the other thing is finished. Examples are sending data over the internet, saving a file, communicating with a database etc.
However, sometimes some heavy calculations has to be done, and you want your thread to be free to do something else, like respond to user input. In that case you can start an awaitable action as if you called an async function.
Task<ResultType> LetSomeoneDoHeavyCalculations(...)
{
DoSomePreparations()
// start a different thread that does the heavy calculations:
var myTask = Task.Run( () => DoHeavyCalculations(...))
// now you are free to do other things
DoSomethingElse();
// once you need the result of the HeavyCalculations await for it
var myResult = await myTask;
// use myResult
...
}
Now a different thread is doing the heavy calculations while your thread is free to do other things. Once it starts awaiting your caller can do things until he starts awaiting. Effectively your thread will be fairly free to react on user input. However, this will only be the case if everyone is awaiting. While your thread is busy doing things your thread can't react on user input. Therefore always make sure that if you think your UI thread has to do some busy processing that takes some time use Task.Run and let another thread do it
Another article that helped me: Async-Await by the brilliant explainer Stephen Cleary
This answer deals with the specific case of awaiting an async method in the event handler of a GUI application. In this case the first approach has a significant advantage over the second. Before explaining why, lets rewrite the two approaches in a way that reflects clearly the context of this answer. What follows is only relevant for event handlers of GUI applications.
private async void Button1_Click(object sender, EventArgs args)
{
await Task.Run(async () => await LongProcessAsync());
}
vs
private async void Button1_Click(object sender, EventArgs args)
{
await LongProcessAsync();
}
I added the suffix Async in the method's name, to comply with the guidlines. I also made async the anonymous delegate, just for readability reasons. The overhead of creating a state machine is minuscule, and is dwarfed by the value of communicating clearly that this Task.Run returns a promise-style Task, not an old-school delegate Task intended for background processing of CPU-bound workloads.
The advantage of the first approach is that guarantees that the UI will remain responsive. The second approach offers no such guarantee. As long as you are using the build-in async APIs of the .NET platform, the probability of the UI being blocked by the second approach is pretty small. After all, these APIs are implemented by experts¹. By the moment you start awaiting your own async methods, all guarantees are off. Unless of course your first name is Stephen, and your surname is Toub or Cleary. If that's not the case, it is quite possible that sooner or later you'll write code like this:
public static async Task LongProcessAsync()
{
TeenyWeenyInitialization(); // Synchronous
await SomeBuildInAsyncMethod().ConfigureAwait(false); // Asynchronous
CalculateAndSave(); // Synchronous
}
The problem obviously is with the method TeenyWeenyInitialization(). This method is synchronous, and comes before the first await inside the body of the async method, so it won't be awaited. It will run synchronously every time you call the LongProcessAsync(). So if you follow the second approach (without Task.Run), the TeenyWeenyInitialization() will run on the UI thread.
How bad this can be? The initialization is teeny-weeny after all! Just a quick trip to the database to get a value, read the first line of a small text file, get a value from the registry. It's all over in a couple of milliseconds. At the time you wrote the program. In your PC. Before moving the data folder in a shared drive. Before the amount of data in the database became huge.
But you may get lucky and the TeenyWeenyInitialization() remains fast forever, what about the second synchronous method, the CalculateAndSave()? This one comes after an await that is configured to not capture the context, so it runs on a thread-pool thread. It should never run on the UI thread, right? Wrong. It depends to the Task returned by SomeBuildInAsyncMethod(). If the Task is completed, a thread switch will not occur, and the CalculateAndSave() will run on the same thread that called the method. If you follow the second approach, this will be the UI thread. You may never experience a case where the SomeBuildInAsyncMethod() returned a completed Task in your development environment, but the production environment may be different in ways difficult to predict.
Having an application that performs badly is unpleasant. Having an application that performs badly and freezes the UI is even worse. Do you really want to risk it? If you don't, please use always Task.Run(async inside your event handlers. Especially when awaiting methods you have coded yourself!
¹ Disclaimer, some built-in async APIs are not properly implemented.
Important: The Task.Run runs the supplied asynchronous delegate on a ThreadPool thread, so it's required that the LongProcessAsync has no affinity to the UI thread. If it involves interaction with UI controls, then the Task.Runis not an option. Thanks to #Zmaster for pointing out this important subtlety in the comments.
I don't see the different between C#'s (and VB's) new async features, and .NET 4.0's Task Parallel Library. Take, for example, Eric Lippert's code from here:
async void ArchiveDocuments(List<Url> urls) {
Task archive = null;
for(int i = 0; i < urls.Count; ++i) {
var document = await FetchAsync(urls[i]);
if (archive != null)
await archive;
archive = ArchiveAsync(document);
}
}
It seems that the await keyword is serving two different purposes. The first occurrence (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.
Couldn't it just as easily be written like this?
void ArchiveDocuments(List<Url> urls) {
for(int i = 0; i < urls.Count; ++i) {
var document = FetchAsync(urls[i]); // removed await
if (archive != null)
archive.Wait(); // changed to .Wait()
archive = ArchiveAsync(document.Result); // added .Result
}
}
I've replaced the first await with a Task.Result where the value is actually needed, and the second await with Task.Wait(), where the wait is actually occurring. The functionality is (1) already implemented, and (2) much closer semantically to what is actually happening in the code.
I do realize that an async method is rewritten as a state machine, similar to iterators, but I also don't see what benefits that brings. Any code that requires another thread to operate (such as downloading) will still require another thread, and any code that doesn't (such as reading from a file) could still utilize the TPL to work with only a single thread.
I'm obviously missing something huge here; can anybody help me understand this a little better?
I think the misunderstanding arises here:
It seems that the await keyword is serving two different purposes. The first occurrence (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.
This is actually completely incorrect. Both of these have the same meaning.
In your first case:
var document = await FetchAsync(urls[i]);
What happens here, is that the runtime says "Start calling FetchAsync, then return the current execution point to the thread calling this method." There is no "waiting" here - instead, execution returns to the calling synchronization context, and things keep churning. At some point in the future, FetchAsync's Task will complete, and at that point, this code will resume on the calling thread's synchronization context, and the next statement (assigning the document variable) will occur.
Execution will then continue until the second await call - at which time, the same thing will happen - if the Task<T> (archive) isn't complete, execution will be released to the calling context - otherwise, the archive will be set.
In the second case, things are very different - here, you're explicitly blocking, which means that the calling synchronization context will never get a chance to execute any code until your entire method completes. Granted, there is still asynchrony, but the asynchrony is completely contained within this block of code - no code outside of this pasted code will happen on this thread until all of your code completes.
Anders boiled it down to a very succinct answer in the Channel 9 Live interview he did. I highly recommend it
The new Async and await keywords allow you to orchestrate concurrency in your applications. They don't actually introduce any concurrency in to your application.
TPL and more specifically Task is one way you can use to actually perform operations concurrently. The new async and await keyword allow you to compose these concurrent operations in a "synchronous" or "linear" fashion.
So you can still write a linear flow of control in your programs while the actual computing may or may not happen concurrently. When computation does happen concurrently, await and async allow you to compose these operations.
There is a huge difference:
Wait() blocks, await does not block. If you run the async version of ArchiveDocuments() on your GUI thread, the GUI will stay responsive while the fetching and archiving operations are running.
If you use the TPL version with Wait(), your GUI will be blocked.
Note that async manages to do this without introducing any threads - at the point of the await, control is simply returned to the message loop. Once the task being waited for has completed, the remainder of the method (continuation) is enqueued on the message loop and the GUI thread will continue running ArchiveDocuments where it left off.
The ability to turn the program flow of control into a state machine is what makes these new keywords intresting. Think of it as yielding control, rather than values.
Check out this Channel 9 video of Anders talking about the new feature.
The problem here is that the signature of ArchiveDocuments is misleading. It has an explicit return of void but really the return is Task. To me void implies synchronous as there is no way to "wait" for it to finish. Consider the alternate signature of the function.
async Task ArchiveDocuments(List<Url> urls) {
...
}
To me when it's written this way the difference is much more obvious. The ArchiveDocuments function is not one that completes synchronously but will finish later.
The await keyword does not introduce concurrency. It is like the yield keyword, it tells the compiler to restructure your code into lambda controlled by a state machine.
To see what await code would look like without 'await' see this excellent link: http://blogs.msdn.com/b/windowsappdev/archive/2012/04/24/diving-deep-with-winrt-and-await.aspx
The call to FetchAsync() will still block until it completes (unless a statement within calls await?) The key is that control is returned to the caller (because the ArchiveDocuments method itself is declared as async). So the caller can happily continue processing UI logic, respond to events, etc.
When FetchAsync() completes, it interrupts the caller to finish the loop. It hits ArchiveAsync() and blocks, but ArchiveAsync() probably just creates a new task, starts it, and returns the task. This allows the second loop to begin, while the task is processing.
The second loop hits FetchAsync() and blocks, returning control to the caller. When FetchAsync() completes, it again interrupts the caller to continue processing. It then hits await archive, which returns control to the caller until the Task created in loop 1 completes. Once that task is complete, the caller is again interrupted, and the second loop calls ArchiveAsync(), which gets a started task and begins loop 3, repeat ad nauseum.
The key is returning control to the caller while the heavy lifters are executing.