Microsoft announced the Visual Studio Async CTP today (October 28, 2010) that introduces the async and await keywords into C#/VB for asynchronous method execution.
First I thought that the compiler translates the keywords into the creation of a thread but according to the white paper and Anders Hejlsberg's PDC presentation (at 31:00) the asynchronous operation happens completely on the main thread.
How can I have an operation executed in parallel on the same thread? How is it technically possible and to what is the feature actually translated in IL?
It works similarly to the yield return keyword in C# 2.0.
An asynchronous method is not actually an ordinary sequential method. It is compiled into a state machine (an object) with some state (local variables are turned into fields of the object). Each block of code between two uses of await is one "step" of the state machine.
This means that when the method starts, it just runs the first step and then the state machine returns and schedules some work to be done - when the work is done, it will run the next step of the state machine. For example this code:
async Task Demo() {
var v1 = foo();
var v2 = await bar();
more(v1, v2);
}
Would be translated to something like:
class _Demo {
int _v1, _v2;
int _state = 0;
Task<int> _await1;
public void Step() {
switch(this._state) {
case 0:
this._v1 = foo();
this._await1 = bar();
// When the async operation completes, it will call this method
this._state = 1;
op.SetContinuation(Step);
case 1:
this._v2 = this._await1.Result; // Get the result of the operation
more(this._v1, this._v2);
}
}
The important part is that it just uses the SetContinuation method to specify that when the operation completes, it should call the Step method again (and the method knows that it should run the second bit of the original code using the _state field). You can easily imagine that the SetContinuation would be something like btn.Click += Step, which would run completely on a single thread.
The asynchronous programming model in C# is very close to F# asynchronous workflows (in fact, it is essentially the same thing, aside from some technical details), and writing reactive single-threaded GUI applications using async is quite an interesting area - at least I think so - see for example this article (maybe I should write a C# version now :-)).
The translation is similar to iterators (and yield return) and in fact, it was possible to use iterators to implement asynchronous programming in C# earlier. I wrote an article about that a while ago - and I think it can still give you some insight on how the translation works.
How can I have an operation executed in parallel on the same thread?
You can't. Asynchrony is not "parallelism" or "concurrency". Asynchrony might be implemented with parallelism, or it might not be. It might be implemented by breaking up the work into small chunks, putting each chunk of work on a queue, and then executing each chunk of work whenever the thread happens to be not doing anything else.
I've got a whole series of articles on my blog about how all this stuff works; the one directly germane to this question will probably go up Thursday of next week. Watch this link for details.
As I understand it, what the async and await keywords do is that every time an async method employs the await keyword, the compiler will turn the remainder of the method into a continuation that is scheduled when the async operation is completed. That allows async methods to return to the caller immediately and resume work when the async part is done.
According to the available papers there are a lot details to it, but unless I am mistaken, that is the gist of it.
As I see it the purpose of the async methods is not to run a lot of code in parallel, but to chop up async methods into a number of small chunks, that can be called as needed. The key point is that the compiler will handle all the complex wiring of callbacks using tasks/continuations. This not only reduces complexity, but allows async method to be written more or less like traditional synchronous code.
Related
With async turtles all the way down ...
Some of the background research:
Async/Await - Best Practices in Asynchronous Programming
Async
In code I've seen:
Developers wrap code in a Task.Run() upon entering a method and awaiting it to force it to be async or (or older .NET 4.0 TaskFactory.StartNew())
I've personally just left it synchronous (but it partially violates turtles all the down as different branches often are sync)
I've also seen the attempt to use ValueTask to short-circuit the return as async given the nature of ValueTask with a return value being akin to:
return new ValueTask<int>(valueWantToReturn);
Example psudeo code:
public async Task ParentMethodAsync()
{
// ... do something upon entering
int neededValue = SyncNestedDryMethod(); // ** this part **
await _nthLayer.DoSomethingWithRepoAsync(); // ** same concept applies here
// if there is no asyncronous logic required **
// ... do anything else needed
}
public int SyncNestedDryMethod()
{
// ... do something that requires no real async method signature
// declaration because it is straight synchronous logic (covers reusing
// code for DRY purposes, method-level SRP abstraction (includes moving it
// to an abstracted class -- for future readers), etc.)
// I've seen making the method async then awaiting Task.Run() or
// Task.StartFactory.New() inside this (sync changed to async method)
// to try and force it to be async or last line being:
// return new ValueTask<int>(valueWantToReturn);
// but everything above it being only syncronous logic
}
Is there a best practice to be aware of not found addressed in the respective links (or overlooked) or elsewhere which I have not come across to address abstracting synchronous code into new methods (for DRY purposes, SRP, etc.) that do not have the ability to be async or are forced async all the while following async through and through?
Your question isn't entirely clear, but I think you're asking how to run synchronous code in an asynchronous way? If so, the answer is: you can't.
Using Task.Run() doesn't make the code asynchronous. It just runs the code on a different thread and gives you an asynchronous way to wait for that thread to complete. That may be what you need if you just need to get the work off of the current thread. For example, you would use that in a desktop application where you want to get the work off of the UI thread. But this wouldn't help you at all in ASP.NET.
ValueTask is used for cases where the work might be done asynchronously, but also may complete synchronously too. For example, if data is usually returned from an in memory cache, but refreshing that cache is done asynchronously. This is described in detail here. But if the work is always synchronous, then there is no point using ValueTask, or Task or async.
After reading lot of stuff about async/await, I still not sure where do I use await operator in my async method:
public async Task<IActionResult> DailySchedule(int professionalId, DateTime date)
{
var professional = professionalServices.Find(professionalId);
var appointments = scheduleService.SearchForAppointments(date, professional);
appointments = scheduleService.SomeCalculation(appointments);
return PartialView(appointments);
}
Should I create an async version for all 3 method and call like this?
var professional = await professionalServices.FindAsync(professionalId);
var appointments = await scheduleService.SearchForAppointmentsAsync(date, professional);
appointments = await scheduleService.SomeCalculationAsync(appointments);
or Should I make async only the first one ?
var professional = await professionalServices.FindAsync(professionalId);
var appointments = scheduleService.SearchForAppointments(date, professional);
appointments = scheduleService.SomeCalculation(appointments);
What´s is the difference?
I still not sure where do I use await operator in my async method
You're approaching the problem from the wrong end.
The first thing to change when converting to async is the lowest-level API calls. There are some operations that are naturally asynchronous - specifically, I/O operations. Other operations are naturally synchronous - e.g., CPU code.
Based on the names "Find", "SearchForAppointments, and "SomeCalculation", I'd suspect that Find and SearchForAppointments are I/O-based, possibly hitting a database, while SomeCalculation is just doing some CPU calculation.
But don't change Find to FindAsync just yet. That's still going the wrong way. The right way is to start at the lowest API, so whatever I/O that Find eventually does. For example, if this is an EF query, then use the EF6 asynchronous methods within Find. Then Find should become FindAsync, which should drive DailySchedule to be DailyScheduleAsync, which causes its callers to become async, etc. That's the way that async should grow.
As VSG24 has said, you should await each and every call that can be awaited that is because this will help you keep the main thread free from any long running task — that are for instance, tasks that download data from internet, tasks that save the data to the disk using I/O etc. The problem was that whenever you had to do a long running task, it always froze the UI. To overcome this, asynchronous pattern was used and thus this async/await allows you create a function that does the long running task on the background and your main thread keeps talking to the users.
I still not sure where do I use await operator in my async method
The intuition is that every function ending with Async can be awaited (provided their signature also matches, the following) and that they return either a Task, or Task<T>. Then they can be awaited using a function that returns void — you cannot await void. So the functions where there will be a bit longer to respond, you should apply await there. My own opinion is that a function that might take more than 1 second should be awaited — because that is a 1 second on your device, maybe your client has to wait for 5 seconds, or worse 10 seconds. They are just going to close the application and walk away saying, It doesn't work!.
Ok, if one of these is very very fast, it does not need to be async, right ?
Fast in what manner? Don't forget that your clients may not have very very fast machines and they are definitely going to suffer from the frozen applications. So them a favor and always await the functions.
What´s is the difference?
The difference is that in the last code sample, only the first call will be awaited and the others will execute synchronously. Visual Studio will also explain this part to you by providing green squiggly lines under the code that you can see by hovering over it. That is why, you should await every async call where possible.
Tip: If the process in the function is required, such as loading all of the data before starting the application then you should avoid async/await pattern and instead use the synchronous approach to download the data and perform other tasks. So it entirely depends on what you are doing instead of what the document says. :-)
Every async method call needs to be awaited, so that every method call releases the thread and thus not causing a block. Therefore in your case this is how you should do it:
var professional = await professionalServices.FindAsync(professionalId);
var appointments = await scheduleService.SearchForAppointmentsasync(date, professional);
appointments = await scheduleService.SomeCalculationAsync(appointments);
For running code asynchronously (eg with async/await) I need a proper Task. Of course there are several predefined methods in the framework which cover the most frequent operations, but sometimes I want to write my own ones. I’m new to C# so it’s quite possible that I’m doing it wrong but I’m at least not fully happy with my current practice.
See the following example for what I’m doing:
public async Task<bool> doHeavyWork()
{
var b1 = await this.Foo();
//var b2 = await Task<bool>.Factory.StartNew(Bar); //using Task.Run
var b2 = await Task.Run(()=>Bar());
return b1 & b2;
}
public Task<bool> Foo()
{
return Task.Factory.StartNew(() =>
{
//do a lot of work without awaiting
//any other Task
return false;
});
}
public bool Bar()
{
//do a lot of work without awaiting any other task
return false;
}
In general I create and consume such methods like the Foo example, but there is an ‘extra’ lambda containing the whole method logic which doesn't look very pretty imho. Another option is to consume any Method like the Bar example, however I think that’s even worse because it isn’t clear that this method should be run async (apart from proper method names like BarAsync) and the Task.Factory.StartNew may has to be repeated several times in the program. I don’t know how to just tell the compiler ‘this method returns a Task, please wrap it as a whole into a Task when called’ which is what I want to do.
Finally my question: What’s the best way to write such a method? Can I get rid of the ‘extra’ lambda (without adding an additional named method of course)?
EDIT
As Servy stated there is always good reason to have a synchronous version of the method. An async version should be provided only if absolutley necessary (Stephen Cleary's link).
Would the world end if someone wanted to call Bar without starting it in a new thread/Task? What if there is another method that is already in a background thread so it doesn't need to start a new task to just run that long running code? What if the method ends up being refactored into a class library that is called from both desktop app context as well as an ASP or console context? In those other contexts that method would probably just need to be run directly, not as a Task.
I would say your code should look like Bar does there, unless it's absolutely imperative that, under no circumstances whatsoever, should that code be run without starting a new Task.
Also note that with newer versions of .NET you should be switching from Task.Factory.StartNew to Task.Run, so the amount of code should go down (a tad).
SO I've been doing a ton of reading lately about the net Async CTP and one thing that keeps coming up are sentences like these: "Asynchrony is not about starting new threads, its about multiplexing work", and "Asynchrony without multithreading is the same idea [as cooperative multitasking]. You do a task for a while, and when it yields control, you do another task for a while on that thread".
I'm trying to understand whether remarks like this are purely (well, mostly) esoteric and academic or whether there is some language construct I'm overlooking that allows the tasks I start via "await" to magically run on the UI thread.
In his blog, Eric Lippert gives this example as a demonstration of how you can have Asyncrhony without multithreading:
async void FrobAll()
{
for(int i = 0; i < 100; ++i)
{
await FrobAsync(i); // somehow get a started task for doing a Frob(i) operation on this thread
}
}
Now, it's the comment that intrigues me here: "...get a started task for doing a Frob(i) operation on this thread".
Just how is this possible? Is this a mostly theoretical remark? The only cases I can see so far where a task appears to not need a separate thread (well, you can't know for sure unless you examine the code) would be something like Task.Delay(), which one can await on without starting another thread. But I consider this a special case, since I didn't write the code for that.
For the average user who wants to offload some of their own long running code from the GUI thread, aren't we talking primarily about doing something like Task.Run to get our work offloaded, and isn't that going to start it in another thread? If so, why all this arm waiving about not confusing asynchorny with multithreading?
Please see my intro to async/await.
In the case of CPU work, you do have to use something like Task.Run to execute it in another thread. However, there's a whole lot of work which is not CPU work (network requests, file system requests, timers, ...), and each of those can be wrapped in a Task without using a thread.
Remember, at the driver level, everything is asynchronous. The synchronous Win32 APIs are just convenience wrappers.
Not 100% sure, but from the article it sounds like what's going on is that it allows windows messages (resize events, mouse clicks, etc) to be interleaved between the calls to FrobAsync. Roughly analogous to:
void FrobAll()
{
for(int i = 0; i < 100; ++i)
{
FrobAsync(i); // somehow get a started task for doing a Frob(i) operation on this thread
System.Windows.Forms.Application.DoEvents();
}
}
or maybe more accurately:
void FrobAll()
{
SynchronizationContext.Current.Post(QueueFrob, 0);
}
void QueueFrob(Object state) {
var i = (int)state;
FrobAsync(i);
if (i == 99) return;
SynchronizationContext.Current.Post(QueueFrob, i+1);
}
Without the nasty call to DoEvents between each iteration. The reason that this occurs is because the call to await sends a Windows message that enqueues the FrobAsync call, allowing any windows messages that occurred between Frobbings to be executed before the next Frobbing begins.
async/await is simply about asynchronous. From the caller side, it doesn't matter if multiple threads are being used or not--simply that it does something asynchronous. IO completion ports, for example, are asychronous but doesn't do any multi-threading (completion of the operation occurs on a background thread; but the "work" wasn't done on that thread).
From the await keyword, "multi-threaded" is meaningless. What the async operation does is an implementation detail.
In the sense of Eric's example, that method could have been implemented as follows:
return Task.Factory.StartNew(SomeMethod,
TaskScheduler.FromCurrentSynchronizationContext());
Which really means queue a call to SomeMethod on the current thread when it's done all the currently queued work. That's asynchronous to the caller of FrobAsync in that FrobAsync returns likely before SomeMethod is executed.
Now, FrobAsync could have been implemented to use multithreading, in which case it could have been written like this:
return Task.Factory.StartNew(SomeMethod);
which would, if the default TaskScheduler was not changed, use the thread pool. But, from the caller's perspective, nothing has changed--you still await the method.
From the perspective of multi-threading, this is what you should be looking at. Using Task.Start, Task.Run, or Task.Factory.StartNew.
I don't see the different between C#'s (and VB's) new async features, and .NET 4.0's Task Parallel Library. Take, for example, Eric Lippert's code from here:
async void ArchiveDocuments(List<Url> urls) {
Task archive = null;
for(int i = 0; i < urls.Count; ++i) {
var document = await FetchAsync(urls[i]);
if (archive != null)
await archive;
archive = ArchiveAsync(document);
}
}
It seems that the await keyword is serving two different purposes. The first occurrence (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.
Couldn't it just as easily be written like this?
void ArchiveDocuments(List<Url> urls) {
for(int i = 0; i < urls.Count; ++i) {
var document = FetchAsync(urls[i]); // removed await
if (archive != null)
archive.Wait(); // changed to .Wait()
archive = ArchiveAsync(document.Result); // added .Result
}
}
I've replaced the first await with a Task.Result where the value is actually needed, and the second await with Task.Wait(), where the wait is actually occurring. The functionality is (1) already implemented, and (2) much closer semantically to what is actually happening in the code.
I do realize that an async method is rewritten as a state machine, similar to iterators, but I also don't see what benefits that brings. Any code that requires another thread to operate (such as downloading) will still require another thread, and any code that doesn't (such as reading from a file) could still utilize the TPL to work with only a single thread.
I'm obviously missing something huge here; can anybody help me understand this a little better?
I think the misunderstanding arises here:
It seems that the await keyword is serving two different purposes. The first occurrence (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.
This is actually completely incorrect. Both of these have the same meaning.
In your first case:
var document = await FetchAsync(urls[i]);
What happens here, is that the runtime says "Start calling FetchAsync, then return the current execution point to the thread calling this method." There is no "waiting" here - instead, execution returns to the calling synchronization context, and things keep churning. At some point in the future, FetchAsync's Task will complete, and at that point, this code will resume on the calling thread's synchronization context, and the next statement (assigning the document variable) will occur.
Execution will then continue until the second await call - at which time, the same thing will happen - if the Task<T> (archive) isn't complete, execution will be released to the calling context - otherwise, the archive will be set.
In the second case, things are very different - here, you're explicitly blocking, which means that the calling synchronization context will never get a chance to execute any code until your entire method completes. Granted, there is still asynchrony, but the asynchrony is completely contained within this block of code - no code outside of this pasted code will happen on this thread until all of your code completes.
Anders boiled it down to a very succinct answer in the Channel 9 Live interview he did. I highly recommend it
The new Async and await keywords allow you to orchestrate concurrency in your applications. They don't actually introduce any concurrency in to your application.
TPL and more specifically Task is one way you can use to actually perform operations concurrently. The new async and await keyword allow you to compose these concurrent operations in a "synchronous" or "linear" fashion.
So you can still write a linear flow of control in your programs while the actual computing may or may not happen concurrently. When computation does happen concurrently, await and async allow you to compose these operations.
There is a huge difference:
Wait() blocks, await does not block. If you run the async version of ArchiveDocuments() on your GUI thread, the GUI will stay responsive while the fetching and archiving operations are running.
If you use the TPL version with Wait(), your GUI will be blocked.
Note that async manages to do this without introducing any threads - at the point of the await, control is simply returned to the message loop. Once the task being waited for has completed, the remainder of the method (continuation) is enqueued on the message loop and the GUI thread will continue running ArchiveDocuments where it left off.
The ability to turn the program flow of control into a state machine is what makes these new keywords intresting. Think of it as yielding control, rather than values.
Check out this Channel 9 video of Anders talking about the new feature.
The problem here is that the signature of ArchiveDocuments is misleading. It has an explicit return of void but really the return is Task. To me void implies synchronous as there is no way to "wait" for it to finish. Consider the alternate signature of the function.
async Task ArchiveDocuments(List<Url> urls) {
...
}
To me when it's written this way the difference is much more obvious. The ArchiveDocuments function is not one that completes synchronously but will finish later.
The await keyword does not introduce concurrency. It is like the yield keyword, it tells the compiler to restructure your code into lambda controlled by a state machine.
To see what await code would look like without 'await' see this excellent link: http://blogs.msdn.com/b/windowsappdev/archive/2012/04/24/diving-deep-with-winrt-and-await.aspx
The call to FetchAsync() will still block until it completes (unless a statement within calls await?) The key is that control is returned to the caller (because the ArchiveDocuments method itself is declared as async). So the caller can happily continue processing UI logic, respond to events, etc.
When FetchAsync() completes, it interrupts the caller to finish the loop. It hits ArchiveAsync() and blocks, but ArchiveAsync() probably just creates a new task, starts it, and returns the task. This allows the second loop to begin, while the task is processing.
The second loop hits FetchAsync() and blocks, returning control to the caller. When FetchAsync() completes, it again interrupts the caller to continue processing. It then hits await archive, which returns control to the caller until the Task created in loop 1 completes. Once that task is complete, the caller is again interrupted, and the second loop calls ArchiveAsync(), which gets a started task and begins loop 3, repeat ad nauseum.
The key is returning control to the caller while the heavy lifters are executing.