Could someone please be kind enough to confirm if I have understood the Async await keyword correctly? (Using version 3 of the CTP)
Thus far I have worked out that inserting the await keyword prior to a method call essentially does 2 things, A. It creates an immediate return and B. It creates a "continuation" that is invoked upon the completion of the async method invocation. In any case the continuation is the remainder of the code block for the method.
So what I am wondering is, are these two bits of code technically equivalent, and if so, does this basically mean that the await keyword is identical to creating a ContinueWith Lambda (Ie: it's basically a compiler shortcut for one)? If not, what are the differences?
bool Success =
await new POP3Connector(
"mail.server.com", txtUsername.Text, txtPassword.Text).Connect();
// At this point the method will return and following code will
// only be invoked when the operation is complete(?)
MessageBox.Show(Success ? "Logged In" : "Wrong password");
VS
(new POP3Connector(
"mail.server.com", txtUsername.Text, txtPassword.Text ).Connect())
.ContinueWith((success) =>
MessageBox.Show(success.Result ? "Logged In" : "Wrong password"));
The general idea is correct - the remainder of the method is made into a continuation of sorts.
The "fast path" blog post has details on how the async/await compiler transformation works.
Differences, off the top of my head:
The await keyword also makes use of a "scheduling context" concept. The scheduling context is SynchronizationContext.Current if it exists, falling back on TaskScheduler.Current. The continuation is then run on the scheduling context. So a closer approximation would be to pass TaskScheduler.FromCurrentSynchronizationContext into ContinueWith, falling back on TaskScheduler.Current if necessary.
The actual async/await implementation is based on pattern matching; it uses an "awaitable" pattern that allows other things besides tasks to be awaited. Some examples are the WinRT asynchronous APIs, some special methods such as Yield, Rx observables, and special socket awaitables that don't hit the GC as hard. Tasks are powerful, but they're not the only awaitables.
One more minor nitpicky difference comes to mind: if the awaitable is already completed, then the async method does not actually return at that point; it continues synchronously. So it's kind of like passing TaskContinuationOptions.ExecuteSynchronously, but without the stack-related problems.
It's "essentially" that, but the generated code does strictly more than just that. For lots more detail on the code generated, I'd highly recommend Jon Skeet's Eduasync series:
http://codeblog.jonskeet.uk/category/eduasync/
In particular, post #7 gets into what gets generated (as of CTP 2) and why, so probably a great fit for what you're looking for at the moment:
http://codeblog.jonskeet.uk/2011/05/20/eduasync-part-7-generated-code-from-a-simple-async-method/
EDIT: I think it's likely to be more detail than what you're looking for from the question, but if you're wondering what things look like when you have multiple awaits in the method, that's covered in post #9 :)
http://codeblog.jonskeet.uk/2011/05/30/eduasync-part-9-generated-code-for-multiple-awaits/
Related
I have the following code:
var things = await GetDataFromApi(cancellationToken);
var builder = new StringBuilder(JsonSerializer.Serialize(things));
await things
.GroupBy(x => x.Category)
.ToAsyncEnumerable()
.SelectManyAwaitWithCancellation(async (category, ct) =>
{
var thingsWithColors = await _colorsApiClient.GetColorsFor(category.Select(thing => thing.Name).ToList(), ct);
return category
.Select(thing => ChooseBestColor(thingsWithColors))
.ToAsyncEnumerable();
})
.ForEachAsync(thingAndColor =>
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId); // prints different IDs
builder.Replace(thingAndColor.Thing, $"{thingAndColor.Color} {thingAndColor.Thing}");
}, cancellationToken);
It uses System.Linq.Async and I find it difficult to understand.
In "classic"/synchronous LINQ, the whole thing would get executed only when I call ToList() or ToArray() on it. In the example above, there is no such call, but the lambdas get executed anyway. How does it work?
The other concern I have is about multi-threading. I heard many times that async != multithreading. Then, how is that possible that the Console.WriteLine(Thread.CurrentThread.ManagedThreadId); prints various IDs? Some of the IDs get printed multiple times, but overall there are about 5 thread IDs in the output. None of my code creates any threads explicitly. It's all async-await.
The StringBuilder does not support multi-threading, and I'd like to understand if the implementation above is valid.
Please ignore the algorithm of my code, it does not really matter, it's just an example. What matters is the usage of System.Async.Linq.
ForEachAsync would have a similar effect as ToList/ToArray since it forces evaluation of the entire list.
By default, anything after an await continues on the same execution context, meaning if the code runs on the UI thread, it will continue running on the UI thread. If it runs on a background thread, it will continue to run on a background thread, but not necessarily the same one.
However, none of your code should run in parallel. That does not necessarily mean it is thread safe, there probably need to be some memory barriers to ensure data is flushed correctly, but I would assume these barriers are issued by the framework code itself.
The System.Async.Linq, as well as the whole dotnet/reactive repository, is currently a semi-abandoned project. The issues on GitHub are piling up, and nobody answers them officially for almost a year. There is no documentation published, apart from the XML documentation in the source code on top of each method. You can't really use this library without studying the source code, which is generally easy to do because the code is short, readable, and honestly doesn't do too much. The functionality offered by this library is similar with the functionality found in the System.Linq, with the main difference being that the input is IAsyncEnumerable<T> instead of IEnumerable<T>, and the delegates can return values wrapped in ValueTask<T>s.
With the exception of a few operators like the Merge (and only one of its overloads), the System.Async.Linq doesn't introduce concurrency. The asynchronous operations are invoked one at a time, and then they are awaited before invoking the next operation. The SelectManyAwaitWithCancellation operator is not one of the exceptions. The selector is invoked sequentially for each element, and the resulting IAsyncEnumerable<TResult> is enumerated sequentially, and its values yielded the one after the other. So it's unlikely to create thread-safety issues.
The ForEachAsync operator is just a substitute of doing a standard await foreach loop, and was included in the library at a time when the C# language support for await foreach was non existent (before C# 8). I would recommend against using this operator, because its resemblance with the new Parallel.ForEachAsync API could create confusion. Here is what is written inside the source code of the ForEachAsync operator:
// REVIEW: Once we have C# 8.0 language support, we may want to do away with these
// methods. An open question is how to provide support for cancellation,
// which could be offered through WithCancellation on the source. If we still
// want to keep these methods, they may be a candidate for
// System.Interactive.Async if we consider them to be non-standard
// (i.e. IEnumerable<T> doesn't have a ForEach extension method either).
I've recently been learning asynchronous programming and I think I've mastered it. Asynchronous programming is simple just allowing our program to multitask.
The confusion comes with await and async of programming, it seemed to confused me a little more, could somebody help answer some of my concerns?
I don't see the async keyword as much, just something you chuck on a method to let Visual Studio know that the method may await something and for you to allow it to warn you. If it has some other special meaning that actually affects something, could someone explain?
Moving onto await, after talking to a friend I was told I had 1 major thing wrong, await doesn't block the current method, it simply executes the code left in that method and does the asynchronous operation in its own time.
Now, I'm not sure how often this happenes, but lets say yo have some code like this.
Console.WriteLine("Started checking a players data.");
var player = await GetPlayerAsync();
foreach (var uPlayer in Players.Values) {
uPlayer.SendMessage("Checking another players data");
}
if (player.Username == "SomeUsername") {
ExecuteSomeOperation();
}
Console.WriteLine("Finished checking a players data.");
As you can see, I run some asynchronous code on GetPlayerAsync, what happens if we get deeper into the scope and we need to access player, but it hasn't returned the player yet?
If it doesn't block the method, how does it know that player isn't null, does it do some magic and wait for us if we got to that situation, or do we just forbid ourselves from writing methods this way and handle it ourselves.
I've recently been learning asynchronous programming and I think I've mastered it.
I was one of the designers of the feature and I don't feel like I've even come close to mastering it, and you are asking beginner level questions and have some very, very wrong ideas, so there's some hubris going on here I suspect.
Asynchronous programming is simply just allowing our program to multitask.
Suppose you asked "why are some substances hard and some soft?" and I answered "substances are made of arrangements of atoms, and some atom arrangements are hard and some are soft". Though that is undoubtedly true, I hope you would push back on this unhelpful non-explanation.
Similarly, you've just replaced the vague word "asynchronous" with another vague word "multitask". This is an explanation that explains nothing, since you haven't clearly defined what it means to multitask.
Asynchronous workflows are undoubtedly about executing multiple tasks. That's why the fundamental unit of work in a workflow is the Task<T> monad. An asynchronous workflow is the composition of multiple tasks by constructing a graph of dependency relationships among them. But that says nothing about how that workflow is actually realized in software. This is a complex and deep subject.
I don't see the async keyword as much, just something you chuck on a method to let Visual Studio know that the method may await something and for you to allow it to warn you.
That's basically correct, though don't think of it as telling Visual Studio; VS doesn't care. It's the C# compiler that you're telling.
If it has some other special meaning that actually affects something, could someone explain?
It just makes await a keyword inside the method, and puts restrictions on the return type, and changes the meaning of return to "signal that the task associated with this invocation is complete", and a few other housekeeping details.
await doesn't block the current method
Of course it does. Why would you suppose that it does not?
It doesn't block the thread, but it surely blocks the method.
it simply executes the code left in that method and does the asynchronous operation in its own time.
ABSOLUTELY NOT. This is completely backwards. await does the opposite of that. Await means if the task is not complete then return to your caller, and sign up the remainder of this method as the continuation of the task.
As you can see, I run some asynchronous code on GetPlayerAsync, what happens if we get deeper into the scope and we need to access player, but it hasn't returned the player yet?
That doesn't ever happen.
If the value assigned to player is not available when the await executes then the await returns, and the remainder of the method is resumed when the value is available (or when the task completes exceptionally.)
Remember, await mean asynchronously wait, that's why we called it "await". An await is a point in an asynchronous workflow where the workflow cannot proceed until the awaited task is complete. That is the opposite of how you are describing await.
Again, remember what an asynchronous workflow is: it is a collection of tasks where those tasks have dependencies upon each other. We express that one task has a dependency upon the completion of another task by placing an await at the point of the dependency.
Let's look at your workflow in more detail:
var player = await GetPlayerAsync();
foreach (var uPlayer in Players.Values) ...
if (player.Username == "SomeUsername") ...
The await means "the remainder of this workflow cannot continue until the player is obtained". Is that actually correct? If you want the foreach to not execute until the player is fetched, then this is correct. But the foreach doesn't depend on the player, so we could rewrite this like this:
Task<Player> playerTask = GetPlayerAsync();
foreach (var uPlayer in Players.Values) ...
Player player = await playerTask;
if (player.Username == "SomeUsername") ...
See, we have moved the point of dependency to later in the workflow. We start the "get a player" task, then we do the foreach, and then we check to see if the player is available right before we need it.
If you have the belief that await somehow "takes a call and makes it asynchronous", this should dispel that belief. await takes a task and returns if it is not complete. If it is complete, then it extracts the value of that task and continues. The "get a player" operation is already asynchronous, await does not make it so.
If it doesn't block the method, how does it know that player isn't null
It does block the method, or more accurately, it suspends the method.
The method suspends and does not resume until the task is complete and the value is extracted.
It doesn't block the thread. It returns, so that the caller can keep on doing work in a different workflow. When the task is complete, the continuation will be scheduled onto the current context and the method will resume.
await doesn't block the current method
Correct.
it simply executes the code left in that method and does the asynchronous operation in its own time.
No, not at all. It schedules the rest of the method to run when the asynchronous operation has finished. It does not run the rest of the method immediately. It's not allowed to run any of the rest of the code in the method until the awaited operation is complete. It just doesn't block the current thread in the process, the current thread is returned back to the caller, and can go off to do whatever it wants to do. The rest of the method will be scheduled by the synchronization context (or the thread pool, if none exists) when the asynchronous operation finishes.
I had 1 major thing wrong, await doesn't block the current method, it simply executes the code left in that method and does the asynchronous operation in its own time.
But it does block the method, in the sense that a method that calls await won't continue until the results are in. It just doesn't block the thread that the method is running on.
... and we need to access player, but it hasn't returned the player yet?
That simply won't happen.
async/await is ideal for doing all kinds of I/O (file, network, database, UI) without wasting a lot of threads. Threads are expensive.
But as a programmer you can write (and think) as if it were all happening synchronously.
In this code, you will not use Await because GetPlayerAsync() runs some asynchronous code. You can consider it from the perspective that Async and Await are different in that "Async" is waiting while "Await" operates asynchronously.
Try to use Task< T > as return data.
I have the following method which commits changes to a db (using Entity Framework):
public async Task<int> CommitAsync(Info info)
{
if (this.Database.Connection.State == ConnectionState.Closed)
await this.Database.Connection.OpenAsync();
await SetInfo(info);
return await base.SaveChangesAsync();
}
Is the above method safe to use as is, or should I:
Avoid using async-await, or
Use ContinueWith
It's absolutely fine to have multiple await expressions in the same async method - it would be relatively useless feature otherwise.
Basically, the method will execute synchronously until it reaches the first await where the awaitable involved hasn't already completed. It will then return to the caller, having set up a continuation for the awaitable to execute the rest of the async method. If execution later reaches another await expression where the awaitable hasn't already completed, a continuation is set up on that awaitable, etc.
Each time the method "resumes" from an await, it carries on where it left off, with the same local variables etc. This is achieved by the compiler building a state machine on your behalf.
Your code looks perfect. It gives your caller the opportunity to do something useful at moments you are waiting instead of everyone doing a busy wait until everything is finished.
The nice thing of using async-await instead of using ContinueWith is that your code looks fairly synchronous. It is easy to see in which order the statements will be executed. ContinueWith also lets you specify the order, but it is a bit more difficult to see.
If a thread enters an async procedure it executes the procedure until it meets an await. Instead of waiting until the awaited procedure is finished, control is given back to your caller who can continue performing the next statements until he meets an await, where control is given to his caller etc. Once everyone is awaiting and your OpenAsync is finished the thread continues doing the statements after OpenAsync until it meets another await.
Someone here in stackoverflow (alas lost his name) explained me once the async-await process in a breakfast metaphor.
A very useful introduction, Stephen Cleary about Async-await. Lets you also understand how async-await prevents problems with InvokeRequired
I'm looking at some code written a while back that is making me very nervous. The general shape of the methods in questions is like this;
public Task Foo(...){
SyncMethod();
SyncMethod();
...
return AsyncMethod();
}
I realize I can just mark the method async and do an await on the last call, but...do I have to? Is this safe to use as is? The sync methods that are called before the last async method do not have asynchronous alternatives and the Foo method does not care about the result of the AsyncMethod, so it looks like it's ok to just return the awaitable to the caller and let the caller deal with it.
Also, FWIW, the code in question might be called from a thread that's running with an active ASP.NET context, so I'm getting that tingling feeling in my brain that there's some invisible evil that I'm not seeing here?
I realize I can just mark the method async and do an await on the last call, but...do I have to?
Actually, if you did this, the only change that it would have is that exceptions thrown by either of those synchronous methods would be wrapped into the returned Task, while in the current implementation the method would throw without ever successfully returning a Task. The actual effect of what work is done synchronously and what work is done asynchronously is entirely unaffected.
Having said that, both of the options you've mentioned are worrisome. Here you have a method that appears to be asynchronous, meaning someone calling it expects it to return more or less right away, but in reality the method will run synchronously for some amount of time.
If your two synchronous methods are really fast and as a result, you're confident that this very small amount of synchronous work won't be noticeable to any of your callers, then that's fine. If, however, that work will (even potentially) take a non-trivial amount of time to solve, then you have a problem on your hands.
Having actually asynchronous alternatives for those methods would be best, but as a last resort, until you have a better option, often the best you can manage to do is to await Task.Run(() => SyncMethod()); for those methods (which of course means the method now needs to be marked as async).
Microsoft announced the Visual Studio Async CTP today (October 28, 2010) that introduces the async and await keywords into C#/VB for asynchronous method execution.
First I thought that the compiler translates the keywords into the creation of a thread but according to the white paper and Anders Hejlsberg's PDC presentation (at 31:00) the asynchronous operation happens completely on the main thread.
How can I have an operation executed in parallel on the same thread? How is it technically possible and to what is the feature actually translated in IL?
It works similarly to the yield return keyword in C# 2.0.
An asynchronous method is not actually an ordinary sequential method. It is compiled into a state machine (an object) with some state (local variables are turned into fields of the object). Each block of code between two uses of await is one "step" of the state machine.
This means that when the method starts, it just runs the first step and then the state machine returns and schedules some work to be done - when the work is done, it will run the next step of the state machine. For example this code:
async Task Demo() {
var v1 = foo();
var v2 = await bar();
more(v1, v2);
}
Would be translated to something like:
class _Demo {
int _v1, _v2;
int _state = 0;
Task<int> _await1;
public void Step() {
switch(this._state) {
case 0:
this._v1 = foo();
this._await1 = bar();
// When the async operation completes, it will call this method
this._state = 1;
op.SetContinuation(Step);
case 1:
this._v2 = this._await1.Result; // Get the result of the operation
more(this._v1, this._v2);
}
}
The important part is that it just uses the SetContinuation method to specify that when the operation completes, it should call the Step method again (and the method knows that it should run the second bit of the original code using the _state field). You can easily imagine that the SetContinuation would be something like btn.Click += Step, which would run completely on a single thread.
The asynchronous programming model in C# is very close to F# asynchronous workflows (in fact, it is essentially the same thing, aside from some technical details), and writing reactive single-threaded GUI applications using async is quite an interesting area - at least I think so - see for example this article (maybe I should write a C# version now :-)).
The translation is similar to iterators (and yield return) and in fact, it was possible to use iterators to implement asynchronous programming in C# earlier. I wrote an article about that a while ago - and I think it can still give you some insight on how the translation works.
How can I have an operation executed in parallel on the same thread?
You can't. Asynchrony is not "parallelism" or "concurrency". Asynchrony might be implemented with parallelism, or it might not be. It might be implemented by breaking up the work into small chunks, putting each chunk of work on a queue, and then executing each chunk of work whenever the thread happens to be not doing anything else.
I've got a whole series of articles on my blog about how all this stuff works; the one directly germane to this question will probably go up Thursday of next week. Watch this link for details.
As I understand it, what the async and await keywords do is that every time an async method employs the await keyword, the compiler will turn the remainder of the method into a continuation that is scheduled when the async operation is completed. That allows async methods to return to the caller immediately and resume work when the async part is done.
According to the available papers there are a lot details to it, but unless I am mistaken, that is the gist of it.
As I see it the purpose of the async methods is not to run a lot of code in parallel, but to chop up async methods into a number of small chunks, that can be called as needed. The key point is that the compiler will handle all the complex wiring of callbacks using tasks/continuations. This not only reduces complexity, but allows async method to be written more or less like traditional synchronous code.