When exactly are synchronous continuations dangerous?

When exactly are synchronous continuations dangerous? - c#

In a blog post, Microsoft's Sergey Tepliakov says:
you should always provide
TaskCreationOptions.RunContinuationsAsynchronously when creating
TaskCompletionSource instances.
But in that article, unless I misunderstood it, he also says that all await continuations basically act the same way as a TaskCompletionSource without TaskCreationOptions.RunContinuationsAsynchronously. So this would mean that async and await are also inherently dangerous and shouldn't be used, unless await Task.Yield() is used as well.
(edit: I did misunderstand it. He's saying that await continuing synchronously is the cause of the problematic SetResult behaviour. await Task.Yield() is recommended as a client-side workaround if the task creation can't be controlled at the source. Unless I misunderstood something again.)
I conclude that there must be specific circumstances that make synchronous continuations dangerous, and apparently that those circumstances are common for TaskCompletionSource use cases. What are those dangerous circumstances?

Thread theft, basically.
A good example here would be a client library that talks over the network to some server that is implemented as a sequence of frames on the same connection (http/2, for example, or RESP). For whatever reason, imagine that each call to the library returns a Task<T> (for some <T>) that has been provided by a TaskCompletionSource<T> that is awaiting results from the network server.
Now you want to write the read-loop that is dequeing responses from the server (presumably from a socket, stream, or pipeline API), resolve the corresponding TaskCompletionSource<T> (which could mean "the next in the queue", or could mean "use a unique correlation key/token that is present in the response").
So you effectively have:
while (communicationIsAlive)
{
var result = await ParseNextResultAsync(); // next message from the server
TaskCompletionSource<Foo> tcs = TryResolveCorrespondingPendingRequest(result);
tcs?.TrySetResult(result); // or possibly TrySetException, etc
}
Now; imagine that you didn't use RunContinuationsAsynchronously when you created the TaskCompletionSource<T>. The moment you do TrySetResult, the continuation executes on the current thread. This means that the await in some arbitrary client code (which can do anything) has now interrupted your IO read loop, and no other results will be processed until that thread relinquishes the thread.
Using RunContinuationsAsynchronously allows you to use TrySetResult (et al) without worrying about your thread being stolen.
(even worse: if you combine this with a subsequent sync-over-async call to the same resource, you can cause a hard deadlock)
Related question from before this flag existed: How can I prevent synchronous continuations on a Task?.

Related

Is calling Task.Yield in Asp.Net Core ensuring continuation to run on the ThreadPool?

So Task.Yield will yield execution back to the caller immediately, essentially making whatever is bellow it in a continuation of the task returned by it (Task.Yield).
Assuming an Asp.Net Core application which does not have a SynchronizationContext, is there any functional difference from awaiting Task.Yield and passing that continuation to Task.Run?

Assuming an Asp.Net Core application which does not have a SynchronizationContext, is there any functional difference from awaiting Task.Yield and passing that continuation to Task.Run?
There's practically no difference. await Task.Yield will queue the continuation to the thread pool and return to the caller. await Task.Run will queue its delegate to the thread pool and return to the caller.
Personally, I prefer Task.Run because it has a very explicit intent and because it's unambiguous (i.e., in other scenarios such as GUI threads, await Task.Yield will not queue to the thread pool, while await Task.Run always does the same thing regardless of context).
As others have noted in the comments, there's no reason to actually do either one on ASP.NET Core, which is already running on thread pool threads and does not have a request context. It would add a little overhead to the request while providing no benefit at all.

They're difference, if you take the result as educational view point. Better explaination can be found here, here and... if you want to see how it compile to IL, take a shot here.
From real world project, as i saw the only purpose and benefit from Task.Yield is to force a task to run asynchronous. (decorate a method as async/await doesn't ensure that they will run asynchronously, especially with the new ValueTask).
So, forcing immediate return the execution to the caller, means that you doesn't care it's final result (for example: you call an API, but never care about the response).
Which according to me, in the asp.net core server side process, have a better approach, that's build a FireAndForget service and pass the task to it, let they run on a separate scope would be much safer.
That's cause the Task.Yield will capture the current context and goes on with it. So an exception would very likely to happen if you use resource that related to the current execution scope, HttpRequest is a particular case, which normally finish in less than a second(HttpContext already got disposed).

How to use ConfigureAwait on async methods

I am looking into correct uses of ConfigureAwait I have identified a valid use case for ConfigureAwait (i.e. after the await does not require the calling thread synchronisationcontext) and couldn't see any advice on using ConfigureAwait on async methods
My code looks like the following
Public async Task StartMyProgram()
{
await RunBackgroundTask();
}
Private async Task RunBackgroundTask()
{
await Task.Delay(5000);
}
To use ConfigureAwait correctly am I assuming that this should be used on both await calls like the below code:
Public async Task StartMyProgram()
{
await RunBackgroundTask().ConfigureAwait(false);
}
Private async Task RunBackgroundTask()
{
await Task.Delay(5000).ConfigureAwait(false);
}
or do I just need it on the private RunBackgroundTask method?

or do I just need it on the private RunBackgroundTask method?
Each method should make its ConfigureAwait(false) decision on its own. This is because each method captures its own context at await, regardless of what its caller/called methods do. ConfigureAwait configures a single await; it doesn't "flow" at all.
So, RunBackgroundTask needs to determine "do I need to resume on my context?" If no, then it should use ConfigureAwait(false).
And StartMyProgram needs to determine "do I need to resume on my context?" If no, then it should use ConfigureAwait(false).

This is a simplification, but you can assume that ConfigureAwait(false) is a subtle way to say "hey, the stuff you are about to call will not grab the current synchronization context".
The keyword here is current: the synchronization context is used to, well, synchronize with the asynchronous state machine. Your asynchronous methods are turned into tasks, and the whole sequence must returnn only when all complete as you requested.
To perform such synchronization, the inner task scheduler requires a synchronization context. When you are writing a library, you have no idea what is the caller doing, and particularly, you are now aware of additional threads that may be running asynchronous methods (e.g. concurrent asynchronous methods in different threads, or message pumps).
For this reason, you play safe calling ConfigureAwait(false), indicating to the runtime not to borrow (and capture) the caller synchronization context, but use a new one.
Why would you do that? First, because borrowing something in a non deterministic state is not nice. But more important, to avoid deadlocks: in fact, during the execution of your asynchronous method, you are using by default the captured context of the caller. This means that you might end up in deadlocks and/or subtle issues because the thread which is required to run task can be stuck by your method, thus ending up in deadlock.
By default, when you use async/await, it will resume on the original thread that started the request. However, if another long-running process currently has taken over that thread, you will be stuck waiting for it to complete. To avoid this issue, you can use a method called ConfigureAwait with a false parameter. When you do, this tells the Task that it can resume itself on any thread that is available instead of waiting for the thread that originally created it. This will speed up responses and avoid many deadlocks.
With ConfigureAwait(true) (the default one), when you resume on another thread, the thread synchronization context is lost thus losing culture and/or language settings along with other things like HttpContext.Current (this happens in .NET Standard).
As a rule of thumb, you should always use ConfigureAwait(false) in library codes, and also in your code when you are multi-thread. This is an example as the default behaviour may not be suitable for most of the cases.

When entering RunBackgroundTask, you have no clue of what the SynchronizationContext is. So you really don't need to capture it and should keep using .ConfigureAwait(false).

Do you have to await async methods?

Lets say you have a service API call. The callee is somewhat performance critical, so in order not to hold up the API call any longer than necessary, there's an SaveAsync() method that is used. I can't await it, however, because that would hold up the API call just as long (or perhaps even longer) than the non-async version.
The reason I'm asking is this: If you don't await the call, is there a chance the Task object returned gets garbage collected? And if so, would that interrupt the running task?

The reason I'm asking is this: If you don't await the call, is there a chance the Task object returned gets garbage collected?
Generally, no, that shouldn't happen. The underlying TaskScheduler which queues the task, usually keeps a reference to it for the desired life-time until it completes. You can see that in the documentation of TaskScheduler.QueueTask:
A typical implementation would store the task in an internal data structure, which would be serviced by threads that would execute those tasks at some time in the future.
Your real problem is going to be with the ASP.NET SynchronizationContext, which keeps track of any on-going asynchronous operation at runtime. If your controller action finishes prior to the async operation, you're going to get an exception.
If you want to have "fire and forget" operations in ASP.NET, you should make sure to register them with the ASP.NET runtime, either via HostingEnvironment.QueueBackgroundWorkItem or BackgroundTaskManager

No, it won't interrupt the running task, but you won't observe the exceptions from the task either, which is not exactly good. You can (at least partially) avoid that by wrapping all running code in a try ... catch and log the exception.
Also, if you're inside asp.net, then your whole application could be stopped or recycled, and in this case your task will be interrupted. This is harder to avoid - you can register for AppPool shutdown notification, or use something like Hangfire.

Web API Sync Calls Best Practice

Probably this question has already been made, but I never found a definitive answer. Let's say that I have a Web API 2.0 Application hosted on IIS. I think I understand that best practice (to prevent deadlocks on client) is always use async methods from the GUI event to the HttpClient calls. And this is good and it works. But what is the best practice in case I had client application that does not have a GUI (e.g. Window Service, Console Application) but only synchronous methods from which to make the call? In this case, I use the following logic:
void MySyncMethodOnMyWindowServiceApp()
{
list = GetDataAsync().Result().ToObject<List<MyClass>>();
}
async Task<Jarray> GetDataAsync()
{
list = await Client.GetAsync(<...>).ConfigureAwait(false);
return await response.Content.ReadAsAsync<JArray>().ConfigureAwait(false);
}
But unfortunately this can still cause deadlocks on client that occur at random times on random machines.
The client app stops at this point and never returns:
list = await Client.GetAsync(<...>).ConfigureAwait(false);

If it's something that can be run in the background and isn't forced to be synchronous, try wrapping the code (that calls the async method) in a Task.Run(). I'm not sure that'll solve a "deadlock" problem (if it's something out of sync, that's another issue), but if you want to benefit from async/await, if you don't have async all the way down, I'm not sure there's a benefit unless you run it in a background thread. I had a case where adding Task.Run() in a few places (in my case, from an MVC controller which I changed to be async) and calling async methods not only improved performance slightly, but it improved reliability (not sure that it was a "deadlock" but seemed like something similar) under heavier load.
You will find that using Task.Run() is regarded by some as a bad way to do it, but I really couldn't see a better way to do it in my situation, and it really did seem to be an improvement. Perhaps this is one of those things where there's the ideal way to do it vs. the way to make it work in the imperfect situation that you're in. :-)
[Updated due to requests for code]
So, as someone else posted, you should do "async all the way down". In my case, my data wasn't async, but my UI was. So, I went async down as far as I could, then I wrapped my data calls with Task.Run in such as way that it made sense. That's the trick, I think, to figure out if it makes sense that things can run in parallel, otherwise, you're just being synchronous (if you use async and immediately resolve it, forcing it to wait for the answer). I had a number of reads that I could perform in parallel.
In the above example, I think you have to async up as far as makes sense, and then at some point, determine where you can spin off a t hread and perform the operation independent of the other code. Let's say you have an operation that saves data, but you don't really need to wait for a response -- you're saving it and you're done. The only thing you might have to watch out for is not to close the program without waiting for that thread/task to finish. Where it makes sense in your code is up to you.
Syntax is pretty easy. I took existing code, changed the controller to an async returning a Task of my class that was formerly being returned.
var myTask = Task.Run(() =>
{
//...some code that can run independently.... In my case, loading data
});
// ...other code that can run at the same time as the above....
await Task.WhenAll(myTask, otherTask);
//..or...
await myTask;
//At this point, the result is available from the task
myDataValue = myTask.Result;
See MSDN for probably better examples:
https://msdn.microsoft.com/en-us/library/hh195051(v=vs.110).aspx
[Update 2, more relevant for the original question]
Let's say that your data read is an async method.
private async Task<MyClass> Read()
You can call it, save the task, and await on it when ready:
var runTask = Read();
//... do other code that can run in parallel
await runTask;
So, for this purpose, calling async code, which is what the original poster is requesting, I don't think you need Task.Run(), although I don't think you can use "await" unless you're an async method -- you'll need an alternate syntax for Wait.
The trick is that without having some code to run in parallel, there's little point in it, so thinking about multi-threading is still the point.

Using Task<T>.Result is the equivalent of Wait which will perform a synchronous block on the thread. Having async methods on the WebApi and then having all the callers synchronously blocking them effectively makes the WebApi method synchronous. Under load you will deadlock if the number of simultaneous Waits exceeds the server/app thread pool.
So remember the rule of thumb "async all the way down". You want the long running task (getting a collection of List) to be async. If the calling method must be sync you want to make that conversion from async to sync (using either Result or Wait) as close to the "ground" as possible. Keep they long running process async and have the sync portion as short as possible. That will greatly reduce the length of time that threads are blocked.
So for example you can do something like this.
void MySyncMethodOnMyWindowServiceApp()
{
List<MyClass> myClasses = GetMyClassCollectionAsync().Result;
}
Task<List<MyClass>> GetMyListCollectionAsync()
{
var data = await GetDataAsync(); // <- long running call to remote WebApi?
return data.ToObject<List<MyClass>>();
}
The key part is the long running task remains async and not blocked because await is used.
Also don't confuse the responsiveness with scalability. Both are valid reasons for async. Yes responsiveness is a reason for using async (to avoid blocking on the UI thread). You are correct this wouldn't apply to a back end service however this isn't why async is used on a WebApi. The WebApi is also a non GUI back end process. If the only advantage of async code was responsiveness of the UI layer then WebApi would be sync code from start to finish. The other reason for using async is scalability (avoiding deadlocks) and this is the reason why WebApi calls are plumbed async. Keeping the long running processes async helps IIS make more efficient use of a limited number of threads. By default there are only 12 worker threads per core. This can be raised but that isn't a magic bullet either as threads are relatively expensive (about 1MB overhead per thread). await allows you to do more with less. More concurrent long running processes on less threads before a deadlock occurs.

The problem you are having with deadlocks must stem from something else. Your use of ConfigureAwait(false) prevents deadlocks here. Solve the bug and you are fine.
See Should we switch to use async I/O by default? to which the answer is "no". You should decide on a case by case basis and choose async when the benefits outweigh the costs. It is important to understand that async IO has a productivity cost associated with it. In non-GUI scenarios only a few targeted scenarios derive any benefit at all from async IO. The benefits can be enormous, though, but only in those cases.
Here's another helpful post: https://stackoverflow.com/a/25087273/122718

Async methods don't require additional threads?

In MSDN, there is a paragraph like this:
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
But it looks I need little more help with the bold text since I am not sure what it exactly means. So how come it becomes async without using Threads?
Source: http://msdn.microsoft.com/en-us/library/hh191443.aspx

There are many asynchronous operations which don't require the use of multiple threads. Things like Asynchronous IO work by having interrupts which signal when data is available. This allows you to have an asynchronous call which isn't using extra threads - when the signal occurs, the operation completes.
Task.Run can be used to make your own CPU-based async methods, which will run on its own separate thread. The paragraph was intended to show that this isn't the only option, however.

async/await is not just about using more threads. It's about using the threads you have more effectively. When operations block, such as waiting on a download or file read, the async/await pattern allows you to use that existing thread for something else. The compiler handles all the magic plumbing underneath, making it much easier to develop with.
See http://msdn.microsoft.com/en-us/magazine/hh456401.aspx for the problem description and the whitepaper at http://www.microsoft.com/en-us/download/details.aspx?id=14058.

Not the code generated by the async and await keyword themselves, no. They create code that runs on your the current thread, assuming it has a synchronization context. If it doesn't then you actually do get threads, but that's using the pattern for no good reason. The await expression, what you write on the right side of the await keyword causes threads to run.
But that thread is often not observable, it may be a device driver thread. Which reports that it is done with a I/O completion port. Pretty common, I/O is always a good reason to use await. If not already forced on you by WinRT, the real reason that async/await got added.
A note about "having a synchronization context". You have one on a thread if the SynchronizationContext.Current property is not null. This is almost only ever the case on the main thread of a gui app. Also the only place where you normally ever worry about having delays not freeze your user interface.

Essentially what it's doing is when you run an async method without calling it with await is this:
Start the method and do as much as possible sychronously.
When necessary, pause the method and put the rest of it into a continuation.
When the async part is completed (is no longer being waited on), schedule the continuation to run on the same thread.
Whatever you want can run on this thread as normal. You can even examine/manipulate the Task returned from the async method.
When the thread becomes available, it will run the rest of your method.
The 'async part' could be file IO, a web request, or pretty much anything, as long as calling code can wait on this task to complete. This includes, but is not limited to, a separate thread. As Reed Copsey pointed out, there are other ways of performing async operations, like interrupts.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.