Debugging async tasks that don't return - c#

I'm working on a small WPF desktop utility and am using the async/await methodology to allow things to process in parallel.
However, I keep running into issues where an awaited async Task simply never returns. No exceptions are thrown, and if I pause the app in the debugger, the call stack says it is running "External code" called by the line where the async task is called. (Specifically, it hangs at: WindowsBase.dll!System.Windows.Threading.DispatcherSynchronizationContext.Wait(System.IntPtr[] waitHandles, bool waitAll, int millisecondsTimeout) on my current example of this.)
I've seen this several times and worked around it, but each time I have to add debugging code to figure out what line things stopped executing on. (The same code might execute fine for several iterations, and then hang) The same issue has happened both on awaiting my own async methods and on awaiting framework async methods E.G. Stream.CopyToAsync and Stream.ReadAsync)
Is there a way to look into executing Tasks in Visual Studio 2017? I tried opening the "Tasks" window, but I've never gotten anything but "No tasks to display" -- possibly I'm not using that window correctly?
FWIW, I'm doing a lot (hundreds) of concurrent background operations, but none overlap. Mostly web service calls, file system reads and MD5 checksum computations. Is the async/await limited in what it can do concurrently without freezing up? Or a maximum nesting of awaits?

This looks like a classic case of deadlock due to the synchronization context. This happens when you wait synchronously on a task returned by a method that internally calls await. For instance:
public void Deadlock()
{
DoSomething.Wait();
}
public Task DoSomething()
{
// Some stuff
await DoSomethingElse();
// More stuff
}
When inside of a synchronization context, the continuation of await is posted to that very same synchronization context. But the thread of the synchronization context is waiting on the DoSomething().Wait(), and therefore not available. We have a deadlock:
The continuation is waiting for the thread of the synchronization context to be available
The thread of the synchronization context is waiting for the task to be completed, which will happen when the continuation is executed
There are two possible fix:
Stop waiting synchronously on tasks
Use .ConfigureAwait(false) when awaiting, so that the continuation isn't posted to the synchronization context

Related

Why a blocking call inside BackgroundService.ExecuteAsync is not blocking the main thread?

I'm implementing a Microsoft.Extensions.Hosting.BackgroundService in a Asp.Net Core Web API, which has a blocking call inside ExecuteAsync, but surprisingly (to me) it is actually not blocking my application and I'm wondering the reason.
Accordingly to the different versions of the source code of BackgroundService I could find, the method Task ExecuteAsync is called in a fire and forget fashion inside StartAync. Source bellow.
public virtual Task StartAsync(CancellationToken cancellationToken)
{
// Store the task we're executing
_executingTask = ExecuteAsync(_stoppingCts.Token);
// If the task is completed then return it, this will bubble cancellation and failure to the caller
if (_executingTask.IsCompleted)
{
return _executingTask;
}
// Otherwise it's running
return Task.CompletedTask;
}
From what I understand, the continuation of an await call (ie. whatever is bellow it) will be executed in the same SynchronizationContext that called this asynchronous method after the task returns. If this is true, then why isn't this continuation code (that has a blocking call) blocking my application?
To illustrate:
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await Task.Yield();
Foo(); // Foo blocks its executing thread until an I/O operation completes.
}
As ExecutedAsync is never awaited, the continuation of the method (ie. the Foo call) will be executing in the same sync context that fired the ExecuteAsync task in the first place (which will be running in the main thread if I understood correctly).
I'm suspecting the Asp.Net runtime must have its own SynchronizationContext that actually executes async continuations in different threads or something like that.
Can anyone shed some light here?
According to what I know... I'll suggest you to read this first.
And now, I assume that you get the asp.net core itself has actually adopt the contextless approach.
Beside, When you build the host, usually by CreateHostBuilder(args).Build().Run();, that actually run the host, which is an instance of IHost interface, not the IHostedService, which BackgroundService abstract class inherited from.
Briefly, anything that inherited from BackgroundService base class can be consider a special object that got managed and execute by the host itself, and not running on the same level as the thread that got use by the host itself.
You can put an operation that block the thread on the your background service, but that's the thread handling your service, not the application thread itself.
UPDATE
As #underthevoid desire to inspect furthur on what i called "different level as the thread that got use by the host itself", allow me to clear some of my points here.
At first, CreateHostBuilder(args).Build().Run(); build an instance of WebHost, and execute it by calling the .Run(), which behind the screen, call to the Start() method as you can look at it here.
Then, magic happen on this line _hostedServiceExecutor = _applicationServices.GetRequiredService<HostedServiceExecutor>();, which will call StartAsync(cancellationToken) right after that.
Now, look at the HostedServiceExecutor, on the constructor, it get all instances of IHostedService that you registered along with your app, then execute them all, as you can saw briefly on the code.
Each of the IHostedService is now an object that execute their own Tasks, and has nothing to do with the Task that handle WebHost. They are all separate now.
And to what i mean different level as the thread that got use by the host itself, that i tried to point 2 things.
All the BackgroundService that you use got invoke by the WebHost, without it, they can do nothing.
They are running on separate task, therefore, separate thread (i don't think a thread can handle multiple task at the same time as the way they're implement the code).
And for how the task was execute, together nicely, look at #Stephen Cleary answered below, and believe him more than yourself, I have the same level of faith too.
By the way... this is also some nice couple written about BackgroundService, that i get quite a few things from them too.
Hope this will help!
Two points come into play: SynchronizationContext and TaskScheduler.
In ASP.NET Core, the default SyncContext is null, allowing each task to run on whatever thread is available. As second the TaskScheduler comes into play. If it runs multiple tasks on the same thread one can block all the other tasks.
But the tasks running as background service are marked as LongRunning to the task scheduler and the task scheduler gives each long running task its own thread. Thus your background service can't block the asp core engine by a blocking thread call.
From what I understand, the continuation of an await call (ie. whatever is bellow it) will be executed in the same SynchronizationContext that called this asynchronous method after the task returns.
A bit of clarification: when await operates asynchronously, it captures the current "context" at that time (not what was the context at the beginning of the method call). This "context" is SynchronizationContext.Current, unless it is null, in which case the "context" is TaskScheduler.Current.
Note that TaskScheduler.Current is never null. If there isn't a custom TaskScheduler, then TaskScheduler.Current is the same as TaskScheduler.Default, which is a task scheduler that schedules work to the .NET thread pool. To put it more simply, if there's no custom SynchronizationContext and no custom TaskScheduler, then the continuations get queued to the thread pool.
... the same sync context ... (which will be running in the main thread if I understood correctly).
Synchronization contexts don't have a 1:1 relationship with threads. It's true that for GUI apps, a UI synchronization context will queue work to a specific UI thread. But in ASP.NET Classic (pre-Core), its synchronization context would queue work to the thread pool. Even in the UI world, it's possible to have multiple different synchronization contexts for the same UI thread (e.g., multi-windowed WPF apps used to do this, and still do AFAIK).
I'm suspecting the Asp.Net runtime must have its own SynchronizationContext that actually executes async continuations in different threads or something like that.
ASP.NET Core has no SynchronizationContext at all, so the default behavior applies: the continuations are queued to the thread pool.
the continuation of an await call (ie. whatever is bellow it) will be executed in the same SynchronizationContext that called this asynchronous method
Only when there is a SynchronizationContext. And then it depends on the specific Context, not all SynchronizationContexts have Thread affinity.
I think that SynchronizationContext.Current is null in asp.net, I am certain that it will not continue after an wait on the same thread.

Confusion regarding threads and if asynchronous methods are truly asynchronous in C#

I was reading up on async/await and when Task.Yield might be useful and came across this post. I had a question regarding the below from that post:
When you use async/await, there is no guarantee that the method you
call when you do await FooAsync() will actually run asynchronously.
The internal implementation is free to return using a completely
synchronous path.
This is a little unclear to me probably because the definition of asynchronous in my head is not lining up.
In my mind, since I do mainly UI dev, async code is code that does not run on the UI thread, but on some other thread. I guess in the text I quoted, a method is not truly async if it blocks on any thread (even if it's a thread pool thread for example).
Question:
If I have a long running task that is CPU bound (let's say it is doing a lot of hard math), then running that task asynchronously must be blocking some thread right? Something has to actually do the math. If I await it then some thread is getting blocked.
What is an example of a truly asynchronous method and how would they actually work? Are those limited to I/O operations which take advantage of some hardware capabilities so no thread is ever blocked?
This is a little unclear to me probably because the definition of asynchronous in my head is not lining up.
Good on you for asking for clarification.
In my mind, since I do mainly UI dev, async code is code that does not run on the UI thread, but on some other thread.
That belief is common but false. There is no requirement that asynchronous code run on any second thread.
Imagine that you are cooking breakfast. You put some toast in the toaster, and while you are waiting for the toast to pop, you go through your mail from yesterday, pay some bills, and hey, the toast popped up. You finish paying that bill and then go butter your toast.
Where in there did you hire a second worker to watch your toaster?
You didn't. Threads are workers. Asynchronous workflows can happen all on one thread. The point of the asynchronous workflow is to avoid hiring more workers if you can possibly avoid it.
If I have a long running task that is CPU bound (let's say it is doing a lot of hard math), then running that task asynchronously must be blocking some thread right? Something has to actually do the math.
Here, I'll give you a hard problem to solve. Here's a column of 100 numbers; please add them up by hand. So you add the first to the second and make a total. Then you add the running total to the third and get a total. Then, oh, hell, the second page of numbers is missing. Remember where you were, and go make some toast. Oh, while the toast was toasting, a letter arrived with the remaining numbers. When you're done buttering the toast, go keep on adding up those numbers, and remember to eat the toast the next time you have a free moment.
Where is the part where you hired another worker to add the numbers? Computationally expensive work need not be synchronous, and need not block a thread. The thing that makes computational work potentially asynchronous is the ability to stop it, remember where you were, go do something else, remember what to do after that, and resume where you left off.
Now it is certainly possible to hire a second worker who does nothing but add numbers, and then is fired. And you could ask that worker "are you done?" and if the answer is no, you could go make a sandwich until they are done. That way both you and the worker are busy. But there is not a requirement that asynchrony involve multiple workers.
If I await it then some thread is getting blocked.
NO NO NO. This is the most important part of your misunderstanding. await does not mean "go start this job asynchronously". await means "I have an asynchronously produced result here that might not be available. If it is not available, find some other work to do on this thread so that we are not blocking the thread. Await is the opposite of what you just said.
What is an example of a truly asynchronous method and how would they actually work? Are those limited to I/O operations which take advantage of some hardware capabilities so no thread is ever blocked?
Asynchronous work often involves custom hardware or multiple threads, but it need not.
Don't think about workers. Think about workflows. The essence of asynchrony is breaking up workflows into little parts such that you can determine the order in which those parts must happen, and then executing each part in turn, but allowing parts that do not have dependencies with each other to be interleaved.
In an asynchronous workflow you can easily detect places in the workflow where a dependency between parts is expressed. Such parts are marked with await. That's the meaning of await: the code which follows depends upon this portion of the workflow being completed, so if it is not completed, go find some other task to do, and come back here later when the task is completed. The whole point is to keep the worker working, even in a world where needed results are being produced in the future.
I was reading up on async/await
May I recommend my async intro?
and when Task.Yield might be useful
Almost never. I find it occasionally useful when doing unit testing.
In my mind, since I do mainly UI dev, async code is code that does not run on the UI thread, but on some other thread.
Asynchronous code can be threadless.
I guess in the text I quoted, a method is not truly async if it blocks on any thread (even if it's a thread pool thread for example).
I would say that's correct. I use the term "truly async" for operations that do not block any threads (and that are not synchronous). I also use the term "fake async" for operations that appear asynchronous but only work that way because they run on or block a thread pool thread.
If I have a long running task that is CPU bound (let's say it is doing a lot of hard math), then running that task asynchronously must be blocking some thread right? Something has to actually do the math.
Yes; in this case, you would want to define that work with a synchronous API (since it is synchronous work), and then you can call it from your UI thread using Task.Run, e.g.:
var result = await Task.Run(() => MySynchronousCpuBoundCode());
If I await it then some thread is getting blocked.
No; the thread pool thread would be used to run the code (not actually blocked), and the UI thread is asynchronously waiting for that code to complete (also not blocked).
What is an example of a truly asynchronous method and how would they actually work?
NetworkStream.WriteAsync (indirectly) asks the network card to write out some bytes. There is no thread responsible for writing out the bytes one at a time and waiting for each byte to be written. The network card handles all of that. When the network card is done writing all the bytes, it (eventually) completes the task returned from WriteAsync.
Are those limited to I/O operations which take advantage of some hardware capabilities so no thread is ever blocked?
Not entirely, although I/O operations are the easy examples. Another fairly easy example is timers (e.g., Task.Delay). Though you can build a truly asynchronous API around any kind of "event".
When you use async/await, there is no guarantee that the method you call when you do await FooAsync() will actually run asynchronously. The internal implementation is free to return using a completely synchronous path.
This is a little unclear to me probably because the definition of
asynchronous in my head is not lining up.
This simply means there are two cases when calling an async method.
The first is that, upon returning the task to you, the operation is already completed -- this would be a synchronous path. The second is that the operation is still in progress -- this is the async path.
Consider this code, which should show both of these paths. If the key is in a cache, it is returned synchronously. Otherwise, an async op is started which calls out to a database:
Task<T> GetCachedDataAsync(string key)
{
if(cache.TryGetvalue(key, out T value))
{
return Task.FromResult(value); // synchronous: no awaits here.
}
// start a fully async op.
return GetDataImpl();
async Task<T> GetDataImpl()
{
value = await database.GetValueAsync(key);
cache[key] = value;
return value;
}
}
So by understanding that, you can deduce that in theory the call of database.GetValueAsync() may have a similar code and itself be able to return synchronously: so even your async path may end up running 100% synchronously. But your code doesn't need to care: async/await handles both cases seamlessly.
If I have a long running task that is CPU bound (let's say it is doing a lot of hard math), then running that task asynchronously must be blocking some thread right? Something has to actually do the math. If I await it then some thread is getting blocked.
Blocking is a well-defined term -- it means your thread has yielded its execution window while it waits for something (I/O, mutex, and so on). So your thread doing the math is not considered blocked: it is actually performing work.
What is an example of a truly asynchronous method and how would they actually work? Are those limited to I/O operations which take advantage of some hardware capabilities so no thread is ever blocked?
A "truly async method" would be one that simply never blocks. It typically ends up involving I/O, but it can also mean awaiting your heavy math code when you want to your current thread for something else (as in UI development) or when you're trying to introduce parallelism:
async Task<double> DoSomethingAsync()
{
double x = await ReadXFromFile();
Task<double> a = LongMathCodeA(x);
Task<double> b = LongMathCodeB(x);
await Task.WhenAll(a, b);
return a.Result + b.Result;
}
This topic is fairly vast and several discussions may arise. However, using async and await in C# is considered asynchronous programming. However, how asynchrony works is a total different discussion. Until .NET 4.5 there were no async and await keywords, and developers had to develop directly against the Task Parallel Library (TPL). There the developer had full control on when and how to create new tasks and even threads. However, this had a downside since not being really an expert on this topic, applications could suffer from heavy performance problems and bugs due to race conditions between threads and so on.
Starting with .NET 4.5 the async and await keywords were introduced, with a new approach to asynchronous programming. The async and await keywords don't cause additional threads to be created. Async methods don't require multithreading because an async method doesn't run on its own thread. The method runs on the current synchronization context and uses time on the thread only when the method is active. You can use Task.Run to move CPU-bound work to a background thread, but a background thread doesn't help with a process that's just waiting for results to become available.
The async-based approach to asynchronous programming is preferable to existing approaches in almost every case. In particular, this approach is better than BackgroundWorker for IO-bound operations because the code is simpler and you don't have to guard against race conditions. You can read more about this topic HERE.
I don't consider myself a C# black belt and some more experienced developers may raise some further discussions, but as a principle I hope that I managed to answer your question.
Asynchronous does not imply Parallel
Asynchronous only implies concurrency. In fact, even using explicit threads doesn't guarantee that they will execute simultaneously (for example when the threads affinity for the same single core, or more commonly when there is only one core in the machine to begin with).
Therefore, you should not expect an asynchronous operation to happen simultaneously to something else. Asynchronous only means that it will happen, eventually at another time (a(greek) = without, syn (greek) = together, khronos (greek) = time. => Asynchronous = not happening at the same time).
Note: The idea of asynchronicity is that on the invocation you do not care when the code will actually run. This allows the system to take advantage of parallelism, if possible, to execute the operation. It may even run immediately. It could even happen on the same thread... more on that later.
When you await the asynchronous operation, you are creating concurrency (com (latin) = together, currere (latin) = run. => "Concurrent" = to run together). That is because you are asking for the asynchronous operation to reach completion before moving on. We can say the execution converges. This is similar to the concept of joining threads.
When asynchronous cannot be Parallel
When you use async/await, there is no guarantee that the method you call when you do await FooAsync() will actually run asynchronously. The internal implementation is free to return using a completely synchronous path.
This can happen in three ways:
It is possible to use await on anything that returns Task. When you receive the Task it could have already been completed.
Yet, that alone does not imply it ran synchronously. In fact, it suggest it ran asynchronously and finished before you got the Task instance.
Keep in mind that you can await on an already completed task:
private static async Task CallFooAsync()
{
await FooAsync();
}
private static Task FooAsync()
{
return Task.CompletedTask;
}
private static void Main()
{
CallFooAsync().Wait();
}
Also, if an async method has no await it will run synchronously.
Note: As you already know, a method that returns a Task may be waiting on the network, or on the file system, etc… doing so does not imply to start a new Thread or enqueue something on the ThreadPool.
Under a synchronization context that is handled by a single thread, the result will be to execute the Task synchronously, with some overhead. This is the case of the UI thread, I'll talk more about what happens below.
It is possible to write a custom TaskScheduler to always run tasks synchronously. On the same thread, that does the invocation.
Note: recently I wrote a custom SyncrhonizationContext that runs tasks on a single thread. You can find it at Creating a (System.Threading.Tasks.)Task scheduler. It would result in such TaskScheduler with a call to FromCurrentSynchronizationContext.
The default TaskScheduler will enqueue the invocations to the ThreadPool. Yet when you await on the operation, if it has not run on the ThreadPool it will try to remove it from the ThreadPool and run it inline (on the same thread that is waiting... the thread is waiting anyway, so it is not busy).
Note: One notable exception is a Task marked with LongRunning. LongRunning Tasks will run on a separate thread.
Your question
If I have a long running task that is CPU bound (let's say it is doing a lot of hard math), then running that task asynchronously must be blocking some thread right? Something has to actually do the math. If I await it then some thread is getting blocked.
If you are doing computations, they must happen on some thread, that part is right.
Yet, the beauty of async and await is that the waiting thread does not have to be blocked (more on that later). Yet, it is very easy to shoot yourself in the foot by having the awaited task scheduled to run on the same thread that is waiting, resulting in synchronous execution (which is an easy mistake in the UI thread).
One of the key characteristics of async and await is that they take the SynchronizationContext from the caller. For most threads that results in using the default TaskScheduler (which, as mentioned earlier, uses the ThreasPool). However, for UI thread it means posting the tasks into the message queue, this means that they will run on the UI thread. The advantage of this is that you don’t have to use Invoke or BeginInvoke to access UI components.
Before I go into how to await a Task from the UI thread without blocking it, I want to note that it is possible to implement a TaskScheduler where if you await on a Task, you don’t block your thread or have it go idle, instead you let your thread pick another Task that is waiting for execution. When I was backporting Tasks for .NET 2.0 I experimented with this.
What is an example of a truly asynchronous method and how would they actually work? Are those limited to I/O operations which take advantage of some hardware capabilities so no thread is ever blocked?
You seem to confuse asynchronous with not blocking a thread. If what you want is an example of asynchronous operations in .NET that do not require blocking a thread, a way to do it that you may find easy to grasp is to use continuations instead of await. And for the continuations that you need to run on the UI thread, you can use TaskScheduler.FromCurrentSynchronizationContext.
Do not implement fancy spin waiting. And by that I mean using a Timer, Application.Idle or anything like that.
When you use async you are telling the compiler to rewrite the code of the method in a way that allows breaking it. The result is similar to continuations, with a much more convenient syntax. When the thread reaches an await the Task will be scheduled, and the thread is free to continue after the current async invocation (out of the method). When the Task is done, the continuation (after the await) is scheduled.
For the UI thread this means that once it reaches await, it is free to continue to process messages. Once the awaited Task is done, the continuation (after the await) will be scheduled. As a result, reaching await doesn’t imply to block the thread.
Yet blindly adding async and await won’t fix all your problems.
I submit to you an experiment. Get a new Windows Forms application, drop in a Button and a TextBox, and add the following code:
private async void button1_Click(object sender, EventArgs e)
{
await WorkAsync(5000);
textBox1.Text = #"DONE";
}
private async Task WorkAsync(int milliseconds)
{
Thread.Sleep(milliseconds);
}
It blocks the UI. What happens is that, as mentioned earlier, await automatically uses the SynchronizationContext of the caller thread. In this case, that is the UI thread. Therefore, WorkAsync will run on the UI thread.
This is what happens:
The UI threads gets the click message and calls the click event handler
In the click event handler, the UI thread reaches await WorkAsync(5000)
WorkAsync(5000) (and scheduling its continuation) is scheduled to run on the current synchronization context, which is the UI thread synchronization context… meaning that it posts a message to execute it
The UI thread is now free to process further messages
The UI thread picks the message to execute WorkAsync(5000) and schedule its continuation
The UI thread calls WorkAsync(5000) with continuation
In WorkAsync, the UI thread runs Thread.Sleep. The UI is now irresponsive for 5 seconds.
The continuation schedules the rest of the click event handler to run, this is done by posting another message for the UI thread
The UI thread is now free to process further messages
The UI thread picks the message to continue in the click event handler
The UI thread updates the textbox
The result is synchronous execution, with overhead.
Yes, you should use Task.Delay instead. That is not the point; consider Sleep a stand in for some computation. The point is that just using async and await everywhere won't give you an application that is automatically parallel. It is much better to pick what do you want to run on a background thread (e.g. on the ThreadPool) and what do you want to run on the UI thread.
Now, try the following code:
private async void button1_Click(object sender, EventArgs e)
{
await Task.Run(() => Work(5000));
textBox1.Text = #"DONE";
}
private void Work(int milliseconds)
{
Thread.Sleep(milliseconds);
}
You will find that await does not block the UI. This is because in this case Thread.Sleep is now running on the ThreadPool thanks to Task.Run. And thanks to button1_Click being async, once the code reaches await the UI thread is free to continue working. After the Task is done, the code will resume after the await thanks to the compiler rewriting the method to allow precisely that.
This is what happens:
The UI threads gets the click message and calls the click event handler
In the click event handler, the UI thread reaches await Task.Run(() => Work(5000))
Task.Run(() => Work(5000)) (and scheduling its continuation) is scheduled to run on the current synchronization context, which is the UI thread synchronization context… meaning that it posts a message to execute it
The UI thread is now free to process further messages
The UI thread picks the message to execute Task.Run(() => Work(5000)) and schedule its continuation when done
The UI thread calls Task.Run(() => Work(5000)) with continuation, this will run on the ThreadPool
The UI thread is now free to process further messages
When the ThreadPool finishes, the continuation will schedule the rest of the click event handler to run, this is done by posting another message for the UI thread. When the UI thread picks the message to continue in the click event handler it will updates the textbox.
Here's asynchronous code which shows how async / await allows code to block and release control to another flow, then resume control but not needing a thread.
public static async Task<string> Foo()
{
Console.WriteLine("In Foo");
await Task.Yield();
Console.WriteLine("I'm Back");
return "Foo";
}
static void Main(string[] args)
{
var t = new Task(async () =>
{
Console.WriteLine("Start");
var f = Foo();
Console.WriteLine("After Foo");
var r = await f;
Console.WriteLine(r);
});
t.RunSynchronously();
Console.ReadLine();
}
So it's that releasing of control and resynching when you want results that's key with async/await ( which works well with threading )
NOTE: No Threads were blocked in the making of this code :)
I think sometimes the confusion might come from "Tasks" which doesn't mean something running on its own thread. It just means a thing to do, async / await allows tasks to be broken up into stages and coordinate those various stages into a flow.
It's kind of like cooking, you follow the recipe. You need to do all the prep work before assembling the dish for cooking. So you turn on the oven, start cutting things, grating things, etc. Then you await the temp of oven and await the prep work. You could do it by yourself swapping between tasks in a way that seems logical (tasks / async / await), but you can get someone else to help grate cheese while you chop carrots (threads) to get things done faster.
Stephen's answer is already great, so I'm not going to repeat what he said; I've done my fair share of repeating the same arguments many times on Stack Overflow (and elsewhere).
Instead, let me focus on one important abstract things about asynchronous code: it's not an absolute qualifier. There is no point in saying a piece of code is asynchronous - it's always asynchronous with respect to something else. This is quite important.
The purpose of await is to build synchronous workflows on top of asynchronous operations and some connecting synchronous code. Your code appears perfectly synchronous1 to the code itself.
var a = await A();
await B(a);
The ordering of events is specified by the await invocations. B uses the return value of A, which means A must have run before B. The method containing this code has a synchronous workflow, and the two methods A and B are synchronous with respect to each other.
This is very useful, because synchronous workflows are usually easier to think about, and more importantly, a lot of workflows simply are synchronous. If B needs the result of A to run, it must run after A2. If you need to make an HTTP request to get the URL for another HTTP request, you must wait for the first request to complete; it has nothing to do with thread/task scheduling. Perhaps we could call this "inherent synchronicity", apart from "accidental synchronicity" where you force order on things that do not need to be ordered.
You say:
In my mind, since I do mainly UI dev, async code is code that does not run on the UI thread, but on some other thread.
You're describing code that runs asynchronously with respect to the UI. That is certainly a very useful case for asynchrony (people don't like UI that stops responding). But it's just a specific case of a more general principle - allowing things to happen out of order with respect to one another. Again, it's not an absolute - you want some events to happen out of order (say, when the user drags the window or the progress bar changes, the window should still redraw), while others must not happen out of order (the Process button must not be clicked before the Load action finishes). await in this use case isn't that different from using Application.DoEvents in principle - it introduces many of the same problems and benefits.
This is also the part where the original quote gets interesting. The UI needs a thread to be updated. That thread invokes an event handler, which may be using await. Does it mean that the line where await is used will allow the UI to update itself in response to user input? No.
First, you need to understand that await uses its argument, just as if it were a method call. In my sample, A must have already been invoked before the code generated by await can do anything, including "releasing control back to the UI loop". The return value of A is Task<T> instead of just T, representing a "possible value in the future" - and await-generated code checks to see if the value is already there (in which case it just continues on the same thread) or not (which means we get to release the thread back to the UI loop). But in either case, the Task<T> value itself must have been returned from A.
Consider this implementation:
public async Task<int> A()
{
Thread.Sleep(1000);
return 42;
}
The caller needs A to return a value (a task of int); since there's no awaits in the method, that means the return 42;. But that cannot happen before the sleep finishes, because the two operations are synchronous with respect to the thread. The caller thread will be blocked for a second, regardless of whether it uses await or not - the blocking is in A() itself, not await theTaskResultOfA.
In contrast, consider this:
public async Task<int> A()
{
await Task.Delay(1000);
return 42;
}
As soon as the execution gets to the await, it sees that the task being awaited isn't finished yet and returns control back to its caller; and the await in the caller consequently returns control back to its caller. We've managed to make some of the code asynchronous with respect to the UI. The synchronicity between the UI thread and A was accidental, and we removed it.
The important part here is: there's no way to distinguish between the two implementations from the outside without inspecting the code. Only the return type is part of the method signature - it doesn't say the method will execute asynchronously, only that it may. This may be for any number of good reasons, so there's no point in fighting it - for example, there's no point in breaking the thread of execution when the result is already available:
var responseTask = GetAsync("http://www.google.com");
// Do some CPU intensive task
ComputeAllTheFuzz();
response = await responseTask;
We need to do some work. Some events can run asynchronously with respect to others (in this case, ComputeAllTheFuzz is independent of the HTTP request) and are asynchronous. But at some point, we need to get back to a synchronous workflow (for example, something that requires both the result of ComputeAllTheFuzz and the HTTP request). That's the await point, which synchronizes the execution again (if you had multiple asynchronous workflows, you'd use something like Task.WhenAll). However, if the HTTP request managed to complete before the computation, there's no point in releasing control at the await point - we can simply continue on the same thread. There's been no waste of the CPU - no blocking of the thread; it does useful CPU work. But we didn't give any opportunity for the UI to update.
This is of course why this pattern is usually avoided in more general asynchronous methods. It is useful for some uses of asynchronous code (avoiding wasting threads and CPU time), but not others (keeping the UI responsive). If you expect such a method to keep the UI responsive, you're not going to be happy with the result. But if you use it as part of a web service, for example, it will work great - the focus there is on avoiding wasting threads, not keeping the UI responsive (that's already provided by asynchronously invoking the service endpoint - there's no benefit from doing the same thing again on the service side).
In short, await allows you to write code that is asynchronous with respect to its caller. It doesn't invoke a magical power of asynchronicity, it isn't asynchronous with respect to everything, it doesn't prevent you from using the CPU or blocking threads. It just gives you the tools to easily make a synchronous workflow out of asynchronous operations, and present part of the whole workflow as asynchronous with respect to its caller.
Let's consider an UI event handler. If the individual asynchronous operations happen to not need a thread to execute (e.g. asynchronous I/O), part of the asynchronous method may allow other code to execute on the original thread (and the UI stays responsive in those parts). When the operation needs the CPU/thread again, it may or may not require the original thread to continue the work. If it does, the UI will be blocked again for the duration of the CPU work; if it doesn't (the awaiter specifies this using ConfigureAwait(false)), the UI code will run in parallel. Assuming there's enough resources to handle both, of course. If you need the UI to stay responsive at all times, you cannot use the UI thread for any execution long enough to be noticeable - even if that means you have to wrap an unreliable "usually asynchronous, but sometimes blocks for a few seconds" async method in a Task.Run. There's costs and benefits to both approaches - it's a trade-off, as with all engineering :)
Of course, perfect as far as the abstraction holds - every abstraction leaks, and there's plenty of leaks in await and other approaches to asynchronous execution.
A sufficiently smart optimizer might allow some part of B to run, up to the point where the return value of A is actually needed; this is what your CPU does with normal "synchronous" code (Out of order execution). Such optimizations must preserve the appearance of synchronicity, though - if the CPU misjudges the ordering of operations, it must discard the results and present a correct ordering.

Task.Factory.StartNew() followed by Task.Wait() [duplicate]

In other words, is
var task = SomeLongRunningOperationAsync();
task.Wait();
functionally identical to
SomeLongRunningOperation();
Stated another way, is
var task = SomeOtherLongRunningOperationAsync();
var result = task.Result;
functionally identical to
var result = SomeOtherLongRunningOperation();
According to Task.Wait and Inlining, if the Task being Wait’d on has already started execution, Wait has to block. However, if it hasn’t started executing, Wait may be able to pull the target task out of the scheduler to which it was queued and execute it inline on the current thread.
Are those two cases merely a matter of deciding which thread the Task is going to run on, and if you're waiting on the result anyway, does it matter?
Is there any benefit to using the asynchronous form over the synchronous form, if nothing executes between the asynchronous call and the Wait()?
Here are some differences:
The computation might run on a different thread. It might run on the same thread if this task is CPU-based and can be inlined. This is non-deterministic.
If no inlining happens one more thread will be in use during the computation. This usually costs 1MB of stack memory.
Exceptions will be wrapped in AggregateException. The exception stack will be different.
The task version might deadlock if the computation posts to the current synchronization context.
If the thread-pool is maxed out this might deadlock if for the task to complete another task must be scheduled.
Thread-local state, such as HttpContext.Current (which is not actually thread-local but almost), might be different.
A thread abort of the main thread will not reach the task body (except in case of inlining). I'm not sure whether the wait itself will be aborted or not.
Creating a Task induces a memory barrier which can have a synchronizing effect.
Does this matter? Decide for yourself by this list.
Are there benefits to doing this? I can't think of any. If your computation uses async IO the wait will negate the benefits that the async IO brings. The one exception would be fan-out IO, e.g. issuing 10 HTTP requests in parallel and waiting for them. That way you have 10 operations at the cost of one thread.
Note, that Wait and Result are equivalent in all these regards.

Is calling Task.Wait() immediately after an asynchronous operation equivalent to running the same operation synchronously?

In other words, is
var task = SomeLongRunningOperationAsync();
task.Wait();
functionally identical to
SomeLongRunningOperation();
Stated another way, is
var task = SomeOtherLongRunningOperationAsync();
var result = task.Result;
functionally identical to
var result = SomeOtherLongRunningOperation();
According to Task.Wait and Inlining, if the Task being Wait’d on has already started execution, Wait has to block. However, if it hasn’t started executing, Wait may be able to pull the target task out of the scheduler to which it was queued and execute it inline on the current thread.
Are those two cases merely a matter of deciding which thread the Task is going to run on, and if you're waiting on the result anyway, does it matter?
Is there any benefit to using the asynchronous form over the synchronous form, if nothing executes between the asynchronous call and the Wait()?
Here are some differences:
The computation might run on a different thread. It might run on the same thread if this task is CPU-based and can be inlined. This is non-deterministic.
If no inlining happens one more thread will be in use during the computation. This usually costs 1MB of stack memory.
Exceptions will be wrapped in AggregateException. The exception stack will be different.
The task version might deadlock if the computation posts to the current synchronization context.
If the thread-pool is maxed out this might deadlock if for the task to complete another task must be scheduled.
Thread-local state, such as HttpContext.Current (which is not actually thread-local but almost), might be different.
A thread abort of the main thread will not reach the task body (except in case of inlining). I'm not sure whether the wait itself will be aborted or not.
Creating a Task induces a memory barrier which can have a synchronizing effect.
Does this matter? Decide for yourself by this list.
Are there benefits to doing this? I can't think of any. If your computation uses async IO the wait will negate the benefits that the async IO brings. The one exception would be fan-out IO, e.g. issuing 10 HTTP requests in parallel and waiting for them. That way you have 10 operations at the cost of one thread.
Note, that Wait and Result are equivalent in all these regards.

Deadlock when accessing StackExchange.Redis

I'm running into a deadlock situation when calling StackExchange.Redis.
I don't know exactly what is going on, which is very frustrating, and I would appreciate any input that could help resolve or workaround this problem.
In case you have this problem too and don't want to read all this;
I suggest that you'll try setting PreserveAsyncOrder to false.
ConnectionMultiplexer connection = ...;
connection.PreserveAsyncOrder = false;
Doing so will probably resolve the kind of deadlock that this Q&A is about and could also improve performance.
Our setup
The code is run as either a Console application or as an Azure Worker Role.
It exposes a REST api using HttpMessageHandler so the entry point is async.
Some parts of the code have thread affinity (is owned by, and must be run by, a single thread).
Some parts of the code is async-only.
We are doing the sync-over-async and async-over-sync anti-patterns. (mixing await and Wait()/Result).
We're only using async methods when accessing Redis.
We're using StackExchange.Redis 1.0.450 for .NET 4.5.
Deadlock
When the application/service is started it runs normally for a while then all of a sudden (almost) all incoming requests stop functioning, they never produce a response. All those requests are deadlocked waiting for a call to Redis to complete.
Interestingly, once the deadlock occur, any call to Redis will hang but only if those calls are made from an incoming API request, which are run on the thread pool.
We are also making calls to Redis from low priority background threads, and these calls continue to function even after the deadlock occurred.
It seems as if a deadlock will only occur when calling into Redis on a thread pool thread. I no longer think this is due to the fact that those calls are made on a thread pool thread. Rather, it seems like any async Redis call without continuation, or with a sync safe continuation, will continue to work even after the deadlock situation has occurred. (See What I think happens below)
Related
StackExchange.Redis Deadlocking
Deadlock caused by mixing await and Task.Result (sync-over-async, like we do). But our code is run without synchronization context so that doesn't apply here, right?
How to safely mix sync and async code?
Yes, we shouldn't be doing that. But we do, and we'll have to continue doing so for a while. Lots of code that needs to be migrated into the async world.
Again, we don't have a synchronization context, so this should not be causing deadlocks, right?
Setting ConfigureAwait(false) before any await has no effect on this.
Timeout exception after async commands and Task.WhenAny awaits in StackExchange.Redis
This is the thread hijacking problem. What's the current situation on this? Could this be the problem here?
StackExchange.Redis async call hangs
From Marc's answer:
...mixing Wait and await is not a good idea. In addition to deadlocks, this is "sync over async" - an anti-pattern.
But he also says:
SE.Redis bypasses sync-context internally (normal for library code), so it shouldn't have the deadlock
So, from my understanding StackExchange.Redis should be agnostic to whether we're using the sync-over-async anti-pattern. It's just not recommended as it could be the cause of deadlocks in other code.
In this case, however, as far as I can tell, the deadlock is really inside StackExchange.Redis. Please correct me if I'm wrong.
Debug findings
I've found that the deadlock seems to have its source in ProcessAsyncCompletionQueue on line 124 of CompletionManager.cs.
Snippet of that code:
while (Interlocked.CompareExchange(ref activeAsyncWorkerThread, currentThread, 0) != 0)
{
// if we don't win the lock, check whether there is still work; if there is we
// need to retry to prevent a nasty race condition
lock(asyncCompletionQueue)
{
if (asyncCompletionQueue.Count == 0) return; // another thread drained it; can exit
}
Thread.Sleep(1);
}
I've found that during the deadlock; activeAsyncWorkerThread is one of our threads that is waiting for a Redis call to complete. (our thread = a thread pool thread running our code). So the loop above is deemed to continue forever.
Without knowing the details, this sure feels wrong; StackExchange.Redis is waiting for a thread that it thinks is the active async worker thread while it is in fact a thread that is quite the opposite of that.
I wonder if this is due to the thread hijacking problem (which I don't fully understand)?
What to do?
The main two question I'm trying to figure out:
Could mixing await and Wait()/Result be the cause of deadlocks even when running without synchronization context?
Are we running into a bug/limitation in StackExchange.Redis?
A possible fix?
From my debug findings it seems as the problem is that:
next.TryComplete(true);
...on line 162 in CompletionManager.cs could under some circumstances let the current thread (which is the active async worker thread) wander off and start processing other code, possibly causing a deadlock.
Without knowing the details and just thinking about this "fact", then it would seem logical to temporarily release the active async worker thread during the TryComplete invocation.
I guess that something like this could work:
// release the "active thread lock" while invoking the completion action
Interlocked.CompareExchange(ref activeAsyncWorkerThread, 0, currentThread);
try
{
next.TryComplete(true);
Interlocked.Increment(ref completedAsync);
}
finally
{
// try to re-take the "active thread lock" again
if (Interlocked.CompareExchange(ref activeAsyncWorkerThread, currentThread, 0) != 0)
{
break; // someone else took over
}
}
I guess my best hope is that Marc Gravell would read this and provide some feedback :-)
No synchronization context = The default synchronization context
I've written above that our code does not use a synchronization context. This is only partially true: The code is run as either a Console application or as an Azure Worker Role. In these environments SynchronizationContext.Current is null, which is why I wrote that we're running without synchronization context.
However, after reading It's All About the SynchronizationContext I've learned that this is not really the case:
By convention, if a thread’s current SynchronizationContext is null, then it implicitly has a default SynchronizationContext.
The default synchronization context should not be the cause of deadlocks though, as UI-based (WinForms, WPF) synchronization context could - because it does not imply thread affinity.
What I think happens
When a message is completed its completion source is checked for whether it is considered sync safe. If it is, the completion action is executed inline and everything is fine.
If it is not, the idea is to execute the completion action on a newly allocated thread pool thread. This too works just fine when ConnectionMultiplexer.PreserveAsyncOrder is false.
However, when ConnectionMultiplexer.PreserveAsyncOrder is true (the default value), then those thread pool threads will serialize their work using a completion queue and by ensuring that at most one of them is the active async worker thread at any time.
When a thread becomes the active async worker thread it will continue to be that until it have drained the completion queue.
The problem is that the completion action is not sync safe (from above), still it is executed on a thread that must not be blocked as that will prevent other non sync safe messages from being completed.
Notice that other messages that are being completed with a completion action that is sync safe will continue to work just fine, even though the active async worker thread is blocked.
My suggested "fix" (above) would not cause a deadlock in this way, it would however mess with the notion of preserving async completion order.
So maybe the conclusion to make here is that it is not safe to mix await with Result/Wait() when PreserveAsyncOrder is true, no matter whether we are running without synchronization context?
(At least until we can use .NET 4.6 and the new TaskCreationOptions.RunContinuationsAsynchronously, I suppose)
These are the workarounds I've found to this deadlock problem:
Workaround #1
By default StackExchange.Redis will ensure that commands are completed in the same order that result messages are received. This could cause a deadlock as described in this question.
Disable that behavior by setting PreserveAsyncOrder to false.
ConnectionMultiplexer connection = ...;
connection.PreserveAsyncOrder = false;
This will avoid deadlocks and could also improve performance.
I encourage anyone that run into to deadlock problems to try this workaround, since it's so clean and simple.
You'll loose the guarantee that async continuations are invoked in the same order as the underlying Redis operations are completed. However, I don't really see why that is something you would rely on.
Workaround #2
The deadlock occur when the active async worker thread in StackExchange.Redis completes a command and when the completion task is executed inline.
One can prevent a task from being executed inline by using a custom TaskScheduler and ensure that TryExecuteTaskInline returns false.
public class MyScheduler : TaskScheduler
{
public override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
return false; // Never allow inlining.
}
// TODO: Rest of TaskScheduler implementation goes here...
}
Implementing a good task scheduler may be a complex task. There are, however, existing implementations in the ParallelExtensionExtras library (NuGet package) that you can use or draw inspiration from.
If your task scheduler would use its own threads (not from the thread pool), then it might be a good idea to allow inlining unless the current thread is from the thread pool. This will work because the active async worker thread in StackExchange.Redis is always a thread pool thread.
public override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
// Don't allow inlining on a thread pool thread.
return !Thread.CurrentThread.IsThreadPoolThread && this.TryExecuteTask(task);
}
Another idea would be to attach your scheduler to all of its threads, using thread-local storage.
private static ThreadLocal<TaskScheduler> __attachedScheduler
= new ThreadLocal<TaskScheduler>();
Ensure that this field is assigned when the thread starts running and cleared as it completes:
private void ThreadProc()
{
// Attach scheduler to thread
__attachedScheduler.Value = this;
try
{
// TODO: Actual thread proc goes here...
}
finally
{
// Detach scheduler from thread
__attachedScheduler.Value = null;
}
}
Then you can allow inlining of tasks as long as its done on a thread that is "owned" by the custom scheduler:
public override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
// Allow inlining on our own threads.
return __attachedScheduler.Value == this && this.TryExecuteTask(task);
}
I am guessing a lot based on the detailed information above and not knowing the source code you have in place. It sounds like you may be hitting some internal, and configurable, limits in .Net. You shouldn't be hitting those, so my guess is that you are not disposing of objects since they are floating between threads which won't allow you to use a using statement to cleanly handle their object lifetimes.
This details the limitations on HTTP requests. Similar to the old WCF issue when you didn't dispose of the connection and then all WCF connections would fail.
Max number of concurrent HttpWebRequests
This is more of a debugging aid, since I doubt you really are using all the TCP ports, but good info on how to find how many open ports you have and to where.
https://msdn.microsoft.com/en-us/library/aa560610(v=bts.20).aspx

Categories