Is it ever safe to not wait for SaveChangesAsync()? - c#

Consider the following WebAPI method:
[HttpPost]
public async Task<IHttpResult> CreateWorkItem(Item wi)
{
var item = await dataContext.Item.AddAsync(wi);
await dataContext.SaveChangesAsync();
return Ok();
}
Is there a situation where it would be safe not to await for a SaveChangesAsync() method?
In theory I would have a faster response time if I just sent a response while the flushing to database gets done in the background.
Is my assumption correct?

There's nothing magical about the await keyword. It quite literally means "wait on this task to complete before moving on". Tasks return hot, or already started, so whether you await them or not, the work is happening.
However, things can get dicey in situations where you don't wait for task completion. In particular, here, your context, which is required to actually do the save operation (remember that it owns the physical DB connection) is request-scoped. That means if you do not await the save and return from the action, you're now essentially in a race to see which completes the first: the save operation or the end of the request. If the end of the request (i.e. the final response is flushed to the client) occurs first, the context is disposed taking the active transaction and your SQL connection with it.
Another important reason to await is for proper exception handling. If you do not await, any exceptions thrown during the save will essentially be swallowed because the code has already moved on. That means you have no real assurances that the save actually completed successfully; all you've got is a wish and prayer.
With very rare exception, all async tasks should be awaited all the time. It doesn't always need to be in the same line (such as when using things like Task.WhenAll to await a series of tasks), but at some point or another the await keyword should be there. FWIW, those rare exceptions are mostly confined to desktop and mobile development where you often need to fire work off on new threads to prevent blocking the main, or UI, thread. That's not remotely a concern for web applications, so you can pretty much consider the "await all the things" rule universal in that context.

Related

Background started task does not finish/gets terminated after the first encountered await

In an ASP.NET application, I have an action which when hit, starts a new background task in the following way:
Task.Factory.StartNew(async () => await somethingWithCpuAndIo(input), CancellationToken.None, TaskCreationOptions.DenyChildAttach | TaskCreationOptions.LongRunning, TaskScheduler.FromCurrentSynchronizationContext());
I'm not awaiting it, I just want it to start, and continuing doing its work in the background.
Immediately after that, I return a response to the client.
For some reason though, after the initial thread executing the background work hits an await that awaits the completion of a method, when debugging, I successfully resolve the method, but upon returning, execution just stops and does not continue below that point.
Interestingly, if I fully await this task (using double await), everything works as expected.
Is this due to the SynchronizationContext? The moment I return a response, the synchronizationContext is disposed/removed? (The SynchronizationContext is being used inside the method)
If it is due to that, where exactly does the issue happen?
A) When the Scheduler attempts to assign the work on the given synchronizationContext, it will already be disposed, so nothing will be provided
B) Somewhere down the lines in the method executing, when I return a response to the client, the synchronizationContext is lost, regardless of anything else.
C) Something else entirely?
If it's A), I should be able to fix this by simply doing Thread.Sleep() between scheduling the work and returning a response. (Tried that, it didn't work.)
If it's B) I have no idea how I can resolve this. Help will be appreciated.
As Gabriel Luci has pointed out, it is due the the first awaited incomplete Task returning immediately, but there's a wider point to be made about Task.Factory.StartNew.
Task.Factory.StartNew should not be used with async code, and neither should TaskCreationOptions.LongRunning. TaskCreationOptions.LongRunning should be used for scheduling long running CPU-bound work. With an async method, it may be logically long running, but Task.Factory.StartNew is about starting synchronous work, the synchronous part of an async method is the bit before the first await, this is usually very short.
Here is the guidance from David Fowler (Partner Software Architect at Microsoft on the ASP.NET team) on the matter:
https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/blob/86b502e88c752e42f68229afb9f1ac58b9d1fef7/AsyncGuidance.md#avoid-using-taskrun-for-long-running-work-that-blocks-the-thread
See the 3rd bulb:
Don't use TaskCreationOptions.LongRunning with async code as this will
create a new thread which will be destroyed after first await.
Your comments made you intentions a little clearer. What I think you want to do is:
Start the task and don't wait for it. Return a response to the client before the background task completes.
Make sure that the somethingWithCpuAndIo method has access to the request context.
But,
A different thread won't be in the same context, and
As soon as the first await is hit, a Task is returned, which also means that Task.Factory.StartNew returns and execution of the calling method continues. That means that the response is returned to the client. When the request completes, the context is disposed.
So you can't really do both things you want. There are a couple of ways to work around this:
First, you might be able to not start it on a different thread at all. This depends on when somethingWithCpuAndIo needs access to the context. If it only needs the context before the first await, then you can do something like this:
public IActionResult MyAction(input) {
somethingWithCpuAndIo(input); //no await
}
private async Task somethingWithCpuAndIo(SomeThing input) {
// You can read from the request context here
await SomeIoRequest().ConfigureAwait(false);
// Everything after here will run on a ThreadPool thread with no access
// to the request context.
}
Every asynchronous method starts running synchronously. The magic happens when await is given an incomplete Task. So in this example, somethingWithCpuAndIo will start executing on the same thread, in the request context. When it hits the await, a Task is returned to MyAction, but it is not awaited, so MyAction completes executing and a response gets sent to the client before SomeIoRequest() has completed. But ConfigureAwait(false) tells it that we don't need to resume execution in the same context, so somethingWithCpuAndIo resume execution on a ThreadPool thread.
But that will only help you if you don't need the context after the first await in somethingWithCpuAndIo.
Your best option is to still execute on a different thread, but pass the values you need from the context into somethingWithCpuAndIo.
But also, use Task.Run instead of Task.Factory.StartNew for reasons described in detail here.
Update: This can very likely cause unpredictable results, but you can also try passing a reference to HttpContext.Current to the thread and setting HttpContext.Current in the new thread, like this:
var ctx = HttpContext.Current;
Task.Run(async () => {
HttpContext.Current = ctx;
await SomeIoRequest();
});
However, it all depends on how you are using the context. HttpContext itself doesn't implement IDiposable, so it, itself, can't be disposed. And the garbage collector won't get rid of it as long as you're holding a reference to it. But the context isn't designed to live longer than the request. So after the response is returned to the client, there may be many parts of the context that are disposed or otherwise unavailable. Test it out an see what explodes. But even if nothing explodes right now, you (or someone else) might come back to that code later, try to use something else in the context and get really confused when it blows up. It could make for some difficult-to-debug scenarios.

Why doesn't this call to .Result provoke a deadlock?

I'm still trying to wrap my head around when I can use .Result, and when I can't. I have been trying to avoid it entirely, by using async "all the way down".
While doing some work in our ASP.NET web app (NOT core), I came across this code. Now, the code works (and has been working for 2 years), so I know it works, I just don't know why it works.
This is fired from a sync controller method. It is a chain of method calls that goes through a few layers until it finally does some HTTP work. I have simplified the code here:
// Class1
public byte[] GetDocument1(string url)
{
return class2.GetDocument2(url).Result;
}
// Class2
public Task<byte[]> GetDocument2(string url)
{
return class3.GetDocument3(url)
}
// Class3
public async Task<byte[]> GetDocument3(string url)
{
var client = GetHttpClient(url);
var resp = client.GetAsync(url).Result;
if(resp.StatusCode == HttpStatusCode.OK)
{
using(var httpStream = await resp.Content.ReadAsStreamAsync())
{
// we are also using
await httpStream.ReadAsync(...);
}
}
}
So as far as I can tell, when this all starts, I'm on the "main ASP sync context" (I start in a controller and eventually get to this code). We aren't using any .ConfigureAwait(false), so I believe we are always returning to this context.
In GetDocument3, why doesn't client.GetAsync(url).Result deadlock?
GetDocument3 is mixing .Result with await stuff. In general, is that a good idea?? Is it OK here because .Result comes before the awaits?
In GetDocument1, why doesn't .Result deadlock?
client.GetAsync(url).Result is synchronously blocking. resp.Content.ReadAsStreamAsync() is actually not doing any awaiting. The Task is already completed, because HttpCompletionOption.ResponseContentRead is used in GetAsync, so the entire block of code here is synchronous code pretending to be asynchronous.
In general, you should never use .Result or .Wait or Task.Wait* - if you absolutely have to, use GetAwaiter().GetResult(), which doesn't throw exceptions wrapped in AggregateExceptions, for which you may not have a catch, but even that should be avoided like the plague. You're right to use async all the way down.
The deadlock is only a problem in a context that wants to return back to the original thread (e.g. UI or ASP.NET), and that thread is blocked by a synchronous wait. If you always use ConfigureAwait(false) on every await, unless you know you actually need to preserve the context on the continuation (usually only at the top level UI event handler because you need to update something on the UI, or in the top level of the ASP.NET controller) then you should be safe.
Synchronous blocking inside asynchronous code also isn't good, but won't cause the deadlock. Using await with ConfigureAwait(true) (the default), inside a synchronous wait, in a context that wants to return back to a specific thread will cause the deadlock.

Manually capturing and applying SynchronizationContext when completing a Task

I was having a problem with a hanging await (described here). During research I found out that calling SetResult on my TaskCompletionSource actually invokes awaiting continuation in the context of the thread that called SetResult (this is also spelled out in this answer to a somewhat related question). In my case this is a different thread (a thread-pool worker thread) from the one that started the await (an ASP.NET request thread).
While I'm still not sure why this would cause a hang, I decided to try forcing the SetResult into the original context. I stored the value of SynchronizationContext.Current before entering await on the request thread and manually applied it in the worker thread via SynchronizationContext.SetSynchronizationContext just before calling SetResult. This solved the hang and I can now await all my async methods without having to specify ConfigureAwait(false).
My question is: is this a reasonable and correct approach to manually capturing and applying the SynchronizationContext? FWIW, I tried doing a simple Post() with the SetResult delegate first, but that still caused a hang. I'm obviously a bit out of my comfort zone here... Please help me understand what's going on!
SetResult is not guaranteed to call anything. Therefore, this is not reliable.
You need to switch the sync context at the point where it is captured. A common pain point here is WebClient which captures the context when starting a web request. So your code would look like this:
SetContext(newContext);
new WebClient().DownloadAsync(...);
SetContext(oldContext);
Restore the old context to not disturb anything.
In other words the problem is in the continuation code, not in the code calling SetResult.
To my embarrassment, I had completely overlooked that my HTTP handler was derived from a small base class, which implemented IAsyncHttpHandler in a very questionable way in order to add support for async handlers:
public IAsyncResult BeginProcessRequest(HttpContext context, AsyncCallback cb, object extraData)
{
...
var task = HandleRequestAsync(...);
Task.Run(async () => { await task; }).GetAwaiter().GetResult();
...
}
I can't even remember why I did this in the first place (it was over a year ago), but it definitely was THE stupid part I was looking for for the last couple of days!
Changing the handler base class to .NET 4.6's HttpTaskAsyncHandler got rid of the hangs. Sorry for wasting everyone's time! :(

await and LINQ within DBContext

We have a service layer in our application which is composed of three logical layers - web service, business model services (our name for the layer that executes business logic and orchestrates calls to various repositories), and the repository layer which connects to various DBs using EF6.
Many of our repository calls just get data straight from DB sets via ToListAsync, FirstOrDefaultAsync, like this:
public async Task<MyObject> GetSomeData()
{
using(var context = new myDBContext())
{
return await context.SomeDbSet.FirstOrDefault(o=>o.Something == true);
}
}
We're having a bit of an internal debate as to whether using await here is the right thing to do or not, because there is nothing executing in this method after the await. I/we understand that the way the code is written, it is a necessity otherwise the context would be disposed of as soon as the method exists and it would result in an exception. But if we await here, we have to await all the way up (or down, depending on how you look at it) our call stack, and that would result in a number of expensive and somewhat unnecessary context switches.
The other option here is to make the repository methods synchronous, and do a Task.Run() in the method that calls the repository method, like:
Task.Run(() => MyRepository.GetSomeData());
we can then await this call if we want, or just return the task object again to the caller. The downside here is the call to the database then becomes synchronous and one thread from the pool is blocking for the entire length of the database call.
So this comes down to what's more expensive? Unnecessary context switches via await or having threads block? It seems that there is no right answer, but is there a best practice?
Any thoughts would be appreciated.
You should, of course, use the async version.
As you said, if you don't await you will dispose of the context before the operation completed, but that doesn't mean the calling methods need to use async-await as well. They can return the task just as you mention in the Task.Run option:
public Task<MyObject> FooAsync()
{
// do some stuff
return GetSomeDataAsync();
}
public async Task<MyObject> GetSomeDataAsync()
{
using(var context = new myDBContext())
{
return await context.SomeDbSet.FirstOrDefault(o=>o.Something == true);
}
}
You mentioned that the cost in this case is some expensive context switches. I'm not sure what you mean by that, but if you're referring to thread context switches then there's only a single one. The calling thread will be released while awaiting the asynchronous operation and a different thread will continue running when that operation completes.
Not only that this is negligible compared to the time it takes to execute the actual operation, if you use Task.Run you have the same context switch as a blocked thread is taken out of the CPU.
Using Task.Run on a synchronous operation is redundant. It's just blocking a thread and it potentially requires more context switches then the async equivalent.
There are many kinds of “context” in the .NET Framework: LogicalCallContext, SynchronizationContext, HostExecutionContext, SecurityContext, ExecutionContext etc.. SynchronizationContext is captured when using Async/Await, but it is no the only context that get captured. Along with SynchronizationContext, ExecutionContext also get captured. ExecutionContext consist of SecurityContext, LogicalCallContext etc..
An async code is always executed against the captured ExecutionContext. When an await complete, if there was a current SynchronizationContext that got captured, the continuation representing the remainder of the asynchronous method is posted to that SynchronizationContext.
So when executing code under Task.Run, it's only the SynchronizationContext that will no get captured, but ExecutionContext will still get captured in any case. You can get same behaviour of not getting SynchronizationContext captured by async/await using ConfigureAwait(false) when awaiting. The down side is that when await completes, the SynchronizationContext will be ignored and the Framework will attempt to continue the execution wherever the previous asynchronous operation completed, which is what you exactly want.
So in your scenarios I think you should be using async/await with ConfigureAwait(false), as there would not be any overhead of SynchronizationContext in this case and at the same time there would be no overhead of blocking any thread.
The following post may be helpful to get more insight: https://msdn.microsoft.com/en-us/magazine/hh456402.aspx

What happens while waiting on a Task's Result?

I'm using the HttpClient to post data to a remote service in a .NET 4.0 project. I'm not concerned with this operation blocking, so I figured I could skip ContinueWith or async/await and use Result.
While debugging, I ran into an issue where the remote server wasn't responsive. As I stepped through the code, it seemed like my code just stopped running on the third line... the current stack pointer line stopped being highlighted yellow, and didn't advance to the next line. It just disappeared. It took me a while to realize that I should wait for the request to timeout.
var client = new HttpClient();
var task = client.PostAsync("http://someservice/", someContent);
var response = task.Result;
My understanding was that calling Result on the Task caused the code to execute synchronously, to behave more like this (I know there is no Post method in the HttpClient):
var client = new HttpClient();
var response = client.Post("http://someservice/", someContent);
I'm not sure this is a bad thing, I'm just trying to get my head around it. Is it really true that by virtue of the fact that the HttpClient is returning Tasks instead of the results directly, my application is automatically taking advantage of asynchrony even when I think I'm avoiding it?
In Windows, all I/O is asynchronous. Synchronous APIs are just a convenient abstraction.
So, when you use HttpWebRequest.GetResponse, what actually happens is the I/O is started (asynchronously), and the calling thread (synchronously) blocks, waiting for it to complete.
Similarly, when you use HttpClient.PostAsync(..).Result, the I/O is started (asynchronously), and the calling thread (synchronously) blocks, waiting for it to complete.
I usually recommend people use await rather than Task.Result or Task.Wait for the following reasons:
If you block on a Task that is the result of an async method, you can easily get into a deadlock situation.
Task.Result and Task.Wait wrap any exceptions in an AggregateException (because those APIs are holdovers from the TPL). So error handling is more complex.
However, if you're aware of these limitations, there are some situations where blocking on a Task can be useful (e.g., in a Console application's Main).
Capturing the result of a task blocks the current thread. There is no point in using a async version of a method in this case. Post() and PostAsync().Result will both block.
If you want to make use of concurrency, you should write it as such:
async Task PostContent()
{
var client = new HttpClient();
Task t = await client.PostAsync("http://someservice/", someContent);
//code after this line will execute when the PostAsync completes.
return t;
}
Since PostContent() itself returns a Task, the method calling it should also await.
async void ProcessResult()
{
var result = await PostContent();
//Do work with the result when the result is ready
}
For instance, if you call ProcessResult() in a button click handler, you see that the UI is still responsive, other controls still function.

Categories