How does using await make code thread safe?
Link: https://learn.microsoft.com/en-us/ef/core/dbcontext-configuration/#avoiding-dbcontext-threading-issues
Entity Framework Core does not support multiple parallel operations
being run on the same DbContext instance. This includes both parallel
execution of async queries and any explicit concurrent use from
multiple threads. Therefore, always await async calls immediately, or
use separate DbContext instances for operations that execute in
parallel.
Say 1 db call is in progress and thread 1 is on it. While it's working in it;
Case 1:
Assume I am not using async. A parallel call is made by another thread, hence there will be clash and error.
Case 2:
Assume I use async. A parallel call is made by another thread:
Case 2.a: since we use async, when 1st thread makes the request, and waiting for response, it will release the thread so thread 2 won't clash when making the request. I get this point.
Case 2.b what if both threads really make request at the same time.
Is case 2.b possible and does using async be of any help here?
"always await async calls immediately" means to avoid starting multiple async operations and waiting for them to finish later (as one would do for multiple HTTP requests).
Here is typical example of what one should do for async HTTP calls but invalid for EF operations:
// start operations early so results are there after our CPU intensive code complete
var task1 = dbContext.GetSomething();
var task2 = dbContext.GetMore();
// do something slow here to let both operations to finish
var result1 = await task1;
// even more code
var result2 = await task2;
This code will let both operations to run in parallel thus potentially causing issues with dbContext.
The guidance recommends:
var result1 = await dbContext.GetSomething();
var result2 = await dbContext.GetMore();
// run slow code now sequntially to avoid parallel calls to dbContext.
Indeed if you are careful you can do similar interleaving of calls on DBContext and other operations on unrelated objects - i.e. HTTP call and DBContext call in parallel, or DBContext call and CPU-intensive code to run in parallel with late awiat of EF call.
Note that the guidance can't help much with sharing context between two threads - as with any non-thread-safe object it is your responsibility to prevent simultaneous access to such shared objects.
The thing you're missing is that in the common case of ASP.NET Core, the DbContext is scoped to the HTTP request, and the framework guarantees that there will be only one middleware or controller method running for a request at any one time. So awaiting the async calls is sufficient to prevent parallel DbContext access.
Related
We have an async/await method GetLoanDataAsync which calls a stored procedurre through entity framework which is being called by a sync method GetData.
GetLoanData takes a good amount of time to execute and probably that was the reason that we wrote as async/await which can be used by multiple places.
I understand that we shouldn't mix async and sync calls but let's say if we have this scenario and we are using Task.Run() to call async method GetLoanDataAsync from sync method GetData which I understand that method - GetLoanDataAsync will run on a background thread.
My question is that what if we had a sync version of async method GetLoanDataAsync and called using Task.Run() from GetData than what difference it would make in this case?
Providing more details regarding the issue-
We have ASP.NET REST Web API which return type is not a Task. This is called from angular app. This api has few methods called GetData() and where we wait for the result from GetLoanDataAsync. As per my understanding GetLoanDataAsync will be called in background thread and will be able to execute GetUserDetails(), once this finish it will give back the result from executed GetLoanDataAsync.
Code -
public List<int> GetData(int id)
{
// Calls GetLoanDataAsync
var result = Task.Run(()=> GetLoanDataAsync(id));
// calls couple other sync methods
GetUserDetails();
return result.GetAwaiter().GetResult();
}
GetLoanDataAsync().Result will result in deadlock which was the earlier issue. To fix this we are temporarily trying to use Task.Run till we make overall api as an async.
If you compare a sync vs async version, then the most obvious difference is that the sync version is tying up a thread for the duration of the DB call, which seems unnecessary if all the work is at the database. Whether this is a problem depends on your throughput.
By using Task.Run, you're allowing that operation to run in parallel with other work. This is fine, as long as you don't have any touch-points that impact thread-safety, or which depend on a single logical execution flow.
If you want to use async effectively, it is best to consider it "sticky" - i.e. if X is async, then everything that calls X (directly or indirectly) needs to be async.
I've been struggling for about some days now on checking where to do await and where not to.
I have a Repository class which fetches the data from database.
using EntityFramework the code would be something like this:
public async Task<List<Object>> GetAsync()
{
return await context.Set<Object>().ToListAsync();
}
and the consumer:
var data = await GetAsync();
and on top level I'm awaiting this method too.
should I use await on only one of these methods?
Is it a performance penalty on using resources and creates new thread each time you do await?
I have checked the questions listed in the comments and they do not reffer to the performance issues and just say that you can do it. I wanted the best practice and the reason why to / not to do so.
I'd like to add to that.
There are some async methods where there is no need to use async/await keywords. It is important to detect this kind of misuse because adding the async modifier comes at a price.
E.G. You don't need async/await keywords in your example.
public Task<List<Object>> GetAsync()
{
return context.Set<Object>().ToListAsync();
}
And then:
var data = await GetAsync();
Will be just fine. In this case, you are returning the Task<List<Object>> and then you are awaiting that in the place you directly work with objects.
I recommend installing async await helper
Let me get the essence of your question first, confusion is related to where to use the Async in the complete chain of calls and where not and how to assess the performance impact of usage, as it may lead to creation of more threads. If the synopsis goes beyond this add details to the comments, i till try to answer them too.
Let's divide and tackle each of them one by one.
Where to use the Async in the chain of calls and where not ?
Here as you are using Entity Framework to access a database, I can safely assume you are using IO based Asynchronous processing, which is the most prominent use case for Async processing across languages and frameworks, use cases for CPU based Asynchronous processing are relatively limited (will explain them too)
Async is a Scalability feature especially for IO processing instead of performance feature, in simple words via Async processing you can ensure that a hosted server can cater to many times more calls for IO processing, since calls are not blocking and they just hand over the processing request over the network, while process thread goes back to the pool ready to serve another request, complete handing over process few milliseconds
When the processing is complete, software thread need to just receive them and pass back to the client, again few millisecond, mostly its around < 1 ms, if its a pure pass through no logic call for IO
What are Benefits
Imagine instead making Synchronous call for IO to a database, where each thread involve will just wait for result to arrive, which may go in few seconds, impact will be highly negative, in general based on thread pool size, you may server 25 - 50 request at most and they too will reply on number of cores available to process, will continuously spin wasting resources, while they are idle and waiting for the response
If you make synchronous call there's no way to serve 1000+ requests in same setup and I am extremely conservative Async can actually have huge Scalability impact, where for light weight calls it may serve millions requests with ease from a single hosted process
After the background, where to use the Async in complete chain
Everywhere, feasible from begin to end, from entry point to actual exit point making IO call, since that's the actual call relieving the pool thread, as it dispatch the call over network
Do Remember though, await at a given point doesn't allow further code to process ins same method, even if it relieve the thread, so its better that if there are multiple independent calls, they are aggregated using Task.WhenAll and the representative task is awaited, it will return when all of them finish success / error, what ever may be the state
If the Async is broken in between by using something like Task.Wait or Task.Result, it will not remain pure Async call and will block the calling thread pool thread
How can Async be further enhanced ?
In Pure library calls, where Async is initiated by the Thread pool and dispatching thread can be different from receiving one and call doesn't need to reenter same context, you shall use ConfigureAwait(false), which means it will not wait to re-enter the original context and is a performance boost
Like await it makes sense to use ConfigureAwait(false) across the chain, entry to the end. This is valid only for libraries which reply extensively on thread pools
Is there a Thread created
Best read this, Stephen Cleary - There's no thread
A genuine IO async call will use Hardware based concurrency to process, will not block
the Software Threads
Variations
CPU based Asychronous processing, where you take things in background, since current thread needs to be responsive, mostly in case of Ui like WPF
Use cases
All kinds of systems especially non MS frameworks, like node js, have Async processing as underlying principle and database server cluster on receiving end is tuned to receive millions of calls and process them
B2C calls, its expected that each request is light weight with limited Payload
Edit 1:
Just in this specific case as listed here, ToListAsyncis by default Asynchronous, so you can skip async await in that case as listed in variopus comments, though do review Stepehen Cleay's article in general that may not be a very good strategy, since gains are minimal and negative impact for incorrect usage can be high
I know this question might be a bit trivial, but all the answers I find on the internet leave me confused.
I'm aware with basic principles of how async/await works (how await asynchroniously waits for the task to complete not blocking the main thread),
but I don't understand its real benefit, because it seems to me everything you do with async/await you can do using Task Paralel Library.
Please consider this example, to better understand what I mean:
Let's say I have a SuperComplexMethod that returns some value and I would like to execute it in parallel, meanwhile doing some other things. Normally I would do it this way:
internal class Program
{
private static void Main()
{
//I will start a task first that will run asynchroniously
var task = Task.Run(() => SuperComplexMethod());
//Then I will be doing some other work, and then get the result when I need it
Console.WriteLine("Doing some other work...");
var result = task.Result;
}
static string SuperComplexMethod()
{
Console.WriteLine("Doing very complex calculations...");
Thread.Sleep(3000);
return "Some result";
}
}
Here how I would have to do it using async/await:
internal class Program
{
private static void Main()
{
var task = SuperComplexMethodAsync();
Console.WriteLine("Doing some other work...");
var result = task.Result;
}
//I have to create this async wrapper that can wait for the task to complete
async static Task<string> SuperComplexMethodAsync()
{
return await Task.Run(() => SuperComplexMethod());
}
static string SuperComplexMethod()
{
Console.WriteLine("Doing very complex calculations...");
Thread.Sleep(3000);
return "Some result";
}
}
As you can see in the second example in order to use async/await approach, I have to create a wrapper method that starts a task and asynchronously waits for it to complete. Obviously it seems redundant to me, because I can achieve the very same behavior without using this wrapper marked async/await.
Can you please explain me what is so special about async/await, and what actual benefits it provides over using tools of Task Parallel Library alone?
Arguably the main reason to use async/await is thread sparing. Imagine the following scenario (I'll simplify to make the point): a) you have a web application that has 10 threads available to process incoming requests; b) all requests involve I/O (e.g. connecting to a remote database, connecting to upstream network services via HTTP/SOAP) to process/complete; c) each request takes 2 seconds to process.
Now imagine 20 requests arrive at about the same time. Without async/await, your web app would start to process the first 10 requests. While this is happening the other 10 would just sit in the queue for 2 seconds, with your web app out of threads and hence unable to process them. Only when the first 10 complete would the second 10 begin to be processed.
Under async/await, the first 10 requests would instead begin tasks, and, while awaiting those tasks, the threads that were processing them would be returned to the web app to process other requests. So your web app would begin processing the second 10 almost straight away, rather than waiting. As each of the awaited tasks from the first 10 completes, the web app would continue processing the rest of their methods, either on a thread-pool thread or one of the web app's threads (which it is depends on how you configure things). We can usually expect in an I/O scenario that the I/O is by far the bulk of the duration of the call, so we can make a reasonable assumption that in the above scenario, the network/database call might take 1.9s and the rest of the code (adapting DTOs, some business logic, etc.) might take 0.1s. If we assume the continuation (after the await) is processed by a web app thread, that thread is now only tied up for 0.1 of the 2 seconds, instead of the full 2 seconds in the non async/await scenario.
You might naturally think: well I've just pushed the threads out of one pool of threads and into another, and that will eventually fill up too. To understand why this isn't really true in practise in truly async scenarios, you need to read There Is No Thread.
The upshot is that you are now able to concurrently process many more requests than you have threads available to process them.
You'll notice the above is focused on I/O, and that's really where async/await shines. If your web app instead processed requests by performing complex mathematical calculations using the CPU, you would not see the above benefit, hence why async/await is not really suited for nor intended for use with CPU-bound activities.
Before others jump in with all the exceptions to the rules (and there are some), I'm only presenting a vanilla simplified scenario to show the value of async/await in I/O-bound scenarios. Covering everything about async/await would create a very long answer (and this one is long enough already!)
I should also add that there are other ways to process web requests asynchronously, ways that pre-date async/await, but async/await very significantly simplifies the implementation.
--
Moving briefly to say a WinForms or similar app, the scenario is very similar, except now you really only have one thread available to process UI requests, and any time you hold onto that thread, the UI will be unresponsive, so you can use a similar approach to move long-running operations off the UI thread. In the UI scenario, it becomes more reasonable to perform CPU-bound operations off the UI thread as well. When doing this, a thread pool thread will instead perform that CPU work, freeing up the UI thread to keep the UI responsive. Now there is a thread, but at least it's not the UI one. This is generally called "offloading", which is one of the other primary uses for async/await.
--
Your example is a console app - there's often not a lot to be gained in that context, except for the ability to fairly easily (arguably more easily than creating your own threads) execute several requests concurrently on the thread pool.
When using async and await the compiler generates a state machine in the background
public async Task MyMethodAsync()
{
Task<int> longRunningTask = LongRunningOperationAsync();
// independent work which doesn't need the result of LongRunningOperationAsync can be done here
//and now we call await on the task
int result = await longRunningTask;
//use the result
Console.WriteLine(result);
}
public async Task<int> LongRunningOperationAsync() // assume we return an int from this long running operation
{
await Task.Delay(1000); // 1 second delay
return 1;
}
so what happens here:
Task longRunningTask = LongRunningOperationAsync(); starts
executing LongRunningOperation
Independent work is done on let's assume the Main Thread (Thread ID
= 1) then await long running task is reached.
Now, if the longRunningTask hasn't finished and it is still running, MyMethodAsync() will return to its calling method, thus the main thread doesn't get blocked. When the longRunningTask is done then a thread from the ThreadPool (can be any thread) will return to MyMethodAsync() in its previous context and continue execution (in this case printing the result to the console).
A second case would be that the longRunningTask has already finished its execution and the result is available. When reaching the await longRunningTask we already have the result so the code will continue executing on the very same thread. (in this case printing result to console). Of course this is not the case for the above example, where there's a Task.Delay(1000) involved.
For More ... Refer :=>
Async/Await - Best Practices
Simplifying Asynchronous
Probably this question has already been made, but I never found a definitive answer. Let's say that I have a Web API 2.0 Application hosted on IIS. I think I understand that best practice (to prevent deadlocks on client) is always use async methods from the GUI event to the HttpClient calls. And this is good and it works. But what is the best practice in case I had client application that does not have a GUI (e.g. Window Service, Console Application) but only synchronous methods from which to make the call? In this case, I use the following logic:
void MySyncMethodOnMyWindowServiceApp()
{
list = GetDataAsync().Result().ToObject<List<MyClass>>();
}
async Task<Jarray> GetDataAsync()
{
list = await Client.GetAsync(<...>).ConfigureAwait(false);
return await response.Content.ReadAsAsync<JArray>().ConfigureAwait(false);
}
But unfortunately this can still cause deadlocks on client that occur at random times on random machines.
The client app stops at this point and never returns:
list = await Client.GetAsync(<...>).ConfigureAwait(false);
If it's something that can be run in the background and isn't forced to be synchronous, try wrapping the code (that calls the async method) in a Task.Run(). I'm not sure that'll solve a "deadlock" problem (if it's something out of sync, that's another issue), but if you want to benefit from async/await, if you don't have async all the way down, I'm not sure there's a benefit unless you run it in a background thread. I had a case where adding Task.Run() in a few places (in my case, from an MVC controller which I changed to be async) and calling async methods not only improved performance slightly, but it improved reliability (not sure that it was a "deadlock" but seemed like something similar) under heavier load.
You will find that using Task.Run() is regarded by some as a bad way to do it, but I really couldn't see a better way to do it in my situation, and it really did seem to be an improvement. Perhaps this is one of those things where there's the ideal way to do it vs. the way to make it work in the imperfect situation that you're in. :-)
[Updated due to requests for code]
So, as someone else posted, you should do "async all the way down". In my case, my data wasn't async, but my UI was. So, I went async down as far as I could, then I wrapped my data calls with Task.Run in such as way that it made sense. That's the trick, I think, to figure out if it makes sense that things can run in parallel, otherwise, you're just being synchronous (if you use async and immediately resolve it, forcing it to wait for the answer). I had a number of reads that I could perform in parallel.
In the above example, I think you have to async up as far as makes sense, and then at some point, determine where you can spin off a t hread and perform the operation independent of the other code. Let's say you have an operation that saves data, but you don't really need to wait for a response -- you're saving it and you're done. The only thing you might have to watch out for is not to close the program without waiting for that thread/task to finish. Where it makes sense in your code is up to you.
Syntax is pretty easy. I took existing code, changed the controller to an async returning a Task of my class that was formerly being returned.
var myTask = Task.Run(() =>
{
//...some code that can run independently.... In my case, loading data
});
// ...other code that can run at the same time as the above....
await Task.WhenAll(myTask, otherTask);
//..or...
await myTask;
//At this point, the result is available from the task
myDataValue = myTask.Result;
See MSDN for probably better examples:
https://msdn.microsoft.com/en-us/library/hh195051(v=vs.110).aspx
[Update 2, more relevant for the original question]
Let's say that your data read is an async method.
private async Task<MyClass> Read()
You can call it, save the task, and await on it when ready:
var runTask = Read();
//... do other code that can run in parallel
await runTask;
So, for this purpose, calling async code, which is what the original poster is requesting, I don't think you need Task.Run(), although I don't think you can use "await" unless you're an async method -- you'll need an alternate syntax for Wait.
The trick is that without having some code to run in parallel, there's little point in it, so thinking about multi-threading is still the point.
Using Task<T>.Result is the equivalent of Wait which will perform a synchronous block on the thread. Having async methods on the WebApi and then having all the callers synchronously blocking them effectively makes the WebApi method synchronous. Under load you will deadlock if the number of simultaneous Waits exceeds the server/app thread pool.
So remember the rule of thumb "async all the way down". You want the long running task (getting a collection of List) to be async. If the calling method must be sync you want to make that conversion from async to sync (using either Result or Wait) as close to the "ground" as possible. Keep they long running process async and have the sync portion as short as possible. That will greatly reduce the length of time that threads are blocked.
So for example you can do something like this.
void MySyncMethodOnMyWindowServiceApp()
{
List<MyClass> myClasses = GetMyClassCollectionAsync().Result;
}
Task<List<MyClass>> GetMyListCollectionAsync()
{
var data = await GetDataAsync(); // <- long running call to remote WebApi?
return data.ToObject<List<MyClass>>();
}
The key part is the long running task remains async and not blocked because await is used.
Also don't confuse the responsiveness with scalability. Both are valid reasons for async. Yes responsiveness is a reason for using async (to avoid blocking on the UI thread). You are correct this wouldn't apply to a back end service however this isn't why async is used on a WebApi. The WebApi is also a non GUI back end process. If the only advantage of async code was responsiveness of the UI layer then WebApi would be sync code from start to finish. The other reason for using async is scalability (avoiding deadlocks) and this is the reason why WebApi calls are plumbed async. Keeping the long running processes async helps IIS make more efficient use of a limited number of threads. By default there are only 12 worker threads per core. This can be raised but that isn't a magic bullet either as threads are relatively expensive (about 1MB overhead per thread). await allows you to do more with less. More concurrent long running processes on less threads before a deadlock occurs.
The problem you are having with deadlocks must stem from something else. Your use of ConfigureAwait(false) prevents deadlocks here. Solve the bug and you are fine.
See Should we switch to use async I/O by default? to which the answer is "no". You should decide on a case by case basis and choose async when the benefits outweigh the costs. It is important to understand that async IO has a productivity cost associated with it. In non-GUI scenarios only a few targeted scenarios derive any benefit at all from async IO. The benefits can be enormous, though, but only in those cases.
Here's another helpful post: https://stackoverflow.com/a/25087273/122718
How does Asynchronous tasks (Async/Await) work in .Net 4.5?
Some sample code:
private async Task<bool> TestFunction()
{
var x = await DoesSomethingExists();
var y = await DoesSomethingElseExists();
return y;
}
Does the second await statement get executed right away or after the first await returns?
await pauses the method until the operation completes. So the second await would get executed after the first await returns.
For more information, see my async / await intro or the official FAQ.
It executes after the first await returns. If this thing confuses you, try to play around with breakpoints - they are fully supported by the new async pattern.
Imagine it would look like this:
var x = await GetSomeObjectInstance();
var y = await GetSomeObjectInstance2(x);
There probably would occur a NullReferenceException somewhere, so the first await has to return first. Otherwise, x would be null/undefined or whatever.
The methods calls will still occur sequentially just like "regular", non awaited method calls. The purpose of await is that it will return the current thread to the thread pool while the awaited operation runs off and does whatever.
This is particularly useful in high performance environments, say a web server, where a given request is processed on a given thread from the overall thread pool. If we don't await, then the given thread processing the request (and all it's resources) remains "in use" while the db / service call completes. This might take a couple of seconds or more especially for external service calls.
Now in low traffic websites this is not much of an issue but in high traffic sites the cost of all these request threads just sitting around, doing nothing, in an "in-use" state, waiting for other processes like those db /service calls to return can be a resource burden.
We are better off releasing the thread back to the worker pool to allow it do other useful work for some other request.
Once the db / service call completes, we can then interrupt the thread pool and ask for a thread to carry on processing that request from where it left off. At that point the state of the request is reloaded and the method call continues.
So on a per request basis when using await, the request will still take the same amount of time from the users perspective... plus a tiny smidge more for the switching overhead.
But in the aggregate, across all requests for all users, things can seem more performant to all users as the web server (in this case) runs more efficiently with better resource utilization. i.e. it either doesn't have to queue up requests waiting for free threads to process requests because await is returning them or alternatively we don't have to buy more hardware because we are using the same amount of hardware, more efficiently, to obtain higher throughputs.
There is a switching cost to this though so despite what you see in the default templates and in many docs you shouldn't just blindly use await for every single call. It's just a tool and like all tools it has its place. If the switching cost is not less than the cost of just completing your calls synchronously then you shouldn't use await.