I want to performa an asynchronous DB Query in C# that calls a stored procedure for a Backup. Since we use Azure this takes about 2 minutes and we don't want the user to wait that long.
So the idea is to make it asynchronous, so that the task continues to run, after the request.
[HttpPost]
public ActionResult Create(Snapshot snapshot)
{
db.Database.CommandTimeout = 7200;
Task.Run(() => db.Database.ExecuteSqlCommandAsync("EXEC PerformSnapshot #User = '" + CurrentUser.AccountName + "', #Comment = '" + snapshot.Comment + "';"));
this.ShowUserMessage("Your snapshot has been created.");
return this.RedirectToActionImpl("Index", "Snapshots", new System.Web.Routing.RouteValueDictionary());
}
I'm afraid that I haven't understood the concept of asynchronous taks. The query will not be executed (or aborted?), if I don't use the wait statement. But actually "waiting" is the one thing I espacially don't want to do here.
So... why am I forced to use wait here?
Or will the method be started, but killed if the requst is finished?
We don't want the user to wait that long.
async-await won't help you with that. Odd as it may sound, the basic async-await pattern is about implementing synchronous behavior in a non-blocking fashion. It doesn't re-arrange your logical flow; in fact, it goes to great lengths to preserve it. The only thing you've changed by going async here is that you're no longer tying up a thread during that 2-minute database operation, which is a huge win your app's scalability if you have lots of concurrent users, but doesn't speed up an individual request one bit.
I think what you really want is to run the operation as a background job so you can respond to the user immediately. But be careful - there are bad ways to do that in ASP.NET (i.e. Task.Run) and there are good ways.
Dave, you're not forced to use await here. And you're right - from user perspective it still will take 2 minutes. The only difference is that the thread which processes your request can now process other requests meanwhile database does its job. And when database finishes, the thread will continue process your request.
Say you have limited number of threads capable to process HTTP request. This async code will help you to process more requests per time period, but it won't help user to get the job done faster.
This seems to be down to a misunderstanding as to what async and await do.
async does not mean run this on a new thread, in essence it acts as a signal to the compiler to build a state machine, so a method like this:
Task<int> GetMeAnInt()
{
return await myWebService.GetMeAnInt();
}
sort of (cannot stress this enough), gets turned into this:
Task<int> GetMeAnInt()
{
var awaiter = myWebService.GetMeAnInt().GetAwaiter();
awaiter.OnCompletion(() => goto done);
return Task.InProgress;
done:
return awaiter.Result;
}
MSDN has way more information about this, and there's even some code out there explaining how to build your own awaiters.
async and await at their very core just enable you to write code that uses callbacks under the hood, but in a nice way that tells the compiler to do the heavy lifting for you.
If you really want to run something in the background, then you need to use Task:
Task<int> GetMeAnInt()
{
return Task.Run(() => myWebService.GetMeAnInt());
}
OR
Task<int> GetMeAnInt()
{
return Task.Run(async () => await myWebService.GetMeAnInt());
}
The second example uses async and await in the lambda because in this scenario GetMeAnInt on the web service also happens to return Task<int>.
To recap:
async and await just instruct the compiler to do some jiggerypokery
This uses labels and callbacks with goto
Fun fact, this is valid IL but the C# compiler doesn't allow it for your own code, hence why the compiler can get away with the magic but you can't.
async does not mean "run on a background thread"
Task.Run() can be used to queue a threadpool thread to run an arbitrary function
Task.Factory.Start() can be used to grab a brand new thread to run an arbitrary function
await instructs the compiler that this is the point at which the result of the awaiter for the awaitable (e.g. Task) being awaited is required - this is how it knows how to structure the state machine.
As I describe in my MSDN article on async ASP.NET, async is not a silver bullet; it doesn't change the HTTP protocol:
When some developers learn about async and await, they believe it’s a way for the server code to “yield” to the client (for example, the browser). However, async and await on ASP.NET only “yield” to the ASP.NET runtime; the HTTP protocol remains unchanged, and you still have only one response per request.
In your case, you're trying to use a web request to kick off a backend operation and then return to the browser. ASP.NET was not designed to execute backend operations like this; it is only a web tier framework. Having ASP.NET execute work is dangerous because ASP.NET is only aware of work coming in from its requests.
I have an overview of various solutions on my blog. Note that using a plain Task.Run, Task.Factory.StartNew, or ThreadPool.QueueUserWorkItem is extremely dangerous because ASP.NET doesn't know anything about that work. At the very least you should use HostingEnvironment.QueueBackgroundWorkItem so ASP.NET at least knows about the work. But that doesn't guarantee that the work will actually ever complete.
A proper solution is to place the work in a persistent queue and have an independent background worker process that queue. See the Asynchronous Messaging Primer (specifically, your scenario is "Decoupling workloads").
Related
I am trying to understand the difference between these lines of code. I have an object of memorystream called arContents and works fine when its size is small. I see the file upload.
blockBlob.UploadFromStreamAsync(arContents)
But when the memorystream is large, the above code runs without errors but uploads no file. However when I added the WAIT() function call, this worked.
blockBlob.UploadFromStreamAsync(arContents).Wait();
I would like to understand first of all What is wait call doing and why is it an async call. Guess want to know the difference between asynch and synch calls too.
Also, I have seen the await also.. code like this
await blockBlob.UploadFromStreamAsync(arContents)
What is the difference?
Thanks
I see you tagged multithreading on your question, so I think it is important to note that asynchronous and multi-threading are two completely different things. Read through Microsoft's article called The Task asynchronous programming model in C# for a good explanation of how they are different. The example about cooking really helps make this point clear.
When you call an asynchronous method, the task (whatever that is) will start. But you usually also want to either:
Know that it finished successfully, or
Use the returned result from the task (like, if it was retrieving data)
How and when you do that is when things get interesting. The point of asynchronous programming is to allow the thread to go and do something else while waiting for something else to happen (a network request, data from the hard drive, etc) rather than the thread just sitting idle until the task completes.
You have probably experienced some programs that completely lock up and don't let you do anything while it's doing something. That is what asynchronous programming allows you to avoid.
In your first example, you aren't waiting for the result at all. That's often called "fire and forget". You start the task, then code execution immediately moves on to the next line and you will never know if the task completed successfully.
Using .Wait() is not asynchronous at all. It will lock up the thread until the task finishes. Worse than that, when you try to force an asynchronous method to be synchronous, it can sometimes cause a deadlock that your application cannot recover from. Search for c# async wait deadlock and you'll find many examples and explanations.
Ideally, you will want to use await. This is what happens when you use it:
The method (UploadFromStreamAsync) is called.
When an asynchronous operation starts, UploadFromStreamAsync returns a Task.
The await keyword will look at the Task, and if it is not complete, it will put the rest of the current method on the "to do" list for when the Task does finish and return a new Task to the calling method.
As long as you have used async all the way up the call stack, then at that point, the thread can go and do something else it has on its "to do" list. In ASP.NET it could be working on a new request that came in. In a desktop app it could be responding to user input. That would happen on the same thread.
Whenever that task finishes, then await extracts the returned value from the Task. If the method returned just a Task, then that is similar to void (no return value). If it is Task<T>, then it will return the object of type T. Then your code resumes execution after the await line.
That all sounds complicated, but you don't really need to understand completely what it's doing, and that's by design. This feature of C# lets you use asynchronous programming in a way that looks very similar to normal synchronous programming. For example:
public async Task Upload() {
...
await blockBlob.UploadFromStreamAsync(arContents);
}
Will do exactly the same as this:
public void Upload() {
...
blockBlob.UploadFromStream(arContents);
}
They look very similar, except that using async/await will give you the benefits I talked about above and the second will not.
I've been struggling for about some days now on checking where to do await and where not to.
I have a Repository class which fetches the data from database.
using EntityFramework the code would be something like this:
public async Task<List<Object>> GetAsync()
{
return await context.Set<Object>().ToListAsync();
}
and the consumer:
var data = await GetAsync();
and on top level I'm awaiting this method too.
should I use await on only one of these methods?
Is it a performance penalty on using resources and creates new thread each time you do await?
I have checked the questions listed in the comments and they do not reffer to the performance issues and just say that you can do it. I wanted the best practice and the reason why to / not to do so.
I'd like to add to that.
There are some async methods where there is no need to use async/await keywords. It is important to detect this kind of misuse because adding the async modifier comes at a price.
E.G. You don't need async/await keywords in your example.
public Task<List<Object>> GetAsync()
{
return context.Set<Object>().ToListAsync();
}
And then:
var data = await GetAsync();
Will be just fine. In this case, you are returning the Task<List<Object>> and then you are awaiting that in the place you directly work with objects.
I recommend installing async await helper
Let me get the essence of your question first, confusion is related to where to use the Async in the complete chain of calls and where not and how to assess the performance impact of usage, as it may lead to creation of more threads. If the synopsis goes beyond this add details to the comments, i till try to answer them too.
Let's divide and tackle each of them one by one.
Where to use the Async in the chain of calls and where not ?
Here as you are using Entity Framework to access a database, I can safely assume you are using IO based Asynchronous processing, which is the most prominent use case for Async processing across languages and frameworks, use cases for CPU based Asynchronous processing are relatively limited (will explain them too)
Async is a Scalability feature especially for IO processing instead of performance feature, in simple words via Async processing you can ensure that a hosted server can cater to many times more calls for IO processing, since calls are not blocking and they just hand over the processing request over the network, while process thread goes back to the pool ready to serve another request, complete handing over process few milliseconds
When the processing is complete, software thread need to just receive them and pass back to the client, again few millisecond, mostly its around < 1 ms, if its a pure pass through no logic call for IO
What are Benefits
Imagine instead making Synchronous call for IO to a database, where each thread involve will just wait for result to arrive, which may go in few seconds, impact will be highly negative, in general based on thread pool size, you may server 25 - 50 request at most and they too will reply on number of cores available to process, will continuously spin wasting resources, while they are idle and waiting for the response
If you make synchronous call there's no way to serve 1000+ requests in same setup and I am extremely conservative Async can actually have huge Scalability impact, where for light weight calls it may serve millions requests with ease from a single hosted process
After the background, where to use the Async in complete chain
Everywhere, feasible from begin to end, from entry point to actual exit point making IO call, since that's the actual call relieving the pool thread, as it dispatch the call over network
Do Remember though, await at a given point doesn't allow further code to process ins same method, even if it relieve the thread, so its better that if there are multiple independent calls, they are aggregated using Task.WhenAll and the representative task is awaited, it will return when all of them finish success / error, what ever may be the state
If the Async is broken in between by using something like Task.Wait or Task.Result, it will not remain pure Async call and will block the calling thread pool thread
How can Async be further enhanced ?
In Pure library calls, where Async is initiated by the Thread pool and dispatching thread can be different from receiving one and call doesn't need to reenter same context, you shall use ConfigureAwait(false), which means it will not wait to re-enter the original context and is a performance boost
Like await it makes sense to use ConfigureAwait(false) across the chain, entry to the end. This is valid only for libraries which reply extensively on thread pools
Is there a Thread created
Best read this, Stephen Cleary - There's no thread
A genuine IO async call will use Hardware based concurrency to process, will not block
the Software Threads
Variations
CPU based Asychronous processing, where you take things in background, since current thread needs to be responsive, mostly in case of Ui like WPF
Use cases
All kinds of systems especially non MS frameworks, like node js, have Async processing as underlying principle and database server cluster on receiving end is tuned to receive millions of calls and process them
B2C calls, its expected that each request is light weight with limited Payload
Edit 1:
Just in this specific case as listed here, ToListAsyncis by default Asynchronous, so you can skip async await in that case as listed in variopus comments, though do review Stepehen Cleay's article in general that may not be a very good strategy, since gains are minimal and negative impact for incorrect usage can be high
I have a Web API's action where I need to run some task and forget about this task.
This is how my method is organized now:
public async Task<SomeType> DoSth()
{
await Task.Run(...);
.....
//Do some other work
}
The thing is that obviously it stops at the await line waiting when it's done and only then continues the work.
And I need to "fire and forget"
Should I just call Task.Run() without any async-await?
And I need to "fire and forget"
I have a blog post that goes into details of several different approaches for fire-and-forget on ASP.NET.
In summary: first, try not to do fire-and-forget at all. It's almost always a bad idea. Do you really want to "forget"? As in, not care whether it completes successfully or not? Ignore any errors? Accept occasional "lost work" without any log notifications? Almost always, the answer is no, fire-and-forget is not the appropriate approach.
A reliable solution is to build a proper distributed architecture. That is, construct a message that represents the work to be done and queue that message to a reliable queue (e.g., Azure Queue, MSMQ, etc). Then have an independent backend that process that queue (e.g., Azure WebJob, Win32 service, etc).
Should I just call Task.Run() without any async-await?
No. This is the worst possible solution. If you must do fire-and-forget, and you're not willing to build a distributed architecture, then consider Hangfire. If that doesn't work for you, then at the very least you should register your cowboy background work with the ASP.NET runtime via HostingEnvironment.QueueBackgroundWorkItem or my ASP.NET Background Tasks library. Note that QBWI and AspNetBackgroundTasks are both unreliable solutions; they just minimize the chance that you'll lose work, not prevent it.
For fire and forget, use this
Task.Factory.StartNew(async () =>
{
using (HttpClient client = new HttpClient())
{
await client.PostAsync("http://localhost/api/action", new StringContent(""));
}
});
True fire and forget tasks can be difficult in asp.net as they can often die along with the request that they were created as part of.
If you are using 4.5.2+ then you can use QueueBackgroundWorkItem to run the task. By registering tasks via this method the AppDomain will try to delay shutting down until they have all completed but there can still be instances when they will be killed before they are completed. This is probably the simplest thing to do but worth reading into to see exactly what instances can cause jobs to be cancelled.
HostingEnvironment.QueueBackgroundWorkItem(async cancellationToken =>
{
await Task.Run(...);
});
There is an tool called hangfire that uses a persistent store to ensure that a task has completed and has built-in retry and error recording functionality. This is more for "background tasks" but does suit fire and forget. This is relatively easy to setup and offers a variety of backing stores, I can't recall the exact details but some require a license and some don't (like MSSQL).
I use HangFire.
This is best for me.
An easy way to perform background processing in .NET and .NET Core applications. No Windows Service or separate process required.
Backed by persistent storage. Open and free for commercial use.
I agree with others that you should not just forget about your call. However, to answer your question, if you remove await from the Task.Run() line, the call will not be blocking as shown here
public async Task<SomeType> DoSth()
{
Task.Run(...);
.....
//Do some other work while Task.Run() continues in parallel.
}
For invoking a fire and forget WebApi method, I used the following code to ensure that it returns an OK response. I my case, the bearer authorization token created at login is stored in a cookie:
...
FireAndForget().Wait();
...
private async Task FireAndForget()
{
using (var httpClient = new HttpClient())
{
HttpCookie cookie = this.Request.Cookies["AuthCookieName"];
var authToken = cookie["AuthTokenPropertyName"] as string;
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", authToken);
using (var response = await httpClient.GetAsync("http://localhost/api/FireAndForgetApiMethodName"))
{
//will throw an exception if not successful
response.EnsureSuccessStatusCode();
}
}
}
Never fire and forget, because then you won't get to see any errors, which makes for some very awkward troubleshooting if something goes wrong (having the task method do its own exception handling isn't guaranteed to work, because the task may not successfully start in the first place). Unless you really don't mind if the task does anything or not, but that's quite unusual (since, if you truly didn't care, why run the task in the first place)? At the very least, create your task with a continuation:
Task.Run(...)
.ContinueWith(t =>
logException(t.Exception.GetBaseException()),
TaskContinuationOptions.OnlyOnFaulted
)
;
You can make this more sophisticated as your needs dictate.
In the specific case of a web API, you may actually want to wait for your background tasks to finish before you complete your request. If you don't, you're leaving stuff running in the background that may misrepresent how much load your service can really take, or even stop working altogether if clients fire too many requests and you don't do anything to throttle them. You can gather tasks up and issue an await Task.WhenAll(...) at the end to achieve that; this way, you can continue to do useful work while your background tasks plod away, but you don't return until everything's done.
Probably this question has already been made, but I never found a definitive answer. Let's say that I have a Web API 2.0 Application hosted on IIS. I think I understand that best practice (to prevent deadlocks on client) is always use async methods from the GUI event to the HttpClient calls. And this is good and it works. But what is the best practice in case I had client application that does not have a GUI (e.g. Window Service, Console Application) but only synchronous methods from which to make the call? In this case, I use the following logic:
void MySyncMethodOnMyWindowServiceApp()
{
list = GetDataAsync().Result().ToObject<List<MyClass>>();
}
async Task<Jarray> GetDataAsync()
{
list = await Client.GetAsync(<...>).ConfigureAwait(false);
return await response.Content.ReadAsAsync<JArray>().ConfigureAwait(false);
}
But unfortunately this can still cause deadlocks on client that occur at random times on random machines.
The client app stops at this point and never returns:
list = await Client.GetAsync(<...>).ConfigureAwait(false);
If it's something that can be run in the background and isn't forced to be synchronous, try wrapping the code (that calls the async method) in a Task.Run(). I'm not sure that'll solve a "deadlock" problem (if it's something out of sync, that's another issue), but if you want to benefit from async/await, if you don't have async all the way down, I'm not sure there's a benefit unless you run it in a background thread. I had a case where adding Task.Run() in a few places (in my case, from an MVC controller which I changed to be async) and calling async methods not only improved performance slightly, but it improved reliability (not sure that it was a "deadlock" but seemed like something similar) under heavier load.
You will find that using Task.Run() is regarded by some as a bad way to do it, but I really couldn't see a better way to do it in my situation, and it really did seem to be an improvement. Perhaps this is one of those things where there's the ideal way to do it vs. the way to make it work in the imperfect situation that you're in. :-)
[Updated due to requests for code]
So, as someone else posted, you should do "async all the way down". In my case, my data wasn't async, but my UI was. So, I went async down as far as I could, then I wrapped my data calls with Task.Run in such as way that it made sense. That's the trick, I think, to figure out if it makes sense that things can run in parallel, otherwise, you're just being synchronous (if you use async and immediately resolve it, forcing it to wait for the answer). I had a number of reads that I could perform in parallel.
In the above example, I think you have to async up as far as makes sense, and then at some point, determine where you can spin off a t hread and perform the operation independent of the other code. Let's say you have an operation that saves data, but you don't really need to wait for a response -- you're saving it and you're done. The only thing you might have to watch out for is not to close the program without waiting for that thread/task to finish. Where it makes sense in your code is up to you.
Syntax is pretty easy. I took existing code, changed the controller to an async returning a Task of my class that was formerly being returned.
var myTask = Task.Run(() =>
{
//...some code that can run independently.... In my case, loading data
});
// ...other code that can run at the same time as the above....
await Task.WhenAll(myTask, otherTask);
//..or...
await myTask;
//At this point, the result is available from the task
myDataValue = myTask.Result;
See MSDN for probably better examples:
https://msdn.microsoft.com/en-us/library/hh195051(v=vs.110).aspx
[Update 2, more relevant for the original question]
Let's say that your data read is an async method.
private async Task<MyClass> Read()
You can call it, save the task, and await on it when ready:
var runTask = Read();
//... do other code that can run in parallel
await runTask;
So, for this purpose, calling async code, which is what the original poster is requesting, I don't think you need Task.Run(), although I don't think you can use "await" unless you're an async method -- you'll need an alternate syntax for Wait.
The trick is that without having some code to run in parallel, there's little point in it, so thinking about multi-threading is still the point.
Using Task<T>.Result is the equivalent of Wait which will perform a synchronous block on the thread. Having async methods on the WebApi and then having all the callers synchronously blocking them effectively makes the WebApi method synchronous. Under load you will deadlock if the number of simultaneous Waits exceeds the server/app thread pool.
So remember the rule of thumb "async all the way down". You want the long running task (getting a collection of List) to be async. If the calling method must be sync you want to make that conversion from async to sync (using either Result or Wait) as close to the "ground" as possible. Keep they long running process async and have the sync portion as short as possible. That will greatly reduce the length of time that threads are blocked.
So for example you can do something like this.
void MySyncMethodOnMyWindowServiceApp()
{
List<MyClass> myClasses = GetMyClassCollectionAsync().Result;
}
Task<List<MyClass>> GetMyListCollectionAsync()
{
var data = await GetDataAsync(); // <- long running call to remote WebApi?
return data.ToObject<List<MyClass>>();
}
The key part is the long running task remains async and not blocked because await is used.
Also don't confuse the responsiveness with scalability. Both are valid reasons for async. Yes responsiveness is a reason for using async (to avoid blocking on the UI thread). You are correct this wouldn't apply to a back end service however this isn't why async is used on a WebApi. The WebApi is also a non GUI back end process. If the only advantage of async code was responsiveness of the UI layer then WebApi would be sync code from start to finish. The other reason for using async is scalability (avoiding deadlocks) and this is the reason why WebApi calls are plumbed async. Keeping the long running processes async helps IIS make more efficient use of a limited number of threads. By default there are only 12 worker threads per core. This can be raised but that isn't a magic bullet either as threads are relatively expensive (about 1MB overhead per thread). await allows you to do more with less. More concurrent long running processes on less threads before a deadlock occurs.
The problem you are having with deadlocks must stem from something else. Your use of ConfigureAwait(false) prevents deadlocks here. Solve the bug and you are fine.
See Should we switch to use async I/O by default? to which the answer is "no". You should decide on a case by case basis and choose async when the benefits outweigh the costs. It is important to understand that async IO has a productivity cost associated with it. In non-GUI scenarios only a few targeted scenarios derive any benefit at all from async IO. The benefits can be enormous, though, but only in those cases.
Here's another helpful post: https://stackoverflow.com/a/25087273/122718
How does Asynchronous tasks (Async/Await) work in .Net 4.5?
Some sample code:
private async Task<bool> TestFunction()
{
var x = await DoesSomethingExists();
var y = await DoesSomethingElseExists();
return y;
}
Does the second await statement get executed right away or after the first await returns?
await pauses the method until the operation completes. So the second await would get executed after the first await returns.
For more information, see my async / await intro or the official FAQ.
It executes after the first await returns. If this thing confuses you, try to play around with breakpoints - they are fully supported by the new async pattern.
Imagine it would look like this:
var x = await GetSomeObjectInstance();
var y = await GetSomeObjectInstance2(x);
There probably would occur a NullReferenceException somewhere, so the first await has to return first. Otherwise, x would be null/undefined or whatever.
The methods calls will still occur sequentially just like "regular", non awaited method calls. The purpose of await is that it will return the current thread to the thread pool while the awaited operation runs off and does whatever.
This is particularly useful in high performance environments, say a web server, where a given request is processed on a given thread from the overall thread pool. If we don't await, then the given thread processing the request (and all it's resources) remains "in use" while the db / service call completes. This might take a couple of seconds or more especially for external service calls.
Now in low traffic websites this is not much of an issue but in high traffic sites the cost of all these request threads just sitting around, doing nothing, in an "in-use" state, waiting for other processes like those db /service calls to return can be a resource burden.
We are better off releasing the thread back to the worker pool to allow it do other useful work for some other request.
Once the db / service call completes, we can then interrupt the thread pool and ask for a thread to carry on processing that request from where it left off. At that point the state of the request is reloaded and the method call continues.
So on a per request basis when using await, the request will still take the same amount of time from the users perspective... plus a tiny smidge more for the switching overhead.
But in the aggregate, across all requests for all users, things can seem more performant to all users as the web server (in this case) runs more efficiently with better resource utilization. i.e. it either doesn't have to queue up requests waiting for free threads to process requests because await is returning them or alternatively we don't have to buy more hardware because we are using the same amount of hardware, more efficiently, to obtain higher throughputs.
There is a switching cost to this though so despite what you see in the default templates and in many docs you shouldn't just blindly use await for every single call. It's just a tool and like all tools it has its place. If the switching cost is not less than the cost of just completing your calls synchronously then you shouldn't use await.