This question already has answers here:
When should I use Async Controllers in ASP.NET MVC?
(8 answers)
Closed 6 years ago.
I am working with a pre-existing C# ASP.NET MVC webapplication and I'm adding some functionality to it that I can't decide whether or not to make async.
Right now, the application home page just processes a page and a user logs in. Nothing more, and nothing asynchronous going on at all.
I am adding functionality that will, when the homepage is visited, generate a call to a Web API that subsequently calls a database that grabs an identifier and returns it to an HTML tag on the home page. This identifier will not be visible on the screen, only on the source/HTML view (this is being added for various internal tracking purposes).
The Web API/database call is simple, just grab an identifier and return it to the controller. Regardless, I'm wondering whether the app should make this call asynchronously? The website traffic isn't immense, but I'm still wondering about concurrency, performance and future scalability.
The one catch is that I'd have to make the entire ActionMethod async and I'm not sure what the affects of that would be. The basic pattern, currently synchronous, is below:
public ActionResult Index()
{
var result = GetID();
ViewBag.result = result.Data;
return View();
}
public JsonResult GetID()
{
var result = "";
var url = "http://APIURL/GetID";
using (WebClient client = new WebClient())
{
result = client.DownloadString(url);
}
return Json(result, JsonRequestBehavior.AllowGet);
}
Any thoughts?
First and foremost, realize the purpose of async, in the context of a web application. A web server has what's called a thread pool. Generally speaking, 1 thread == 1 request, so if you have a 1000 threads in the pool (typical), your website can roughly serve 1000 simultaneous requests. Also keep in mind that, it often takes many requests to render a single resource. The HTML document itself is one request, but each image, JS file, CSS file, etc. is also a request. Then, there's any AJAX requests the page may issue. In other words, it's not uncommon for a request for a single resource to generate 20+ requests to the web server.
Given that, when your server hits its max requests (all threads are being utilized), any further requests are queued and processed in order as threads are made available. What async does is buy you some additional head room. If there's threads that are in a wait-state (waiting for the results of a database query, the response from a web service, a file to be read from the filesystem, etc.), then async allows these threads to be returned to the pool, where they are then able to field some of those waiting requests. When whatever the thread was waiting on completes, a new thread is requested to finish servicing the request.
What is important to note here is that a new thread is requested to finish servicing the request. Once the thread has been released to the pool, you have to wait for a thread again, just like a brand new request. This means running async can sometimes take longer than running sync, depending on the availability of threads in the pool. Also, async caries with it a non-insignificant amount of overhead that also adds to the overall load time.
Async != faster. It can many times be slower, but it allows your web server to more efficiently utilize resources, which could mean the difference between falling down and gracefully bearing load. Because of this, there's no one universal answer to a question like "Should I just make everything async?" Async is a trade-off between raw performance and efficiency. In some situations it may not make sense to use async at all, while in others you might want to use it for everything that's applicable. What you need to do is first identity the stress points of your application. For example, if your database instance resides on the same server as your web server (not a good idea, BTW), using async on your database queries would be fairly pointless. The chief culprit of waiting is network latency, whereas filesystem access is typically relatively quick. On the other hand, if your database server is in a remote datacenter and has to not only travel the pipes between there and your web server but also do things like traverse firewalls, well, then your network latency is much more significant, and async is probably a very good idea.
Long and short, you need to evaluate your setup, your environment and the needs of your application. Then, and only then, can you make smart decisions about this. That said, even given the overhead of async, if there's network latency involved at all, it's a pretty safe bet async should be used. It's perfectly acceptable to err on the site of caution and just use async everywhere it's applicable, and many do just that. If you're looking to optimize for performance though (perhaps you're starting the next Facebook?), then you'd want to be much more judicious.
Here, the reason to use async IO is to not have many threads running at the same time. Threads consume OS resources and memory. The thread pool also cal be a little slow to adjust to sudden load. If your thread count in a web app is below 100 and load is not extremely spikey you have nothing to worry about.
Generally, the slower a web service and the more often it is called the more beneficial async IO can be. You will need on average (latency * frequency) threads running. So 100ms call time and 10 calls per second is about 1 thread on average.
Run the numbers and see if you need to change anything or not.
Any thoughts?
Yes, lot's of thoughts...but that alone doesn't count as an answer. ;)
There is no real good answer here since there isn't much context provided. But let's address what we know.
Since we are a web application, each request/response cycle has a direct impact on performance and can be a bottleneck.
Since we are internally invoking another API call from ours, we shouldn't assume that it is hosted on the same server - as such this should be treated just like all I/O bound operations.
With the two known factors above, we should make our calls async. Consider the following:
public async Task<ActionResult> Index()
{
var result = await GetIdAsync();
ViewBag.result = result.Data;
return View();
}
public async Task<JsonResult> GetIdAsync()
{
var result = "";
var url = "http://APIURL/GetID";
using (WebClient client = new WebClient())
{
// Note the API change here?
result = await client.DownloadStringAsync(url);
}
return Json(result, JsonRequestBehavior.AllowGet);
}
Now, we are correctly using async and await keywords with our Task<T> returning operations. This will help to ensure ideal performance. Notice the API change on the client.DownloadStringAsync too, this is very important.
Related
I have the following WebAPI 2 method:
public HttpResponseMessage ProcessData([FromBody]ProcessDataRequestModel model)
{
var response = new JsonResponse();
if (model != null)
{
// checks if there are old records to process
var records = _utilityRepo.GetOldProcesses(model.ProcessUid);
if (records.Count > 0)
{
// there is an active process
// insert the new process
_utilityRepo.InsertNewProcess(records[0].ProcessUid);
response.message = "Process added to ProcessUid: " + records[0].ProcessUid.ToString();
}
else
{
// if this is a new process then do adjustments rules
var settings = _utilityRepo.GetSettings(model.Uid);
// create a new process
var newUid = Guid.NewGuid();
// if its a new adjustment
if (records.AdjustmentUid == null)
{
records.AdjustmentUid = Guid.NewGuid();
// create new Adjustment information
_utilityRepo.CreateNewAdjustment(records.AdjustmentUid.Value);
}
// if adjustment created
if (_utilityRepo.CreateNewProcess(newUid))
{
// insert the new body
_utilityRepo.InsertNewBody(newUid, model.Body, true);
}
// start AWS lambda function timer
_utilityRepo.AWSStartTimer();
response.message = "Process created";
}
response.success = true;
response.data = null;
}
return Request.CreateResponse(response);
}
The above method sometimes can take from 3-4 seconds to process (some db calls and other calculations) and I don't want the user to wait until all the executions are done.
I would like the user hit the web api method and almost inmediatly get a success response, meanwhile the server is finishing all the executions.
Any clue on how to implement Async / Await to achieve this?
If you don't need to return a meaningful response it's a piece of cake. Wrap your method body in a lambda you pass to Task.Run (which returns a Task). No need to use await or async. You just don't await the Task and the endpoint will return immediately.
However if you need to return a response that depends on the outcome of the operation, you'll need some kind of reporting mechanism in place, SignalR for example.
Edit: Based on the comments to the original post, my recommendation would be to wrap the code in await Task.Run(()=>...), i.e., indeed await it before returning. That will allow the long-ish process to run on a different thread asynchronously, but the response will still await the outcome rather than leaving the user in the dark about whether it finished (since you have no control over the UI). You'd have to test it though to see if there's really any performance benefit from doing this. I'm skeptical it'll make much difference.
2020-02-14 Edit:
Hooray, my answer's votes are no longer in the negative! I figured having had the benefit of two more years of experience I would share some new observations on this topic.
There's no question that asynchronous background operations running in a web server is a complex topic. But as with most things, there's a naive way of doing it, a "good enough for 99% of cases" way of doing it, and a "people will die (or worse, get sued) if we do it wrong" way of doing it. Things need to be put in perspective.
My original answer may have been a little naive, but to be fair the OP was talking about an API that was only taking a few seconds to finish, and all he wanted to do was save the user from having to wait for it to return. I also noted that the user would not get any report of progress or completion if it is done this way. If it were me, I'd say the user should suck it up for that short of a time. Alternatively, there's nothing that says the client has to wait for the API response before returning control to the user.
But regardless, if you really want to get that 200 right away JUST to acknowledge that the task was initiated successfully, then I still maintain that a simple Task.Run(()=>...) without the await is probably fine in this case. Unless there are truly severe consequences to the user not knowing the API failed, on the off chance that the app pool was recycled or the server restarted during those exact 4 seconds between the API return and its true completion, the user will just be ignorant of the failure and will presumably find out next time they go into the application. Just make sure that your DB operations are transactional so you don't end up in a partial success situation.
Then there's the "good enough for 99% of cases" way, which is what I do in my application. I have a "Job" system which is asynchronous, but not reentrant. When a job is initiated, we do a Task.Run and begin to execute it. The code in the task always holds onto a Job data structure whose ID is returned immediately by the API. The code in the task periodically updates the Job data with status, which is also saved to a database, and checks to see if the Job was cancelled by the user, in which case it wraps up immediately and the DB transaction is rolled back. The user cancels by calling another API which updates said Job object in the database to indicate it should be cancelled. A separate infinite loop periodically polls the job database server side and updates the in-memory Job objects used by the actual running code with any cancellation requests. Fundamentally it's just like any CancellationToken in .NET but it just works via a database and API calls. The front end can periodically poll the server for job status using the ID, or better yet, if they have WebSockets the server pushes job updates using SignalR.
So, what happens if the app domain is lost during the job? Well, first off, every job runs in a single DB transaction, so if it doesn't complete the DB rolls back. Second, when the ASP.NET app restarts, one of the first things it does is check for any jobs that are still marked as running in the DB. These are the zombies that died upon app pool restart but the DB still thinks they're alive. So we mark them as KIA, and send the user an email indicating their job failed and needs to be rerun. Sometimes it causes inconvenience and a puzzled user from time to time, but it works fine 99% of the time. Theoretically, we could even automatically restart the job on server startup if we wanted to, but we feel it's better to make that a manual process for a number of case-specific reasons.
Finally, there's the "people will die (or worse, get sued) if we get it wrong" way. This is what some of the other comments are more directed to. This is where have to break down all jobs into small atomic transactions that are tracked in a database at every step, and which can be picked up by any server (the same or maybe another server in a farm) at any time. If it's really top notch, multiple servers can even work on the same job concurrently, depending on what it is. It requires carefully coding every background operation with this in mind, constantly updating a database with your progress, dealing with concurrent changes to the database (because now the entire operation is no longer a single atomic transaction), etc. Needless to say, it's a LOT of effort. Yeah, it would be great if it worked this way. It would be great if every app did everything to this level of perfection. I also want a toilet made out of solid gold, but it's just not in the cards now is it?
So my $0.02 is, again, let's have some perspective. Do the cost benefit analysis and unless you're doing something where lives or lots of money is at stake, aim for what works perfectly well 99%+ of the time and only causes minor inconvenience when it doesn't work perfectly.
I know you should only use async for stuff which is not "CPU-intensive", e.g. file writes, web calls etc. and therefore I also know it doesn't makes sense to wrap every method into Task.Run or something similar.
However what should I do when I know a method does a web call, but it doesn't offer an async interface. Is it in this case worth to wrap it?
Concrete example:
I'm using CSOM (Client SharePoint Object Model) in my WebApi application (server) and want to get a SharePoint list.
This is normally done like this:
[HttpGet]
[Route("foo/{webUrl}")]
public int GetNumberOfLists(string webUrl)
{
using (ClientContext context = new ClientContext(webUrl))
{
Web web = context.Web;
context.Load(web.Lists);
context.ExecuteQuery();
return web.Lists.Count;
}
}
And I thought about changing it to something like this:
[HttpGet]
[Route("foo/{webUrl}")]
public async Task<int> GetNumberOfLists(string webUrl)
{
using (ClientContext context = new ClientContext(webUrl))
{
Web web = context.Web;
context.Load(web.Lists);
await Task.Run(() => clientContext.ExecuteQuery());
return web.Lists.Count;
}
}
Does it make sense and does it help? As I understand it, I just create / need a new thread for executing the query ("overhead") but at least the request thread will be free / ready for another request (that would be good).
But is it worth it and should it be done like this?
If so:
Isn't it strange that Microsoft doesn't offer the "async" method out of the box or did they just not care about it?
edit:
updated to use Task.Run as suggested in comment.
However what should I do when I know a method does a web call, but it doesn't offer an async interface.
Unfortunately still somewhat common. As different libraries update their APIs, they will eventually catch up.
Is it in this case worth to wrap it?
Yes, if you're dealing with a UI thread. Otherwise, no.
Concrete example... in my WebApi application (server)
Then, no, you don't want to wrap in Task.Run. As noted in my article on async ASP.NET:
You can kick off some background work by awaiting Task.Run, but there’s no point in doing so. In fact, that will actually hurt your scalability by interfering with the ASP.NET thread pool heuristics... As a general rule, don’t queue work to the thread pool on ASP.NET.
Wrapping with Task.Run on ASP.NET:
Interferes with the ASP.NET thread pool heuristics twice (by taking a thread now and then releasing it later).
Adds overhead (code has to switch threads).
Does not free up a thread (the total number of threads used for this request is almost equal to just calling the synchronous version).
As I understand it, I just create / need a new thread for executing the query ("overhead") but at least the request thread will be free / ready for another request (that would be good).
Yes, but all you're doing is jumping threads, for no benefit. The thread used to block on the query result is one less thread ASP.NET has to use to handle requests, so freeing up one thread by consuming another isn't a good tradeoff.
Isn't it strange that Microsoft doesn't offer the "async" method out of the box or did they just not care about it?
Some of the "older" MS APIs just haven't gotten around to adding async versions yet. They certainly should, but developer time is a finite resource.
This is my personal view of your problem and for me the above way is not required. When we host your API in IIS, the server assigns one thread from thread pool it has in the server. The IIS also has a setting of maxConcurrentRequestsPerCPU maxConcurrentThreadsPerCPU. You can setup these values to serve the request instead of handling the request all by yourself.
I came a across a nice little tool that has been added to ASP.NET in v4.5.2
I am wandering how safe it is and how one can effectively utilize it in an ASP.NET MVC or Web API scenario.
I know I am always wanting to do a quick and simple fire and forget task in my web applications. For example:
Sending an emails/s
Sending push notifications
Logging analytics or errors to the db
Now typically I just create a method called
public async Task SendEmailAsync(string to, string body)
{
//TODO: send email
}
and I would use it like so:
public async Task<ActionResult> Index()
{
...
await SendEmailAsync(User.Identity.Username, "Hello");
return View();
}
now my concern with this is that, I am delaying the user in order to send my email to them. This doesn't make much sense to me.
So I first considered just doing:
Task.Run(()=> SendEmailAsync(User.Identity.Username, "Hello"));
however when reading up about this. It is apparently not the best thing to do in IIS environment. (i'm not 100% sure on the specifics).
So this is where I came across HostingEnvironment.QueueBackgroundWorkItem(x=> SendEmailAsync(User.Identity.Username, "Hello"));
This is a very quick and easy way to offload the send email task to a background worker and serve up the users View() much quicker.
Now I am aware this is not for tasks running longer than 90 seconds and is not 100% guaranteed executution.
But my question is:
Is HostingEnvironment.QueueBackgroundWorkItem() sufficient for: sending emails, push notifications, db queries etc in a standard ASP.NET web site.
It depends.
The main benefit of QueueBackgroundWorkItem is the following, emphasis mine (source):
Differs from a normal ThreadPool work item in that ASP.NET can keep track of how many work items registered through this API are currently running, and the ASP.NET runtime will try to delay AppDomain shutdown until these work items have finished executing.
Essentially, QueueBackgroundWorkItem helps you run tasks that might take a couple of seconds by attempting not to shutdown your application while there's still a task running.
Running a normal database query or sending out a push notification should be a matter of a couple hundred milliseconds (or a few seconds); neither should take a very long time and should thus be fine to run within QueueBackgroundWorkItem.
However, there's no guarantee for the task to finish — as you said, the task is not awaited. It all depends on the importance of the task to execute. If the task must complete, it's not a good candidate for QueueBackgroundWorkItem.
I recently read an article about c#-5 and new & nice asynchronous programming features . I see it works greate in windows application. The question came to me is if this feature can increase ASP.Net performance?
consider this two psudo code:
public T GetData()
{
var d = GetSomeData();
return d;
}
and
public async T GetData2()
{
var d = await GetSomeData();
return d;
}
Has in an ASP.Net appication that two codes difference?
thanks
Well for a start your second piece of code would return Task<T> rather than T. The ultimate answer is "it depends".
If your page needs to access multiple data sources, it may make it simpler to parallelize access to those sources, using the result of each access only where necessary. So for example, you may want to start making a long-running data fetch as the first part of your page handling, then only need the result at the end. It's obviously possible to do this without using async/await, but it's a lot simpler when the language is helping you out.
Additionally, asynchrony can be used to handle a huge number of long-running requests on a small number of threads, if most of those requests will be idle for a lot of the time - for example in a long-polling scenario. I can see the async abilities being used as an alternative to SignalR in some scenarios.
The benefits of async on the server side are harder to pin down than on the client side because there are different ways in which it helps - whereas the "avoid doing work on the UI thread" side is so obvious, it's easy to see the benefit.
Don't forget that there can be a lot more to server-side coding than just the front-end. In my experience, async is most likely to be useful when implementing RPC services, particularly those which talk to multiple other RPC services.
As Pasi says, it's just syntactic sugar - but I believe that it's sufficiently sweet sugar that it may well make the difference between proper asynchronous handling being feasible and it being simply too much effort and complexity.
Define 'performance'.
Ultimately the application is going to be doing the same amount of work as it would have done synchronously, it's just that the calling thread in the asynchronous version will wait for the operation to complete on another, whereas in the synchronous model it's the same thread performing the task.
Ultimately in both cases the client will wait the same amount of time before seeing a response from the web server and therefore won't notice any difference in performance.
If the web request is being handled via an asynchronous handler, then, again, the response will still take the same amount of time to return - however, you can decrease the pressure on the thread pool, making the webserver itself more responsive in accepting requests - see this other SO for more details on that.
Since the code is executed on the server and the user will still have to wait for the response, the question is like - does async call go faster than sync call.
Well, that depends mostly on the server implementation. For IIS and multiple number of users, too many threads would be spawned per user (even without async) and async would be inefficient. But in case of small amount of users it should be faster.
One way to see is to try out.
No. These are purely syntactic sugar and make your job as a programmer easier by allowing writing simple code and read code simply when you fix the bugs.
It does allow simpler approach to loading data without blocking UI, so in a way yes, but not really.
It would only increase performance if you needed to do multiple things, that can all be done without the need for any other information. Otherwise you may as well just do them in sequence.
In terms of your example, the answer is no. The page needs to wait for each one regardless.
I'm writing an ASP.NET web service using C# that has a DoLookup() function. For each call to the DoLookup() function I need my code to execute two separate queries: one to another web service at a remote site and one to a local database. Both queries have to complete before I can compile the results and return them as the response to the DoLookup method. The problem I'm dealing with is that I want to make this as efficient as possible, both in terms of response time and resource usage on the web server. We are expecting up to several thousand queries per hour. Here's a rough C#-like overview of what I have so far:
public class SomeService : System.Web.Services.WebService
{
public SomeResponse DoLookup()
{
// Do the lookup at the remote web service and get the response
WebResponse wr = RemoteProvider.DoRemoteLookup();
// Do the lookup at the local database and get the response
DBResponse dbr = DoDatabaseLookup();
SomeResponse resp = new SomeResponse( wr, dbr);
return resp;
}
}
The above code does everything sequentially and works great but now I want to make it more scalable. I know that I can call the DoRemoteLookup() function asynchronously ( RemoteProvider has BeginRemoteLookup / EndRemoteLookup methods) and that I can also do the database lookup asynchronously using the BeginExecuteNonQuery / EndExecuteNonQuery methods.
My question (finally) is this: how do I fire both the remote web service lookup AND the database lookup simultaneously on separate threads and ensure that they have both completed before returning the response?
The reason I want to execute both requests on separate threads is that they both potentially have long response times (1 or 2 seconds) and I'd like to free up the resources of the web server to handle other requests while it is waiting for responses. One additional note - I do have the remote web service lookup running asynchronously currently, I just didn't want to make the sample above too confusing. What I'm struggling with is getting both the remote service lookup AND the database lookup started at the same time and figuring out when they have BOTH completed.
Thanks for any suggestions.
You can use a pair of AutoResetEvents, one for each thread. At the end of thread execution, you call AutoResetEvents.Set() to trigger the event.
After spawning the threads, you use WaitAll() with the two AutoResetEvents. This will cause the thread to block until both events are set.
The caveat to this approach is that you must ensure the Set() is guarantee to be called, otherwise you will block forever. Additionally ensure that with threads you exercise proper exception handling, or you will inadvertently cause more performance issues when unhanded exceptions cause your web application to restart.
MSDN Has sample code regarding AutoResetEvent usage.
See Asynchronous XML Web Service Methods, How to: Create Asynchronous Web Service Methods and How to: Chain Asynchronous Calls with a Web Service Method.
But note the first paragraph of those articles:
This topic is specific to a legacy technology. XML Web services and XML Web service clients should now be created using Windows Communication Foundation (WCF).
BTW, doing things the way these articles say is important because it frees up the ASP.NET worker thread while the long-running task runs. Otherwise, you might be blocking the worker thread, preventing it from servicing further requests, and impacting scalability.
Assuming you can have a callback function for both the web request and the database lookup then something along these lines may work
bool webLookupDone = false;
bool databaseLookupDone = false;
private void FinishedDBLookupCallBack()
{
databaseLookupDone = true;
if(webLookupDone)
{
FinishMethod();
}
}
private void FinishedWebLookupCallBack()
{
webLookupDone = true;
if(databaseLookupDone)
{
FinishMethod();
}
}
I guess I don't have enough rep to upvote nor to comment. So this is a comment on John Saunders answer and Alan's comment on it.
You definitely want to go with John's answer if you are concerned about scalability and resource consumption.
There are two considerations here: Speeding up an individual request, and making your system handle many concurrent requests efficiently. The former both Alan's and John's answer achieve by performing the external calls in parallel.
The latter, and it sounds like that was your main concern, is achieved by not having threads blocked anywhere, i.e. John's answer.
Don't spawn your own threads. Threads are expensive, and there are already plenty of threads in the IO Threadpool that will handle your external calls for you if you use the asynch methods provided by the .net framework.
Your service's webmethod needs to be asynch as well. Otherwise a worker thread will be blocked until your external calls are done (it's still 1-2 seconds even if they run in parallel). And you only have 12 threads per CPU handling incoming requests (if your machine.config is set according to recommendation.) I.e. you would at most be able to handle 12 concurrent requests (times the # of CPUs). On the other hand if your web method is asynch the Begin will return pretty much instantenously and the thread returned to the worker thread pool ready to handle another incoming request, while your external calls are being waited on by the IO completion port, where they will be handled by threads from the IO thread pool, once they return.