HostingEnvironment.QueueBackgroundWorkItem() in ASP.NET for small background tasks - c#

I came a across a nice little tool that has been added to ASP.NET in v4.5.2
I am wandering how safe it is and how one can effectively utilize it in an ASP.NET MVC or Web API scenario.
I know I am always wanting to do a quick and simple fire and forget task in my web applications. For example:
Sending an emails/s
Sending push notifications
Logging analytics or errors to the db
Now typically I just create a method called
public async Task SendEmailAsync(string to, string body)
{
//TODO: send email
}
and I would use it like so:
public async Task<ActionResult> Index()
{
...
await SendEmailAsync(User.Identity.Username, "Hello");
return View();
}
now my concern with this is that, I am delaying the user in order to send my email to them. This doesn't make much sense to me.
So I first considered just doing:
Task.Run(()=> SendEmailAsync(User.Identity.Username, "Hello"));
however when reading up about this. It is apparently not the best thing to do in IIS environment. (i'm not 100% sure on the specifics).
So this is where I came across HostingEnvironment.QueueBackgroundWorkItem(x=> SendEmailAsync(User.Identity.Username, "Hello"));
This is a very quick and easy way to offload the send email task to a background worker and serve up the users View() much quicker.
Now I am aware this is not for tasks running longer than 90 seconds and is not 100% guaranteed executution.
But my question is:
Is HostingEnvironment.QueueBackgroundWorkItem() sufficient for: sending emails, push notifications, db queries etc in a standard ASP.NET web site.

It depends.
The main benefit of QueueBackgroundWorkItem is the following, emphasis mine (source):
Differs from a normal ThreadPool work item in that ASP.NET can keep track of how many work items registered through this API are currently running, and the ASP.NET runtime will try to delay AppDomain shutdown until these work items have finished executing.
Essentially, QueueBackgroundWorkItem helps you run tasks that might take a couple of seconds by attempting not to shutdown your application while there's still a task running.
Running a normal database query or sending out a push notification should be a matter of a couple hundred milliseconds (or a few seconds); neither should take a very long time and should thus be fine to run within QueueBackgroundWorkItem.
However, there's no guarantee for the task to finish — as you said, the task is not awaited. It all depends on the importance of the task to execute. If the task must complete, it's not a good candidate for QueueBackgroundWorkItem.

Related

Will non-awaited async functions definitely attempt finish in ASP.NET Core Web API?

It's my understanding that controllers get destroyed after an http request is made. Is there any assurances that the .NET Core runtime will wait until all threads initiated in an async action have terminated/ended before destroying the controller instance?
I have code below with an async controller action that calls an async function. I don't need to know if the async function actually succeeds or not (e.g. sending the email), I just want to make sure that it attempts to. My fear is that the .NET Core runtime will possibly kill the thread in the middle of execution.
Spoiler alert I ran the code below in my development environment and it does send the email every time (I put a real email). But I don't know if the behavior would change in a production environment.
Any thoughts/guidance?
[HttpGet]
public async Task SendEmail()
{
// If I would prefix this with 'await' the controller
// action doesn't terminate until the async function returns
this.InternalSendEmail();
}
private async Task InternalSendEmail()
{
try
{
await this.Email.Send("to#example.com", "Interesting subject", "Captivating content");
}
catch (Exception exc)
{
Log(exc);
}
}
What happens to the controller instance - nothing you can't manage
First, when we talk about destroying the controller instance let's be more precise. The instance won't get GCd as long as there's still a control flow that has access to this. It can't. So your controller instance will be fine in that regard at least until your private method finishes.
What will happen is your controller method will return and control flow will go to the next stage in the middleware chain, meaning your API consumer will likely get the http response before the email is sent. You will lose your HttpContext and all that goes with it when this happens. Thus if there's anything in your Log method or anything else in InternalSendEmail that relies on the HttpContext you need to make sure that information is extracted and provided to the background method before the controller method returns.
What happens to the thread - almost certainly nothing
As far as the thread goes, most likely the email will be sent on a different thread in the thread pool from that of the original controller method, but either way, no the .NET runtime isn't going to care about your controller method returning before every task it fired off has completed, let alone kill the thread. That's well above its paygrade. Moreover it's very rare for threads to be killed in any instance these days because it's not just your control flow that's affected but completely unrelated async contexts could be dependent on that thread too.
IIS Application Pool recycling and other things that COULD potentially kill your background task
The only reasonably likely thing that would cause your background task not to complete would be if the process terminated. This happens for example during an IIS Application Pool reset (or equivalent if you're using dotnet hosting), obviously a server restart, etc. It can also happen if there's a catastrophic event like running out of memory, or nasty things like memory faults unique to unsafe or native code. But these things would kill all pending HTTP requests too.
I have seen anecdotal assertions that if there are no pending HTTP requests it makes it more likely that IIS will recycle the application pool on its own even if you have other active code running. After many years of maintaining an application that uses a very similar pattern for many non-critical long-running tasks, I have not seen this happen in practice (and we log every application start to a local TXT file so we would know if this were happening). So I am personally skeptical of this, though I welcome someone providing an authoritative source proving me wrong.
That said, we do set the application pool to reset every day at 4 AM, so to the extent that IIS would be inclined to involuntarily reset our app pools (as it does need to happen every now and then) I suspect this helps mitigate that, and would recommend it regardless. We also allow only one CPU process per application, rather than allowing IIS to fire off processes whenever it feels like it; I suspect this also makes it less likely IIS would kill the process involuntarily.
In sum - this is perfectly fine for non-critical tasks
I would not use this for critical tasks where unambiguous acknowledgement of success or failure is needed, such as in life critical applications. But for 99+% of real world applications what you're doing is perfectly fine as long as you account for the things discussed above and have some reasonable level of fault tolerance and failsafes in place, which the fact that you're logging the exception shows you clearly do.
PS - If you're interested in having robust progress reporting and you aren't familiar with it, I would look into SignalR, which would allow you to notify the client of a successful email send (or anything else) even after the API call returns, and is shockingly easy to use. Plus an open websocket connection would almost certainly prevent IIS from mistaking a returned API method for an opportunity to kill the process.
Is there any assurances that the .NET Core runtime will wait until all threads initiated in an async action have terminated/ended before destroying the controller instance?
No, this is absolutely not guaranteed.
I don't need to know if the async function actually succeeds or not (e.g. sending the email), I just want to make sure that it attempts to. My fear is that the .NET Core runtime will possibly kill the thread in the middle of execution.
You cannot be sure that it will attempt to do so. The thread (and entire process) may be terminated at any time after the HTTP response is sent. In general, request-extrinsic code is not guaranteed to complete.
Some people are fine with losing some work (e.g., in this case, missing some emails). I'm not, so my systems are all built on a basic distributed architecture, as described on my blog.
It's important to note that work can be lost during any shutdown, and shutdowns are normal:
Any rolling upgrade triggers shutdowns (i.e., application updates).
IIS/ASP.NET recycles every 19 hours by default.
Runtime and OS patches require shutdowns.
Cloud hosting causes shutdowns (both at the VM level and higher levels).
Bottom line: shutdowns are normal, and shutdowns cause any currently-running request-extrinsic work to be lost. If you're fine with occasionally losing work, then OK; but if you require an assurance that the work is done, then you'll need a basic distributed architecture (a durable queue with a background processor).
There are more basic control flow issues with that logic what you trying to do. Your biggest problem is not the garantee about it is finished or not.
The example you present is very simple, but in real life example you will need some context in InternalSendEmail when it is executed. Because the request is completely served at the time it is executed, there will not be HttpContext, with all the consequences, for example you can not even log the IP address of the the request, not talking about all the more advanced context bound things like the user (or any other security principal) etc.
Of course you can pass anything as parameter (for example the IP address) but probably your logging infra (or your custom log enricher) will not work with that. Same is true for any other pipeline component which depends on the context.

Is multithreading an API application a thing?

I'm learning C#/DOTNET as one of the main reasons are incredible speeds over Node.js and OO syntax.
Now the tutorial I am following all of a sudden introduced async, and that's cool, but I could have done that with Node.js as well, so I feel a little disappointed.
My thought was maybe we could take this to the next level with Multithreading, but a lot of questions came up, with discrepancy in the database (like thread one is expecting to get data that thread two updated, but thread two was not executed before thread one retrieved, so thread one is working with an outdated data).
And searching for this seems to return very little information, mostly it's people misunderstanding multithreading and asynchronous programing.
So I'm guessing you would not want to mix API with multithreading?
Yes, it's a thing, and you're already doing it with async tasks.
.NET has a Task Scheduler that assigns your tasks to available threads from the Thread Pool. Default behavior is to create a pool of threads for each available CPU.
Clarification: this doesn't mean 1 task : 1 thread. There's a large collection of work to be done by a number of workers. Scheduler hands a worker a job, worker works until it's done or an 'await' is reached.
From the perspective of a regular async method, it can be hard to see where the 'multi-threading' comes into play. There isn't an obvious difference between Get() and await GetAsync() when your code has to sit and wait either way.
But it's not always about your code. This example might make it more clear.
List<Task> work = new();
foreach(var uri in uriList)
{
work.Add(http.GetAsync(uri));
}
await Task.WhenAll(work);
This code will execute all those GetAsyncs at the same time.
The framework making your API work is doing something similar. It would be pretty silly if the whole server was tied up because a single user requested a big file over dialup.
Async await is used for multi-threading but it is not used only for multi-threading.
I have not pesronally used/seen multi-threading in API but only console jobs. Using TPL in console jobs has improved the efficiency more than 100% for me
Async/Await is powerful and should be used for asynchronic processing in API's too.
Please go through Shiv's videos https://www.youtube.com/watch?v=iMcycFie-nk

Make Controller methods asynchronous or leave synchronous? [duplicate]

This question already has answers here:
When should I use Async Controllers in ASP.NET MVC?
(8 answers)
Closed 6 years ago.
I am working with a pre-existing C# ASP.NET MVC webapplication and I'm adding some functionality to it that I can't decide whether or not to make async.
Right now, the application home page just processes a page and a user logs in. Nothing more, and nothing asynchronous going on at all.
I am adding functionality that will, when the homepage is visited, generate a call to a Web API that subsequently calls a database that grabs an identifier and returns it to an HTML tag on the home page. This identifier will not be visible on the screen, only on the source/HTML view (this is being added for various internal tracking purposes).
The Web API/database call is simple, just grab an identifier and return it to the controller. Regardless, I'm wondering whether the app should make this call asynchronously? The website traffic isn't immense, but I'm still wondering about concurrency, performance and future scalability.
The one catch is that I'd have to make the entire ActionMethod async and I'm not sure what the affects of that would be. The basic pattern, currently synchronous, is below:
public ActionResult Index()
{
var result = GetID();
ViewBag.result = result.Data;
return View();
}
public JsonResult GetID()
{
var result = "";
var url = "http://APIURL/GetID";
using (WebClient client = new WebClient())
{
result = client.DownloadString(url);
}
return Json(result, JsonRequestBehavior.AllowGet);
}
Any thoughts?
First and foremost, realize the purpose of async, in the context of a web application. A web server has what's called a thread pool. Generally speaking, 1 thread == 1 request, so if you have a 1000 threads in the pool (typical), your website can roughly serve 1000 simultaneous requests. Also keep in mind that, it often takes many requests to render a single resource. The HTML document itself is one request, but each image, JS file, CSS file, etc. is also a request. Then, there's any AJAX requests the page may issue. In other words, it's not uncommon for a request for a single resource to generate 20+ requests to the web server.
Given that, when your server hits its max requests (all threads are being utilized), any further requests are queued and processed in order as threads are made available. What async does is buy you some additional head room. If there's threads that are in a wait-state (waiting for the results of a database query, the response from a web service, a file to be read from the filesystem, etc.), then async allows these threads to be returned to the pool, where they are then able to field some of those waiting requests. When whatever the thread was waiting on completes, a new thread is requested to finish servicing the request.
What is important to note here is that a new thread is requested to finish servicing the request. Once the thread has been released to the pool, you have to wait for a thread again, just like a brand new request. This means running async can sometimes take longer than running sync, depending on the availability of threads in the pool. Also, async caries with it a non-insignificant amount of overhead that also adds to the overall load time.
Async != faster. It can many times be slower, but it allows your web server to more efficiently utilize resources, which could mean the difference between falling down and gracefully bearing load. Because of this, there's no one universal answer to a question like "Should I just make everything async?" Async is a trade-off between raw performance and efficiency. In some situations it may not make sense to use async at all, while in others you might want to use it for everything that's applicable. What you need to do is first identity the stress points of your application. For example, if your database instance resides on the same server as your web server (not a good idea, BTW), using async on your database queries would be fairly pointless. The chief culprit of waiting is network latency, whereas filesystem access is typically relatively quick. On the other hand, if your database server is in a remote datacenter and has to not only travel the pipes between there and your web server but also do things like traverse firewalls, well, then your network latency is much more significant, and async is probably a very good idea.
Long and short, you need to evaluate your setup, your environment and the needs of your application. Then, and only then, can you make smart decisions about this. That said, even given the overhead of async, if there's network latency involved at all, it's a pretty safe bet async should be used. It's perfectly acceptable to err on the site of caution and just use async everywhere it's applicable, and many do just that. If you're looking to optimize for performance though (perhaps you're starting the next Facebook?), then you'd want to be much more judicious.
Here, the reason to use async IO is to not have many threads running at the same time. Threads consume OS resources and memory. The thread pool also cal be a little slow to adjust to sudden load. If your thread count in a web app is below 100 and load is not extremely spikey you have nothing to worry about.
Generally, the slower a web service and the more often it is called the more beneficial async IO can be. You will need on average (latency * frequency) threads running. So 100ms call time and 10 calls per second is about 1 thread on average.
Run the numbers and see if you need to change anything or not.
Any thoughts?
Yes, lot's of thoughts...but that alone doesn't count as an answer. ;)
There is no real good answer here since there isn't much context provided. But let's address what we know.
Since we are a web application, each request/response cycle has a direct impact on performance and can be a bottleneck.
Since we are internally invoking another API call from ours, we shouldn't assume that it is hosted on the same server - as such this should be treated just like all I/O bound operations.
With the two known factors above, we should make our calls async. Consider the following:
public async Task<ActionResult> Index()
{
var result = await GetIdAsync();
ViewBag.result = result.Data;
return View();
}
public async Task<JsonResult> GetIdAsync()
{
var result = "";
var url = "http://APIURL/GetID";
using (WebClient client = new WebClient())
{
// Note the API change here?
result = await client.DownloadStringAsync(url);
}
return Json(result, JsonRequestBehavior.AllowGet);
}
Now, we are correctly using async and await keywords with our Task<T> returning operations. This will help to ensure ideal performance. Notice the API change on the client.DownloadStringAsync too, this is very important.

Asynchronous DB-Query to trigger Stored Procedure

I want to performa an asynchronous DB Query in C# that calls a stored procedure for a Backup. Since we use Azure this takes about 2 minutes and we don't want the user to wait that long.
So the idea is to make it asynchronous, so that the task continues to run, after the request.
[HttpPost]
public ActionResult Create(Snapshot snapshot)
{
db.Database.CommandTimeout = 7200;
Task.Run(() => db.Database.ExecuteSqlCommandAsync("EXEC PerformSnapshot #User = '" + CurrentUser.AccountName + "', #Comment = '" + snapshot.Comment + "';"));
this.ShowUserMessage("Your snapshot has been created.");
return this.RedirectToActionImpl("Index", "Snapshots", new System.Web.Routing.RouteValueDictionary());
}
I'm afraid that I haven't understood the concept of asynchronous taks. The query will not be executed (or aborted?), if I don't use the wait statement. But actually "waiting" is the one thing I espacially don't want to do here.
So... why am I forced to use wait here?
Or will the method be started, but killed if the requst is finished?
We don't want the user to wait that long.
async-await won't help you with that. Odd as it may sound, the basic async-await pattern is about implementing synchronous behavior in a non-blocking fashion. It doesn't re-arrange your logical flow; in fact, it goes to great lengths to preserve it. The only thing you've changed by going async here is that you're no longer tying up a thread during that 2-minute database operation, which is a huge win your app's scalability if you have lots of concurrent users, but doesn't speed up an individual request one bit.
I think what you really want is to run the operation as a background job so you can respond to the user immediately. But be careful - there are bad ways to do that in ASP.NET (i.e. Task.Run) and there are good ways.
Dave, you're not forced to use await here. And you're right - from user perspective it still will take 2 minutes. The only difference is that the thread which processes your request can now process other requests meanwhile database does its job. And when database finishes, the thread will continue process your request.
Say you have limited number of threads capable to process HTTP request. This async code will help you to process more requests per time period, but it won't help user to get the job done faster.
This seems to be down to a misunderstanding as to what async and await do.
async does not mean run this on a new thread, in essence it acts as a signal to the compiler to build a state machine, so a method like this:
Task<int> GetMeAnInt()
{
return await myWebService.GetMeAnInt();
}
sort of (cannot stress this enough), gets turned into this:
Task<int> GetMeAnInt()
{
var awaiter = myWebService.GetMeAnInt().GetAwaiter();
awaiter.OnCompletion(() => goto done);
return Task.InProgress;
done:
return awaiter.Result;
}
MSDN has way more information about this, and there's even some code out there explaining how to build your own awaiters.
async and await at their very core just enable you to write code that uses callbacks under the hood, but in a nice way that tells the compiler to do the heavy lifting for you.
If you really want to run something in the background, then you need to use Task:
Task<int> GetMeAnInt()
{
return Task.Run(() => myWebService.GetMeAnInt());
}
OR
Task<int> GetMeAnInt()
{
return Task.Run(async () => await myWebService.GetMeAnInt());
}
The second example uses async and await in the lambda because in this scenario GetMeAnInt on the web service also happens to return Task<int>.
To recap:
async and await just instruct the compiler to do some jiggerypokery
This uses labels and callbacks with goto
Fun fact, this is valid IL but the C# compiler doesn't allow it for your own code, hence why the compiler can get away with the magic but you can't.
async does not mean "run on a background thread"
Task.Run() can be used to queue a threadpool thread to run an arbitrary function
Task.Factory.Start() can be used to grab a brand new thread to run an arbitrary function
await instructs the compiler that this is the point at which the result of the awaiter for the awaitable (e.g. Task) being awaited is required - this is how it knows how to structure the state machine.
As I describe in my MSDN article on async ASP.NET, async is not a silver bullet; it doesn't change the HTTP protocol:
When some developers learn about async and await, they believe it’s a way for the server code to “yield” to the client (for example, the browser). However, async and await on ASP.NET only “yield” to the ASP.NET runtime; the HTTP protocol remains unchanged, and you still have only one response per request.
In your case, you're trying to use a web request to kick off a backend operation and then return to the browser. ASP.NET was not designed to execute backend operations like this; it is only a web tier framework. Having ASP.NET execute work is dangerous because ASP.NET is only aware of work coming in from its requests.
I have an overview of various solutions on my blog. Note that using a plain Task.Run, Task.Factory.StartNew, or ThreadPool.QueueUserWorkItem is extremely dangerous because ASP.NET doesn't know anything about that work. At the very least you should use HostingEnvironment.QueueBackgroundWorkItem so ASP.NET at least knows about the work. But that doesn't guarantee that the work will actually ever complete.
A proper solution is to place the work in a persistent queue and have an independent background worker process that queue. See the Asynchronous Messaging Primer (specifically, your scenario is "Decoupling workloads").

Using Async controller action to call existing synchronous method

I've not dealt much with Async/threads/Tasks other than some web services.
I'm using MVC4. I have existing code which takes some time to run. It is using an existing method in the service layer, which uses various other the areas in further layers.
Essentially I was hoping to be able to make an ASync call from the Asynccontroller to that method. However it appears that I would need to change/create another method to implement all the Task & await keywords, quite a hefty job altering all the way down the chain.
Is it possible to call/'fire' a synchronous method in this manner?
I want the long process (creating some documents in the background) to continue running even if the user closes their browser. However if the user still has the browser open then I would like to return a notification to them.
Is there a better way to fire a background task to execute from the MVC Application?
I think you're trying to use async for something it cannot do. As I describe on my blog, async does not change the HTTP protocol.
Is it possible to call/'fire' a synchronous method in this manner?
Sort of. You can use Task.Run if you have CPU-bound work that you want to move off the UI thread in a desktop/mobile application. But there is no point in doing that in an ASP.NET MVC application.
I want the long process (creating some documents in the background) to continue running even if the user closes their browser. However if the user still has the browser open then I would like to return a notification to them.
The problem with this is that you'd be returning early from an ASP.NET request, and (as I describe on my blog), that's quite dangerous.
A proper solution would be to queue the work in a reliable queue (e.g., Azure queue or MSMQ), have an independent backend for processing (e.g., Azure worker role / web job or Win32 service), and use something like SignalR for notification.
As soon as you attempt to do work in an ASP.NET process without a request context, then you run into the danger that your process may exit without completing the work. If you are OK with this, then you can use the BackgroundTaskManager type from my blog above to minimize the chance of that happening (but keep in mind: it can still happen).

Categories