How does Asynchronous tasks (Async/Await) work in .Net 4.5?
Some sample code:
private async Task<bool> TestFunction()
{
var x = await DoesSomethingExists();
var y = await DoesSomethingElseExists();
return y;
}
Does the second await statement get executed right away or after the first await returns?
await pauses the method until the operation completes. So the second await would get executed after the first await returns.
For more information, see my async / await intro or the official FAQ.
It executes after the first await returns. If this thing confuses you, try to play around with breakpoints - they are fully supported by the new async pattern.
Imagine it would look like this:
var x = await GetSomeObjectInstance();
var y = await GetSomeObjectInstance2(x);
There probably would occur a NullReferenceException somewhere, so the first await has to return first. Otherwise, x would be null/undefined or whatever.
The methods calls will still occur sequentially just like "regular", non awaited method calls. The purpose of await is that it will return the current thread to the thread pool while the awaited operation runs off and does whatever.
This is particularly useful in high performance environments, say a web server, where a given request is processed on a given thread from the overall thread pool. If we don't await, then the given thread processing the request (and all it's resources) remains "in use" while the db / service call completes. This might take a couple of seconds or more especially for external service calls.
Now in low traffic websites this is not much of an issue but in high traffic sites the cost of all these request threads just sitting around, doing nothing, in an "in-use" state, waiting for other processes like those db /service calls to return can be a resource burden.
We are better off releasing the thread back to the worker pool to allow it do other useful work for some other request.
Once the db / service call completes, we can then interrupt the thread pool and ask for a thread to carry on processing that request from where it left off. At that point the state of the request is reloaded and the method call continues.
So on a per request basis when using await, the request will still take the same amount of time from the users perspective... plus a tiny smidge more for the switching overhead.
But in the aggregate, across all requests for all users, things can seem more performant to all users as the web server (in this case) runs more efficiently with better resource utilization. i.e. it either doesn't have to queue up requests waiting for free threads to process requests because await is returning them or alternatively we don't have to buy more hardware because we are using the same amount of hardware, more efficiently, to obtain higher throughputs.
There is a switching cost to this though so despite what you see in the default templates and in many docs you shouldn't just blindly use await for every single call. It's just a tool and like all tools it has its place. If the switching cost is not less than the cost of just completing your calls synchronously then you shouldn't use await.
Related
I've been struggling for about some days now on checking where to do await and where not to.
I have a Repository class which fetches the data from database.
using EntityFramework the code would be something like this:
public async Task<List<Object>> GetAsync()
{
return await context.Set<Object>().ToListAsync();
}
and the consumer:
var data = await GetAsync();
and on top level I'm awaiting this method too.
should I use await on only one of these methods?
Is it a performance penalty on using resources and creates new thread each time you do await?
I have checked the questions listed in the comments and they do not reffer to the performance issues and just say that you can do it. I wanted the best practice and the reason why to / not to do so.
I'd like to add to that.
There are some async methods where there is no need to use async/await keywords. It is important to detect this kind of misuse because adding the async modifier comes at a price.
E.G. You don't need async/await keywords in your example.
public Task<List<Object>> GetAsync()
{
return context.Set<Object>().ToListAsync();
}
And then:
var data = await GetAsync();
Will be just fine. In this case, you are returning the Task<List<Object>> and then you are awaiting that in the place you directly work with objects.
I recommend installing async await helper
Let me get the essence of your question first, confusion is related to where to use the Async in the complete chain of calls and where not and how to assess the performance impact of usage, as it may lead to creation of more threads. If the synopsis goes beyond this add details to the comments, i till try to answer them too.
Let's divide and tackle each of them one by one.
Where to use the Async in the chain of calls and where not ?
Here as you are using Entity Framework to access a database, I can safely assume you are using IO based Asynchronous processing, which is the most prominent use case for Async processing across languages and frameworks, use cases for CPU based Asynchronous processing are relatively limited (will explain them too)
Async is a Scalability feature especially for IO processing instead of performance feature, in simple words via Async processing you can ensure that a hosted server can cater to many times more calls for IO processing, since calls are not blocking and they just hand over the processing request over the network, while process thread goes back to the pool ready to serve another request, complete handing over process few milliseconds
When the processing is complete, software thread need to just receive them and pass back to the client, again few millisecond, mostly its around < 1 ms, if its a pure pass through no logic call for IO
What are Benefits
Imagine instead making Synchronous call for IO to a database, where each thread involve will just wait for result to arrive, which may go in few seconds, impact will be highly negative, in general based on thread pool size, you may server 25 - 50 request at most and they too will reply on number of cores available to process, will continuously spin wasting resources, while they are idle and waiting for the response
If you make synchronous call there's no way to serve 1000+ requests in same setup and I am extremely conservative Async can actually have huge Scalability impact, where for light weight calls it may serve millions requests with ease from a single hosted process
After the background, where to use the Async in complete chain
Everywhere, feasible from begin to end, from entry point to actual exit point making IO call, since that's the actual call relieving the pool thread, as it dispatch the call over network
Do Remember though, await at a given point doesn't allow further code to process ins same method, even if it relieve the thread, so its better that if there are multiple independent calls, they are aggregated using Task.WhenAll and the representative task is awaited, it will return when all of them finish success / error, what ever may be the state
If the Async is broken in between by using something like Task.Wait or Task.Result, it will not remain pure Async call and will block the calling thread pool thread
How can Async be further enhanced ?
In Pure library calls, where Async is initiated by the Thread pool and dispatching thread can be different from receiving one and call doesn't need to reenter same context, you shall use ConfigureAwait(false), which means it will not wait to re-enter the original context and is a performance boost
Like await it makes sense to use ConfigureAwait(false) across the chain, entry to the end. This is valid only for libraries which reply extensively on thread pools
Is there a Thread created
Best read this, Stephen Cleary - There's no thread
A genuine IO async call will use Hardware based concurrency to process, will not block
the Software Threads
Variations
CPU based Asychronous processing, where you take things in background, since current thread needs to be responsive, mostly in case of Ui like WPF
Use cases
All kinds of systems especially non MS frameworks, like node js, have Async processing as underlying principle and database server cluster on receiving end is tuned to receive millions of calls and process them
B2C calls, its expected that each request is light weight with limited Payload
Edit 1:
Just in this specific case as listed here, ToListAsyncis by default Asynchronous, so you can skip async await in that case as listed in variopus comments, though do review Stepehen Cleay's article in general that may not be a very good strategy, since gains are minimal and negative impact for incorrect usage can be high
I'm executing a web request to get a message, then await the processing of that message, then repeat the whole process again.
The processing of the message will be long running and the thread may be in a waiting state that may allow it to be used elsewhere. What I'd like is to continue the while loop, get more messages and process them when threads become free.
Current synchronous code:
while(!cancellationToken.IsCancelled) {
var message = await GetMessage();
await ProcessMessage(message); // I'll need it to continue from here if thread is released.
}
The scenario this is used in is a message queue Consumer service.
Given the use of async / await, your current code isn't necessarily synchronous (in thread terms - the continuations can be invoked on different threads), although the dependency between getting a message and processing it obviously must be upheld.
Re: the thread may be in a waiting state that may allow it to be used elsewhere
Awaiting on well coded I/O-bound work doesn't need to consume a thread at all - see Stephen Cleary's There is no thread. Assuming the two awaited tasks are IO-bound, your code will likely consume no threads at all while it is awaiting IO bound work, i.e. the rest of your application will have the use of the Threadpool. So if your only concern was wasting threads, then nothing more is needed.
If however your concern is about performance and additional throughput, if there is downstream capacity to do concurrent calls to ProcessMessage (e.g. multiple downstream web servers or additional database capacity), then you could look at parallelizing the IO bound work (again, without requiring more Threadpool threads)
For instance, if you are able to re-write the GetMessages call to retrieve a batch at a time, you could try this:
var messages = await GetMessages(10);
var processTasks = messages
.Select(message => ProcessMessage(message));
await Task.WhenAll(processTasks);
(and if you can't touch code, you could just loop GetMessages to retrieve 10 individual messages before the Task.WhenAll)
However, if you do NOT have any further capacity to do concurrent ProcessMessage calls, then you should instead look at addressing the bottleneck - e.g. adding more servers, optimizing code, or parallelizing the work done in ProcessMessage work, etc.
The rationale is that, as you say, GetMessages retrieves data off a queue. If you have no capacity to process the messages you've retrieved, all you could do is queue messages somewhere else, which seems rather pointless - rather leave the messages on the Queue until you are ready to process them. The queue depth will also create visibility of the backlog of work building up, which you can monitor.
Edit, Re : Occasionally one ProcessMessage() call takes much longer than others
As per the comments, OP has additional information that an occasional ProcessMessage call takes much longer than others, and would like to continue processing other messages in the interim.
One approach could be to apply a timeout to the Parallel tasks using this clever pattern here, which, if reached, will leave any long running ProcessTasks running, and will continue with the next batch of messages.
The below is potentially dangerous, in that it will require careful balancing of the timeout (1000ms below) against the observed frequency of the misbehaving ProcessMessage calls - if the timeout is too low vs the frequency of 'slow' ProcessMessages, the downstream resources can become overwhelmed.
A safer (yet more complicated) addition would be to track the concurrent number of incomplete ProcessMessage tasks via Task.IsCompleted, and if this hits a threshold, then to await completion of enough of these tasks to bring the backlog to a safe level.
while(!cancellationToken.IsCancelled)
{
// Ideally, the async operations should all accept cancellationTokens too
var message = await GetMessages(10, cancellationToken);
var processTasks = messages
.Select(message => ProcessMessage(message, cancellationToken));
await Task.WhenAny(Task.WhenAll(processTasks),
Task.Delay(1000, cancellationToken));
}
Re : Throttling for safe levels of downstream load - TPL DataFlow more than likely would be of use here.
Take a look at https://msdn.microsoft.com/library/hh191443(vs.110).aspx should get you going. Also, seems like ProcessMessage aught to end with 'Async' according to the C#/.NET style guide.
You'll want to set up a Task<ReturnTypeOfProcessMessage> procMessageTask = ProcessMessageAsync(message);
then you can do your business while its running,
SomeBusiness(...)
then
await procMessageTask;
Seems like you may also want some type of await-with-timeout functionality so that you can poll, here's a question related to that:
Asynchronously wait for Task<T> to complete with timeout
HTH
I'm trying to benchmark (using Apache bench) a couple of ASP.NET Web API 2.0 endpoints. One of which is synchronous and one async.
[Route("user/{userId}/feeds")]
[HttpGet]
public IEnumerable<NewsFeedItem> GetNewsFeedItemsForUser(string userId)
{
return _newsFeedService.GetNewsFeedItemsForUser(userId);
}
[Route("user/{userId}/feeds/async")]
[HttpGet]
public async Task<IEnumerable<NewsFeedItem>> GetNewsFeedItemsForUserAsync(string userId)
{
return await Task.Run(() => _newsFeedService.GetNewsFeedItemsForUser(userId));
}
After watching Steve Sanderson's presentation I issued the following command ab -n 100 -c 10 http://localhost.... to each endpoint.
I was surprised as the benchmarks for each endpoint seemed to be approximately the same.
Going off what Steve explained I was expecting that the async endpoint would be more performant because it would release thread pool threads back to the thread pool immediately, thus making them available for other requests and improving throughput. But the numbers seem exactly the same.
What am I misunderstanding here?
Using await Task.Run to create "async" WebApi is a bad idea - you will still use a thread, and even from the same thread pool used for requests.
It will lead to some unpleasant moments described in good details here:
Extra (unnecessary) thread switching to the Task.Run thread pool thread. Similarly, when that thread finishes the request, it has to
enter the request context (which is not an actual thread switch but
does have overhead).
Extra (unnecessary) garbage is created. Asynchronous programming is a tradeoff: you get increased responsiveness at the expense of higher
memory usage. In this case, you end up creating more garbage for the
asynchronous operations that is totally unnecessary.
The ASP.NET thread pool heuristics are thrown off by Task.Run “unexpectedly” borrowing a thread pool thread. I don’t have a lot of
experience here, but my gut instinct tells me that the heuristics
should recover well if the unexpected task is really short and would
not handle it as elegantly if the unexpected task lasts more than two
seconds.
ASP.NET is not able to terminate the request early, i.e., if the client disconnects or the request times out. In the synchronous case,
ASP.NET knew the request thread and could abort it. In the
asynchronous case, ASP.NET is not aware that the secondary thread pool
thread is “for” that request. It is possible to fix this by using
cancellation tokens, but that’s outside the scope of this blog post.
Basically, you do not allow any asynchrony to the ASP.NET - you just hide the CPU-bound synchronous code behind the async facade. Async on its own is ideal for I/O bound code, because it allows to utilize CPU (threads) at their top efficiency (no blocking for I/O), but when you have Compute-bound code, you will still have to utilize CPU to the same extent.
And taking into account the additional overhead from Task and context switching you will get even worser results than with simple sync controller methods.
HOW TO MAKE IT TRULY ASYNC:
GetNewsFeedItemsForUser method shall be turned into async.
[Route("user/{userId}/feeds/async")]
[HttpGet]
public async Task<IEnumerable<NewsFeedItem>> GetNewsFeedItemsForUserAsync(string userId)
{
return await _newsFeedService.GetNewsFeedItemsForUser(userId);
}
To do it:
If it is some library method then look for its async variant (if there are none - bad luck, you'll have to search for some competing analogue).
If it is your custom method using file system or database then leverage their async facilities to create async API for the method.
I want to performa an asynchronous DB Query in C# that calls a stored procedure for a Backup. Since we use Azure this takes about 2 minutes and we don't want the user to wait that long.
So the idea is to make it asynchronous, so that the task continues to run, after the request.
[HttpPost]
public ActionResult Create(Snapshot snapshot)
{
db.Database.CommandTimeout = 7200;
Task.Run(() => db.Database.ExecuteSqlCommandAsync("EXEC PerformSnapshot #User = '" + CurrentUser.AccountName + "', #Comment = '" + snapshot.Comment + "';"));
this.ShowUserMessage("Your snapshot has been created.");
return this.RedirectToActionImpl("Index", "Snapshots", new System.Web.Routing.RouteValueDictionary());
}
I'm afraid that I haven't understood the concept of asynchronous taks. The query will not be executed (or aborted?), if I don't use the wait statement. But actually "waiting" is the one thing I espacially don't want to do here.
So... why am I forced to use wait here?
Or will the method be started, but killed if the requst is finished?
We don't want the user to wait that long.
async-await won't help you with that. Odd as it may sound, the basic async-await pattern is about implementing synchronous behavior in a non-blocking fashion. It doesn't re-arrange your logical flow; in fact, it goes to great lengths to preserve it. The only thing you've changed by going async here is that you're no longer tying up a thread during that 2-minute database operation, which is a huge win your app's scalability if you have lots of concurrent users, but doesn't speed up an individual request one bit.
I think what you really want is to run the operation as a background job so you can respond to the user immediately. But be careful - there are bad ways to do that in ASP.NET (i.e. Task.Run) and there are good ways.
Dave, you're not forced to use await here. And you're right - from user perspective it still will take 2 minutes. The only difference is that the thread which processes your request can now process other requests meanwhile database does its job. And when database finishes, the thread will continue process your request.
Say you have limited number of threads capable to process HTTP request. This async code will help you to process more requests per time period, but it won't help user to get the job done faster.
This seems to be down to a misunderstanding as to what async and await do.
async does not mean run this on a new thread, in essence it acts as a signal to the compiler to build a state machine, so a method like this:
Task<int> GetMeAnInt()
{
return await myWebService.GetMeAnInt();
}
sort of (cannot stress this enough), gets turned into this:
Task<int> GetMeAnInt()
{
var awaiter = myWebService.GetMeAnInt().GetAwaiter();
awaiter.OnCompletion(() => goto done);
return Task.InProgress;
done:
return awaiter.Result;
}
MSDN has way more information about this, and there's even some code out there explaining how to build your own awaiters.
async and await at their very core just enable you to write code that uses callbacks under the hood, but in a nice way that tells the compiler to do the heavy lifting for you.
If you really want to run something in the background, then you need to use Task:
Task<int> GetMeAnInt()
{
return Task.Run(() => myWebService.GetMeAnInt());
}
OR
Task<int> GetMeAnInt()
{
return Task.Run(async () => await myWebService.GetMeAnInt());
}
The second example uses async and await in the lambda because in this scenario GetMeAnInt on the web service also happens to return Task<int>.
To recap:
async and await just instruct the compiler to do some jiggerypokery
This uses labels and callbacks with goto
Fun fact, this is valid IL but the C# compiler doesn't allow it for your own code, hence why the compiler can get away with the magic but you can't.
async does not mean "run on a background thread"
Task.Run() can be used to queue a threadpool thread to run an arbitrary function
Task.Factory.Start() can be used to grab a brand new thread to run an arbitrary function
await instructs the compiler that this is the point at which the result of the awaiter for the awaitable (e.g. Task) being awaited is required - this is how it knows how to structure the state machine.
As I describe in my MSDN article on async ASP.NET, async is not a silver bullet; it doesn't change the HTTP protocol:
When some developers learn about async and await, they believe it’s a way for the server code to “yield” to the client (for example, the browser). However, async and await on ASP.NET only “yield” to the ASP.NET runtime; the HTTP protocol remains unchanged, and you still have only one response per request.
In your case, you're trying to use a web request to kick off a backend operation and then return to the browser. ASP.NET was not designed to execute backend operations like this; it is only a web tier framework. Having ASP.NET execute work is dangerous because ASP.NET is only aware of work coming in from its requests.
I have an overview of various solutions on my blog. Note that using a plain Task.Run, Task.Factory.StartNew, or ThreadPool.QueueUserWorkItem is extremely dangerous because ASP.NET doesn't know anything about that work. At the very least you should use HostingEnvironment.QueueBackgroundWorkItem so ASP.NET at least knows about the work. But that doesn't guarantee that the work will actually ever complete.
A proper solution is to place the work in a persistent queue and have an independent background worker process that queue. See the Asynchronous Messaging Primer (specifically, your scenario is "Decoupling workloads").
After this question, it makes me comfortable when using async
operations in ASP.NET MVC. So, I wrote two blog posts on that:
My Take on Task-based Asynchronous Programming in C# 5.0 and ASP.NET MVC Web Applications
Asynchronous Database Calls With Task-based Asynchronous Programming Model (TAP) in ASP.NET MVC 4
I have too many misunderstandings in my mind about asynchronous operations on ASP.NET MVC.
I always hear this sentence: Application can scale better if operations run asynchronously
And I heard this kind of sentences a lot as well: if you have a huge volume of traffic, you may be better off not performing your queries asynchronously - consuming 2 extra threads to service one request takes resources away from other incoming requests.
I think those two sentences are inconsistent.
I do not have much information about how threadpool works on ASP.NET but I know that threadpool has a limited size for threads. So, the second sentence has to be related to this issue.
And I would like to know if asynchronous operations in ASP.NET MVC uses a thread from ThreadPool on .NET 4?
For example, when we implement a AsyncController, how does the app structures? If I get huge traffic, is it a good idea to implement AsyncController?
Is there anybody out there who can take this black curtain away in front of my eyes and explain me the deal about asynchrony on ASP.NET MVC 3 (NET 4)?
Edit:
I have read this below document nearly hundreds of times and I understand the main deal but still I have confusion because there are too much inconsistent comment out there.
Using an Asynchronous Controller in ASP.NET MVC
Edit:
Let's assume I have controller action like below (not an implementation of AsyncController though):
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Do an advanced looging here which takes a while
});
return View();
}
As you see here, I fire an operation and forget about it. Then, I return immediately without waiting it be completed.
In this case, does this have to use a thread from threadpool? If so, after it completes, what happens to that thread? Does GC comes in and clean up just after it completes?
Edit:
For the #Darin's answer, here is a sample of async code which talks to database:
public class FooController : AsyncController {
//EF 4.2 DbContext instance
MyContext _context = new MyContext();
public void IndexAsync() {
AsyncManager.OutstandingOperations.Increment(3);
Task<IEnumerable<Foo>>.Factory.StartNew(() => {
return
_context.Foos;
}).ContinueWith(t => {
AsyncManager.Parameters["foos"] = t.Result;
AsyncManager.OutstandingOperations.Decrement();
});
Task<IEnumerable<Bars>>.Factory.StartNew(() => {
return
_context.Bars;
}).ContinueWith(t => {
AsyncManager.Parameters["bars"] = t.Result;
AsyncManager.OutstandingOperations.Decrement();
});
Task<IEnumerable<FooBar>>.Factory.StartNew(() => {
return
_context.FooBars;
}).ContinueWith(t => {
AsyncManager.Parameters["foobars"] = t.Result;
AsyncManager.OutstandingOperations.Decrement();
});
}
public ViewResult IndexCompleted(
IEnumerable<Foo> foos,
IEnumerable<Bar> bars,
IEnumerable<FooBar> foobars) {
//Do the regular stuff and return
}
}
Here's an excellent article I would recommend you reading to better understand asynchronous processing in ASP.NET (which is what asynchronous controllers basically represent).
Let's first consider a standard synchronous action:
public ActionResult Index()
{
// some processing
return View();
}
When a request is made to this action a thread is drawn from the thread pool and the body of this action is executed on this thread. So if the processing inside this action is slow you are blocking this thread for the entire processing, so this thread cannot be reused to process other requests. At the end of the request execution, the thread is returned to the thread pool.
Now let's take an example of the asynchronous pattern:
public void IndexAsync()
{
// perform some processing
}
public ActionResult IndexCompleted(object result)
{
return View();
}
When a request is sent to the Index action, a thread is drawn from the thread pool and the body of the IndexAsync method is executed. Once the body of this method finishes executing, the thread is returned to the thread pool. Then, using the standard AsyncManager.OutstandingOperations, once you signal the completion of the async operation, another thread is drawn from the thread pool and the body of the IndexCompleted action is executed on it and the result rendered to the client.
So what we can see in this pattern is that a single client HTTP request could be executed by two different threads.
Now the interesting part happens inside the IndexAsync method. If you have a blocking operation inside it, you are totally wasting the whole purpose of the asynchronous controllers because you are blocking the worker thread (remember that the body of this action is executed on a thread drawn from the thread pool).
So when can we take real advantage of asynchronous controllers you might ask?
IMHO we can gain most when we have I/O intensive operations (such as database and network calls to remote services). If you have a CPU intensive operation, asynchronous actions won't bring you much benefit.
So why can we gain benefit from I/O intensive operations? Because we could use I/O Completion Ports. IOCP are extremely powerful because you do not consume any threads or resources on the server during the execution of the entire operation.
How do they work?
Suppose that we want to download the contents of a remote web page using the WebClient.DownloadStringAsync method. You call this method which will register an IOCP within the operating system and return immediately. During the processing of the entire request, no threads are consumed on your server. Everything happens on the remote server. This could take lots of time but you don't care as you are not jeopardizing your worker threads. Once a response is received the IOCP is signaled, a thread is drawn from the thread pool and the callback is executed on this thread. But as you can see, during the entire process, we have not monopolized any threads.
The same stands true with methods such as FileStream.BeginRead, SqlCommand.BeginExecute, ...
What about parallelizing multiple database calls? Suppose that you had a synchronous controller action in which you performed 4 blocking database calls in sequence. It's easy to calculate that if each database call takes 200ms, your controller action will take roughly 800ms to execute.
If you don't need to run those calls sequentially, would parallelizing them improve performance?
That's the big question, which is not easy to answer. Maybe yes, maybe no. It will entirely depend on how you implement those database calls. If you use async controllers and I/O Completion Ports as discussed previously you will boost the performance of this controller action and of other actions as well, as you won't be monopolizing worker threads.
On the other hand if you implement them poorly (with a blocking database call performed on a thread from the thread pool), you will basically lower the total time of execution of this action to roughly 200ms but you would have consumed 4 worker threads so you might have degraded the performance of other requests which might become starving because of missing threads in the pool to process them.
So it is very difficult and if you don't feel ready to perform extensive tests on your application, do not implement asynchronous controllers, as chances are that you will do more damage than benefit. Implement them only if you have a reason to do so: for example you have identified that standard synchronous controller actions are a bottleneck to your application (after performing extensive load tests and measurements of course).
Now let's consider your example:
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Do an advanced looging here which takes a while
});
return View();
}
When a request is received for the Index action a thread is drawn from the thread pool to execute its body, but its body only schedules a new task using TPL. So the action execution ends and the thread is returned to the thread pool. Except that, TPL uses threads from the thread pool to perform their processing. So even if the original thread was returned to the thread pool, you have drawn another thread from this pool to execute the body of the task. So you have jeopardized 2 threads from your precious pool.
Now let's consider the following:
public ViewResult Index() {
new Thread(() => {
//Do an advanced looging here which takes a while
}).Start();
return View();
}
In this case we are manually spawning a thread. In this case the execution of the body of the Index action might take slightly longer (because spawning a new thread is more expensive than drawing one from an existing pool). But the execution of the advanced logging operation will be done on a thread which is not part of the pool. So we are not jeopardizing threads from the pool which remain free for serving another requests.
Yes - all threads come from the thread-pool. Your MVC app is already multi-threaded, when a request comes in a new thread will be taken from the pool and used to service the request. That thread will be 'locked' (from other requests) until the request is fully serviced and completed. If there is no thread available in the pool the request will have to wait until one is available.
If you have async controllers they still get a thread from the pool but while servicing the request they can give up the thread, while waiting for something to happen (and that thread can be given to another request) and when the original request needs a thread again it gets one from the pool.
The difference is that if you have a lot of long-running requests (where the thread is waiting for a response from something) you might run out of threads from the the pool to service even basic requests. If you have async controllers, you don't have any more threads but those threads that are waiting are returned to the pool and can service other requests.
A nearly real life example...
Think of it like getting on a bus, there's five people waiting to get on, the first gets on, pays and sits down (the driver serviced their request), you get on (the driver is servicing your request) but you can't find your money; as you fumble in your pockets the driver gives up on you and gets the next two people on (servicing their requests), when you find your money the driver starts dealing with you again (completing your request) - the fifth person has to wait until you are done but the third and fourth people got served while you were half way through getting served. This means that the driver is the one and only thread from the pool and the passengers are the requests. It was too complicated to write how it would work if there was two drivers but you can imagine...
Without an async controller, the passengers behind you would have to wait ages while you looked for your money, meanwhile the bus driver would be doing no work.
So the conclusion is, if lots of people don't know where their money is (i.e. require a long time to respond to something the driver has asked) async controllers could well help throughput of requests, speeding up the process from some. Without an aysnc controller everyone waits until the person in front has been completely dealt with. BUT don't forget that in MVC you have a lot of bus drivers on a single bus so async is not an automatic choice.
There are two concepts at play here. First of all we can make our code run in parallel to execute faster or schedule code on another thread to avoid making the user wait. The example you had
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Do an advanced looging here which takes a while
});
return View();
}
belongs to the second category. The user will get a faster response but the total workload on the server is higher because it has to do the same work + handle the threading.
Another example of this would be:
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Make async web request to twitter with WebClient.DownloadString()
});
Task.Factory.StartNew(() => {
//Make async web request to facebook with WebClient.DownloadString()
});
//wait for both to be ready and merge the results
return View();
}
Because the requests run in parallel the user won't have to wait as long as if they where done in serial. But you should realize that we use up more resources here than if we ran in serial because we run the code at many threads while we have on thread waiting too.
This is perfectly fine in a client scenario. And it is quite common there to wrap synchronous long running code in a new task(run it on another thread) too keep the ui responsive or parallize to make it faster. A thread is still used for the whole duration though. On a server with high load this could backfire because you actually use more resources. This is what people have warned you about
Async controllers in MVC has another goal though. The point here is to avoid having threads sittings around doing nothing(which can hurt scalability). It really only matters if the API's you are calling have async methods. Like WebClient.DowloadStringAsync().
The point is that you can let your thread be returned to handle new requests untill the web request is finished where it will call you callback which gets the same or a new thread and finish the request.
I hope you understand the difference between asynchronous and parallel. Think of parallel code as code where your thread sits around and wait for the result. While asynchronous code is code where you will be notified when the code is done and you can get back working at it, in the meantime the thread can do other work.
Applications can scale better if operations run asynchronously, but only if there are resources available to service the additional operations.
Asynchronous operations ensure that you're never blocking an action because an existing one is in progress. ASP.NET has an asynchronous model that allows multiple requests to execute side-by-side. It would be possible to queue the requests up and processes them FIFO, but this would not scale well when you have hundreds of requests queued up and each request takes 100ms to process.
If you have a huge volume of traffic, you may be better off not performing your queries asynchronously, as there may be no additional resources to service the requests. If there are no spare resources, your requests are forced to queue up, take exponentially longer or outright fail, in which case the asynchronous overhead (mutexes and context-switching operations) isn't giving you anything.
As far as ASP.NET goes, you don't have a choice - it's uses an asynchronous model, because that's what makes sense for the server-client model. If you were to be writing your own code internally that uses an async pattern to attempt to scale better, unless you're trying to manage a resource that's shared between all requests, you won't actually see any improvements because they're already wrapped in an asynchronous process that doesn't block anything else.
Ultimately, it's all subjective until you actually look at what's causing a bottleneck in your system. Sometimes it's obvious where an asynchronous pattern will help (by preventing a queued resource blocking). Ultimately only measuring and analysing a system can indicate where you can gain efficiencies.
Edit:
In your example, the Task.Factory.StartNew call will queue up an operation on the .NET thread-pool. The nature of Thread Pool threads is to be re-used (to avoid the cost of creating/destroying lots of threads). Once the operation completes, the thread is released back to the pool to be re-used by another request (the Garbage Collector doesn't actually get involved unless you created some objects in your operations, in which case they're collected as per normal scoping).
As far as ASP.NET is concerned, there is no special operation here. The ASP.NET request completes without respect to the asynchronous task. The only concern might be if your thread pool is saturated (i.e. there are no threads available to service the request right now and the pool's settings don't allow more threads to be created), in which case the request is blocked waiting to start the task until a pool thread becomes available.
Yes, they use a thread from the thread pool. There is actually a pretty excellent guide from MSDN that will tackle all of your questions and more. I have found it to be quite useful in the past. Check it out!
http://msdn.microsoft.com/en-us/library/ee728598.aspx
Meanwhile, the comments + suggestions that you hear about asynchronous code should be taken with a grain of salt. For starters, just making something async doesn't necessarily make it scale better, and in some cases can make your application scale worse. The other comment you posted about "a huge volume of traffic..." is also only correct in certain contexts. It really depends on what your operations are doing, and how they interact with other parts of the system.
In short, lots of people have lots of opinions about async, but they may not be correct out of context. I'd say focus on your exact problems, and do basic performance testing to see what async controllers, etc. actually handle with your application.
First thing its not MVC but the IIS who maintains the thread pool. So any request which comes to MVC or ASP.NET application is served from threads which are maintained in thread pool. Only with making the app Asynch he invokes this action in a different thread and releases the thread immediately so that other requests can be taken.
I have explained the same with a detail video (http://www.youtube.com/watch?v=wvg13n5V0V0/ "MVC Asynch controllers and thread starvation" ) which shows how thread starvation happens in MVC and how its minimized by using MVC Asynch controllers.I also have measured the request queues using perfmon so that you can see how request queues are decreased for MVC asynch and how its worst for Synch operations.