Optimizing an ASMX web service with Multiple Long-Running Operations - c#

I'm writing an ASP.NET web service using C# that has a DoLookup() function. For each call to the DoLookup() function I need my code to execute two separate queries: one to another web service at a remote site and one to a local database. Both queries have to complete before I can compile the results and return them as the response to the DoLookup method. The problem I'm dealing with is that I want to make this as efficient as possible, both in terms of response time and resource usage on the web server. We are expecting up to several thousand queries per hour. Here's a rough C#-like overview of what I have so far:
public class SomeService : System.Web.Services.WebService
{
public SomeResponse DoLookup()
{
// Do the lookup at the remote web service and get the response
WebResponse wr = RemoteProvider.DoRemoteLookup();
// Do the lookup at the local database and get the response
DBResponse dbr = DoDatabaseLookup();
SomeResponse resp = new SomeResponse( wr, dbr);
return resp;
}
}
The above code does everything sequentially and works great but now I want to make it more scalable. I know that I can call the DoRemoteLookup() function asynchronously ( RemoteProvider has BeginRemoteLookup / EndRemoteLookup methods) and that I can also do the database lookup asynchronously using the BeginExecuteNonQuery / EndExecuteNonQuery methods.
My question (finally) is this: how do I fire both the remote web service lookup AND the database lookup simultaneously on separate threads and ensure that they have both completed before returning the response?
The reason I want to execute both requests on separate threads is that they both potentially have long response times (1 or 2 seconds) and I'd like to free up the resources of the web server to handle other requests while it is waiting for responses. One additional note - I do have the remote web service lookup running asynchronously currently, I just didn't want to make the sample above too confusing. What I'm struggling with is getting both the remote service lookup AND the database lookup started at the same time and figuring out when they have BOTH completed.
Thanks for any suggestions.

You can use a pair of AutoResetEvents, one for each thread. At the end of thread execution, you call AutoResetEvents.Set() to trigger the event.
After spawning the threads, you use WaitAll() with the two AutoResetEvents. This will cause the thread to block until both events are set.
The caveat to this approach is that you must ensure the Set() is guarantee to be called, otherwise you will block forever. Additionally ensure that with threads you exercise proper exception handling, or you will inadvertently cause more performance issues when unhanded exceptions cause your web application to restart.
MSDN Has sample code regarding AutoResetEvent usage.

See Asynchronous XML Web Service Methods, How to: Create Asynchronous Web Service Methods and How to: Chain Asynchronous Calls with a Web Service Method.
But note the first paragraph of those articles:
This topic is specific to a legacy technology. XML Web services and XML Web service clients should now be created using Windows Communication Foundation (WCF).
BTW, doing things the way these articles say is important because it frees up the ASP.NET worker thread while the long-running task runs. Otherwise, you might be blocking the worker thread, preventing it from servicing further requests, and impacting scalability.

Assuming you can have a callback function for both the web request and the database lookup then something along these lines may work
bool webLookupDone = false;
bool databaseLookupDone = false;
private void FinishedDBLookupCallBack()
{
databaseLookupDone = true;
if(webLookupDone)
{
FinishMethod();
}
}
private void FinishedWebLookupCallBack()
{
webLookupDone = true;
if(databaseLookupDone)
{
FinishMethod();
}
}

I guess I don't have enough rep to upvote nor to comment. So this is a comment on John Saunders answer and Alan's comment on it.
You definitely want to go with John's answer if you are concerned about scalability and resource consumption.
There are two considerations here: Speeding up an individual request, and making your system handle many concurrent requests efficiently. The former both Alan's and John's answer achieve by performing the external calls in parallel.
The latter, and it sounds like that was your main concern, is achieved by not having threads blocked anywhere, i.e. John's answer.
Don't spawn your own threads. Threads are expensive, and there are already plenty of threads in the IO Threadpool that will handle your external calls for you if you use the asynch methods provided by the .net framework.
Your service's webmethod needs to be asynch as well. Otherwise a worker thread will be blocked until your external calls are done (it's still 1-2 seconds even if they run in parallel). And you only have 12 threads per CPU handling incoming requests (if your machine.config is set according to recommendation.) I.e. you would at most be able to handle 12 concurrent requests (times the # of CPUs). On the other hand if your web method is asynch the Begin will return pretty much instantenously and the thread returned to the worker thread pool ready to handle another incoming request, while your external calls are being waited on by the IO completion port, where they will be handled by threads from the IO thread pool, once they return.

Related

Call synchronous webservice methods asynchronously

I have to call a webservice method about a hundred times in a loop with different parameters everytime.
The webservice has only sync methods. Currently I am testing this in an console application and it takes over ten minutes to get the data when doing it synchronously!
What i want:
Run 10 requests in parallel. When they have finished, execute the next ten calls.
This should of course be async.
The functionality will be hosted in an IIS hosted wcf service.
Overview:
Client calls wcf service with params once. The wcf service method should call another webservice a hundred times asynchronously and save the final data to Excel.
I read, that Task.Run isn't a good idea when used in web application.
So, how to call sync web service methods asynchronously in a web context?
I am using the CRM Microsoft.Xrm.Sdk, Version=7.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
var orders = _serviceProxy.RetrieveMultiple( new FetchExpression(strXmlFetch));
RetrieveMultiple gets an xml fragment (strXmlFetch) where the query is defined and executes the request.
The method resides here:
Namespace Microsoft.Xrm.Sdk.Client
Class OrganizationServiceProxy
public EntityCollection RetrieveMultiple(QueryBase query);
Under the hood the client SDk does the follwoing when calling RetrieveMultiple.
using (OrganizationServiceContextInitializer organizationServiceContextInitializer = new OrganizationServiceContextInitializer(this))
{
entityCollection = base.ServiceChannel.Channel.RetrieveMultiple(query);
return entityCollection;
}
I didn't implement anything yet, as i need a starting point for the async execution for the requests.
This should of course be async.
Well, that would be ideal. However, OrganizationServiceProxy does not have asynchronous methods (neither does IOrganizationService, so using the ServiceChannel.Channel directly won't help, either).
So, the first thing to do is ask Microsoft to add asynchronous APIs to that client library.
I read, that Task.Run isn't a good idea when used in web application.
Normally that's true; you'd want to call asynchronous APIs. Which in this case aren't available.
So, how to call sync web service methods asynchronously in a web context?
This isn't possible. Your options are:
Just keep it synchronous, one at a time. Yeah, it'll take longer than it should.
Make it parallel.
The problem with parallel processing on the server side is that you're now using N threads for a single request, which can really quickly bring your web server to its knees. I do not recommend this approach in production code, especially if it's exposed over the Internet.
However, if you're sure that there won't be too many calls to your WCF service, you can implement it in parallel.
Since you need to collect results from the parallel query, I'd recommend using Parallel LINQ (AsParallel) specifying WithDegreeOfParallelism(10), something like:
IEnumerable<EntityCollection> GetAll(OrganizationServiceProxy proxy, IEnumerable<QueryBase> queries)
{
return queries.AsParallel().WithDegreeOfParallelism(10)
.Select(query => proxy.RetrieveMultiple(query));
}
The webservice has only sync methods
Web services by default produce Asynchronous methods pertaining to BeginXX, EndXX, which are part of Asynchronous Programming Model, but they are not the one which can be consumed by Async-Await, since they don't return the Task.
What i want: Run 10 requests in parallel. When they have finished, execute the next ten calls. This should of course be async.
In truly Async calls as no threads are invoked and as it works on IO completion ports, therefore you can start many more calls, than just 10. Otherwise the logic to schedule 10 Async requests at a time has to be custom, there's no out of box mechanism to do it, even while using Threadpool, there's no guarantee of number of requests in parallel, though you may set the Max Degree of Parallelism, for higher limit for Parallel APIs.
Overview: Client calls wcf service with params once. The wcf service method should call another webservice a hundred times asynchronously and save the final data to Excel.
You would prefer all the calls to be Async, so that the Ui thread is not blocked, whether its to the wcf service or web service.
I read that Task.Run isn't a good idea when used in web application.
True, since a ThreadPool thread is getting invoked, which will do nothing post dispatching the IO call to Web Service
So, how to call sync web service methods asynchronously in a web context?
Needs an Async wrapper over Sync method, which is Anti-Pattern, but there's no other option. This would need a ThreadPool thread, something like:
public class AsyncWrapper
{
public async Task CallWcfAsync(<Parameters>)
{
SynchronizationContext _synchronizationContext =
SynchronizationContext.Current;
try
{
SynchronizationContext.SetSynchronizationContext(null);
await Task.Run(() => CallWcfMethod(<Parameters>));
}
finally
{
SynchronizationContext.SetSynchronizationContext
(_synchronizationContext);
}
}
}
Important Points:
CallWcfAsync is the Async wrapper method which you need.
Notice I have set the Synchronization context to Null before execution and then reset it, this is similar in behavior to ConfigureAwait(false), else in the web applications it would lead to deadlock, as Sychronization Context waited upon is blocked.

Make Controller methods asynchronous or leave synchronous? [duplicate]

This question already has answers here:
When should I use Async Controllers in ASP.NET MVC?
(8 answers)
Closed 6 years ago.
I am working with a pre-existing C# ASP.NET MVC webapplication and I'm adding some functionality to it that I can't decide whether or not to make async.
Right now, the application home page just processes a page and a user logs in. Nothing more, and nothing asynchronous going on at all.
I am adding functionality that will, when the homepage is visited, generate a call to a Web API that subsequently calls a database that grabs an identifier and returns it to an HTML tag on the home page. This identifier will not be visible on the screen, only on the source/HTML view (this is being added for various internal tracking purposes).
The Web API/database call is simple, just grab an identifier and return it to the controller. Regardless, I'm wondering whether the app should make this call asynchronously? The website traffic isn't immense, but I'm still wondering about concurrency, performance and future scalability.
The one catch is that I'd have to make the entire ActionMethod async and I'm not sure what the affects of that would be. The basic pattern, currently synchronous, is below:
public ActionResult Index()
{
var result = GetID();
ViewBag.result = result.Data;
return View();
}
public JsonResult GetID()
{
var result = "";
var url = "http://APIURL/GetID";
using (WebClient client = new WebClient())
{
result = client.DownloadString(url);
}
return Json(result, JsonRequestBehavior.AllowGet);
}
Any thoughts?
First and foremost, realize the purpose of async, in the context of a web application. A web server has what's called a thread pool. Generally speaking, 1 thread == 1 request, so if you have a 1000 threads in the pool (typical), your website can roughly serve 1000 simultaneous requests. Also keep in mind that, it often takes many requests to render a single resource. The HTML document itself is one request, but each image, JS file, CSS file, etc. is also a request. Then, there's any AJAX requests the page may issue. In other words, it's not uncommon for a request for a single resource to generate 20+ requests to the web server.
Given that, when your server hits its max requests (all threads are being utilized), any further requests are queued and processed in order as threads are made available. What async does is buy you some additional head room. If there's threads that are in a wait-state (waiting for the results of a database query, the response from a web service, a file to be read from the filesystem, etc.), then async allows these threads to be returned to the pool, where they are then able to field some of those waiting requests. When whatever the thread was waiting on completes, a new thread is requested to finish servicing the request.
What is important to note here is that a new thread is requested to finish servicing the request. Once the thread has been released to the pool, you have to wait for a thread again, just like a brand new request. This means running async can sometimes take longer than running sync, depending on the availability of threads in the pool. Also, async caries with it a non-insignificant amount of overhead that also adds to the overall load time.
Async != faster. It can many times be slower, but it allows your web server to more efficiently utilize resources, which could mean the difference between falling down and gracefully bearing load. Because of this, there's no one universal answer to a question like "Should I just make everything async?" Async is a trade-off between raw performance and efficiency. In some situations it may not make sense to use async at all, while in others you might want to use it for everything that's applicable. What you need to do is first identity the stress points of your application. For example, if your database instance resides on the same server as your web server (not a good idea, BTW), using async on your database queries would be fairly pointless. The chief culprit of waiting is network latency, whereas filesystem access is typically relatively quick. On the other hand, if your database server is in a remote datacenter and has to not only travel the pipes between there and your web server but also do things like traverse firewalls, well, then your network latency is much more significant, and async is probably a very good idea.
Long and short, you need to evaluate your setup, your environment and the needs of your application. Then, and only then, can you make smart decisions about this. That said, even given the overhead of async, if there's network latency involved at all, it's a pretty safe bet async should be used. It's perfectly acceptable to err on the site of caution and just use async everywhere it's applicable, and many do just that. If you're looking to optimize for performance though (perhaps you're starting the next Facebook?), then you'd want to be much more judicious.
Here, the reason to use async IO is to not have many threads running at the same time. Threads consume OS resources and memory. The thread pool also cal be a little slow to adjust to sudden load. If your thread count in a web app is below 100 and load is not extremely spikey you have nothing to worry about.
Generally, the slower a web service and the more often it is called the more beneficial async IO can be. You will need on average (latency * frequency) threads running. So 100ms call time and 10 calls per second is about 1 thread on average.
Run the numbers and see if you need to change anything or not.
Any thoughts?
Yes, lot's of thoughts...but that alone doesn't count as an answer. ;)
There is no real good answer here since there isn't much context provided. But let's address what we know.
Since we are a web application, each request/response cycle has a direct impact on performance and can be a bottleneck.
Since we are internally invoking another API call from ours, we shouldn't assume that it is hosted on the same server - as such this should be treated just like all I/O bound operations.
With the two known factors above, we should make our calls async. Consider the following:
public async Task<ActionResult> Index()
{
var result = await GetIdAsync();
ViewBag.result = result.Data;
return View();
}
public async Task<JsonResult> GetIdAsync()
{
var result = "";
var url = "http://APIURL/GetID";
using (WebClient client = new WebClient())
{
// Note the API change here?
result = await client.DownloadStringAsync(url);
}
return Json(result, JsonRequestBehavior.AllowGet);
}
Now, we are correctly using async and await keywords with our Task<T> returning operations. This will help to ensure ideal performance. Notice the API change on the client.DownloadStringAsync too, this is very important.

Web API and Async/Await Benefits on a Single Core Machine

I had asked a question in a different thread about issues with GDI+ in TPL (async/await) and the discussion turned to the question of whether or not there even were any benefits to using TPL for this.
So I'm trying to understand the answer to that here.
The scenario is roughly this:
A Web API controller/method receives an image upload
A method which resizes the image and uploads it to azure is called multiple times for various sizes (<10)
The method returns a Uri for each resized-and-uploaded image
A response is returned to the Web API client
Note that this will likely run on a single core machine so there is no benefit to be gained by running all the resizes in parallel (to, say, shorten the overall length of the request).
But I'm under the impression that wrapping all the various resizes into a method and running that asynchronously will at least return the Web API thread to the pool, temporarily, to process another request (while a regular thread runs the resizing tasks), and that that is a good thing. The code would look like this:
public Dictionary<ProfilePhotoSize, Uri> ProcessImages(Stream photoStream)
{
var imgUris = new Dictionary<ProfilePhotoSize, Uri>()
{
ProfilePhotoSize.FiveHundredFixedWidth, ResizeAndUpload(ProfilePhotoSize.FiveHundredFixedWidth, photoStream)},
ProfilePhotoSize.Square220, ResizeAndUpload(ProfilePhotoSize.Square220, photoStream)},
ProfilePhotoSize.Square140, ResizeAndUpload(ProfilePhotoSize.Square140, photoStream)},
ProfilePhotoSize.Square80, ResizeAndUpload(ProfilePhotoSize.Square80, photoStream)},
ProfilePhotoSize.Square50, ResizeAndUpload(ProfilePhotoSize.Square50, photoStream)}
};
return imgUris;
}
and...
var photoUris = await Task.Run(() => _photoService.ProcessImages(photoStream);
So the question is - am I off base? Maybe the theory is sound, but it's not implemented quite right (perhaps I need to use ConfigureAwait)?
What's the reality here?
But I'm under the impression that wrapping all the various resizes into a method and running that asynchronously will at least return the Web API thread to the pool, temporarily, to process another request (while a regular thread runs the resizing tasks), and that that is a good thing.
No, not really. If you had true asynchronous work to do, then yes, you'd get a scalability benefit from using async and await. However, your work is CPU-bound, so code like this:
var photoUris = await Task.Run(() => _photoService.ProcessImages(photoStream);
just ends up using another thread pool thread (Task.Run), allowing the request thread to return to the thread pool. So it's actually adding overhead and doesn't give you any scalability benefit.
On ASP.NET, if you have CPU-bound work to do, just call that method directly. Don't wrap it in Task.Run.
You could see performance and responsiveness improvements for your asp.net application if it is using too many threads from the thread pool, which can happen if you have a lot of long running requests. If the request queue becomes full, the web server rejects requests with an HTTP 503 status (Server Too Busy). According to Microsoft, there are cases where the performance benefit of async code can be significant:
A web application using synchronous methods to service high latency calls where the thread pool grows to the .NET 4.5 default maximum of 5, 000 threads would consume approximately 5 GB more memory than an application able the service the same requests using asynchronous methods and only 50 threads. When you’re doing asynchronous work, you’re not always using a thread. For example, when you make an asynchronous web service request, ASP.NET will not be using any threads between the async method call and the await. Using the thread pool to service requests with high latency can lead to a large memory footprint and poor utilization of the server hardware
This however, is not the case for CPU-bound operations, only network-bound or I/O-bound.
For more information, have a look at Using Asynchronous Methods in ASP.NET MVC 4, which applies to web hosted Web API applications as well.

Do asynchronous operations in ASP.NET MVC use a thread from ThreadPool on .NET 4

After this question, it makes me comfortable when using async
operations in ASP.NET MVC. So, I wrote two blog posts on that:
My Take on Task-based Asynchronous Programming in C# 5.0 and ASP.NET MVC Web Applications
Asynchronous Database Calls With Task-based Asynchronous Programming Model (TAP) in ASP.NET MVC 4
I have too many misunderstandings in my mind about asynchronous operations on ASP.NET MVC.
I always hear this sentence: Application can scale better if operations run asynchronously
And I heard this kind of sentences a lot as well: if you have a huge volume of traffic, you may be better off not performing your queries asynchronously - consuming 2 extra threads to service one request takes resources away from other incoming requests.
I think those two sentences are inconsistent.
I do not have much information about how threadpool works on ASP.NET but I know that threadpool has a limited size for threads. So, the second sentence has to be related to this issue.
And I would like to know if asynchronous operations in ASP.NET MVC uses a thread from ThreadPool on .NET 4?
For example, when we implement a AsyncController, how does the app structures? If I get huge traffic, is it a good idea to implement AsyncController?
Is there anybody out there who can take this black curtain away in front of my eyes and explain me the deal about asynchrony on ASP.NET MVC 3 (NET 4)?
Edit:
I have read this below document nearly hundreds of times and I understand the main deal but still I have confusion because there are too much inconsistent comment out there.
Using an Asynchronous Controller in ASP.NET MVC
Edit:
Let's assume I have controller action like below (not an implementation of AsyncController though):
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Do an advanced looging here which takes a while
});
return View();
}
As you see here, I fire an operation and forget about it. Then, I return immediately without waiting it be completed.
In this case, does this have to use a thread from threadpool? If so, after it completes, what happens to that thread? Does GC comes in and clean up just after it completes?
Edit:
For the #Darin's answer, here is a sample of async code which talks to database:
public class FooController : AsyncController {
//EF 4.2 DbContext instance
MyContext _context = new MyContext();
public void IndexAsync() {
AsyncManager.OutstandingOperations.Increment(3);
Task<IEnumerable<Foo>>.Factory.StartNew(() => {
return
_context.Foos;
}).ContinueWith(t => {
AsyncManager.Parameters["foos"] = t.Result;
AsyncManager.OutstandingOperations.Decrement();
});
Task<IEnumerable<Bars>>.Factory.StartNew(() => {
return
_context.Bars;
}).ContinueWith(t => {
AsyncManager.Parameters["bars"] = t.Result;
AsyncManager.OutstandingOperations.Decrement();
});
Task<IEnumerable<FooBar>>.Factory.StartNew(() => {
return
_context.FooBars;
}).ContinueWith(t => {
AsyncManager.Parameters["foobars"] = t.Result;
AsyncManager.OutstandingOperations.Decrement();
});
}
public ViewResult IndexCompleted(
IEnumerable<Foo> foos,
IEnumerable<Bar> bars,
IEnumerable<FooBar> foobars) {
//Do the regular stuff and return
}
}
Here's an excellent article I would recommend you reading to better understand asynchronous processing in ASP.NET (which is what asynchronous controllers basically represent).
Let's first consider a standard synchronous action:
public ActionResult Index()
{
// some processing
return View();
}
When a request is made to this action a thread is drawn from the thread pool and the body of this action is executed on this thread. So if the processing inside this action is slow you are blocking this thread for the entire processing, so this thread cannot be reused to process other requests. At the end of the request execution, the thread is returned to the thread pool.
Now let's take an example of the asynchronous pattern:
public void IndexAsync()
{
// perform some processing
}
public ActionResult IndexCompleted(object result)
{
return View();
}
When a request is sent to the Index action, a thread is drawn from the thread pool and the body of the IndexAsync method is executed. Once the body of this method finishes executing, the thread is returned to the thread pool. Then, using the standard AsyncManager.OutstandingOperations, once you signal the completion of the async operation, another thread is drawn from the thread pool and the body of the IndexCompleted action is executed on it and the result rendered to the client.
So what we can see in this pattern is that a single client HTTP request could be executed by two different threads.
Now the interesting part happens inside the IndexAsync method. If you have a blocking operation inside it, you are totally wasting the whole purpose of the asynchronous controllers because you are blocking the worker thread (remember that the body of this action is executed on a thread drawn from the thread pool).
So when can we take real advantage of asynchronous controllers you might ask?
IMHO we can gain most when we have I/O intensive operations (such as database and network calls to remote services). If you have a CPU intensive operation, asynchronous actions won't bring you much benefit.
So why can we gain benefit from I/O intensive operations? Because we could use I/O Completion Ports. IOCP are extremely powerful because you do not consume any threads or resources on the server during the execution of the entire operation.
How do they work?
Suppose that we want to download the contents of a remote web page using the WebClient.DownloadStringAsync method. You call this method which will register an IOCP within the operating system and return immediately. During the processing of the entire request, no threads are consumed on your server. Everything happens on the remote server. This could take lots of time but you don't care as you are not jeopardizing your worker threads. Once a response is received the IOCP is signaled, a thread is drawn from the thread pool and the callback is executed on this thread. But as you can see, during the entire process, we have not monopolized any threads.
The same stands true with methods such as FileStream.BeginRead, SqlCommand.BeginExecute, ...
What about parallelizing multiple database calls? Suppose that you had a synchronous controller action in which you performed 4 blocking database calls in sequence. It's easy to calculate that if each database call takes 200ms, your controller action will take roughly 800ms to execute.
If you don't need to run those calls sequentially, would parallelizing them improve performance?
That's the big question, which is not easy to answer. Maybe yes, maybe no. It will entirely depend on how you implement those database calls. If you use async controllers and I/O Completion Ports as discussed previously you will boost the performance of this controller action and of other actions as well, as you won't be monopolizing worker threads.
On the other hand if you implement them poorly (with a blocking database call performed on a thread from the thread pool), you will basically lower the total time of execution of this action to roughly 200ms but you would have consumed 4 worker threads so you might have degraded the performance of other requests which might become starving because of missing threads in the pool to process them.
So it is very difficult and if you don't feel ready to perform extensive tests on your application, do not implement asynchronous controllers, as chances are that you will do more damage than benefit. Implement them only if you have a reason to do so: for example you have identified that standard synchronous controller actions are a bottleneck to your application (after performing extensive load tests and measurements of course).
Now let's consider your example:
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Do an advanced looging here which takes a while
});
return View();
}
When a request is received for the Index action a thread is drawn from the thread pool to execute its body, but its body only schedules a new task using TPL. So the action execution ends and the thread is returned to the thread pool. Except that, TPL uses threads from the thread pool to perform their processing. So even if the original thread was returned to the thread pool, you have drawn another thread from this pool to execute the body of the task. So you have jeopardized 2 threads from your precious pool.
Now let's consider the following:
public ViewResult Index() {
new Thread(() => {
//Do an advanced looging here which takes a while
}).Start();
return View();
}
In this case we are manually spawning a thread. In this case the execution of the body of the Index action might take slightly longer (because spawning a new thread is more expensive than drawing one from an existing pool). But the execution of the advanced logging operation will be done on a thread which is not part of the pool. So we are not jeopardizing threads from the pool which remain free for serving another requests.
Yes - all threads come from the thread-pool. Your MVC app is already multi-threaded, when a request comes in a new thread will be taken from the pool and used to service the request. That thread will be 'locked' (from other requests) until the request is fully serviced and completed. If there is no thread available in the pool the request will have to wait until one is available.
If you have async controllers they still get a thread from the pool but while servicing the request they can give up the thread, while waiting for something to happen (and that thread can be given to another request) and when the original request needs a thread again it gets one from the pool.
The difference is that if you have a lot of long-running requests (where the thread is waiting for a response from something) you might run out of threads from the the pool to service even basic requests. If you have async controllers, you don't have any more threads but those threads that are waiting are returned to the pool and can service other requests.
A nearly real life example...
Think of it like getting on a bus, there's five people waiting to get on, the first gets on, pays and sits down (the driver serviced their request), you get on (the driver is servicing your request) but you can't find your money; as you fumble in your pockets the driver gives up on you and gets the next two people on (servicing their requests), when you find your money the driver starts dealing with you again (completing your request) - the fifth person has to wait until you are done but the third and fourth people got served while you were half way through getting served. This means that the driver is the one and only thread from the pool and the passengers are the requests. It was too complicated to write how it would work if there was two drivers but you can imagine...
Without an async controller, the passengers behind you would have to wait ages while you looked for your money, meanwhile the bus driver would be doing no work.
So the conclusion is, if lots of people don't know where their money is (i.e. require a long time to respond to something the driver has asked) async controllers could well help throughput of requests, speeding up the process from some. Without an aysnc controller everyone waits until the person in front has been completely dealt with. BUT don't forget that in MVC you have a lot of bus drivers on a single bus so async is not an automatic choice.
There are two concepts at play here. First of all we can make our code run in parallel to execute faster or schedule code on another thread to avoid making the user wait. The example you had
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Do an advanced looging here which takes a while
});
return View();
}
belongs to the second category. The user will get a faster response but the total workload on the server is higher because it has to do the same work + handle the threading.
Another example of this would be:
public ViewResult Index() {
Task.Factory.StartNew(() => {
//Make async web request to twitter with WebClient.DownloadString()
});
Task.Factory.StartNew(() => {
//Make async web request to facebook with WebClient.DownloadString()
});
//wait for both to be ready and merge the results
return View();
}
Because the requests run in parallel the user won't have to wait as long as if they where done in serial. But you should realize that we use up more resources here than if we ran in serial because we run the code at many threads while we have on thread waiting too.
This is perfectly fine in a client scenario. And it is quite common there to wrap synchronous long running code in a new task(run it on another thread) too keep the ui responsive or parallize to make it faster. A thread is still used for the whole duration though. On a server with high load this could backfire because you actually use more resources. This is what people have warned you about
Async controllers in MVC has another goal though. The point here is to avoid having threads sittings around doing nothing(which can hurt scalability). It really only matters if the API's you are calling have async methods. Like WebClient.DowloadStringAsync().
The point is that you can let your thread be returned to handle new requests untill the web request is finished where it will call you callback which gets the same or a new thread and finish the request.
I hope you understand the difference between asynchronous and parallel. Think of parallel code as code where your thread sits around and wait for the result. While asynchronous code is code where you will be notified when the code is done and you can get back working at it, in the meantime the thread can do other work.
Applications can scale better if operations run asynchronously, but only if there are resources available to service the additional operations.
Asynchronous operations ensure that you're never blocking an action because an existing one is in progress. ASP.NET has an asynchronous model that allows multiple requests to execute side-by-side. It would be possible to queue the requests up and processes them FIFO, but this would not scale well when you have hundreds of requests queued up and each request takes 100ms to process.
If you have a huge volume of traffic, you may be better off not performing your queries asynchronously, as there may be no additional resources to service the requests. If there are no spare resources, your requests are forced to queue up, take exponentially longer or outright fail, in which case the asynchronous overhead (mutexes and context-switching operations) isn't giving you anything.
As far as ASP.NET goes, you don't have a choice - it's uses an asynchronous model, because that's what makes sense for the server-client model. If you were to be writing your own code internally that uses an async pattern to attempt to scale better, unless you're trying to manage a resource that's shared between all requests, you won't actually see any improvements because they're already wrapped in an asynchronous process that doesn't block anything else.
Ultimately, it's all subjective until you actually look at what's causing a bottleneck in your system. Sometimes it's obvious where an asynchronous pattern will help (by preventing a queued resource blocking). Ultimately only measuring and analysing a system can indicate where you can gain efficiencies.
Edit:
In your example, the Task.Factory.StartNew call will queue up an operation on the .NET thread-pool. The nature of Thread Pool threads is to be re-used (to avoid the cost of creating/destroying lots of threads). Once the operation completes, the thread is released back to the pool to be re-used by another request (the Garbage Collector doesn't actually get involved unless you created some objects in your operations, in which case they're collected as per normal scoping).
As far as ASP.NET is concerned, there is no special operation here. The ASP.NET request completes without respect to the asynchronous task. The only concern might be if your thread pool is saturated (i.e. there are no threads available to service the request right now and the pool's settings don't allow more threads to be created), in which case the request is blocked waiting to start the task until a pool thread becomes available.
Yes, they use a thread from the thread pool. There is actually a pretty excellent guide from MSDN that will tackle all of your questions and more. I have found it to be quite useful in the past. Check it out!
http://msdn.microsoft.com/en-us/library/ee728598.aspx
Meanwhile, the comments + suggestions that you hear about asynchronous code should be taken with a grain of salt. For starters, just making something async doesn't necessarily make it scale better, and in some cases can make your application scale worse. The other comment you posted about "a huge volume of traffic..." is also only correct in certain contexts. It really depends on what your operations are doing, and how they interact with other parts of the system.
In short, lots of people have lots of opinions about async, but they may not be correct out of context. I'd say focus on your exact problems, and do basic performance testing to see what async controllers, etc. actually handle with your application.
First thing its not MVC but the IIS who maintains the thread pool. So any request which comes to MVC or ASP.NET application is served from threads which are maintained in thread pool. Only with making the app Asynch he invokes this action in a different thread and releases the thread immediately so that other requests can be taken.
I have explained the same with a detail video (http://www.youtube.com/watch?v=wvg13n5V0V0/ "MVC Asynch controllers and thread starvation" ) which shows how thread starvation happens in MVC and how its minimized by using MVC Asynch controllers.I also have measured the request queues using perfmon so that you can see how request queues are decreased for MVC asynch and how its worst for Synch operations.

ASP.NET Threading: should I use the pool for DB and Emails actions?

I’m looking for the best way of using threads considering scalability and performance.
In my site I have two scenarios that need threading:
UI trigger: for example the user clicks a button, the server should read data from the DB and send some emails. Those actions take time and I don’t want the user request getting delayed. This scenario happens very frequently.
Background service: when the app starts it trigger a thread that run every 10 min, read from the DB and send emails.
The solutions I found:
A. Use thread pool - BeginInvoke:
This is what I use today for both scenarios.
It works fine, but it uses the same threads that serve the pages, so I think I may run into scalability issues, can this become a problem?
B. No use of the pool – ThreadStart:
I know starting a new thread takes more resources then using a thread pool.
Can this approach work better for my scenarios?
What is the best way to reuse the opened threads?
C. Custom thread pool:
Because my scenarios occurs frequently maybe the best way is to start a new thread pool?
Thanks.
I would personally put this into a different service. Make your UI action write to the database, and have a separate service which either polls the database or reacts to a trigger, and sends the emails at that point.
By separating it into a different service, you don't need to worry about AppDomain recycling etc - and you can put it on an entire different server if and when you want to. I think it'll give you a more flexible solution.
I do this kind of thing by calling a webservice, which then calls a method using a delegate asynchronously. The original webservice call returns a Guid to allow tracking of the processing.
For the first scenario use ASP.NET Asynchronous Pages. Async Pages are very good choice when it comes to scalability, because during async execution HTTP request thread is released and can be re-used.
I agree with Jon Skeet, that for second scenario you should use separate service - windows service is a good choice here.
Out of your three solutions, don't use BeginInvoke. As you said, it will have a negative impact on scalability.
Between the other two, if the tasks are truly background and the user isn't waiting for a response, then a single, permanent thread should do the job. A thread pool makes more sense when you have multiple tasks that should be executing in parallel.
However, keep in mind that web servers sometimes crash, AppPools recycle, etc. So if any of the queued work needs to be reliably executed, then moving it out of process is a probably a better idea (such as into a Windows Service). One way of doing that, which preserves the order of requests and maintains persistence, is to use Service Broker. You write the request to a Service Broker queue from your web tier (with an async request), and then read those messages from a service running on the same machine or a different one. You can also scale nicely that way by simply adding more instances of the service (or more threads in it).
In case it helps, I walk through using both a background thread and Service Broker in detail in my book, including code examples: Ultra-Fast ASP.NET.

Categories