Does asynchronous model really give benefits in throughput against properly configured synchronous?

Does asynchronous model really give benefits in throughput against properly configured synchronous? - c#

Everybody knows that asynchrony gives you "better throughput", "scalability", and more efficient in terms of resources consumption. I also thought this (simplistic) way before doing an experiment below. It basically tells that if we take into account all the overhead for asynchronous code and compare it against properly configured synchronous code it yields little to no performance/throughput/resource consumption advantages.
The question: Does asynchronous code actually perform so much better comparing to the synchronous code with correctly configured thread pool? May be my performance tests are flawed in some dramatic way?
Test setup: Two ASP.NET Web API methods with JMeter trying to call them with 200 threads thread group (30 seconds rump up time).
[HttpGet]
[Route("async")]
public async Task<string> AsyncTest()
{
await Task.Delay(_delayMs);
return "ok";
}
[HttpGet]
[Route("sync")]
public string SyncTest()
{
Thread.Sleep(_delayMs);
return "ok";
}
Here is response time (log scale). Notice how synchronous code becomes faster when Thread Pool injected enough threads. If we were to set up Thread Pool beforehand (via SetMinThreads) it would outperform async right from the start.
What about resources consumption you would ask. "Thread has big cost in terms of CPU time scheduling, context switching and RAM footprint". Not so fast. Threads scheduling and context switching is efficient. As far as the stack usage goes thread does not instantly consume the RAM, but rather just reserve virtual address space and commit only a tiny fraction which is actually needed.
Let's look at what the data says. Even with bigger amount threads sync version has smaller memory footprint (working set which maps into the physical memory).
UPDATE. I want to post the results of follow-up experiment which should be more representational since avoids some biases of the first one.
First of all, the results of the first experiment are taken using IIS Express, which is basically dev time server, so I needed to move away from that. Also, considering the feedback I've isolated load generation machine from the server (two Azure VMs in the same network). I've also discovered that some IIS threading limits are from hard to impossible to violate and ended up switching to ASP.NET WebAPI self-hosting to eliminate IIS from the variables as well. Note that memory footprints/CPU times are radically different with this test, please do not compare numbers across the different test runs as setups are totally different (hosting, hardware, machines setup). Additionally, when I moved to another machines and another hosting solution the Thread Pool strategy changed (it is dynamic) and injection rate increased.
Settings: Delay 100ms, 200 JMeter "users", 30 sec ramp-up time.
I want to conclude these experiments with the following: Yes, under some specific
(more laboratory like) circumstances it's possible to get comparable results for sync vs. async, but in real world cases where workload can not be 100% predictable and workload is uneven we inevitably will hit some kind of threading limits: either server side limits, or Thread Pool grow limits (and bear in mind that thread pool management is automatic mechanism with not always easily predictable properties). Additionally, sync version does have a bigger memory footprint (both working set, and way bigger virtual memory size). As far as CPU consumption is concerned async also wins (CPU time per request metric).
On IIS with default settings the situation is even more dramatic: synchronous version is order(s) of magnitude slower (and smaller throughput) due to quite tight limit on threads count - 20 per CPU.
PS. Do use asynchronous pipelines for IO! [... sigh of relief...]

Everybody knows that asynchrony gives you "better throughput", "scalability", and more efficient in terms of resources consumption.
Scalability, yes. Throughput: it depends. Each asynchronous request is slower than the equivalent synchronous request, so you would only see a throughput benefit when scalability comes into play (i.e., there are more requests than threads available).
Does asynchronous code actually perform so much better comparing to the synchronous code with correctly configured thread pool?
Well, the catch there is "correctly configured thread pool". What you're assuming is that you can 1) predict your load, and 2) have a server big enough to handle it using one thread per request. For many (most?) real-world production scenarios, either or both of these are not true.
From my article on async ASP.NET:
Why not just increase the size of the thread pool [instead of using async]? The answer is twofold: Asynchronous code scales both further and faster than blocking thread pool threads.
First, asynchronous code scales further than synchronous code. With more realistic example code, the total scalability of ASP.NET servers (stress tested) showed a multiplicative increase. In other words, an asynchronous server could handle several times the number of continuous requests as a synchronous server (with both thread pools turned up to the maximum for that hardware). However, these experiments (not done by me) were done on a expected "realistic baseline" for average ASP.NET apps. I don't how the same results would carry over to a noop string return.
Second, asynchronous code scales faster than synchronous code. This one is pretty obvious; synchronous code scales fine up to the number of thread pool threads, but then can't scale faster than the thread injection rate. So you get that really slow response to a sudden heavy load, as shown in the beginning of your response time graph.
I think the work you've done is interesting; I am particularly surprised at the memory usage differences (or rather, lack of difference). I'd love to see you work this into a blog post. Recommendations:
Use ASP.NET Core for your tests. The old ASP.NET had only a partially-asynchronous pipeline; ASP.NET Core would be necessary for a more "pure" comparison of sync vs async.
Don't test locally; there are a lot of caveats when doing that. I'd recommend choosing a VM size (or single-instance Docker container or whatever) and testing in the cloud for repeatability.
Also try stress testing in addition to load testing. Keep increasing load until the server is totally overwhelmed, and see how both the async and sync servers respond.
As a final reminder (also from my article):
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.

Trully asynchronous code (I/O) is more scalable because it releases thread pool threads for other work instead of blocking them. So, for the same number of threads being, it can handle more requests.
But it does that at the cost of more control data structures and more work. So, (other than saving thread pool threads) it consumes more resources (memory, CPU).
It's all about availability, not performance.

Related

Conditions to use async-methods in c# .net-core web-apis

I'm implementing several small services, each of which uses entity-framework to store certain (but little) data. They also have a fair bit of business-logic so it makes sense to separate them from one another.
I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
I'm uncertain wether to use the async-methods of entity-framework logic (e.g. SaveChangesAsync or FirstOrDefaultAsync) because I can't find metrics that say "now you do it, and now you don't" besides from "Is it I/O or CPU-Intensive or not?".
What I've found when researching this topic (not limited to this but these are showing the problem):
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
using it bloats your code and decreases performance because of the context-switches at every method. Especially when I apply those to entity-framework calls it means that I have at least three context switches for one call from controller to business-logic to the repository to the database.
What I don't know, and that's what I would like to know from you:
How many virtual os threads are there? Or to be more precise: If I expect my application and server to be able to handle 100 requests to this service within five seconds (and I don't expect them to be more, 100 is already exagerated), should I back away from using async/await there?
What are the precise metrics that I could look at to answer this question for any of my services?
Or should I rather always use async-methods for I/O calls because they are already there and it could always happen that the load-situation on my server changes and there's so much going on that the async-methods would help me a great deal with that?

I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
Sort of. The primary benefit of asynchronous code is that it frees up threads. UI apps (i.e., desktop/mobile) manifest this benefit in more responsive user interfaces. Services such as the ones you're writing manifest this benefit in better scalability - the performance benefits are only visible when under load. Also, services only receive this benefit from I/O operations; CPU-bound operations require a thread no matter what, so using await Task.Run on service applications doesn't help at all.
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
Yes. More specifically, the thread pool has a limited injection rate, so it can only grow so far so quickly. Asynchrony (freeing up threads) helps your service handle bursty traffic and heavy load. Quote:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Next question:
using it bloats your code and decreases performance because of the context-switches at every method.
The main performance drawback to async is usually memory related. There's additional structures that need to be allocated to keep track of ongoing asynchronous work. In the synchronous world, the thread stack itself has this information.
What I don't know, and that's what I would like to know from you: [when should I use async?]
Generally speaking, you should use async for any new code doing I/O-based operations (including all EF operations). The metrics-based arguments are more about cost/benefit analysis of converting to async - i.e., given an existing old synchronous codebase, at what point is it worth investing the time to convert it to async.

TLDR: Should I use async? YES!
You seem to have fallen for the most common mistake when trying to understand async/await. Async is orthogonal to multi-threading.
To answer your question, when should you the async method?
If currentContext.IsAsync && method.HasAsyncVersion
return UseAsync.Yes;
Else
return UseAsync.No;
That above is the short version.
Async/Await actually solves a few problems
Unblock UI thread
M:N threading
Multithreaded scheduling and synchronization
Interupt/Event based asynchronous scheduling
Given the large number of different use cases for async/await, the "assumptions" you state only apply to certain cases.
For example, context switching, only happens with Multi-Threading. Single-Threaded Interupt based Async actually reduces context switching by reducing blocking times and keeping the OS thread well fed with work.
Finally, your question on OS threads, is fundimentally wrong.
Firstly, OS threads each require creation of a stack (4MB of continous RAM, 100 threads means 400MB of RAM before any work is even done).
Secondly, unless you have 100 physical cores on your PC, your CPUs will have to context switch between each OS thread, resulting in the CPU stalling, whilst it loads that thread. By using M:N threading, you can keep the CPU running, by reducing the number of OS threads and instead using Green Threads (Task in dotnet).
Thirdly, not all "await" results in "async" behavior. Tasks are able to synchronously return, short-circuiting all of the "bloat".
In short, without digging really deep, it is hard to find optimization opportunities by switching from async to sync methods.

ThreadPool and methods with while(true) loops?

ThreadPool utilizes recycling of threads for optimal performance over using multiple of the Thread class. However, how does this apply to processing methods with while loops inside of the ThreadPool?
As an example, if we were to apply a thread in the ThreadPool to a client that has connected to a TCP server, that client would need a while loop to keep checking for incoming data. The loop can be exited to disconnect the client, but only if the server closes or if the client demands a disconnection.
If that is the case, then how would having a ThreadPool help when masses of clients connect? Either way the same amount of memory is used if the clients stay connected. If they stay connected, then the threads cannot be recycled. If so, then ThreadPool would not help much until a client disconnects and opens up a thread to recycle.
On the other hand it was suggested to me to use the Network.BeginReceive and NetworkStream.EndReceive asynchronous methods to avoid threads all together to save RAM usage and CPU usage. Is this true or not?

Either way the same amount of memory is used if the clients stay
connected.
So far this is true. It's up to your app to decide how much state it needs to keep per client.
If they stay connected, then the threads cannot be recycled. If so,
then ThreadPool would not help much until a client disconnects and
opens up a thread to recycle.
This is untrue, because it assumes that all interesting operations performed by these threads are synchronous. This is a naive mode of operation, and in fact real world code is asynchronous: a thread makes a call to request an action and is then free to do other things. When a result is made available as a result of that action, some thread looking for other things to do will run the code that acts on the result.
On the other hand it was suggested to me to use the
Network.BeginReceive and NetworkStream.EndReceive asynchronous methods
to avoid threads all together to save RAM usage and CPU usage. Is this
true or not?
As explained above, async methods like these will allow you to service a potentially very large number of clients with only a small number of worker threads -- but by itself it will do nothing to either help or hurt the memory situation.

You are correct. Slow blocking codes can cause poor performances both on the client-side as well as server-side. You can run slow work on a separate thread and that might work well enough on the client-side but may not help on the server-side. Having blocking methods in the server can diminish the overall performance of the server because it can lead to a situation where your server has a large no of threads running and all blocked. So, even simple request might end up taking a long time. It is better to use asynchronous APIs if they are available for slow running tasks just like the situation you are in. (Note: even if the asynchronous operations are not available, you can implement one by implementing a custom awaiter class) This is better for the clients as well as servers. The main point of asynchronous code is to reduce the no of threads. Because servers can have larger no of requests in progress simultaneously because reducing no of threads to handle a particular no of clients can improve scalability.
If you dont need to have more control over the threads or the thread-pool you can go with asynchronous approach.
Also, each thread takes 1 MB space on the heap. So, asynchronous methods will definitely help reduce memory usage. However, I think the nature of the work you have described here is going to take pretty much the same amount of time in multi-threaded as well as asynchronous approach.

Launching multiple tasks from a WCF service

I need to optimize a WCF service... it's quite a complex thing. My problem this time has to do with tasks (Task Parallel Library, .NET 4.0). What happens is that I launch several tasks when the service is invoked (using Task.Factory.StartNew) and then wait for them to finish:
Task.WaitAll(task1, task2, task3, task4, task5, task6);
Ok... what I see, and don't like, is that on the first call (sometimes the first 2-3 calls, if made quickly one after another), the final task starts much later than the others (I am looking at a case where it started 0.5 seconds after the others). I tried calling
ThreadPool.SetMinThreads(12*Environment.ProcessorCount, 20);
at the beginning of my service, but it doesn't seem to help.
The tasks are all database-related: I'm reading from multiple databases and it has to take as little time as possible.
Any idea why the last task is taking so long? Is there something I can do about it?
Alternatively, should I use the thread pool directly? As it happens, in one case I'm looking at, one task had already ended before the last one started - I would had saved 0.2 seconds if I had reused that thread instead of waiting for a new one to be created. However, I can not be sure that that task will always end so quickly, so I can't put both requests in the same task.
[Edit] The OS is Windows Server 2003, so there should be no connection limit. Also, it is hosted in IIS - I don't know if I should create regular threads or using the thread pool - which is the preferred version?
[Edit] I've also tried using Task.Factory.StartNew(action, TaskCreationOptions.LongRunning); - it doesn't help, the last task still starts much later (around half a second later) than the rest.
[Edit] MSDN1 says:
The thread pool has a built-in delay
(half a second in the .NET Framework
version 2.0) before starting new idle
threads. If your application
periodically starts many tasks in a
short time, a small increase in the
number of idle threads can produce a
significant increase in throughput.
Setting the number of idle threads too
high consumes system resources
needlessly.
However, as I said, I'm already calling SetMinThreads and it doesn't help.

I have had problems myself with delays in thread startup when using the (.Net 4.0) Task-object. So for time-critical stuff I now use dedicated threads (... again, as that is what I was doing before .Net 4.0.)
The purpose of a thread pool is to avoid the operative system cost of starting and stopping threads. The threads are simply being reused. This is a common model found in for example internet servers. The advantage is that they can respond quicker.
I've written many applications where I implement my own threadpool by having dedicated threads picking up tasks from a task queue. Note however that this most often required locking that can cause delays/bottlenecks. This depends on your design; are the tasks small then there would be a lot of locking and it might be faster to trade some CPU in for less locking: http://www.boyet.com/Articles/LockfreeStack.html
SmartThreadPool is a replacement/extension of the .Net thread pool. As you can see in this link it has a nice GUI to do some testing: http://www.codeproject.com/KB/threads/smartthreadpool.aspx
In the end it depends on what you need, but for high performance I recommend implementing your own thread pool. If you experience a lot of thread idling then it could be beneficial to increase the number of threads (beyond the recommended cpucount*2). This is actually how HyperThreading works inside the CPU - using "idle" time while doing operations to do other operations.
Note that .Net has a built-in limit of 25 threads per process (ie. for all WCF-calls you receive simultaneously). This limit is independent and overrides the ThreadPool setting. It can be increased, but it requires some magic: http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=201

Following from my prior question (yep, should have been a Q against original message - apologies):
Why do you feel that creating 12 threads for each processor core in your machine will in some way speed-up your server's ability to create worker threads? All you're doing is slowing your server down!
As per MSDN do
As per the MSDN docs: "You can use the SetMinThreads method to increase the minimum number of threads. However, unnecessarily increasing these values can cause performance problems. If too many tasks start at the same time, all of them might appear to be slow. In most cases, the thread pool will perform better with its own algorith for allocating threads. Reducing the minimum to less than the number of processors can also hurt performance.".
Issues like this are usually caused by bumping into limits or contention on a shared resource.
In your case, I am guessing that your last task(s) is/are blocking while they wait for a connection to the DB server to come available or for the DB to respond. Remember - if your invocation kicks off 5-6 other tasks then your machine is going to have to create and open numerous DB connections and is going to kick the DB with, potentially, a lot of work. If your WCF server and/or your DB server are cold, then your first few invocations are going to be slower until the machine's caches etc., are populated.
Have you tried adding a little tracing/logging using the stopwatch to time how long it takes for your tasks to connect to the DB server and then execute their operations?
You may find that reducing the number of concurrent tasks you kick off actually speeds things up. Try spawning 3 tasks at a time, waiting for them to complete and then spawn the next 3.

When you call Task.Factory.StartNew, it uses a TaskScheduler to map those tasks into actual work items.
In your case, it sounds like one of your Tasks is delaying occasionally while the OS spins up a new Thread for the work item. You could, potentially, build a custom TaskScheduler which already contained six threads in a wait state, and explicitly used them for these six tasks. This would allow you to have complete control over how those initial tasks were created and started.
That being said, I suspect there is something else at play here... You mentioned that using TaskCreationOptions.LongRunning demonstrates the same behavior. This suggests that there is some other factor at play causing this half second delay. The reason I suspect this is due to the nature of TaskCreationOptions.LongRunning - when using the default TaskScheduler (LongRunning is a hint used by the TaskScheduler class), starting a task with TaskCreationOptions.LongRunning actually creates an entirely new (non-ThreadPool) thread for that Task. If creating 6 tasks, all with TaskCreationOptions.LongRunning, demonstrates the same behavior, you've pretty much guaranteed that the problem is NOT the default TaskScheduler, since this is going to always spin up 6 threads manually.
I'd recommend running your code through a performance profiler, and potentially the Concurrency Visualizer in VS 2010. This should help you determine exactly what is causing the half second delay.

What is the OS? If you are not running the server versions of windows, there is a connection limit. Your many threads are probably being serialized because of the connection limit.
Also, I have not used the task parallel library yet, but my limited experience is that new threads are cheap to make in the context of networking.

These articles might explain the problem you're having:
http://blogs.msdn.com/b/wenlong/archive/2010/02/11/why-are-wcf-responses-slow-and-setminthreads-does-not-work.aspx
http://blogs.msdn.com/b/wenlong/archive/2010/02/11/why-does-wcf-become-slow-after-being-idle-for-15-seconds.aspx
seeing as you're using .Net 4, the first article probably doesn't apply, but as the second article points out the ThreadPool terminates idle threads after 15 seconds which might explain the problem you're having and offers a simple (though a little hacky) solution to get around it.
Whether or not you should be using the ThreadPool directly wouldn't make any difference as I suspect the task library is using it for you underneath anyway.
One third-party library we have been using for a while might help you here - Smart Thread Pool. You still get the same benefits of using the task libraries, in that you can have the return values from the threads and get any exception information from them too.
Also, you can instantiate threadpools so that when you have multiple places each needing a threadpool (so that a low priority process doesn't start eating into the quota of some high priority process) and oh yeah you can set the priority of the threads in the pool too which you can't do with the standard ThreadPool where all the threads are background threads.
You can find plenty of info on the codeplex page, I've also got a post which highlights some of the key differences:
http://theburningmonk.com/2010/03/threading-introducing-smartthreadpool/
Just on a side note, for tasks like the one you've mentioned, which might take some time to return, you probably shouldn't be using the threadpool anyway. It's recommended that we should avoid using the threadpool for any blocking tasks like that because it hogs up the threadpool which is used by all sorts of things by the framework classes, like handling timer events, etc. etc. (not to mention handling incoming WCF requests!). I feel like I'm spamming here but here's some of the info I've gathered around the use of the threadpool and some useful links at the bottom:
http://theburningmonk.com/2010/03/threading-using-the-threadpool-vs-creating-your-own-threads/
well, hope this helps!

What resources do blocked threads take-up

One of the main purposes of writing code in the asynchronous programming model (more specifically - using callbacks instead of blocking the thread) is to minimize the number of blocking threads in the system.
For running threads , this goal is obvious, because of context switches and synchronization costs.
But what about blocked threads? why is it so important to reduce their number?
For example, when waiting for a response from a web server a thread is blocked and doesn't take-up any CPU time and does not participate in any context switch.
So my question is:
other than RAM (about 1MB per thread ?) What other resources do blocked threads take-up?
And another more subjective question:
In what cases will this cost really justify the hassle of writing asynchronous code (the price could be, for example, splitting your nice coherent method to lots of beginXXX and EndXXX methods, and moving parameters and local variables to be class fields).
UPDATE - additional reasons I didn't mention or didn't give enough weight to:
More threads means more locking on communal resources
More threads means more creation and disposing of threads which is expensive
The system can definitely run-out of threads/RAM and then stop servicing clients (in a web server scenario this can actually bring down the service)

So my question is: other than RAM (about 1MB per thread ?) What other resources do blocked threads take-up?
This is one of the largest ones. That being said, there's a reason that the ThreadPool in .NET allows so many threads per core - in 3.5 the default was 250 worker threads per core in the system. (In .NET 4, it depends on system information such as virtual address size, platform, etc. - there isn't a fixed default now.) Threads, especially blocked threads, really aren't that expensive...
However, I would say, from a code management standpoint, it's worth reducing the number of blocked threads. Every blocked thread is an operation that should, at some point, return and become unblocked. Having many of these means you have quite a complicated set of code to manage. Keeping this number reduced will help keep the code base simpler - and more maintainable.
And another more subjective question: In what cases will this cost really justify the hassle of writing asynchronous code (the price could be, for example, splitting your nice coherent method to lots of beginXXX and EndXXX methods, and moving parameters and local variables to be class fields).
Right now, it's often a pain. It depends a lot on the scenario. The Task<T> class in .NET 4 dratically improves this for many scenarios, however. Using the TPL, it's much less painful than it was previously using the APM (BeginXXX/EndXXX) or even the EAP.
This is why the language designers are putting so much effort into improving this situation in the future. Their goals are to make async code much simpler to write, in order to allow it to be used more frequently.

Besides from any resources the blocked thread might hold a lock on, thread pool size is also of consideration. If you have reached the maximum thread pool size (if I recall correctly for .NET 4 is max thread count is 100 per CPU) you simply won't be able to get anything else to run on the thread pool until at least one thread gets freed up.

I would like to point out that the 1MB figure for stack memory (or 256KB, or whatever it's set to) is a reserve; while it does take away from available address space, the actual memory is only committed as it's needed.
On the other hand, having a very large number of threads is bound to bog down the task scheduler somewhat as it has to keep track of them (which have become runnable since the last tick, and so on).

multi threading a web application

I know there are many cases which are good cases to use multi-thread in an application, but when is it the best to multi-thread a .net web application?

A web application is almost certainly already multi threaded by the hosting environment (IIS etc). If your page is CPU-bound (and want to use multiple cores), then arguably multiple threads is a bad idea, as when your system is under load you are already using them.
The time it might help is when you are IO bound; for example, you have a web-page that needs to talk to 3 external web-services, talk to a database, and write a file (all unrelated). You can do those in parallel on different threads (ideally using the inbuilt async operations, to maximise completion-port usage) to reduce the overall processing time - all without impacting local CPU overly much (here the real delay is on the network).
Of course, in such cases you might also do better by simply queuing the work in the web application, and having a separate service dequeue and process them - but then you can't provide an immediate response to the caller (they'd need to check back later to verify completion etc).

IMHO you should avoid the use of multithread in a web based application.
maybe a multithreaded application could increase the performance in a standard app (with the right design), but in a web application you may want to keep a high throughput instead of speed.
but if you have a few concurrent connection maybe you can use parallel thread without a global performance degradation

Multithreading is a technique to provide a single process with more processing time to allow it to run faster. It has more threads thus it eats more CPU cycles. (From multiple CPU's, if you have any.) For a desktop application, this makes a lot of sense. But granting more CPU cycles to a web user would take away the same cycles from the 99 other users who are doing requests at the same time! So technically, it's a bad thing.
However, a web application might use other services and processes that are using multiple threads. Databases, for example, won't create a separate thread for every user that connects to them. They limit the number of threads to just a few, adding connections to a connection pool for faster usage. As long as there are connections available or pooled, the user will have database access. When the database runs out of connections, the user will have to wait.
So, basically, the use of multiple threads could be used for web applications to reduce the number of active users at a specific moment! It allows the system to share resources with multiple users without overloading the resource. Instead, users will just have to stand in line before it's their turn.
This would not be multi-threading in the web application itself, but multi-threading in a service that is consumed by the web application. In this case, it's used as a limitation by only allowing a small amount of threads to be active.

In order to benefit from multithreading your application has to do a significant amount of work that can be run in parallel. If this is not the case, the overhead of multithreading may very well top the benefits.
In my experience most web applications consist of a number of short running methods, so apart from the parallelism already offered by the hosting environment, I would say that it is rare to benefit from multithreading within the individual parts of a web application. There are probably examples where it will offer a benefit, but my guess is that it isn't very common.

ASP.NET is already capable of spawning several threads for processing several requests in parallel, so for simple request processing there is rarely a case when you would need to manually spawn another thread. However, there are a few uncommon scenarios that I have come across which warranted the creation of another thread:
If there is some operation that might take a while and can run in parallel with the rest of the page processing, you might spawn a secondary thread there. For example, if there was a webservice that you had to poll as a result of the request, you might spawn another thread in Page_Init, and check for results in Page_PreRender (waiting if necessary). Though it's still a question if this would be a performance benefit or not - spawning a thread isn't cheap and the time between a typical Page_Init and Page_Prerender is measured in milliseconds anyway. Keeping a thread pool for this might be a little bit more efficient, and ASP.NET also has something called "asynchronous pages" that might be even better suited for this need.
If there is a pool of resources that you wish to clean up periodically. For example, imagine that you are using some weird DBMS that comes with limited .NET bindings, but there is no pooling support (this was my case). In that case you might want to implement the DB connection pool yourself, and this would necessitate a "cleaner thread" which would wake up, say, once a minute and check if there are connections that have not been used for a long while (and thus can be closed off).
Another thing to keep in mind when implementing your own threads in ASP.NET - ASP.NET likes to kill off its processes if they have been inactive for a while. Thus you should not rely on your thread staying alive forever. It might get terminated at any moment and you better be ready for it.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.