Trouble understanding the mechanics+benefits of async in .NET - c#

On reading this article it gives the impression that using async can mean that a webserver can concurrently serve more requests than it has threads.
I don't understand how this works though. I do understand (at least I think I understand) that using async one can spin off multiple IO requests without blocking inside a request and without starting a new threads. I believe there's some magic going on but deep within the covers there's probably a call to select().
However, even then, I still think you need one thread per request. The thread presumably has the current stack, and other information about where execution in that thread is up to (like an instruction pointer). I can't see how one can just trash that information whilst you're waiting for a file descriptor in a select() call, lest you forget where you're at and also risk all your active data being garbage collected. But the article first referenced suggested this somehow happens, and I quote:
Asynchronous requests allow a smaller number of threads to handle a larger number of requests.
I don't really understand the mechanics of how this happens, and to me it seems impossible. Sure, async will reduce the number of additional threads to deal with IO requests, but I still can't see how you can avoid at least one thread per request. Unless you do something weird like trashing the thread but saving it's state to the heap and then restoring it later, but I don't really see how that achieves much (you could just apply the thread pool to "running threads" and achieve basically the same thing).

Related

Conditions to use async-methods in c# .net-core web-apis

I'm implementing several small services, each of which uses entity-framework to store certain (but little) data. They also have a fair bit of business-logic so it makes sense to separate them from one another.
I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
I'm uncertain wether to use the async-methods of entity-framework logic (e.g. SaveChangesAsync or FirstOrDefaultAsync) because I can't find metrics that say "now you do it, and now you don't" besides from "Is it I/O or CPU-Intensive or not?".
What I've found when researching this topic (not limited to this but these are showing the problem):
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
using it bloats your code and decreases performance because of the context-switches at every method. Especially when I apply those to entity-framework calls it means that I have at least three context switches for one call from controller to business-logic to the repository to the database.
What I don't know, and that's what I would like to know from you:
How many virtual os threads are there? Or to be more precise: If I expect my application and server to be able to handle 100 requests to this service within five seconds (and I don't expect them to be more, 100 is already exagerated), should I back away from using async/await there?
What are the precise metrics that I could look at to answer this question for any of my services?
Or should I rather always use async-methods for I/O calls because they are already there and it could always happen that the load-situation on my server changes and there's so much going on that the async-methods would help me a great deal with that?
I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
Sort of. The primary benefit of asynchronous code is that it frees up threads. UI apps (i.e., desktop/mobile) manifest this benefit in more responsive user interfaces. Services such as the ones you're writing manifest this benefit in better scalability - the performance benefits are only visible when under load. Also, services only receive this benefit from I/O operations; CPU-bound operations require a thread no matter what, so using await Task.Run on service applications doesn't help at all.
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
Yes. More specifically, the thread pool has a limited injection rate, so it can only grow so far so quickly. Asynchrony (freeing up threads) helps your service handle bursty traffic and heavy load. Quote:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Next question:
using it bloats your code and decreases performance because of the context-switches at every method.
The main performance drawback to async is usually memory related. There's additional structures that need to be allocated to keep track of ongoing asynchronous work. In the synchronous world, the thread stack itself has this information.
What I don't know, and that's what I would like to know from you: [when should I use async?]
Generally speaking, you should use async for any new code doing I/O-based operations (including all EF operations). The metrics-based arguments are more about cost/benefit analysis of converting to async - i.e., given an existing old synchronous codebase, at what point is it worth investing the time to convert it to async.
TLDR: Should I use async? YES!
You seem to have fallen for the most common mistake when trying to understand async/await. Async is orthogonal to multi-threading.
To answer your question, when should you the async method?
If currentContext.IsAsync && method.HasAsyncVersion
return UseAsync.Yes;
Else
return UseAsync.No;
That above is the short version.
Async/Await actually solves a few problems
Unblock UI thread
M:N threading
Multithreaded scheduling and synchronization
Interupt/Event based asynchronous scheduling
Given the large number of different use cases for async/await, the "assumptions" you state only apply to certain cases.
For example, context switching, only happens with Multi-Threading. Single-Threaded Interupt based Async actually reduces context switching by reducing blocking times and keeping the OS thread well fed with work.
Finally, your question on OS threads, is fundimentally wrong.
Firstly, OS threads each require creation of a stack (4MB of continous RAM, 100 threads means 400MB of RAM before any work is even done).
Secondly, unless you have 100 physical cores on your PC, your CPUs will have to context switch between each OS thread, resulting in the CPU stalling, whilst it loads that thread. By using M:N threading, you can keep the CPU running, by reducing the number of OS threads and instead using Green Threads (Task in dotnet).
Thirdly, not all "await" results in "async" behavior. Tasks are able to synchronously return, short-circuiting all of the "bloat".
In short, without digging really deep, it is hard to find optimization opportunities by switching from async to sync methods.

C# async/await in backend service / webservice usefull?

Is async/await useful in a backend / webservice scenario?
Given the case there is only one thread for all requests / work. If this thread awaits a task it is not blocked but it also has no other work to do so it just idles. (It can't accept another request because the current execution is waiting for the task to resolve).
Given the case there is one thread per request / "work item". The Thread still idles because the other request is handled by another thread.
The only case I can imagine is doing two async operations at a the same time is like reading a file and sending an http request. But this sounds like a rare case. Is should read the file first and then post the content and not post something I didn't even read.
Given the case there is one thread per request / "work item". The Thread still idles because the other request is handled by another thread.
That's closer to reality but the server doesn't just keep adding threads ad infinitum - at some point it'll let requests queue if there's not a thread free to handle the request. And that's where freeing up a thread that's got no other work to usefully do at the moment starts winning.
It's hard to read your question without feeling that you misunderstand how webservers work and how async/await & threads work. To make it simple, just think of it like this: async/await is almost always good to use when you query an external resource (e.g. database, web service/API, system file, etc). If you follow this simple rule, you don't need to think too deeply about each situation.
However, when you read & learn more on these subjects and gain good experience, deep thinking becomes essential in each case because there are always exceptions to any rule, so there are scenarios where the overhead of using async/await & threads may transcends their benefits. For example, Microsoft decided not to use it for the logger in ASP.Net Core and there is even a comment about it in the source code.
In your case, the webserver uses much more threads that you seem to think and for much more reasons than you seem to think. Also when a thread is idling waiting for something, it cannot do anything else. What async/await do is that they untie the thread from the current awaited task so the thread can go back to the pool and do something else. When the awaited task is finished, a thread (can be a different thread) is pulled out of the pool to continue the job. You seem to understand this to some degree, but perhaps you just don't know what other things a thread in a webserver can do. Believe me, there is a lot to do.
Finally, remember that threads are generic workers, they can do anything. Webservers may have specialized threads for different tasks, but they fall into two or three categories. Threads can still do anything within their category. Webservers can even move threads to different categories when required. All of that is done for you so you don't need to think about it in most cases and you can just focus on freeing the threads so the webserver can do its job.
Given the case there is only one thread for all requests / work.
I challenge you to say that this is a very abstruse case. Even before multi core servers because standard, asp.net used 50+ threads per core.
If this thread awaits a task it is not blocked but it also has no other work to do so it
just idles.
No, it goes back into the pool handling other requests. MOST web services will love handling as many requests as possible with as few resources as possible. Servers only handling one client are a rare edge case. Extremely rare. Most web services will handle as many requests as the plenthora of clients throw at them.

What are the scalability benefits of async (non-blocking) code?

Blocking threads is considered a bad practice for 2 main reasons:
Threads cost memory.
Threads cost processing time via context switches.
Here are my difficulties with those reasons:
Non-blocking, async code should also cost pretty much the same amount of memory, because the callstack should be saved somewhere right before executing he async call (the context is saved, after all). And if threads are significantly inefficient (memory-wise), why doesn't the OS/CLR offer a more light-weight version of threads (saving only the callstack's context and nothing else)? Wouldn't it be a much cleaner solution to the memory problem, instead of forcing us to re-architecture our programs in an asynchronous fashion (which is significantly more complex, harder to understand and maintain)?
When a thread gets blocked, it is put into a waiting state by the OS. The OS won't context-switch to the sleeping thread. Since way over 95% of the thread's life cycle is spent on sleeping (assuming IO-bound apps here), the performance hit should be negligible, since the processing sections of the thread would probably not be pre-empted by the OS because they should run very fast, doing very little work. So performance-wise, I can't see a whole lot of benefit to a non-blocking approach either.
What am I missing here or why are those arguments flawed?
Non-blocking, async code should also cost pretty much the same amount of memory, because the callstack should be saved somewhere right before executing he async call (the context is saved, after all).
The entire call stack is not saved when an await occurs. Why do you believe that the entire call stack needs to be saved? The call stack is the reification of continuation and the continuation of the awaited task is not the continuation of the await. The continuation of the await is on the stack.
Now, it may well be the case that when every asynchronous method in a given call stack has awaited, information equivalent to the call stack has been stored in the continuations of each task. But the memory burden of those continuations is garbage collected heap memory, not a block of a million bytes of committed stack memory. The continuation state size is order n in the size of the number of tasks; the burden of a thread is a million bytes whether you use it or not.
if threads are significantly inefficient (memory-wise), why doesn't the OS/CLR offer a more light-weight version of threads
The OS does. It offers fibers. Of course, fibers still have a stack, so that's maybe not better. You could have a thread with a small stack I suppose.
Wouldn't it be a much cleaner solution to the memory problem, instead of forcing us to re-architecture our programs in an asynchronous fashion
Suppose we made threads -- or for that matter, processes -- much cheaper. That still doesn't solve the problem of synchronizing access to shared memory.
For what it's worth, I think it would be great if processes were lighter weight. They're not.
Moreover, the question somewhat contradicts itself. You're doing work with threads, so you are already willing to take on the burden of managing asynchronous operations. A given thread must be able to tell another thread when it has produced the result that the first thread asked for. Threading already implies asynchrony, but asynchrony does not imply threading. Having an async architecture built in to the language, runtime and type system only benefits people who have the misfortune to have to write code that manages threads.
Since way over 95% of the thread's life cycle is spent on sleeping (assuming IO-bound apps here), the performance hit should be negligible, since the processing sections of the thread would probably not be pre-empted by the OS because they should run very fast, doing very little work.
Why would you hire a worker (thread) and pay their salary to sit by the mailbox (sleeping the thread) waiting for the mail to arrive (handling an IO message)? IO interrupts don't need a thread in the first place. IO interrupts exist in a world below the level of threads.
Don't hire a thread to wait on IO; let the operating system handle asynchronous IO operations. Hire threads to do insanely huge amounts of high latency CPU processing, and then assign one thread to each CPU you own.
Now we come to your question:
What are the benefits of async (non-blocking) code?
Not blocking the UI thread
Making it easier to write programs that live in a world with high latency
Making more efficient use of limited CPU resources
But let me rephrase the question using an analogy. You're running a delivery company. There are many orders coming in, many deliveries going out, and you cannot tell a customer that you will not take their delivery until every delivery before theirs is completed. Which is better:
hire fifty guys to take calls, pick up packages, schedule deliveries, and deliver packages, and then require that 46 of them be idle at all times or
hire four guys and make each of them really good at first, doing a little bit of work at a time, so that they are always responsive to customer requests, and second, really good at keeping a to-do list of jobs they need to do in the future
The latter seems like a better deal to me.
You are messing multithreading and async concepts here.
Both your "difficulties" come from the assumption that each async method gets assigned a specialized thread on which it does the work. However, the state of affairs is quite opposite: each time an async operation needs to be executed, the CLR picks an idle (thus already created) thread from the threadpool and executes that method on the selected thread.
The core concept here is that async doesn't mean always creating new threads, it means scheduling the execution on existing threads so that no thread is sitting idle.

ThreadPool and methods with while(true) loops?

ThreadPool utilizes recycling of threads for optimal performance over using multiple of the Thread class. However, how does this apply to processing methods with while loops inside of the ThreadPool?
As an example, if we were to apply a thread in the ThreadPool to a client that has connected to a TCP server, that client would need a while loop to keep checking for incoming data. The loop can be exited to disconnect the client, but only if the server closes or if the client demands a disconnection.
If that is the case, then how would having a ThreadPool help when masses of clients connect? Either way the same amount of memory is used if the clients stay connected. If they stay connected, then the threads cannot be recycled. If so, then ThreadPool would not help much until a client disconnects and opens up a thread to recycle.
On the other hand it was suggested to me to use the Network.BeginReceive and NetworkStream.EndReceive asynchronous methods to avoid threads all together to save RAM usage and CPU usage. Is this true or not?
Either way the same amount of memory is used if the clients stay
connected.
So far this is true. It's up to your app to decide how much state it needs to keep per client.
If they stay connected, then the threads cannot be recycled. If so,
then ThreadPool would not help much until a client disconnects and
opens up a thread to recycle.
This is untrue, because it assumes that all interesting operations performed by these threads are synchronous. This is a naive mode of operation, and in fact real world code is asynchronous: a thread makes a call to request an action and is then free to do other things. When a result is made available as a result of that action, some thread looking for other things to do will run the code that acts on the result.
On the other hand it was suggested to me to use the
Network.BeginReceive and NetworkStream.EndReceive asynchronous methods
to avoid threads all together to save RAM usage and CPU usage. Is this
true or not?
As explained above, async methods like these will allow you to service a potentially very large number of clients with only a small number of worker threads -- but by itself it will do nothing to either help or hurt the memory situation.
You are correct. Slow blocking codes can cause poor performances both on the client-side as well as server-side. You can run slow work on a separate thread and that might work well enough on the client-side but may not help on the server-side. Having blocking methods in the server can diminish the overall performance of the server because it can lead to a situation where your server has a large no of threads running and all blocked. So, even simple request might end up taking a long time. It is better to use asynchronous APIs if they are available for slow running tasks just like the situation you are in. (Note: even if the asynchronous operations are not available, you can implement one by implementing a custom awaiter class) This is better for the clients as well as servers. The main point of asynchronous code is to reduce the no of threads. Because servers can have larger no of requests in progress simultaneously because reducing no of threads to handle a particular no of clients can improve scalability.
If you dont need to have more control over the threads or the thread-pool you can go with asynchronous approach.
Also, each thread takes 1 MB space on the heap. So, asynchronous methods will definitely help reduce memory usage. However, I think the nature of the work you have described here is going to take pretty much the same amount of time in multi-threaded as well as asynchronous approach.

What resources do blocked threads take-up

One of the main purposes of writing code in the asynchronous programming model (more specifically - using callbacks instead of blocking the thread) is to minimize the number of blocking threads in the system.
For running threads , this goal is obvious, because of context switches and synchronization costs.
But what about blocked threads? why is it so important to reduce their number?
For example, when waiting for a response from a web server a thread is blocked and doesn't take-up any CPU time and does not participate in any context switch.
So my question is:
other than RAM (about 1MB per thread ?) What other resources do blocked threads take-up?
And another more subjective question:
In what cases will this cost really justify the hassle of writing asynchronous code (the price could be, for example, splitting your nice coherent method to lots of beginXXX and EndXXX methods, and moving parameters and local variables to be class fields).
UPDATE - additional reasons I didn't mention or didn't give enough weight to:
More threads means more locking on communal resources
More threads means more creation and disposing of threads which is expensive
The system can definitely run-out of threads/RAM and then stop servicing clients (in a web server scenario this can actually bring down the service)
So my question is: other than RAM (about 1MB per thread ?) What other resources do blocked threads take-up?
This is one of the largest ones. That being said, there's a reason that the ThreadPool in .NET allows so many threads per core - in 3.5 the default was 250 worker threads per core in the system. (In .NET 4, it depends on system information such as virtual address size, platform, etc. - there isn't a fixed default now.) Threads, especially blocked threads, really aren't that expensive...
However, I would say, from a code management standpoint, it's worth reducing the number of blocked threads. Every blocked thread is an operation that should, at some point, return and become unblocked. Having many of these means you have quite a complicated set of code to manage. Keeping this number reduced will help keep the code base simpler - and more maintainable.
And another more subjective question: In what cases will this cost really justify the hassle of writing asynchronous code (the price could be, for example, splitting your nice coherent method to lots of beginXXX and EndXXX methods, and moving parameters and local variables to be class fields).
Right now, it's often a pain. It depends a lot on the scenario. The Task<T> class in .NET 4 dratically improves this for many scenarios, however. Using the TPL, it's much less painful than it was previously using the APM (BeginXXX/EndXXX) or even the EAP.
This is why the language designers are putting so much effort into improving this situation in the future. Their goals are to make async code much simpler to write, in order to allow it to be used more frequently.
Besides from any resources the blocked thread might hold a lock on, thread pool size is also of consideration. If you have reached the maximum thread pool size (if I recall correctly for .NET 4 is max thread count is 100 per CPU) you simply won't be able to get anything else to run on the thread pool until at least one thread gets freed up.
I would like to point out that the 1MB figure for stack memory (or 256KB, or whatever it's set to) is a reserve; while it does take away from available address space, the actual memory is only committed as it's needed.
On the other hand, having a very large number of threads is bound to bog down the task scheduler somewhat as it has to keep track of them (which have become runnable since the last tick, and so on).

Categories