Is Blocking code really expensive on modern systems? - c#

I'm trying to grasp a bit better the concepts of async programming (mostly for C#) and blocking/non blocking code.
In C#, if I call .Wait() on a Task , is it always considered "blocking" ?
I understand that the current thread will be blocked. However the thread is put in a "waiting" state (AFAIK), and AFAIK it will never be scheduled by the OS until woken up when the Task completed (I assume the thread is woken up by kernel magic)
In that case, the CPU time taken by this blocking operation should be negligible during the waiting period. Is it indeed the case?
So where are the advantage of async programming coming from? Is it because it allows to go beyond 1000 or so threads that the OS wouldn't allow ? Is it because the memory overhead per async task is lower than the overhead of a thread?
Keep in mind that the "event loop" that manages all the tasks in async context also has work to do to manage the scheduling of all async tasks, bookeeping etc. Is it really less work than what the kernerl has to do in the blocking case to manage threads?

Wait() will block your thread the same as calling a non-async I/O.
Blocking is not inherently inefficient. In fact, it can be more performant if you have a process that will have very few threads. Windows' scheduler actually has some interesting special designs for I/O-blocked threads which you can read about in the Windows Internals books, such as boosting a thread to front of the line if it's been waiting on an I/O for a long time.
However, it doesn't scale. Every thread you create has overhead: memory for stack and register space, thread-local storage used by your app and inside of .NET, cache thrashing caused by all the extra memory needed, context switching, and so on. It's generally not going to be an efficient use of resources especially when each thread will spend a majority of its time blocked.
Async takes advantage of the fact that conceptually we don't really need everything a thread has to offer -- we only want concurrency, so we can make more domain-relevant optimizations in how we use our resources.
It rarely hurts a project to be async by default. If your app doesn't need to be hyper-optimized for scalability, it won't hurt or help you. If your app does, then it'll be a huge help. Things like async/await can just help you model your concurrency better, so regardless of your perf goals it can be useful.
Async I/O is moving towards an even cooler place: I/O APIs like Windows RIO and Linux's io_uring allow you to do I/O without even context switching. Currently .NET does not take advantage of these things, but PipeWriter and PipeReader were built with it in mind for the future.

Related

Are there advantages to asynchronous code on dedicated backend servers with no UI thread?

I had a developer challenge the use of asynchronous server side code the other day. He asked why asynchronous code is superior to synchronous code on a server with no UI thread to block? I gave him the typical thread exhaustion answer but after thinking about it for a while I was no longer sure my answer was correct. After doing a little research I found that the upper limit to threads in an OS is governed by memory not an arbitrary number. And servers like Kestrel support unlimited threads. So in "theory" the number of requests (threads) a server can block on in parallel is governed by memory. Which is no different than async code in .NET; it lifts stack variables to the heap but it's still memory bound.
I've always assumed that smarter people than me had thought this through and async code was the right way to handle IO bound code. But what are the measurable advantages of async .NET code when running in a dedicated server farm with no UI thread? Does a move to the cloud (AWS) change the answer?
Server-side asynchronous code purpose is completely different from asynchronous UI code.
Asynchronous UI code makes UI more responsive (especially when multiple CPU cores are available), it allows multiple UI tasks to run in parallel which improves UI user experience.
The purpose of server-side asynchronous code on the other hand is to minimise the resources necessary to serve multiple clients simultaneously. In fact it is beneficial even if there is only one CPU core or a single-threaded event loop like in Node.js. And it all boils down to a simple concept of
Asynchronous IO.
The difference between synchronous and asynchronous IO is that in case of the former the thread which initialises an IO operation is paused until the IO operation is completed (e.g. until DB request is executed or a file on a disk is read). The same thread is then un-paused once the IO operation is completed to process the result of it. Note: even though while paused the thread is most likely not using any CPU resources (it is probably put to sleep by a thread scheduler) its resources are still tied to this particular IO operation and are pretty much wasted while IO is executed by the hardware. Effectively with synchronous IO you will need at least one thread per currently being processed client request even though most of those threads are probably asleep waiting for their IO operations to complete. In .NET each thread has at least 1MB of stack allocated so if the server is currently processing say 1000 requests it leads to almost 1GB of memory allocated simply for thread stacks plus an additional burden for a thread scheduler and more time CPU spends doing context switches: the more threads there are the slower overall performance of the system. More memory allocated means less efficient memory/CPU caches usage too.
Asynchronous IO is more efficient because a worker thread only initialises an IO operation and instead of waiting for it to complete it is immediately switched to another useful task (e.g. continuation of another client's request processing) and when the IO operation is completed by the hardware the processing of the result is resumed on any available worker thread. As a result, depending on the ratio between overall time spent waiting for hardware to complete IO and the time spent doing CPU tasks (e.g. serialisation of the result of IO operation into JSON) this approach can use less threads to serve the same number of simultaneous client requests: if, say, 90% of the time is spent in IO we can potentially use only 100 thread to serve the same 1000 simultaneous requests. The more your server-side code is IO-bound vs CPU-bound the more simultaneous clients requests it can process using a given amount of resources: CPU and memory.
What is the drawback of asynchronous code? Mainly it is generally harder to write than synchronous. Asynchronous code uses callbacks to resume operation so instead of a simple linear code a programmer needs to pass a delegate (a continuation) to IO method which is later called by the system when IO operation is completed (potentially on a different thread). However modern C# with its async/await facilities makes this task less complicated and even makes asynchronous code to almost look like synchronous. The only thing to remember: the asynchronous code only works when it is asynchronous "all the way down": even a single Task.Wait or Task.Result somewhere in the stack of calls from initial HTTP request processing to DB request call makes entire code synchronous thus forcing the current working thread to wait for that Wait call to finish defeating the purpose. Note: await in C# code does not actually awaits to the result of the call but is converted by the compiler to a ContinueWith i.e. to a continuation callback though in practice it is a bit more complicated than that but luckily the complexity is hidden from a programmer so nowadays writing efficient asynchronous code is relatively straightforward task.

Does the .Net Threadpool provide any mechanisms to avoid I/O performance degradation when a lot of CPU intensive work is scheduled?

I have a pretty specific question about the .NET threadpool.
I would say I have a pretty fair understanding of the threadpool, but one thing still boggles my mind.
Let's assume I run a web application which serves requests, but also performs a lot of heavy duty CPU-bound work by rendering / editing uploaded media.
Common advice when it comes to separating I/O and CPU bound tasks in an application would be to dispatch the CPU bound work to the .Net ThreadPool. Concrete, that would mean dispatching the call with Task.Run(...) - So far so good.
However, I do wonder, what would happen if this is done for a lot of requests. Let's say several hundreds / thousands, enough to really put an enourmous strain on a machine, and even up to the point the Threadpool just can't handle it anymore. Adding more Threads would obviously go only so far, when your CPU can't handle more. I would say at this point the Threadpool's Threads are also at the mercy of the CPU itself, and the scheduling algorithm.
What implications would this have on I/O bound async operations?
Would this cause I/O bound async operations to struggle with executing their continuation? Given we are in a runtime environment which executes async/await continuations on the Threadpool and discards the SynchronizationContext, what would ensure that these would still execute properly?
Does the Threadpool make any sophisticated assumption as to which Thread receives scheduling priority, to ensure throughput even when it's absolutely polluted with work?
It would be especially interesting to know how ASP.Net Core deals with this, since the request handlers are supposedly Threadpool Threads themselves.
Let's assume I run a web application which serves requests, but also performs a lot of heavy duty CPU-bound work by rendering / editing uploaded media.
Common advice when it comes to separating I/O and CPU bound tasks in an application would be to dispatch the CPU bound work to the .Net ThreadPool. Concrete, that would mean dispatching the call with Task.Run(...) - So far so good.
No, that's bad advice.
ASP.NET is already handling the request on a thread pool thread, so switching to another thread pool thread via Task.Run isn't going to help anything - in fact, it'll make things worse.
Task.Run is fine to offload CPU work to the thread pool when the calling method is a GUI thread. However, it's not a good idea to use Task.Run on ASP.NET, generally speaking.
However, I do wonder, what would happen if this is done for a lot of requests. Let's say several hundreds / thousands, enough to really put an enourmous strain on a machine, and even up to the point the Threadpool just can't handle it anymore. Adding more Threads would obviously go only so far, when your CPU can't handle more.
The thread pool will inject threads whenever the thread pool is over-full. However, the injection rate is limited, so the thread pool grows slowly.
What implications would this have on I/O bound async operations? Would this cause I/O bound async operations to struggle with executing their continuation? ... what would ensure that these would still execute properly?
First off, the I/O requests themselves (and their lowest-level, BCL-internal continuations) are not affected. That's because "the" thread pool is actually two thread pools: there's worker threads (that execute queued work) and there's I/O threads (that enlist in the I/O completion port and handle low-level I/O completion).
However, at some point most continuations do transition to the worker thread pool, so by the time your code continues, it needs a regular thread pool thread to do so. And yes, that means that if the (worker) thread pool is saturated, then that can starve await continuations.
Having ASP.NET handlers do heavy CPU work is unusual. The thread pool does have a lot of knobs to tweak if you do need to support it. And there's always the option of splitting the CPU-bound APIs internally into a separate API, which would give you two different ASP.NET apps: one I/O-bound and the other CPU-bound, which would let you tune the thread pool appropriately for each.

Conditions to use async-methods in c# .net-core web-apis

I'm implementing several small services, each of which uses entity-framework to store certain (but little) data. They also have a fair bit of business-logic so it makes sense to separate them from one another.
I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
I'm uncertain wether to use the async-methods of entity-framework logic (e.g. SaveChangesAsync or FirstOrDefaultAsync) because I can't find metrics that say "now you do it, and now you don't" besides from "Is it I/O or CPU-Intensive or not?".
What I've found when researching this topic (not limited to this but these are showing the problem):
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
using it bloats your code and decreases performance because of the context-switches at every method. Especially when I apply those to entity-framework calls it means that I have at least three context switches for one call from controller to business-logic to the repository to the database.
What I don't know, and that's what I would like to know from you:
How many virtual os threads are there? Or to be more precise: If I expect my application and server to be able to handle 100 requests to this service within five seconds (and I don't expect them to be more, 100 is already exagerated), should I back away from using async/await there?
What are the precise metrics that I could look at to answer this question for any of my services?
Or should I rather always use async-methods for I/O calls because they are already there and it could always happen that the load-situation on my server changes and there's so much going on that the async-methods would help me a great deal with that?
I'm certainly aware that async-methods and the async-await pattern itself can solve many problems in regards to performance especially when it comes to any I/O or cpu-intensive operations.
Sort of. The primary benefit of asynchronous code is that it frees up threads. UI apps (i.e., desktop/mobile) manifest this benefit in more responsive user interfaces. Services such as the ones you're writing manifest this benefit in better scalability - the performance benefits are only visible when under load. Also, services only receive this benefit from I/O operations; CPU-bound operations require a thread no matter what, so using await Task.Run on service applications doesn't help at all.
not using it can lead to your application stopping to respond because the threads (not the ones of the cpu, but virtual threads of the os) can run out because of the in that case blocking i/o calls to the database.
Yes. More specifically, the thread pool has a limited injection rate, so it can only grow so far so quickly. Asynchrony (freeing up threads) helps your service handle bursty traffic and heavy load. Quote:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Next question:
using it bloats your code and decreases performance because of the context-switches at every method.
The main performance drawback to async is usually memory related. There's additional structures that need to be allocated to keep track of ongoing asynchronous work. In the synchronous world, the thread stack itself has this information.
What I don't know, and that's what I would like to know from you: [when should I use async?]
Generally speaking, you should use async for any new code doing I/O-based operations (including all EF operations). The metrics-based arguments are more about cost/benefit analysis of converting to async - i.e., given an existing old synchronous codebase, at what point is it worth investing the time to convert it to async.
TLDR: Should I use async? YES!
You seem to have fallen for the most common mistake when trying to understand async/await. Async is orthogonal to multi-threading.
To answer your question, when should you the async method?
If currentContext.IsAsync && method.HasAsyncVersion
return UseAsync.Yes;
Else
return UseAsync.No;
That above is the short version.
Async/Await actually solves a few problems
Unblock UI thread
M:N threading
Multithreaded scheduling and synchronization
Interupt/Event based asynchronous scheduling
Given the large number of different use cases for async/await, the "assumptions" you state only apply to certain cases.
For example, context switching, only happens with Multi-Threading. Single-Threaded Interupt based Async actually reduces context switching by reducing blocking times and keeping the OS thread well fed with work.
Finally, your question on OS threads, is fundimentally wrong.
Firstly, OS threads each require creation of a stack (4MB of continous RAM, 100 threads means 400MB of RAM before any work is even done).
Secondly, unless you have 100 physical cores on your PC, your CPUs will have to context switch between each OS thread, resulting in the CPU stalling, whilst it loads that thread. By using M:N threading, you can keep the CPU running, by reducing the number of OS threads and instead using Green Threads (Task in dotnet).
Thirdly, not all "await" results in "async" behavior. Tasks are able to synchronously return, short-circuiting all of the "bloat".
In short, without digging really deep, it is hard to find optimization opportunities by switching from async to sync methods.

What are the scalability benefits of async (non-blocking) code?

Blocking threads is considered a bad practice for 2 main reasons:
Threads cost memory.
Threads cost processing time via context switches.
Here are my difficulties with those reasons:
Non-blocking, async code should also cost pretty much the same amount of memory, because the callstack should be saved somewhere right before executing he async call (the context is saved, after all). And if threads are significantly inefficient (memory-wise), why doesn't the OS/CLR offer a more light-weight version of threads (saving only the callstack's context and nothing else)? Wouldn't it be a much cleaner solution to the memory problem, instead of forcing us to re-architecture our programs in an asynchronous fashion (which is significantly more complex, harder to understand and maintain)?
When a thread gets blocked, it is put into a waiting state by the OS. The OS won't context-switch to the sleeping thread. Since way over 95% of the thread's life cycle is spent on sleeping (assuming IO-bound apps here), the performance hit should be negligible, since the processing sections of the thread would probably not be pre-empted by the OS because they should run very fast, doing very little work. So performance-wise, I can't see a whole lot of benefit to a non-blocking approach either.
What am I missing here or why are those arguments flawed?
Non-blocking, async code should also cost pretty much the same amount of memory, because the callstack should be saved somewhere right before executing he async call (the context is saved, after all).
The entire call stack is not saved when an await occurs. Why do you believe that the entire call stack needs to be saved? The call stack is the reification of continuation and the continuation of the awaited task is not the continuation of the await. The continuation of the await is on the stack.
Now, it may well be the case that when every asynchronous method in a given call stack has awaited, information equivalent to the call stack has been stored in the continuations of each task. But the memory burden of those continuations is garbage collected heap memory, not a block of a million bytes of committed stack memory. The continuation state size is order n in the size of the number of tasks; the burden of a thread is a million bytes whether you use it or not.
if threads are significantly inefficient (memory-wise), why doesn't the OS/CLR offer a more light-weight version of threads
The OS does. It offers fibers. Of course, fibers still have a stack, so that's maybe not better. You could have a thread with a small stack I suppose.
Wouldn't it be a much cleaner solution to the memory problem, instead of forcing us to re-architecture our programs in an asynchronous fashion
Suppose we made threads -- or for that matter, processes -- much cheaper. That still doesn't solve the problem of synchronizing access to shared memory.
For what it's worth, I think it would be great if processes were lighter weight. They're not.
Moreover, the question somewhat contradicts itself. You're doing work with threads, so you are already willing to take on the burden of managing asynchronous operations. A given thread must be able to tell another thread when it has produced the result that the first thread asked for. Threading already implies asynchrony, but asynchrony does not imply threading. Having an async architecture built in to the language, runtime and type system only benefits people who have the misfortune to have to write code that manages threads.
Since way over 95% of the thread's life cycle is spent on sleeping (assuming IO-bound apps here), the performance hit should be negligible, since the processing sections of the thread would probably not be pre-empted by the OS because they should run very fast, doing very little work.
Why would you hire a worker (thread) and pay their salary to sit by the mailbox (sleeping the thread) waiting for the mail to arrive (handling an IO message)? IO interrupts don't need a thread in the first place. IO interrupts exist in a world below the level of threads.
Don't hire a thread to wait on IO; let the operating system handle asynchronous IO operations. Hire threads to do insanely huge amounts of high latency CPU processing, and then assign one thread to each CPU you own.
Now we come to your question:
What are the benefits of async (non-blocking) code?
Not blocking the UI thread
Making it easier to write programs that live in a world with high latency
Making more efficient use of limited CPU resources
But let me rephrase the question using an analogy. You're running a delivery company. There are many orders coming in, many deliveries going out, and you cannot tell a customer that you will not take their delivery until every delivery before theirs is completed. Which is better:
hire fifty guys to take calls, pick up packages, schedule deliveries, and deliver packages, and then require that 46 of them be idle at all times or
hire four guys and make each of them really good at first, doing a little bit of work at a time, so that they are always responsive to customer requests, and second, really good at keeping a to-do list of jobs they need to do in the future
The latter seems like a better deal to me.
You are messing multithreading and async concepts here.
Both your "difficulties" come from the assumption that each async method gets assigned a specialized thread on which it does the work. However, the state of affairs is quite opposite: each time an async operation needs to be executed, the CLR picks an idle (thus already created) thread from the threadpool and executes that method on the selected thread.
The core concept here is that async doesn't mean always creating new threads, it means scheduling the execution on existing threads so that no thread is sitting idle.

Why Use Async/Await Over Normal Threading or Tasks?

I've been reading a lot about async and await, and at first I didn't get it because I didn't properly understand threading or tasks. But after getting to grips with both I wonder: why use async/await if you're comfortable with threads?
The asynchronousy of async/await can be done with Thread signaling, or Thread.Join() etc. Is it merely for time saving coding and "less" hassle?
Yes, it is a syntactic sugar that makes dealing with threads much easier, it also makes the code easier to maintain, because the thread management is done by run-time. await release the thread immediately and allows that thread or another one to pick up where it left off, even if done on the main thread.
Like other abstractions, if you want complete control over the mechanisms under the covers, then you are still free to implement similar logic using thread signaling, etc.
If you are interested in seeing what async/await produces then you can use Reflector or ILSpy to decompile the generated code.
Read What does async & await generate? for a description of what C# 5.0 is doing on your behalf.
If await was just calling Task.Wait we wouldn't need special syntax and new APIs for that. The major difference is that async/await releases the current thread completely while waiting for completion. During an async IO there is no thread involved at all. The IO is just a small data structure inside of the kernel.
async/await uses callback-based waiting under the hood and makes all its nastiness (think of JavaScript callbacks...) go a way.
Note, that async does not just move the work to a background thread (in general). It releases all threads involved.
Comparing async and await with threads is like comparing apples and pipe wrenches. From 10,000 feet they may look similar, but they are very different solutions to very different problems.
async and await are all about asynchronous programming; specifically, allowing a method to pause itself while it's waiting for some operation. When the method pauses, it returns to its caller (usually returning a task, which is completed when the method completes).
I assume you're familiar with threading, which is about managing threads. The closest parallel to a thread in the async world is Task.Run, which starts executing some code on a background thread and returns a task which is completed when that code completes.
async and await were carefully designed to be thread-agnostic. So they work quite well in the UI thread of WPF/Win8/WinForms/Silverlight/WP apps, keeping the UI thread responsive without tying up thread pool resources. They also work quite well in multithreaded scenarios such as ASP.NET.
If you're looking for a good intro to async/await, I wrote up one on my blog which has links to additional recommended reading.
There is a difference between the Threads and async/await feature.
Think about a situation, where you are calling a network to get some data from network. Here the Thread which is calling the Network Driver (probably running in some svchost process) keeps itself blocked, and consumes resources.
In case of Async/await, if the call is not network bound, it wraps the entire call into a callback using SynchronizationContext which is capable of getting callback from external process. This frees the Thread and the Thread will be available for other things to consume.
Asynchrony and Concurrency are two different thing, the former is just calling something in async mode while the later is really cpu bound. Threads are generally better when you need concurrency.
I have written a blog long ago describing these features .
C# 5.0 vNext - New Asynchronous Pattern
async/await does not use threads; that's one of the big advantages. It keeps your application responsive without the added complexity and overhead inherent in threads.
The purpose is to make it easy to keep an application responsive when dealing with long-running, I/O intensive operations. For example, it's great if you have to download a bunch of data from a web site, or read files from disk. Spinning up a new thread (or threads) is overkill in those cases.
The general rule is to use threads via Task.Run when dealing with CPU-bound operations, and async/await when dealing with I/O bound operations.
Stephen Toub has a great blog post on async/await that I recommend you read.

Categories