Scheduling of I/O-bound operations in .NET - c#

If I'm on a thread which doesn't need to be responsive, and for which continued execution relies on the result of an I/O bound call (HttpClient request), is there any value in implementing the call asynchronously in .NET?
Will Windows know that I'm waiting on an I/O operation and refrain from scheduling the thread until data arrives?
I recall reading somewhere that it does, but I'm afraid I still have difficulty understanding how this works and when I can rely on it.

No, there is no value in using async there. As you suspect, Windows will know that the thread is waiting for IO and won't schedule the thread until the data arrives.
However, the idea of async is that you don't really need to create a new thread. The idea of async is that (I'm cutting a few corners here; there is better documentation available on the Internet) it tries to do something like you're doing here manually. So instead of you having to create a new thread, async does this for you. (It doesn't actually create a new thread, but you get the idea.)
If this needs to be high performance, I would not advise to do it the way you're implementing it now. Async would be much better for this. In your case, when you're doing 1000 requests, you would have 1000 threads, which is not a good idea. Async would accomplish this a lot smarter, and will give you better performance.
The basic advantage of using async (besides performance) is that it's like you're actually programming only on the UI thread. Previously, that would have locked up your application, but with async your application stays responsive. That's really the primary advantage of async.

Related

Is multithreading an API application a thing?

I'm learning C#/DOTNET as one of the main reasons are incredible speeds over Node.js and OO syntax.
Now the tutorial I am following all of a sudden introduced async, and that's cool, but I could have done that with Node.js as well, so I feel a little disappointed.
My thought was maybe we could take this to the next level with Multithreading, but a lot of questions came up, with discrepancy in the database (like thread one is expecting to get data that thread two updated, but thread two was not executed before thread one retrieved, so thread one is working with an outdated data).
And searching for this seems to return very little information, mostly it's people misunderstanding multithreading and asynchronous programing.
So I'm guessing you would not want to mix API with multithreading?
Yes, it's a thing, and you're already doing it with async tasks.
.NET has a Task Scheduler that assigns your tasks to available threads from the Thread Pool. Default behavior is to create a pool of threads for each available CPU.
Clarification: this doesn't mean 1 task : 1 thread. There's a large collection of work to be done by a number of workers. Scheduler hands a worker a job, worker works until it's done or an 'await' is reached.
From the perspective of a regular async method, it can be hard to see where the 'multi-threading' comes into play. There isn't an obvious difference between Get() and await GetAsync() when your code has to sit and wait either way.
But it's not always about your code. This example might make it more clear.
List<Task> work = new();
foreach(var uri in uriList)
{
work.Add(http.GetAsync(uri));
}
await Task.WhenAll(work);
This code will execute all those GetAsyncs at the same time.
The framework making your API work is doing something similar. It would be pretty silly if the whole server was tied up because a single user requested a big file over dialup.
Async await is used for multi-threading but it is not used only for multi-threading.
I have not pesronally used/seen multi-threading in API but only console jobs. Using TPL in console jobs has improved the efficiency more than 100% for me
Async/Await is powerful and should be used for asynchronic processing in API's too.
Please go through Shiv's videos https://www.youtube.com/watch?v=iMcycFie-nk

Task.Run() vs Async/Await

I have a question regarding some code I am writing. I have 3 calls made synchronously to some endpoints that have large payloads. I don’t want to wait for these payloads and instead continue running through the method until I need the values from those 3 endpoints.
I have approached a solution like this. I converted the method that calls the 3 service endpoints into an async method. I start the call for the data using
var serviceCallOneTask = Task.Run(()=> serviceCallOne());
Note serviceCallOne() is not asynchronous
and finally when I need the data I do something like
var serviceCallOneValue = await serviceCallOneTask;
My questions are
Is this solution considered bad practice?
Should I be worried about deadlocks?
From what I have read, when using await we are not blocking a thread but when using task.run we are using a CPU-bound thread and we are blocking the thread pool; is that correct?
Is it better for me to convert everything in this httpGet method from beginning to end into async methods?
Is it ok for me to approach the problem this way for now and later on convert those task.run() services into asynchronous methods?
Is this solution considered bad practice?
It depends what you're doing.
Task.Run moves execution to another thread. That is helpful in a desktop application because you don't want long-running CPU-bound operations running on the UI thread and locking up your UI.
ASP.NET is different since there is no UI thread, so there is no need to move operations to another thread, unless you want to do something in parallel (run two CPU-bound operations at the same time).
If you're doing something else between calling Task.Run and await serviceCallOneTask, then that's certainly a reason to do what you're doing. But whether it's "better" depends on what serviceCallOne() is doing. You have to think about two things to determine if the benefit outweighs the cost:
Does the benefit of running it in a separate thread outweigh the cost of moving it to a separate thread? (Is it actually faster than running it in the same thread?)
Remember that ASP.NET has a limited number of threads (by default, 20 per processor), and now you're using 2 threads instead of 1. Depending on the expected load of your application, that may or may not matter.
Should I be worried about deadlocks?
Not with the small bit of code that you've shown. As long as you don't wait synchronously on an async method, you will not have to worry about deadlocks.
From what I have read, when using await we are not blocking a thread but when using task.run we are using a CPU-bound thread and we are blocking the thread pool; is that correct?
When using await, you don't block the current thread. As discussed above, you might just be blocking another thread, depending on your code.
Is it better for me to convert everything in this httpGet method from beginning to end into async methods?
Considering the limited thread that ASP.NET has, and that async/await helps you free up threads, then yes. It's always better to use async wherever you can.
Is it ok for me to approach the problem this way for now and later on convert those task.run() services into asynchronous methods?
If it works, then it's "ok". But you have to change something, right? May as well do it right. :)

Async-Await vs ThreadPool vs MultiThreading on High-Performance Sockets (C10k Solutions?)

I'm really confused about async-awaits, pools and threads. The main problem starts with this question: "What can I do when I have to handle 10k socket I/O?" (aka The C10k Problem).
First, I tried to make a custom pooling architecture with threads
that uses one main Queue and multiple Threads to process all
incoming datas. It was a great experience about understanding
thread-safety and multi-threading but thread is an overkill
with async-await nowadays.
Later, I implemented a simple architecture with async-await but I
can't understand why "The async and await keywords don't cause
additional threads to be created." (from MSDN)? I think there
must be some threads to do jobs like BackgroundWorker.
Finally, I implemented another architecture with ThreadPool and it
looks like my first custom pooling.
Now, I think there should be someone else with me who confused about handling The C10k. My project is a dedicated (central) server for my game project that is hub/lobby server like MCSG's lobbies or COD's matchmaking servers. I'll do the login operations, game server command executions/queries and information serving (like version, patch).
Last part might be more specific about my project but I really need some good suggestions about real world solutions about multiple (heavy) data handling.
(Also yes, 1k-10k-100k connection handling depending on server hardware but this is a general question)
The key point: Choosing Between the Task Parallel Library and the ThreadPool (MSDN Blog)
[ADDITIONAL] Good (basic) things to read who wants to understand what are we talking about:
Threads
Async, Await
ThreadPool
BackgroundWorker
async/await is roughly analogous to the "Serve many clients with each thread, and use asynchronous I/O and completion notification" approach in your referenced article.
While async and await by themselves do not cause any additional threads, they will make use of thread pool threads if an async method resumes on a thread pool context. Note that the async interaction with ThreadPool is highly optimized; it is very doubtful that you can use Thread or ThreadPool to get the same performance (with a reasonable time for development).
If you can, I'd recommend using an existing protocol - e.g., SignalR. This will greatly simplify your code, since there are many (many) pitfalls to writing your own TCP/IP protocol. SignalR can be self-hosted or hosted on ASP.NET.
No. If we use asynchronous programming pattern that .NET introduced in 4.5, in most of the cases we need not to create manual thread by us. The compiler does the difficult work that the developer used to do. Creating a new thread is costly, it takes time. Unless we need to control a thread, then “Task-based Asynchronous Pattern (TAP)” and “Task Parallel Library (TPL)” is good enough for asynchronous and parallel programming. TAP and TPL uses Task. In general Task uses the thread from ThreadPool(A thread pool is a collection of threads already created and maintained by .NET framework. If we use Task, most of the cases we need not to use thread pool directly. A thread can do many more useful things. You can read more about Thread Pooling
You can avoid performance bottlenecks and enhance the overall responsiveness of your application by using asynchronous programming. Asynchrony is essential for activities that are potentially blocking, such as when your application accesses the web. Access to a web resource sometimes is slow or delayed. If such an activity is blocked within a synchronous process, the entire application must wait. In an asynchronous process, the application can continue with other work that doesn't depend on the web resource until the potentially blocking task finishes.
Await is specifically designed to deal with something taking time, most typically an I/O request. Which traditionally was done with a callback when the I/O request was complete. Writing code that relies on these callbacks is quite difficult, await greatly simplifies it. Await just takes care of dealing with the delay, it doesn't otherwise do anything that a thread does. The await expression, what's at the right of the await keyword, is what gets the job done. You can use Async with any method that returns a Task. The XxxxAsync() methods are just precooked ones in the .NET framework for common operations that take time. Like downloading data from a web server.
I would recommend you to read Asynchronous Programming with Async and Await

Why Use Async/Await Over Normal Threading or Tasks?

I've been reading a lot about async and await, and at first I didn't get it because I didn't properly understand threading or tasks. But after getting to grips with both I wonder: why use async/await if you're comfortable with threads?
The asynchronousy of async/await can be done with Thread signaling, or Thread.Join() etc. Is it merely for time saving coding and "less" hassle?
Yes, it is a syntactic sugar that makes dealing with threads much easier, it also makes the code easier to maintain, because the thread management is done by run-time. await release the thread immediately and allows that thread or another one to pick up where it left off, even if done on the main thread.
Like other abstractions, if you want complete control over the mechanisms under the covers, then you are still free to implement similar logic using thread signaling, etc.
If you are interested in seeing what async/await produces then you can use Reflector or ILSpy to decompile the generated code.
Read What does async & await generate? for a description of what C# 5.0 is doing on your behalf.
If await was just calling Task.Wait we wouldn't need special syntax and new APIs for that. The major difference is that async/await releases the current thread completely while waiting for completion. During an async IO there is no thread involved at all. The IO is just a small data structure inside of the kernel.
async/await uses callback-based waiting under the hood and makes all its nastiness (think of JavaScript callbacks...) go a way.
Note, that async does not just move the work to a background thread (in general). It releases all threads involved.
Comparing async and await with threads is like comparing apples and pipe wrenches. From 10,000 feet they may look similar, but they are very different solutions to very different problems.
async and await are all about asynchronous programming; specifically, allowing a method to pause itself while it's waiting for some operation. When the method pauses, it returns to its caller (usually returning a task, which is completed when the method completes).
I assume you're familiar with threading, which is about managing threads. The closest parallel to a thread in the async world is Task.Run, which starts executing some code on a background thread and returns a task which is completed when that code completes.
async and await were carefully designed to be thread-agnostic. So they work quite well in the UI thread of WPF/Win8/WinForms/Silverlight/WP apps, keeping the UI thread responsive without tying up thread pool resources. They also work quite well in multithreaded scenarios such as ASP.NET.
If you're looking for a good intro to async/await, I wrote up one on my blog which has links to additional recommended reading.
There is a difference between the Threads and async/await feature.
Think about a situation, where you are calling a network to get some data from network. Here the Thread which is calling the Network Driver (probably running in some svchost process) keeps itself blocked, and consumes resources.
In case of Async/await, if the call is not network bound, it wraps the entire call into a callback using SynchronizationContext which is capable of getting callback from external process. This frees the Thread and the Thread will be available for other things to consume.
Asynchrony and Concurrency are two different thing, the former is just calling something in async mode while the later is really cpu bound. Threads are generally better when you need concurrency.
I have written a blog long ago describing these features .
C# 5.0 vNext - New Asynchronous Pattern
async/await does not use threads; that's one of the big advantages. It keeps your application responsive without the added complexity and overhead inherent in threads.
The purpose is to make it easy to keep an application responsive when dealing with long-running, I/O intensive operations. For example, it's great if you have to download a bunch of data from a web site, or read files from disk. Spinning up a new thread (or threads) is overkill in those cases.
The general rule is to use threads via Task.Run when dealing with CPU-bound operations, and async/await when dealing with I/O bound operations.
Stephen Toub has a great blog post on async/await that I recommend you read.

Asynchronous methods(!) clarification in .net?

I've been reading a lot lately about this topic and , still I need to clarify something
The whole idea with asynchronous methods is Thread economy :
Allow many tasks to run on a few threads. this is done by using the hardware driver to do the job while releasing the thread back to the thread-pool so it can server other jobs.
please notice .
I'm not talking about asynchronous delegates which ties another thread (execute a task in parallel with the caller).
However I've seen 2 main types of asynchronous methods examples :
Code samples (from books) who only uses existing I/O asynchronous operations as beginXXX / endXX e.g. Stream.BeginRead.
And I couldn't find any asynchronous methods samples which don't use existing .net I/O operations e.g. Stream.BeginRead )
Code samples like this (and this). which doesnt actually invoking an asynchronous operation (although the author thinks he is - but he actually causes a thread to block !)
Question :
Does asynchronous methods are used only with .net I/O existing methods like BeginXXX , EndXXX ?
I mean , If I want to create my own asynchronous methods like BeginMyDelay(int ms,...){..} , EndMyDelay(...). I couldn't done it without tie a blocked thread to it....correct?
Thank you very much.
p.s. please notice this question is tagged as .net 4 and not .net4.5
You're talking about APM.
APM widely uses OS concept, known as IO Completion ports. That's why different IO operations are the best candidates to use APM.
You could write your own APM methods.
But, in fact, these methods will be either over existing APM methods, or they will be IO-bound, and will use some native OS mechanism (like FilesStream, which uses overlapped file IO).
For compute-bound asynchronous operations APM only will increase complexity, IMO.
A bit more clarification.
Work with hardware is asynchronous by its nature. Hardware needs a time to perform request - newtork card must send or receive data, HDD must read/write etc. If IO is synchronous, thread, which was generated IO request, is waiting for response. And here APM helps - you shouldn't wait, just execute something else, and when IO will be complete, I'll call you, says APM.
The main point - operation is performing outside of CPU.
When you're writing any compute-bound operation, which will use CPU for it execution without any IO, there's nothing to wait here. So, APM coludn't help - if you need CPU, you need thread - you need thread pool.
I think, but I'm not sure, that you can create your own asynchronous methods. For example creating a new thread and wait for it to finish some work (db query, ...).
In term of overall system performance probably it is not useful, as you say you just create another thread. But for example if you work on IIS, the original request thread can be used for other requests while you are waiting for the 'background' operation.
I think that IIS has a fixed number of threads (thread pool), so in this case can be useful.
I mean , If I want to create my own asynchronous methods like
BeginMyDelay(int ms,...){..} , EndMyDelay(...). I couldn't done it
without tie a blocked thread to it....correct?
While I've not dug into the implementation of async, I can't see any reason why one couldn't do this.
The simplest way would be to use existing libraries that help [e.g. timers] or some sort of event system IIRC.
However even if you don't want to use any library helpers then you're stuck with a problem... the 'blocked thread'.
Sure the code does look something like this:
while (true){
foreach (var item in WaitingTasks)
if (item.Ready())
/*fire item, and remove it from tasks*/;
/*Some blocking action*/
}
Thing is - 'Some blocking action' doesn't have to be 'blocking'. You could yield/sleep the thread, or use it to process some data. For example, the Unity Game Engine does a similar thing with Coroutines - where the same thread that processes all the code also checks to see if various coroutines [that have been delayed due to time] need to be updated. Replace /*Some blocking action*/ with ProcessGameLoop().
Hoe that helps, feel free to ask questions/post corrections etc.

Categories