How are threads managed for Begin/Async calls (like socket IO)?

How are threads managed for Begin/Async calls (like socket IO)? - c#

The .Net Socket async API manages threads automatically when using the BeginXXX methods. For example, if I have 100 active connections sending and receiving TCP messages, will be used around 3 threads. And it makes me curious.
How the API makes this thread management?
How all flow of connections are divided among the threads to be processed?
How the manager prioritizes which connections/readings/writings must be processed first?
My questions may not have sense because I don't know how it works and what to ask specifically, so sorry. Basically I need to know how this whole process works in low level.

The .Net Socket async API manages threads automatically when using the
BeginXXX methods.
This is not quite correct. APM Begin/End-style socket API do not manage threads at all. Rather, the completion AsyncCallback is called on a random thread, which is the thread where the asynchronous socket I/O operation has completed. Most likely, this is going to be an IOCP pool thread (I/O completion port thread), different from the thread on which you called the BeginXXX method. For more details, check Stephen Cleary's "There Is No Thread".
How the manager prioritizes which connections/readings/writings must
be processed first?
The case when there's no IOCP threads available to handle the completion of the async I/O operation is called TheadPool starvation. It happens when all pool threads are busy executing some code (e.g., processing the received socket messages), or are blocked with a blocking call like WaitHandle.WaitOne(). In this case, the I/O completion routine is queued to ThreadPool to be executed when a thread becomes available, on FIFO basis.
You have an option to increase the size of ThreadPool with SetMinThreads/SetMaxThreads APIs, but doing so isn't always a good idea. The number of actual concurrent threads is anyway limited by the number of CPU/cores, so you'd rather want to finish any CPU-bound processing work as soon as possible and release the thread to go back to the pool.

Related

On Windows, C#, Does Socket.BeginReceive() take a background thread before or after the data arrives?

If it take a background thread before the data arrives, and when many connections waiting for data, there will be too many threads exist, causing performance degradation. is there an approach to wait for data without taking a thread?

Socket.BeginReceive(), and other asynchronous I/O methods in .NET, make use of the IOCP thread pool. The short version is that this is a very efficient way to manage I/O. There is practically no cost to wait for the I/O to complete, and even once it completes, your completion callback is called from a thread pool thread, tying up that thread only for as long as it takes for your callback to complete.
"IOCP" stands for "IO Completion Ports", a feature in the native Windows API. The basic idea is that you can have a single thread, or some small collection of threads, all ready to service the completion of a large number of I/O operations. This allows I/O operations to scale well into the hundreds of thousands, if not millions, of concurrent operations, while still only requiring a relatively small number of threads to deal with them all.
So, go right ahead use those asynchronous I/O APIs. They are the best way to write scalable I/O code.
(Aside: the Socket class in particular has a number of async options. Ironically, the methods ending in ...Async do not comply with the new(er) async/await paradigm in C#, but they are in fact the most scalable way to do I/O with a Socket, because not only do they use the IOCP thread pool, they also allow you to reuse your I/O state objects, so you can have a pool of those and minimize GC load.)

C# Asynchronous Socket Read Without Using Runtime's Threadpool

I'm trying to create a socket server which can handle relatively large amount of clients. There are different approaches used for such a task, first is to use a separate thread for each incoming connection, and second is to use async/await pattern.
First approach is bad as there will be relatively small number of threads such as all system's resources will be lost on context switching.
Second approach from the first sight is good as we can have our own threadpool with limited number of worker threads, so dispatcher will receive incoming connections, add them to some queue and call async socket read methods which in turn will receive data from socket and add this data/errors to queue for further processing(error handling client responses, DB-related work).
There is not so much info on internal implementation of async/await I could found, but as I understood while using non-UI application all continuation is done through TaskScheduler.Current which is using runtime's threadpool and so it's resources are limited. Greater amount of incoming connections will result in no free threads in runtime's threadpool or amount will be so large that system will stop responding.
In this matter async/await will result in same problem as with 1-client/1-thread concern, however with little advantage as runtime threadpool's threads may not occupy so much address space as default System.Threading.Thread (I believe 1MB stack size + ~1/2MB of control data).
Is there any way I can made one thread to wait for some kernel interrupt on say 10 sockets so application will only use my explicitly sized thread pool? (I mean that in case there is any further data on one from 10 sockets, one thread will wake up and handle it.)

In this matter async/await will result in same problem as with 1-client/1-thread concern
When thread reach code that is running asynchronously then control is returned to caller so that means thread is returned to thread pool and can handle another request so it is any superior to 1-client/1-thread because thread isn't blocked.
There is some any intersting blog about asnyc/await:
1

How are asynchronous I/O methods processed

After reading alot about async-await, I can only find the benefits of using it in GUI thread (WPF/WinForms).
In what scenarios does it reduce the creation of threads in WCF services?
Does a programmer must use async-await on every method in the service by choosing to implement async-await in web service? Making some non-async-await methods in a service full of async-await reduse the efficiency of my service? How?
Last question - some say that using 'await Task.Run(()=>...)' is not a "real async-await". What do they mean by saying that?
Thanks in advence,
Stav.
EDIT:
Both answers are excellent but for even dipper explanation about how async-await works, I suggest to read #Stephen Cleary answer here:
https://stackoverflow.com/a/7663734/806963
Following topics are required for understand his answer:
SynchronizationContext,SynchronizationContext.Current,TaskScheduler,TaskScheduler.Current,Threadpool.

The real benefit of async/await in server applications (like WCF) is asynchronous I/O.
When you call a synchronous I/O method, the calling thread will be blocked waiting for the I/O to complete. The thread cannot be used by other requests, it just waits for the result. When more requests arrive, the thread pool will create more threads to handle them, wasting a lot of resources - memory, context switching when the waiting threads get unblocked...
If you use async IO, the thread is not blocked. After starting the asynchronous IO operation, it is again available to be used by the thread pool. When the async operation is finished, the thread pool assigns a thread to continue processing the request. No resources wasted.
From MSDN (it's about file I/O, but applies to other too)
In synchronous file I/O, a thread starts an I/O operation and immediately enters a wait state until the I/O request has completed. A thread performing asynchronous file I/O sends an I/O request to the kernel by calling an appropriate function. If the request is accepted by the kernel, the calling thread continues processing another job until the kernel signals to the thread that the I/O operation is complete. It then interrupts its current job and processes the data from the I/O operation as necessary.
Now you probably can see why await Task.Run() will not give any benefit if the IO in the task is done synchronously. A thread will get blocked anyway, just not the one that called the Task.Run().
You don't need to implement every method asynchronously to see improvement in performance (although it should become a habit to always perform I/O asynchronously).

In what scenarios does it reduce the creation of threads in WCF services?
If you have an action that will wait on an IO operation (reading from the database, calling an external web service, ...), using async/await frees up the managed thread that your WCF request is being processed on. That makes the thread available for other requests, pending completion of your IO. It makes for more efficient use of the thread pool.
After reading alot about async-await, I can only find the benefits of using it in GUI thread
For client applications that is the key benefit that I'm aware of, since you are far less likely to run out of manged threads than you are in a server application.
some say that using 'await Task.Run(()=>...)' is not a "real async-await".
You allocate a new managed thread to run your new task, so you are not saving any managed threads.

How do WebSocket Clients work in .Net?

My question is more related to how WebSockets (on the client) work/behave with threads in .Net and what I am looking for as an answer would be more of a low level explanation on how the OS interacts with the .Net thread when it receives data from the server on its socket.
Suppose I have a client that opens 1000 sockets to a server asynchronously. It then sits there waiting for updates/events to come through. These events can arrive at different times and frequencies.
Assuming that every time data comes in via a socket, a thread needs to pick it up and do some work on it, am I correct to assume that IF all the 1000 sockets receive data at the same time I will then have 1000 threads (1 thread per socket) coming from the Thread Pool to pick-up the data from the socket? What if I wanted to have 3000 sockets open?
Any clarification on this is very much appreciated.

Assuming you are using the .NET Framework library WebSocket the received data will be returned on a thread from the ThreadPool (probably the IO Completion Thread Pool).
Thread Pool
When the thread pool is used you don't know how many different threads that will be active a the same time. The data is put on a queue and the thread pool works through it as fast as it can. You can control the min/max number of threads that it will use, but the way that the pool creates/destroys its threads is unspecified.
The above hold true for most asynchronous operations in .NET.
Excpetions
If you awaited the asynchronous receive operation in a synchronization context (for instance a UI thread) the operation will resume in the same context (UI thread), unless you suppress the sync context. In this case only one thread will be used and the receive operations will be queued and processed in sequence.

.NET Async IO associated with calling Sleep on response handler

I have a piece of code (on a server) that uses async method to receive data on sockets like this:
asyncRes = connectionSocket.BeginReceive(receiveBuffer, 0, RECEIVING_BUFFER_SIZE,
SocketFlags.None, out error, new AsyncCallback(ReceiveDataDone), null);
In the handler (ReceiveDataDone) of the socket there are cases where Thread.Sleep(X) is used in order to wait for other things(questionable implementation indeed). I know this is a questionable design but I wonder if making such kind of code could explain an explosion of threads created in my application because of the other pending sockets in the server that have their ReceiveDataDone called. (when many connections are handled by the server the number of threads created figuratively explodes). I wonder how BeginReceive method on .NET sockets work, that could explain the huge number of threads I see.

You absolutely should not perform any kind of blocking action in APM callbacks. These are run in the ThreadPool. The ThreadPool is designed for the invocation of short-lived tasks. If you block (or take a long time to execute) you are tying up (a finite number of) threads and causing ThreadPool starvation. Because the ThreadPool does not spin up extra threads easily (in fact, it's quite slow to start extra threads), you're bottlenecking on the timing that controls how quickly the ThreadPool is allowed to spin up new threads.
Despite answering a different question, this answer I provided a while back explains the same issue:
https://stackoverflow.com/a/1733226/14357

You should not use Thread.sleep for waiting in ThreadPool Threads this causes the Thread to be blocked and It will not accept any further workitems for the time it is blocked.
You can use TimerCallback for such a use case. It will let the ThreadPool schedule other work on the waiting thread in the meantime.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.