I have a piece of code (on a server) that uses async method to receive data on sockets like this:
asyncRes = connectionSocket.BeginReceive(receiveBuffer, 0, RECEIVING_BUFFER_SIZE,
SocketFlags.None, out error, new AsyncCallback(ReceiveDataDone), null);
In the handler (ReceiveDataDone) of the socket there are cases where Thread.Sleep(X) is used in order to wait for other things(questionable implementation indeed). I know this is a questionable design but I wonder if making such kind of code could explain an explosion of threads created in my application because of the other pending sockets in the server that have their ReceiveDataDone called. (when many connections are handled by the server the number of threads created figuratively explodes). I wonder how BeginReceive method on .NET sockets work, that could explain the huge number of threads I see.
You absolutely should not perform any kind of blocking action in APM callbacks. These are run in the ThreadPool. The ThreadPool is designed for the invocation of short-lived tasks. If you block (or take a long time to execute) you are tying up (a finite number of) threads and causing ThreadPool starvation. Because the ThreadPool does not spin up extra threads easily (in fact, it's quite slow to start extra threads), you're bottlenecking on the timing that controls how quickly the ThreadPool is allowed to spin up new threads.
Despite answering a different question, this answer I provided a while back explains the same issue:
https://stackoverflow.com/a/1733226/14357
You should not use Thread.sleep for waiting in ThreadPool Threads this causes the Thread to be blocked and It will not accept any further workitems for the time it is blocked.
You can use TimerCallback for such a use case. It will let the ThreadPool schedule other work on the waiting thread in the meantime.
Related
I'm trying to create a socket server which can handle relatively large amount of clients. There are different approaches used for such a task, first is to use a separate thread for each incoming connection, and second is to use async/await pattern.
First approach is bad as there will be relatively small number of threads such as all system's resources will be lost on context switching.
Second approach from the first sight is good as we can have our own threadpool with limited number of worker threads, so dispatcher will receive incoming connections, add them to some queue and call async socket read methods which in turn will receive data from socket and add this data/errors to queue for further processing(error handling client responses, DB-related work).
There is not so much info on internal implementation of async/await I could found, but as I understood while using non-UI application all continuation is done through TaskScheduler.Current which is using runtime's threadpool and so it's resources are limited. Greater amount of incoming connections will result in no free threads in runtime's threadpool or amount will be so large that system will stop responding.
In this matter async/await will result in same problem as with 1-client/1-thread concern, however with little advantage as runtime threadpool's threads may not occupy so much address space as default System.Threading.Thread (I believe 1MB stack size + ~1/2MB of control data).
Is there any way I can made one thread to wait for some kernel interrupt on say 10 sockets so application will only use my explicitly sized thread pool? (I mean that in case there is any further data on one from 10 sockets, one thread will wake up and handle it.)
In this matter async/await will result in same problem as with 1-client/1-thread concern
When thread reach code that is running asynchronously then control is returned to caller so that means thread is returned to thread pool and can handle another request so it is any superior to 1-client/1-thread because thread isn't blocked.
There is some any intersting blog about asnyc/await:
1
The .Net Socket async API manages threads automatically when using the BeginXXX methods. For example, if I have 100 active connections sending and receiving TCP messages, will be used around 3 threads. And it makes me curious.
How the API makes this thread management?
How all flow of connections are divided among the threads to be processed?
How the manager prioritizes which connections/readings/writings must be processed first?
My questions may not have sense because I don't know how it works and what to ask specifically, so sorry. Basically I need to know how this whole process works in low level.
The .Net Socket async API manages threads automatically when using the
BeginXXX methods.
This is not quite correct. APM Begin/End-style socket API do not manage threads at all. Rather, the completion AsyncCallback is called on a random thread, which is the thread where the asynchronous socket I/O operation has completed. Most likely, this is going to be an IOCP pool thread (I/O completion port thread), different from the thread on which you called the BeginXXX method. For more details, check Stephen Cleary's "There Is No Thread".
How the manager prioritizes which connections/readings/writings must
be processed first?
The case when there's no IOCP threads available to handle the completion of the async I/O operation is called TheadPool starvation. It happens when all pool threads are busy executing some code (e.g., processing the received socket messages), or are blocked with a blocking call like WaitHandle.WaitOne(). In this case, the I/O completion routine is queued to ThreadPool to be executed when a thread becomes available, on FIFO basis.
You have an option to increase the size of ThreadPool with SetMinThreads/SetMaxThreads APIs, but doing so isn't always a good idea. The number of actual concurrent threads is anyway limited by the number of CPU/cores, so you'd rather want to finish any CPU-bound processing work as soon as possible and release the thread to go back to the pool.
I'm writing a TCP Server in C# and I'm using the BeginXXX and EndXXX methods for async communication. If I understand correctly, when I use BeginXXX the request will be handled on in the threadpool (when the request is ready) while the main thread keeps accepting new connections.
The question is what happens if I perform a blocking action in one of these AsyncCallbacks? Will it be better to run a blocking operation as a task? Tasks use the threadpool as well don't they?
The use case is the following:
The main thread set ups a listening socket which accepts connections using BeginAccept, and starts listening on those connections using BeginReceive. When a full message has been received, a function is called depending on what that message was, in 80% of all cases, those functions will start a database query/insertion/update.
I suggest you use SocketAsyncEventArgs which is introduced in .net 4.5
Here's some reading material you can start with
click me
The question is what happens if I perform a blocking action in one of these AsyncCallbacks? Will it be better to run a blocking operation as a task?
If you do that too often or for too long then the ThreadPool will grow. Possible to the point where it will crash your App.
So try to avoid blocking as much as possible. But a little bit of it should be acceptable. Keep in mind that the ThreadPool will grow with 1 new thread per 500 ms. So make sure and verify that it will level out on some reasonable number of threads.
A blunt instrument could be to cap the MaxThreads of the pool.
Tasks use the threadpool as well don't they?
Yes, so your options are limited.
I'm writing a TCP server, and at the very heart of it is a fairly standard bind-listen-accept piece of code nicely encapsulated by TcpListener. The code I'm running in development now works, but I'm looking for some discussion of the thread model I chose:
// Set up the socket listener
// *THIS* is running on a System.Threading.Thread, of course.
tpcListener = new TcpListener(IPAddress.Any, myPort);
tpcListener.Start();
while (true)
{
Socket so = tpcListener.AcceptSocket();
try
{
MyWorkUnit work = new MyWorkUnit(so);
BackgroundWorker bw = new BackgroundWorker();
bw.DoWork += new DoWorkEventHandler(DispatchWork);
bw.RunWorkerCompleted +=
new RunWorkerCompletedEventHandler(SendReply);
bw.RunWorkerAsync(work);
}
catch (System.Exception ex)
{
EventLogging.WindowsLog("Error caught: " +
ex.Message, EventLogging.EventType.Error);
}
}
I've seen good descriptions of which kind of thread model to pick (BackgroundWorker, stock Thread, or ThreadPool) but none of them for this kind of situation. A nice summary of the pros and cons of each is backgroundworker-vs-background-thread
(second answer). In the sample code above, I picked BackgroundWorker because it was easy. It's time to figure out if this is the right way to do it.
These are the particulars of this application, and they're probably pretty standard for most transaction-like TCP servers:
Not a Windows Forms app. In fact, it's run as Windows Service.
(I'm not sure whether the spawned work needs to be a foreground thread or not. I'm running them as background now and things are okay.)
Whatever priority assigned to the thread is fine as long as the Accept() loop gets cycles.
I don't need a fixed ID for the threads for later Abort() or whatever.
Tasks run in the threads are short -- seconds at most.
Potentially lots of tasks could hit this loop very quickly.
A "graceful" way of refusing (or queuing) new work would be nice if I'm out of threads.
So which is the right way to go on this?
For this main thread, use a separate Thread object. It is long running which is less suitable for the ThreadPool (and the Bgw uses the ThreadPool).
The cost of creating it doesn't matter here, and you (may) want full control over the properties of the Thread.
Edit
And for the incoming requests, you can use the ThreadPool (directly or through a Bgw) but note that this may affect your throughput. When all threads are busy there is a delay (0.5 sec) before an extra thread is created. This ThreadPool behaviour might be useful, or not. You can tweak MinThreads to control it somewhat.
It's crude but if you were to create your own threads for the spawned tasks you might have to come up with your own throttle mechanism.
It all depends on how much requests you expect, and how big they are.
BackgroundWorker seems to be a decent choice here for your workers. My only caveat would be to make sure you aren't blocking for network traffic on those threads themselves. Use the Async methods for sending/receiving there, as appropriate, so that ThreadPool threads are not being blocked for network traffic.
It's fine (appropriate, really) for those threads to be Background threads, too. The only real difference is that under normal circumstances, a Foreground thread will keep a process alive.
Also I don't think you mentioned this main thread which captures these connections; that one is appropriate for a regular System.Threading.Thread instance.
Well, personally none of those would be my first choice. I tend to prefer asynchronous IO operations by taking advantage of Socket.BeginReceive and Socket.BeginSend and let the underlying IO completion ports do all of the threading for you. But, if you would prefer to use synchronous IO operations then shuttling them off to the ThreadPool or a Task (if using .NET 4.0) would be the next best option.
I know .NET has a good asynchronous model for network I/O, which uses completion port under the hood. But all the callbacks are happening on IO threads from thread pool.
Is there any .NET counterparts like Java's selector which deals with multiple streams in a single thread? Or, the thread pool callbacks scale better than this single threaded approach?
Thanks,
Dodd
For async operations an IO handle is associated with the thread pool. When the async operation completes, I believe that the callbacks(for each stream) may or may not execute using the same thread, any available thread pool thread could process the callback, it's quite possible that the same thread could process multiple callbacks or just one callback based on runtime conditions.
Hope this helps
EDIT: Adding reply to Dodd's comment
I'm not intimately familiar with the Selector API but from looking at an example posted here it seems that Selector waits until all events occur. Is that true? If so then the caller would have to wait for all events to occur even when one event occurs sooner than the another.
But if the Selector works by processing an event as soon as it occurs, one could run into a situation where the selector is processing the callback for one event while another event arrives(I would imagine that in this case the incoming event would get queued somewhere or else you would be dropping events) but it would still reduce the throughput when the events are orthogonal and should be processed as soon as they occur.
The async model in .NET is centered around the thread pool to reduce the overhead of creating a new thread(since it's an expensive operation). If you are observing that the thread pool is maxing out you could increase the number of Thread in the pool as documented here. Bear in mind though that at the end of the day you are limited to the number of processors i.e On a dual core box only 2 threads can be actively running, all others are blocked, so that might be something to take into account.
Hope this helps.
Thanks Abhijeet,
Well, my concern is that, in some extremely busy senarios, many async callbacks happen simultaneously and we are running out of threads: then context switching will be nightmare. In this particular case, is async callback the right choice? Or we should use Select()?