Thousands of concurrent packets, ThreadPool vs BeginRead? - c#

I'm running an application that has 5000 instances of the UdpClient class since I need to transmit packets concurrently on different ports.
I'm currently using a System.Timers.Timer, which runs on the ThreadPool. It has a very short Interval with many Elapsed events firing.
Would it be better to modify the application to work using BeginRead/EndRead functions of the UdpClient?

When you work with IO operations (i.e, network, file system, etc.), you should always use async operations, i.e BeginXXX/EndXXX, XXXAsync/XXXCompleted, XXXAsync in context of async/await. These operations use so known IO complition ports. In two words, it doesn't consume any CPU resources while the data is transmitting. As soon as the requested data is loaded, it takes a thread from the ThreadPool and queues the handler to this thread. In your case, you are wasting CPU resources. Instead of doing some useful work, the threads just wait until the data is transmitted. Also, the ThreadPool has limited number of threads (usually equal to number of CPU cores). So, in you case it sends/receives data only to/from 2 clients in a time.
The asynchronous methods might seem very complicated (especially BeginXXX/EndXXX), but there are a lot of wrappers which significantly simplifies thier useage. For example, you can use Rx's FromAsyncPattern extension method, or you can use new async/await asyncronous model.

Related

On Windows, C#, Does Socket.BeginReceive() take a background thread before or after the data arrives?

If it take a background thread before the data arrives, and when many connections waiting for data, there will be too many threads exist, causing performance degradation. is there an approach to wait for data without taking a thread?
Socket.BeginReceive(), and other asynchronous I/O methods in .NET, make use of the IOCP thread pool. The short version is that this is a very efficient way to manage I/O. There is practically no cost to wait for the I/O to complete, and even once it completes, your completion callback is called from a thread pool thread, tying up that thread only for as long as it takes for your callback to complete.
"IOCP" stands for "IO Completion Ports", a feature in the native Windows API. The basic idea is that you can have a single thread, or some small collection of threads, all ready to service the completion of a large number of I/O operations. This allows I/O operations to scale well into the hundreds of thousands, if not millions, of concurrent operations, while still only requiring a relatively small number of threads to deal with them all.
So, go right ahead use those asynchronous I/O APIs. They are the best way to write scalable I/O code.
(Aside: the Socket class in particular has a number of async options. Ironically, the methods ending in ...Async do not comply with the new(er) async/await paradigm in C#, but they are in fact the most scalable way to do I/O with a Socket, because not only do they use the IOCP thread pool, they also allow you to reuse your I/O state objects, so you can have a pool of those and minimize GC load.)

C# Asynchronous Socket Read Without Using Runtime's Threadpool

I'm trying to create a socket server which can handle relatively large amount of clients. There are different approaches used for such a task, first is to use a separate thread for each incoming connection, and second is to use async/await pattern.
First approach is bad as there will be relatively small number of threads such as all system's resources will be lost on context switching.
Second approach from the first sight is good as we can have our own threadpool with limited number of worker threads, so dispatcher will receive incoming connections, add them to some queue and call async socket read methods which in turn will receive data from socket and add this data/errors to queue for further processing(error handling client responses, DB-related work).
There is not so much info on internal implementation of async/await I could found, but as I understood while using non-UI application all continuation is done through TaskScheduler.Current which is using runtime's threadpool and so it's resources are limited. Greater amount of incoming connections will result in no free threads in runtime's threadpool or amount will be so large that system will stop responding.
In this matter async/await will result in same problem as with 1-client/1-thread concern, however with little advantage as runtime threadpool's threads may not occupy so much address space as default System.Threading.Thread (I believe 1MB stack size + ~1/2MB of control data).
Is there any way I can made one thread to wait for some kernel interrupt on say 10 sockets so application will only use my explicitly sized thread pool? (I mean that in case there is any further data on one from 10 sockets, one thread will wake up and handle it.)
In this matter async/await will result in same problem as with 1-client/1-thread concern
When thread reach code that is running asynchronously then control is returned to caller so that means thread is returned to thread pool and can handle another request so it is any superior to 1-client/1-thread because thread isn't blocked.
There is some any intersting blog about asnyc/await:
1

ThreadPool and methods with while(true) loops?

ThreadPool utilizes recycling of threads for optimal performance over using multiple of the Thread class. However, how does this apply to processing methods with while loops inside of the ThreadPool?
As an example, if we were to apply a thread in the ThreadPool to a client that has connected to a TCP server, that client would need a while loop to keep checking for incoming data. The loop can be exited to disconnect the client, but only if the server closes or if the client demands a disconnection.
If that is the case, then how would having a ThreadPool help when masses of clients connect? Either way the same amount of memory is used if the clients stay connected. If they stay connected, then the threads cannot be recycled. If so, then ThreadPool would not help much until a client disconnects and opens up a thread to recycle.
On the other hand it was suggested to me to use the Network.BeginReceive and NetworkStream.EndReceive asynchronous methods to avoid threads all together to save RAM usage and CPU usage. Is this true or not?
Either way the same amount of memory is used if the clients stay
connected.
So far this is true. It's up to your app to decide how much state it needs to keep per client.
If they stay connected, then the threads cannot be recycled. If so,
then ThreadPool would not help much until a client disconnects and
opens up a thread to recycle.
This is untrue, because it assumes that all interesting operations performed by these threads are synchronous. This is a naive mode of operation, and in fact real world code is asynchronous: a thread makes a call to request an action and is then free to do other things. When a result is made available as a result of that action, some thread looking for other things to do will run the code that acts on the result.
On the other hand it was suggested to me to use the
Network.BeginReceive and NetworkStream.EndReceive asynchronous methods
to avoid threads all together to save RAM usage and CPU usage. Is this
true or not?
As explained above, async methods like these will allow you to service a potentially very large number of clients with only a small number of worker threads -- but by itself it will do nothing to either help or hurt the memory situation.
You are correct. Slow blocking codes can cause poor performances both on the client-side as well as server-side. You can run slow work on a separate thread and that might work well enough on the client-side but may not help on the server-side. Having blocking methods in the server can diminish the overall performance of the server because it can lead to a situation where your server has a large no of threads running and all blocked. So, even simple request might end up taking a long time. It is better to use asynchronous APIs if they are available for slow running tasks just like the situation you are in. (Note: even if the asynchronous operations are not available, you can implement one by implementing a custom awaiter class) This is better for the clients as well as servers. The main point of asynchronous code is to reduce the no of threads. Because servers can have larger no of requests in progress simultaneously because reducing no of threads to handle a particular no of clients can improve scalability.
If you dont need to have more control over the threads or the thread-pool you can go with asynchronous approach.
Also, each thread takes 1 MB space on the heap. So, asynchronous methods will definitely help reduce memory usage. However, I think the nature of the work you have described here is going to take pretty much the same amount of time in multi-threaded as well as asynchronous approach.

Using delegates in C# .Net, what happens when I run out of threads in the .Net threadpool?

I'm making a multi-threaded application using delegates to handle the processing of requests in a WCF service. I want the clients to be able to send the request and then disconnect and await for a callback to announce the work is done (which will most likely be searching through a database). I don't know how many requests may come in at once, it could be one every once in a while or it could spike to dozens.
As far as I know, .Net's threadpool has 25 threads available to use. What happens when I spawn 25 delegates or more? Does it throw an error, does it wait, does it pause an existing operation and start working on the new delegate, or some other behavior?
Beyond that, what happens if I want to spawn up to or more than 25 delegates while other operations (such as incoming/outgoing connections) want to start, and/or when another operation is working and I want to spawn another delegate?
I want to make sure this is scalable without being too complex.
Thanks
All operations are queued (I am assuming that you are using the threadpool directly or indirectly). It is the job of the threadpool to munch through the queue and dispatch operations onto threads. Eventually all threads may become busy, which will just mean that the queue will grow until threads are free to start processing queued work items.
You're confusing delegates with threads, and number of concurrent connections.
With WCF 2-way bindings, the connection remains open while waiting for the callback.
IIS 7 or above, on modern hardware should have no difficulty maintaining a few thousand concurrent connections if they're sitting idle.
Delegates are just method pointers - you can have as many as you wish. That doesn't mean they're being invoked concurrently.
If you are using ThreadPool.QueueUserWorkItem then it just queues the extra items until a thread is available.
ThreadPools default max amount of thread is 250 not 25! You can still set a higher limit for the ThreadPool if you need that.
If your ThreadPool runs out of threads two things may happen: All opperations are queued until the next resource is available. If there are finished threads those might still be "in use" so the GC will trigger and free up some of them, providing you with new resources.
However you can also create Threads not using the ThreadPool.

Async threaded tcp server

I want to create a high performance server in C# which could take about ~10k clients. Now i started writing a TcpServer with C# and for each client-connection i open a new thread. I also use one thread to accept the connections. So far so good, works fine.
The server has to deserialize AMF incoming objects do some logic ( like saving the position of a player ) and send some object back ( serializing objects ). I am not worried about the serializing/deserializing part atm.
My main concern is that I will have a lot of threads with 10k clients and i've read somewhere that an OS can only hold like a few hunderd threads.
Are there any sources/articles available on writing a decent async threaded server ? Are there other possibilties or will 10k threads work fine ? I've looked on google, but i couldn't find much info about design patterns or ways which explain it clearly
You're going to run into a number of problems.
You can't spin up 10,000 threads for a couple of reasons. It'll trash the kernel scheduler. If you're running a 32-bit, then the default stack address space of 1MB means that 10k threads will reserve about 10GB of address space. That'll fail.
You can't use a simple select system either. At it's heart, select is O(N) for the number of sockets. With 10k sockets, that's bad.
You can use IO Completion Ports. This is the scenario they're designed for. To my knowledge there is no stable, managed IO Completion port library. You'll have to write your own using P/Invoke or Managed C++. Have fun.
The way to write an efficient multithreaded server is to use I/O completion ports (using a thread per request is quite inefficient, as #Marcelo mentions).
If you use the asynchronous version of the .NET socket class, you get this for free. See this question which has pointers to documentation.
You want to look into using IO completion ports. You basically have a threadpool and a queue of IO operations.
I/O completion ports provide an
efficient threading model for
processing multiple asynchronous I/O
requests on a multiprocessor system.
When a process creates an I/O
completion port, the system creates an
associated queue object for requests
whose sole purpose is to service these
requests. Processes that handle many
concurrent asynchronous I/O requests
can do so more quickly and efficiently
by using I/O completion ports in
conjunction with a pre-allocated
thread pool than by creating threads
at the time they receive an I/O
request.
You definitely don't want a thread per request. Even if you have fewer clients, the overhead of creating and destroying threads will cripple the server, and there's no way you'll get to 10,000 threads; the OS scheduler will die a horrible death long before then.
There are numerous articles online about asynchronous server programming in C# (e.g., here). Just google around a bit.

Categories