In my project I have a cloud hosted virtual machine running a C# application which needs to:
accept TCP connection from several external clients (approximately 500)
receive data asynchronously from the connected clients (not high frequency, approximately 1 message per minute)
do some processing on received data
forward received data to other actors
reply back to connected clients and possibly do some asynchronous sending (based on internal time-checks)
The design seems to me quite straightforward. I provide a listener which accepts incoming TCP connection, when a new connection is establhised a new thread is spawned; that thread runs in loop (performing activities points from 2 to 5) and check for associated socket aliveness (if socket is dead, the thread exits the loop and would eventually terminate; later a new connection will be attempted from the external client the socket belonged to).
So now the issue is that for limited amount of external clients (I would say 200/300) everything runs smoothly, but as that number grows (or when the clients send data with higher frequency) the communication gets very slow and obstructed.
I was thinking about some better design, for example:
using Tasks instead of Threads
using ThreadPool
replace 1Thread1Socket with something like 1Thread10Socket
or even some scaling strategies:
open two different TCP listeners (different port) within the same application (reconfiguring clients so that half of them target each listener)
provide two identical application with two different TCP listeners (different port) on the same virtual machine
set up two different virtual machines with the same application running on each of them (reconfiguring clients so that half of them target each virtual machine address)
Finally the questions: is the current design poor or naive? do you see any major criticality in the way I handle the communication? do you have any more robust and efficient option (among those mentioned above, or any additional one)?
Thanks
The number of listeners is unlikely to be a limiting factor. Here at Stack Overflow we handle ~60k sockets per instance, and the only reason we need multiple listeners is so we can split the traffic over multiple ports to avoid ephemeral port exhaustion at the load balancer. Likewise, I should note that those 60k per-instance socket servers run at basically zero CPU, so: it is premature to think about multiple exes, VMs, etc. That is not the problem. The problem is the code, and distributing a poor socket infrastructure over multiple processes just hides the problem.
Writing high performance socket servers is hard, but the good news is: you can avoid most of this. Kestrel (the ASP.NET Core http server) can act as a perfectly good TCP server, dealing with most of the horrible bits of async, sockets, buffer management, etc for you, so all you have to worry about is the actual data processing. The "pipelines" API even deals with back-buffers for you, so you don't need to worry about over-read.
An extensive walkthrough of this is in my 3-and-a-bit part blog series starting here - it is simply way too much information to try and post here. But it links through to a demo server - a dummy redis server hosted via Kestrel. It can also be hosted without Kestrel, using Pipelines.Sockets.Unofficial, but... frankly I'd use Kestrel. The server shown there is broadly similar (in terms of broad initialization - not the actual things it does) to our 60k-per-instance web-socket tier.
Related
Intro:
I have developed a Server (TCP Listener) program developed in C# which runs at specific (constant) IP at specific (constant) port. This server program receives data packet sent by client programs, process it and sends back data packet to same socket. Whilst the number of client programs for this server may be in hundreds:
Code Sample
while(true)
{
socket = Host.AcceptSocket();
int len = socket.Receive(msgA);
performActivities(msgA); // May take up-to 5 seconds
socket.Send(msgB);
socket.Close();
}
Problem:
Due to some critical business requirements, processing may take up-to 5 seconds hence other requests during this time are queued which I need to avoid so that every request must be entertained in not more than 5 seconds.
Query:
I can make it multi-threaded but (pardon me if you find me novice):
how one socket will receive another packet from different clients if it is still opened by previous thread?
In case of entertaining multi-requests, how this can be made sure that response is sent back to respective clients?
Building an efficient, multi-threaded socket server requires strong knowledge and skills in that area. My proposal is instead of trying to build your own TCP server from scratch, use one of the existing libraries, that already solved this problem. Few that come to my mind are:
DotNetty used on Azure IoT services.
System.IO.Pipelines which is experimental, but already quite fast.
Akka.Streams TCP stream.
Each one of those libs covers things like:
Management of a TCP connection lifecycle.
Efficient management of byte buffers. Allocating new byte[] for every package is highly inefficient and causes a lot of GC pressure.
Safe access to socket API (sockets are not thread-safe by default).
Buffering of incoming packets.
Abstraction in form of handlers/pipes/stages, that allow you to compose and manipulate binary payload for further processing. This is particularly useful i.e. when you want to operate on the incoming data in terms of messages - by default TCP is a binary stream and it doesn't know when one message inside the pipeline ends and another one starts.
Writing a production-ready TCP server is a tremendous work. Unless you're an expert in network programming with a very specific requirements, you should never write the one from scratch.
I have an application in C# that is a TCP server listening to a port. GPS devices connect to this port. The application is accepting the TCP client and creating a new thread for each client. The client ID in maintained in a hash table that is updated when a client is connected. this was all working fine until around 400 units. Once the number of units increased, the server was unable to handle all connections. The connections are being continuously dropped and once in awhile leads eating up the server CPU and memory and brings it down. Work around was to open another instance of the TCP server listening to a different port and diverted some units to that port. Currently some 1800 units are somehow running in 8 different ports. The server is extremely unstable and units are still unable to stay connected. Facing too many issues on a daily basis. Also using remoting to send settings via the remoting port - this is working only sometimes.
Please help by giving a solution for TCP socket/threading/thread pooling etc. that is both scalable and robust and can in a single port.
This TCP server is running in Windows server 2008 R2 Enterprise with IIS7 and SQL server 2008.
Processor: Intel Xenon CPU E3-1270 V2 #3.50GHz
RAM: 32GB
System: 64-bit operating system
Thanks
Jonathan
Basically, don't use a thread per socket; use one of the async APIs (BeginReceive / ReceiveAsync), or some kind of socket polling (Socket.Select for example, although note that this is implemented in a very awkward way; when I use this, I actually use P/Invoke to get to the raw underlying API). Right at this moment, I have > 30k sockets per process talking to our web-sockets server (which is implemented via via Socket). Note that for OS reasons we do split that over a few different ports - mainly due to limitations of our load-balancer:
One thread per connection is not a really good idea specially when you have to handle 100s of client concurrently.Asynchronous is the way to go with some buffer pooling/managing. If you are looking for something to start with asynchronous sockets have a look at this basic implementation if you are looking for something complete Take a look at this(explanation: Here)
If you are willing check this out too.
In C# you can go with classical BeginXXX/EndXXX methods. Microsoft also have a High Performance Socket API which can be leveraged using XXXAsync methods. A few articles which explain the High Performance Socket API Here and Here
When I was experimenting with C# and WCF one of the things I kept reading about was how unscalable it is to have clients with a constant current connection to the server. And although WCF allows that it seems that the recommended best practise is to use 'per call' as opposed to 'per session' for instance management if you want to have any kind of decent scalablity. (Please correct me if Im wrong)
However from what I understand IRC uses constant client connections to the server and IRC servers (well networks of servers) are servicing hundreds of thousands of clients at any given time. So in that case is there nothing actually 'bad' about keeping constant client connections to the server?
As long as you don't follow the one-thread-per-connection architecture, a server can support quite a large number of concurrent TCP connections.
IRC doesn't require much per connection state, beyond the TCP send and receive windows.
If you need real-time duplex communication (IRC is a chat protocol), then keeping a TCP connection alive is a relevant option. However, TCP connection brings network overhead and operating systems have practical limits on the number of concurrent open TCP connections. WCF is commonly used in SOAP/HTTP/RPC contexts where duplex communication is not required, but certainly it offers suitable bindings and channels for that as well. To answer your question, there is nothing bad in keeping the connection open if you have real-time, duplex requirements for your communication.
Yes, such architecture is feasible, but... The "ping? pong!" thing was invented for a reason - to let both parties know that the other party is still there. You cannot actually tell if a client is idle, because it does not have much to say or because it is actually disconnected and you are waiting for a TCP timeout.
UPD: "hundreds of thousands of clients" is possible on IRCnet only because of server networks. For a single machine, the C10K problem is still an issue.
I am required to create a high performance application where I will be getting 500 socket messages from my socket clients simultaneously. Based on my logs i could see that my dual core system is processing 80 messages at a time.
I am using Async sockets (.BeginRecieve) and i have set NoDelay to true
From the logs from my clients and my server i could see that the message i wrote from my client is read by my server after 3-4 sec.
My service time of my application should be lot lesser.
First, you should post your current code so any potential bugs can be identified.
Second, if you're on .NET 3.5, you might want to look at the SocketAsyncEventArgs enhancements.
Start looking at your resource usages:
CPU usage - both on the overall system, as well as your specific process.
Memory usage - same as above.
Networking statistics.
Once you identify where the bottleneck is, both the community and yourself will have an easier time looking at what to focus on next in improving the performance.
A review of your code may also be necessary - but this may be more appropriate for https://codereview.stackexchange.com/.
When you do a socket.listen, what is your backlog set to? I can't speak to .net 4.0, but with 2.0 I have seen a problem where once your backlog is filled up (too many connection attempts too fast) then some of the sockets will get a TCP accept and then a TCP Reset. The Client then may or may not attempt to reconnect later again. This causes a connection bottleneck rather than a data throughput or a processing bottleneck.
We're in the process of moving our .NET platform from using MSMQ to ActiveMQ. We pump 30+ million persistent messages through it a day, so throughput and capacity are critical to us. The way our MSMQ dependent applications are configured, is they write to local/private queues first. Then we have a local service that routes those messages to their respective remote queues for processing. This ensures the initial enqueue/write write is fast (yes, we can also use async enqueueing as well), and messages aren't lost if the remote servers are unavailable.
We were going to use the same paradigm for ActiveMQ, but now we've decided to move to using VM's with NAS storage for most of our application servers. This greatly reduces the write performance of each message since it's going to NAS, and I feel I need to rethink our approach to queueing. I'd like to know what is considered best practice for using ActiveMQ, with persistent, high throughput needs. Should I consider using dedicated queue servers (that aren't VM's)? But that would mean all writes from the application are going directly over the network. How do I deal with high availability requirements?
Any suggestions are appreciated.
You can deploy ActiveMQ instances in a network of brokers and the topology can include local instances as well as remote instances. I have deployed topologies containing a local instance of ActiveMQ so that messages are persisted as close to the sender as possible and then the messages are forwarded to remote ActiveMQ instances based on demand. With this style of topology, I recommend configuring the network connector(s) to disallow forwarding messages from all destinations. I.e., instead of openly allowing the forwarding of messages for all destinations, you may want to narrow the number of messages forwarded using the excludedDestinations property.
As far as high availability with ActiveMQ, the master/slave configuration is designed for exactly this. It comes in three flavors depending on your needs.
Hope that helps.
Bruce