Performance send / receive data via TCP socket Client-Server application

Performance send / receive data via TCP socket Client-Server application - c#

I'm developing a system for exchanging data between client and server application using sockets via TCP.
The server was written long ago in C++ (by a third part), the client instead is the application that I'm still developing in C#. The bases have already been completed. Both applications are communicating properly. The problem is that data is exchanged (in my opinion) too slowly.
I emphasize in my opinion because I have no idea what you can actually expect using sockets via TCP. Currently, and by eye, after several tests, the communication speed is an average of 200/250 KB per second, both in reception and in sending.
There are applications that allow you to evaluate the speed of data exchange between two endpoints on a specific port? What speed should I expect?
More specifically, the client and the server are developed for the file sharing. Currently, the client can receive and send one file at a time. The protocol would do otherwise (receive/send multiple files at once), but for security reasons I preferred otherwise (if you're receiving more than one file and you lose the connection you'll lose all files you're receiving).
Change this feature could significantly affect the speed of data exchange? In what way?

First, you have to know the bandwith of the link your host provides you to know what to expect. If you are hosting it yourself, your upload speed may be way slower than the download speed link. You can log into your server and run a speed test.
I had a situation similar to yours (developed a server and a client to send binary data to it) and by that time (2-3 years ago) I used a version of this tool to measure the speed of the data exchange. You would install it on your server, set up the monitor (ports you want to watch, etc), configure the charts and let it run for some time. It worked really well. You can look for other similar tools searching for "Bandwidth Traffic Monitor".
If you allow your application to exchange multiple files at the same time, have in mind that your link speed will be divided among the connections and, depending on how many clients are using your application, the multiple connections and file writes your server will be doing may be limited by its processing capacity, in addition to the bandwith.

What speed should I expect?
The full speed the slowest part of the network between the two nodes can transfer, minus the bandwidth of existing traffic over that slowest part. On a 100 Mbit LAN, I would expect a one-way transfer speed of 10 MB/s.
Change this feature could significantly affect the speed of data exchange? In what way?
That totally depends on the protocol being used and the client and server implementation. You'll have to benchmark where the hard work is being done, where your bottleneck is.

Related

Remote MSMQ connection performance

We experience a lack of performance during iteration across remote private MSMQ queue. We tried to use both API methods - MessageQueue.GetAllMessages() and MessageQueue.GetEnumerator2() and see the same results.
It seems that the problem is in Message Queuing Service, because it always uses only up to 15% of CPU (single core). For example, if we iterate across local queue - we use 100% of CPU and can load 1 million messages in 2 seconds, but for remote queues it takes 30 seconds to load only 10K! Network connection is 100MBPs.
Is there a way to increase MSMQ performance for remote queues and force it to use 100% of CPU or Network?

MSMQ is optimised to go as fast as it can - it's not going slow just to irritate you.
Performance will be poor on remote queues. This is not the best way to use MSMQ. High performance is obtained through the "send remote, read local" model.
Remote access uses RPC which will be slow over a LAN. If you looked at a network trace, you would see all the back-and-forth communication. Binding to the remote RPC service and querying to find where MSMQ is listening; binding to the remote MSMQ RPC listener; requesting messages from the listener; etc etc.

This may or may not be relevant in your scenario, but it's a way to improve overall performance for MSMQ.
If you're sending messages that wrap a consistent type - a serialized class, for example - buffer them before sending and send one message containing an array or collection of items.
I was working with some serialized class and sending a large volume of messages. I tested and found that if I sent them in batches of 50 instead of individually then the size of queue was reduced by 75%. I didn't spend much time optimizing from there. It depends on the size of your messages. But this gets rid of much of the overhead incurred in sending individual messages.

Try using the TCP connection syntax and use an explicit numerical IP address 123.123.123.123. See if this affects your performance. If it does then think security.
You use the terms GetMessage but also talk about loading so I am confused about if you want performance on Message Receive “GetOne” or Load into the queue operations.
For core production code I always operate one at a time on the messages so I am never trying to GetAllMessages or EnumerateAllMessages except in specific management functions.

Does Localhost perform better on windows?

I have a .NET 3.5 server application that usually has about 8 clients. I'm using System.Net.Sockets for all the networking.
I've been told that if a client is running on the same box, it should use localhost:<port> or 127.0.0.1:<port> instead of the machine's ip or name for better performance. Several people at work have said that this skips some layers of the tcp stack.
But I'm not able to see any performance difference at all in testing (timing how long it takes to get a ping packet from server to client using System.Diagnostics.Stopwatch).
Should there really be better performance in theory?

No, performance is same in both cases. If you are using your local device ip address, your operating system kernel don't transport your packets data to your network device and this data don't be was sended anywhere then you don't have any ISO layers calculations (encapsulation, decapsulation etc).
I belive the OS will see this is a local device and you treat it like it was 127.0.0.1. So in fact both will have the same effect.

I suppose it's possible that there will be an extremely tiny performance boost in using 127.0.0.1 (though I doubt it), but with 8 clients you'll never notice it. That performance difference would have to be aggregate over A LOT of traffic to become at all noticeable.
The larger concern would be which value is better from a maintenance perspective. If the application is always looking at localhost for external dependencies, it won't do well if run on another host. But if it's looking for a more universally understood address for those dependencies, it'll find them from anywhere on the network.

How to maximize http.sys file upload performance

I'm building a tool that transfers very large streaming data sets (possibly on the order of terabytes in a single stream; routinely in the tens of gigabytes) from one server to another. The client portion of the tool will read blocks from the source disk, and send them over the network. The server side will read these blocks off the network and write them to a file on the server disk.
Right now I'm trying to decide which transport to use. Options are raw TCP, and HTTP.
I really, REALLY want to be able to use HTTP. The HttpListener (or WCF if I want to go that route) make it easy to plug in to the HTTP Server API (http.sys), and I can get things like authentication and SSL for free. The problem right now is performance.
I wrote a simple test harness that sends 128K blocks of NULL bytes using the BeginWrite/EndWrite async I/O idiom, with async BeginRead/EndRead on the server side. I've modified this test harness so I can do this with either HTTP PUT operations via HttpWebRequest/HttpListener, or plain old socket writes using TcpClient/TcpListener. To rule out issues with network cards or network pathways, both the client and server are on one machine and communicate over localhost.
On my 12-core Windows 2008 R2 test server, the TCP version of this test harness can push bytes at 450MB/s, with minimal CPU usage. On the same box, the HTTP version of the test harness runs between 130MB/s and 200MB/s depending upon how I tweak it.
In both cases CPU usage is low, and the vast majority of what CPU usage there is is kernel time, so I'm pretty sure my usage of C# and the .NET runtime is not the bottleneck. The box has two 6-core Xeon X5650 processors, 24GB of single-ranked DDR3 RAM, and is used exclusively by me for my own performance testing.
I already know about HTTP client tweaks like ServicePointManager.MaxServicePointIdleTime, ServicePointManager.DefaultConnectionLimit, ServicePointManager.Expect100Continue, and HttpWebRequest.AllowWriteStreamBuffering.
Does anyone have any ideas for how I can get HTTP.sys performance beyond 200MB/s? Has anyone seen it perform this well on any environment?
UPDATE:
Here's a bit more detail on the performance I'm seeing with TcpListener vs HttpListener:
First, I wrote a TcpClient/TcpListener test. On my test box that was able to push 450MB/s.
Then using reflector I figured out how to get the raw Socket object underlying HttpWebRequest, and modified my HTTP client test to use that. Still no joy; barely 200MB/s.
My current theory is that http.sys is optimized for the typical IIS use case, which is lots of concurrent small requests, and lots of concurrent and possibly large responses. I hypothesize that in order to achieve this optimization, MSFT had to do so at the expense of what I'm trying to accomplish, which is very high throughput on a single very large request, with a very small response.
For what it's worth, I also tried up to 32 concurrent HTTP PUT operations to see if it could scale out, but there was still no joy; about 200MB/s.
Interestingly, on my development workstation, which is a quad-core Xeon Precision T7400 running 64-bit Windows 7, my TcpClient implementation is about 200MB/s, and the HTTP version is also about 200MB/s. Once I take it to a higher-end server-class machine running Server 2008 R2, the TcpClient code gets up to 450MB/s, while the HTTP.sys code stays around 200.
At this point I've sadly concluded that HTTP.sys is not the right tool for the job I need done, and will have to continue to use the hand-rolled socket protocol we've been using all along.

I can't see too much of interest except for this Tech Note. It might be worth having a fiddle with MaxBytesPerSend

If you're going to send files over the LAN then UDP is the way to go, because TCP's overhead is a waste in that case. TCP provides rate limiting to avoid too many lost packets, whereas with UDP the application has to sort that out by itself. NFS would do the job, were it not that you're stuck with windows; but I'm sure there must be ready made UDP stuff. Also use the tool "iperf" (available on linux, probably also windows) to benchmark the network link irrespective of the protocol. Some network cards are plain crap and rely on the CPU too much, which will limit your speed to 200mbit. You want a proper network card with its own processors (don't know the exact terms to put this).

How to send information fast like many games do?

I'm thinking like the methods games like Counter Sstrike, WoW etc uses. In CS you often have just like 50 ping, is there any way to send information to an online MySQL database at that speed?
Currently I'm using an online PHP script which my program requests, but this is really slow, because the program first has to send headers and post-information to it, and then retrieve the result as an ordinary webpage.
There really have to be any easier, faster way of doing this? I've heard about TCP/IP, is this what I should use here? Is it possible for it to connect to the database in a faster way than indirectly via the PHP script?

TCP/IP is made up of three protocols:
TCP
UDP
ICMP
ICMP is what you are using when you ping another computer on a network.
Games, like CounterStrike, don't care about what you previously did. So there's no requirement for completeness, to be able to reconstruct what you did (which is why competitors have to tape what they are doing). This is what UDP is used for - there's no guarantee that data is delivered or received. Which is why lag can be such a problem - you're already dead, you just didn't know it.
TCP guarantees that data is sent and received. Slower than UDP.
There are numerous things to be aware of to have a fast connection - less hops, etc.

Client-to-server for latency-critical stuff? Use non-blocking UDP.
For reliable stuff that can be a little slower, if you use TCP make sure you do so in a non-blocking fashion (select(), non-blocking send, etc.).
The big reason to use UDP is if you have time-sensitive data - if the position of a critter gets dropped, you're better off ignoring it and sending the next position packet rather than re-sending the last one.
And I don't think any high-performance game has each and every call resolve to a call to the database. It's more common to (if a database is even used) persist data occasionally, or at important events.
You're not going to implement Counterstrike or anything similar on top of http.

Most games like the ones you cite use UDP for this (one of the TCP/IP suite of protocols.) UDP is chosen over TCP for this application since it's lighter weight allowing for better performance and TCP's reliability features aren't necessary.
Keep in mind though, those games have standalone clients and servers usually written in C or C++. If your application is browser-based and you're trying to do this over HTTP then use a long-lived connection and strip back the headers as much as possible, including cookies. The Tornado framework may be of interest to you there. You may also want to look into HTML5 WebSockets however widespread support is still a fair way off.
If you are targeting a browser-based plugin like Flash, Java, SilverLight then you may be able to use UDP but I don't know enough about those platforms to confirm.
Edit:
Also worth mentioning: once your networking code and protocol is sufficiently optimized there are still things you can do to improve the experience for players with high pings.

Socket.BeginReceive Performance on Mono

I'm developing a server in C#. This server will act as a data server for a backup service: a client will send data, a lot of data, continuously, specifically will send data chunk of files, up to five, in the same tcp channel. I'll send data to the server slowly, i don't want to kill customer bandwidth, so i didn't need to speed up at max data send and, for this reason, i can use a single tcp channel for everything.
Said this, actually the server uses BeginReceive method to acquire data from client and, on windows, this means IOCP. My questions is: how BeginReceive will perform on linux/freebsd trough mono? On windows, i've read a lot of stuff, will perform very well but this software, the server part, will run on linux or freebsd trough mono and i don't know how these methods are implemented on it!
More, to try to reduce continue allocations of an Async State object for the (Begin|End)Receive method i mantain one for the tcp connection and in the BeginReceive callback i copy out data before reuse it (naturally i don't clear data in because i know how much read trough EndReceive return value). Buffer is set on 8kb so i'll at max copy out 8kb of data, it shouldn't kill resoruces.
My target is to get up to 400/500 connections at max. It isn't so much, but the server (machine), in the meantime, will handle files trough an own filesystem (developed using fuse first in C# and later in C) on LVM+Linux Software Raid Mirror and antivirus check using clamav so the software must be light as can!
EDIT: I forgot to say that the machine will be (probably) a Intel Core 2 Duo 2.66+ GHz (3 MB L2 - FSB 1066 MHz) with 2 GB of ram and the SO using 64 bits.
Is mono using epoll (libevent) or kqueue (on freebsd)? And I should do something specific to try to maximize performances? Can I do something more to don't kill resources receiving data packets?

I know it's a little late, but I just found this question...
Mono is able to handle the number of connections that you need and much more. I regularly test xsp2 (the Mono ASP.NET standalone server) with over 1k simultaneous connections.. If this is going to be a high load situation, you should play a bit with setting MONO_THREADS_PER_CPU until you find the right number of threads for the ThreadPool.
On linux, Mono uses epoll when available (which is always these days).

I can't speak specifically about the performance of that one function on mono, but in general mono performs very well these days. 4-500 connections is as you say, not very many, so I doubt you'd have any issues.
In saying that, it shouldn't be very hard to set a test for this kind of thing up. I think that's probably the only way you'll get a conclusive answer for your situation.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.