I am looking for some fast / low latency stream implementations for C#/.Net, and would be interested what is out there. These streams will be reporting live market data, so I am interested in low latency, as well as moderately high compression, and the data will be pushed out on a TCP stream.
What options are available to compress a TCP stream?
You could serialize your data with Protobuf.net which reduces the size by being densely encoded.
As Marc says, it's probably better to use a raw socket.
Related
My problem can be described with following statements:
I would like my program to be able to compress and decompress selected files
I have very large files (20 GB+). It is safe to assume that the size will never fit into the memory
Even after compression the compressed file might still not fit into the memory
I would like to use System.IO.Compression.GzipStream from .NET Framework
I would like my application to be parallel
As I am a newbie to compression / decompression I had following idea on how to do it:
I could use split the files into chunks and compress each of them separately. Then merge them back into a whole compressed file.
Question 1 about this approach - Is compressing multiple chunks and then merging them back together going to give me the proper result i.e. if I were to reverse the process (starting from compressed file, back to decompressed) will I receive the same original input?
Question 2 about this approach - Does this approach make sense to you? Perhaps you could direct me towards some good lecture about the topic? Unfortunately I could not find anything myself.
You do not need to chunk the compression just to limit memory usage. gzip is designed to be a streaming format, and requires on the order of 256KB of RAM to compress. The size of the data does not matter. The input could be one byte, 20 GB, or 100 PB -- the compression will still only need 256KB of RAM. You just read uncompressed data in, and write compressed data out until done.
The only reason to chunk the input as you diagram is to make use of multiple cores for compression. Which is a perfectly good reason for your amount of data. Then you can do exactly what you describe. So long as you combine the output in the correct order, the decompression will then reproduce the original input. You can always concatenate valid gzip streams to make a valid gzip stream. I would recommend that you make the chunks relatively large, e.g. megabytes, so that the compression is not noticeably impacted by the chunking.
Decompression cannot be chunked in this way, but it is much faster so there would be little to no benefit even if you could. The decompression is usually i/o bound.
I'm building a File Sharing Program, and I would like to know if it's better, while using Sockets, to receive and send byte per byte, or a fixed amount. I'm sending messages of Login, Actual file size list, etc, of 512 bytes, and 65536, when sending and receiving files.
it is depend on your usage and goal:
for High Performance when in non-faulty environment:
choose 1500 bytes
for bad and faulty environment:
choose lower sizes but not byte per byte
It's always better to use reasonably sized blocks for efficiency reasons. Typical network packets are around 1500 bytes in size (Ethernet) and every packet carries a bunch of necessary overhead (such as protocol, destination address and port etc.).
Single bytes is the worst (in terms of efficiency) that you can do.
Handling 1500 or so bytes at a time will be much more efficient than one byte at a time. That is about the size of a typical Ethernet frame.
Keep in mind that you are using a stream of bytes: any concept of message or record is up to you to implement.
I'm encrypting data on the fly and writing it to a network stream.
Should I write to the stream as soon as each 16-byte encrypted block data becomes available or should I buffer it? Is there a performance penalty to sending bunches of 16 byte writes rather than a single 20 kilobyte or 1 megabyte write?
Feed it as much as you have, It will let you know if it can't take any more. TCP will handle the buffering for you.
Also, the more you feed - the better, it will likely result in less traffic as packets will not be fragmented much.
By default Socket uses Nagle algorithm, which is designed to reduce network traffic by causing the socket to buffer small packets and then combine and send them in one packet under certain circumstances. A TCP packet consists of 40 bytes of header plus the data being sent. When small packets of data are sent with TCP, the overhead resulting from the TCP header can become a significant part of the network traffic. On heavily loaded networks, the congestion resulting from this overhead can result in lost datagrams and retransmissions, as well as excessive propagation time caused by congestion. The Nagle algorithm inhibits the sending of new TCP segments when new outgoing data arrives from the user if any previously transmitted data on the connection remains unacknowledged.
You can turn off Nagle algorithm, but this will likely result in more fragmentation and traffic.
Hi
I have TCP/IP client server application. i want to send large serialized object around 1MB through sockets.
Is it possible to get better performance by splitting byte array to for example 10 chunks of arrays and open a socket for each and send them Async compared to opening one socket and send all large data through it ?
Thanks
Splitting the data to less than the MTU will introduce more overhead as there will be more packets - this will actually slow things down. What you are proposing is already being done as part of the protocol i.e. splitting and re-assembling. I would experiment with sending less data e.g. compression.
No, this doesn't speed up the transfer under normal conditions, it only adds overhead. It would only help if you have a slow network segment which is quite busy otherwise and the traffic is shaped per TCP connection.
Make sure that your sockets code is efficient, because wrong buffer and therefore packet sizes, synchroneous operation and other stuff may slow the transfer down.
i am using the win32 waveform api's in a C# app to make a voip system. all is going well, however i need some way of compressing the audio data on the fly.
so basically the audio data comes into a 'record' buffer of size 150 bytes, and then this buffer is sent over udp, and at the remote end, the 150 bytes are received and put into a 'play' buffer.
so i need some way of compressing/decompressing the data just before the udp->send and just after the udp->recv. normal compression algorithms dont work with audio, including the .NET GZip class.
does anyone know of a library that i can use that will help me do this ?
thanks in advance...
150 bytes is an unbelievably small buffer for audio data--less than 5 milliseconds for e.g. 16 KHz mono. I'm no expert but I think regardless of the compression scheme you choose, your compression ratio will suffer greatly for using such a small buffer. Besides that there is significant overhead for each packet you send.
That said, if you are sending speech data, take a look at Speex for lossy compression (I have found it very effective at compressing speech, but the sound quality is terrible for music.)
I would think you'd want to batch up those 150-byte chunks to get better compression.
Although, even at small buffer sizes like that, you can still get some compression.
If the built-in GZipStream isn't working you could try the GZipStream that is included in DotNetZip. There is also a ZlibCodec class available in DotNetZip that implements the Codec pattern - this may facilitate compressing in 150-byte blocks.
The component you're looking for is more well-known as a coder/decoder, or codec, and there are many options when it comes to picking one.
As suggested above, I'd look into Speex. It's well supported, and now the defacto standard for Flash Player.
I assume that by the size you are setting your buffers that latency is an issue (the bigger the buffer, the bigger the latency), so don't go for a codec that has a high decompressed frame size, because it introduces high latency. This more or less rules out MP3... for voice at 5khz output sample rate (it wouldn't serve much purpose going higher), the minimum decompressed frame size is 576 samples, or ~100ms of data that must be encoded prior to send. This means a bothway latency of over 200ms before you've even considered the network part of the problem.