TcpClient wait for CRLF - c#

I'm writing a class library to communicate with a PLC by using TCP. The communication is based on sending a data string which is terminated by a CRLF and next waiting for an acknowledge string (also terminated by a CRLF) to confirm the data is received (yes I know this is also included in the TCPIP protocol, but this is another discussion).
Currently I'm facing two major problems:
I'm setting the TcpClient.SendTimeout property, however it looks like when the data is send (by TcpClient.Client.Send), the sender does not wait for the receiver the data to be read. Why?
Because of the sender is not waiting, an acknowledge string and immediately the next data string can be send. So, the receiver is getting two packages. Is there a way to read the buffer only till the first CRLF (acknowledge) and leave the next data string in the buffer for the next TcpClient.Client.Read command.?
Thanks in advance,
Mark

TCP is a streaming protocol. There are no packets that you can program against. The receiver must be able to decode the data no matter in what chunks it arrives. Assume one byte chunks, for example.
Here, it seems the receiver can just read until it finds a CRLF. StreamReader can do that.
the sender does not wait for the receiver the data to be read
TCP is asynchronous. When your Send completes the receiver hasn't necessarily processed the data. This is impossible to ensure at the TCP stack level. The receiving app might have called Receive and gotten the data but it might not have processed it. The TCP stack can't know.
You must design your protocol so that this information is not needed.
I just read one byte till
That can work but it is very CPU intensive and inefficient.

Related

C# Socket.Send( ): does it send all data or not?

I was reading about sockets from a book called "C# Network Programming" by Richard Blum. The following excerpt states that the Send() method is not guaranteed to send all the data passed to it.
byte[] data = new byte[1024];
int sent = socket.Send(data);
On the basis of this code, you might be tempted to presume that the
entire 1024-byte data buffer was sent to the remote device... but this
might be a bad assumption. Depending on the size of the internal TCP
buffer and how much data is being transferred, it is possible that not
all the data supplied to the Send() mehtod was actually sent.
However, when I went and looked at the Microsoft documentation https://msdn.microsoft.com/en-us/library/w93yy28a(v=vs.110).aspx it says:
If you are using a connection-oriented protocol, Send will block until
all of the bytes in the buffer are sent, unless a time-out was set
So which is it? The book was published in 2004, so has it changed since then?
I'm planning to use asynchronous sockets, so my next question is, would BeginSend() send all data?
All you had to do was read the rest of the exact same paragraph you quoted. There's even an exception to your quote given in the very same sentence.
If you are using a connection-oriented protocol, Send will block until all of the bytes in the buffer are sent, unless a time-out was set by using Socket.SendTimeout. If the time-out value was exceeded, the Send call will throw a SocketException. In nonblocking mode, Send may complete successfully even if it sends less than the number of bytes in the buffer. It is your application's responsibility to keep track of the number of bytes sent and to retry the operation until the application sends the bytes in the buffer.
For BeginSend, the behavior is also described:
Your callback method should invoke the EndSend method. When your application calls BeginSend, the system will use a separate thread to execute the specified callback method, and will block on EndSend until the Socket sends the number of bytes requested or throws an exception.
That's not a very nice design and defeats the whole point of a callback! Consider using SendAsync instead (and then you still need to check the BytesTransferred property).
Both of the resources you quoted are correct. I think the wording could have been better though.
Inthe docs, MSDN it is also written that
There is also no guarantee that the data you send will appear on the
network immediately. To increase network efficiency, the underlying
system may delay transmission until a significant amount of outgoing
data is collected.
So Send method is blocking until underlying system has had room to buffer your data for a network send.
A successful completion of the Send method means that the underlying
system has had room to buffer your data for a network send.

When does TcpClient's NetworkStream finish one read operation?

I am working on a project that involves client server communication via TCP and Google Protocol Buffer. On the client side, I am basically using NetworkStream.Read() to do blocking read from server via a byte array buffer.
According to MSDN documentation,
This method reads data into the buffer parameter and returns the number of bytes successfully read. If no data is available for reading, the Read method returns 0. The Read operation reads as much data as is available, up to the number of bytes specified by the size parameter. If the remote host shuts down the connection, and all available data has been received, the Read method completes immediately and return zero bytes.
It is the same case with async read (NetworkStream.BeginRead and EndRead). My question is that when does Read()/EndRead() return? It seems like it will return after all the bytes in the buffer have been filled. But in my own testing, that is not the case. The bytes read in one operation vary a lot. I think it makes sense because if there is a pause on the server side when sending messages, the client should not wait until the read buffer has been filled. Does the Read()/EndRead() inherently have some timeout mechanism?
I was trying to find out how Mono implements Read() in NetworkStream and traced until a extern method Receive_internal() is called.
It reads all the data that is available on the networkstream or when the buffer is full. Whichever comes first. You have already noticed this behaviour.
So you will need to process all the bytes and see whether the message is complete. You do this by framing a message. See .NET question about asynchronous socket operations and message framing on how you can do this.
As for the timeout question, if assuming you are asking whether a beginread has a timeout, I would say no, because it is just waiting for data to arrive on the stream and put it into a buffer, after which you can process the incoming bytes.
The number of bytes available on the read action depends on things like your network (e.g. latency, proxy throttling) and the client sending the data.
BeginRead behaviour summary:
Call BeginRead(); -> Waiting for bytes to arrive on the stream......
1 byte or more have arrived on the stream
Start putting the byte(s) from step 2 into the buffer that was given
Call EndRead(); -> The byte(s) within the buffer can be processed by EndRead();
Most common practice is to repeat all these steps again.
If Read was waiting for a full buffer of data, you could easily deadlock if the remote party expects your response but you are waiting for a full buffer which will never come.
According to this logic it must return without ever blocking if data is available. Even if it is just a single byte that is available.
assume that server sends one message (100 bytes) every 50 ms, what is the bytes read on client side on one NetworkStream.Read() call?
Each call will return between one byte and the number of bytes available without blocking. Nothing, nothing, nothing else is guaranteed. In practice you will get one or multiple network packets at once. It doesn't make sense for the stack to withhold available bytes.

Transmitting strings between a C# client and a Node server

I have a node TCP server working and waiting for data and for every socket I have
socket.on("data", function () {
});
Now, as far as I understand, this will get invoked whenever there's any data received. That means that if I send a large string, it will get segmented into multiple packets and each of those will invoke the event separately. Therefore I could concatenate the data until the "end" event is invoked. According to the Node documentation this happens when the FIN packet is sent.
I have to admit I don't know much about networking but this FIN packet, do I have to send it manually when sending data from my C# app or will thise code
var stream = client.GetStream();
using (var writer = new StreamWriter(stream)) writer.Write(request);
send it automatically when it manages to send the whole request string?
Secondly, how does it work from the other end? How do I send a "batch" of data from Node to my C# client so that it knows that the whole "batch" should be considered one thing, despite it being in multiple packets?
Also, is there an equivalent of the "end" even in .NET? Currently, I'm blocking until the stream's DataAvailable is true but that will trigger on the first packet, right? It won't wait for the whole thing.
I'd appreciate if someone could shed some light on this for me.
The TCP FIN packet will be sent when you call writer.Close() in C#, which will trigger the end event in Node as you said.
Without seeing how your C# reading code looks I can't give specifics, but C# will not fire an event when Node closes the connection. It will no longer be stream.CanRead, and if you had a current stream.Read call blocking, it will throw an exception.
TCP provides a stream of bytes, and nothing more. If you are planning to send several messages back and forth from Node and C#, it is up to you to send your messages in such a way that they can be separated. For instance, you could prefix each message with the length, so that you read one byte, and then read that many bytes after it for the message. If your messages are always text, you could encode it as JSON and separate messages with newlines.

How to send large data using C# UdpClient?

I'm trying to send a large amount of data (more than 50 MB) using C# UdpClient.
So at first I split the data into 65507 byte blocks and send them in a loop.
for(int i = 0; i < packetCount; i++)
myUdpClient.Send(blocks[i], block[i].Length, remoteEndPoint);
My problem is that only the first packets can be received.
During sending the first packet the networkload increases rapidly to 100%, and then the other packets cannot be received.
I want to get as much data throughput as possible.
I'm sorry for my English!
Thanks for your help in advance.
For all those people saying to use TCP....are foolishly wrong. Although TCP is reliable and the window maintained by the kernel it's fairly "set and forget" protocol, but when it comes to the guy wanting to use 100% of his throughput, TCP will not do (it throttles too hard, and the wait for an ACK automatically puts at least 50% trashed because of the RTT).
To the original question, you are sending UDP packets nonstop in that for-loop, the window fills up and then any new data is dropped immediately and doesn't even attempt to go on the line. You are also splitting your data too large. I would recommend building your own throttle mechanism that starts off with 2k segments per second, and slowly ramps up. Each "segment" contains a SEQ (sequence identifier for acknowledgements or ACK) and OFF (offset inside the file for this data set). As the data is being tagged, let the server keep track of these tags. When the other side gets them, it stores the SEQ numbers in an ACK list, and any missing SEQ numbers are placed into a NACK timer list, when the timer runs out (if they haven't been received) it moves to a NACK list. The receiver should send 5 or so ACKs from the ACK list along with up to 5 NACKs in a single transmission every couple seconds or so. If the sender receives these messages and there are any NACKs, it should immediately throttle down and resend the missing fragment before continuing. The data that is ACKed can be freed from memory.
Good luck!
I don't know about .Net implementation specifically, it might be buffering your data, but UDP datagram is normally limited by the link MTU, which is 1500 on normal ethernet (subtract 20 bytes for IP header and 8 bytes of UDP header.)
UDP is explicitly allowed to drop and reorder the datagrams, and there's no flow control as in TCP.
Exceeding the socket send buffer on the sender side will have the network stack ignore following send attempts until the buffer space is available again (you need to check the return value of the send() for that.)
Edit:
I would strongly recommend going with TCP for large file transfers. TCP gives you sequencing (you don't have to keep track of dropped and re-ordered packets.) It has advanced flow control (so fast sender does not overwhelm a slow receiver.) It also does Path MTU discovery (i.e. finds out optimal data packetization and avoids IP fragmentation.) Otherwise you would have to re-implement most of these features yourself.
I hate to say it but you need to sleep the thread. You are overloading your throughput. UDP is not very good for lossless data transfer. UDP is for when you don't mind dropping some packets.
Reliably - no, you won't do it with UDP.
As far as I understand, this makes sense for sending to multiple computers at a time (broadcasting).
In this case,
establish a TCP connection with each of them,
split the data into blocks,
give each block an ID,
send list of IDs to each computer with TCP connection,
broadcast data with UDP,
inform clients (via TCP) that data transmission is over,
than clients should ask to resend the dropped packets

How is.NET's NetworkStream delimiting multiple messages in the same packet?

So I've been tasked with creating a tool for our QA department that can read packets off the wire and reassemble the messages correctly (they don't trust our logs... long story).
The application whose communication I'm attempting to listen in on is using .NET's TcpListener and TcpClient classes to communicate. Intercepting the packets isn't a problem (I'm using SharpPcap). However, correctly reassembling the packets into application level messages is proving slightly difficult.
Some packets have the end of one message and the beginning of the next message in them and I can't figure out how the NetworkStream object in .NET is able to tell where one application level message ends and the other begins.
I have been able to figure out that any packet that contains the end of an application level message will have the TCP header "PSH" (Push) flag turned on. But I can't figure out how .NET knows where exactly the end of the message is inside that packet.
The data of one packet might look like:
/></Message><Message><Header fromSystem=http://blah
How does the stream know to only send up to the end of </Message> to the application and store the rest until the rest of the message is complete?
There are no IP level flags set for fragmentation, and the .NET sockets have no knowledge of the application level protocol. So I find this incredibly vexing. Any insight would be appreciated.
The stream doesn't know the end of anything - it should be part of the application protocol.
NetworkStream doesn't have anything built into it to convert the data into an object. What makes you think it does? What does the code which manages to use the NetworkStream look like? Are you perhaps doing some form of XML deserialization, and the reading code automatically stops when it reaches the closing tag?
Basically:
If your protocol doesn't have any message delimiter or length prefix, it probably should have
NetworkStream itself is highly unlikely to be doing anything clever - but if you could tell us what you're observing, we can maybe work out what's going on.

Categories