C# socket blocking behavior

C# socket blocking behavior - c#

My situation is this : I have a C# tcp socket through which I receive structured messages consisting of a 3 byte header and a variable size payload. The tcp data is routed through a network of tunnels and is occasionally susceptible to fragmentation. The solution to this is to perform a blocking read of 3 bytes for the header and a blocking read of N bytes for the variable size payload (the value of N is in the header). The problem I'm experiencing is that occasionally, the blocking receive operation returns a partial packet. That is, it reads a volume of bytes less than the number I explicitly set in the receive call. After some debugging, it appears that the number of bytes it returns is equal to the number of bytes in the Available property of the socket before the receive op.
This behavior is contrary to my expectation. If the socket is blocking and I explicitly set the number of bytes to receive, shouldn't the socket block until it recv's those bytes?, any help, pointers, etc would be much appreciated.

The behaviour depends on the type of socket you're using. TCP is a Connection-Oriented Socket, which means:
If you are using a connection-oriented Socket, the Receive method will read as much data as is available, up to the number of bytes specified by the size parameter. If the remote host shuts down the Socket connection with the Shutdown method, and all available data has been received, the Receive method will complete immediately and return zero bytes.
When using TCP sockets, you have to be prepared for this possibility; check the return value of the Receive method, and if it was less than what you expected, Receive again until either the socket is closed or you've actually received as much data as you need.

Related

How does a TCP packet arrive when using the Socket api in C#

I have been reading about TCP packet and how they can be split up any number of times during their voyage. I took this to assume I would have to implement some kind of buffer on top of the buffer used for the actual network traffic in order to store each ReceiveAsync() until enough data is available to parse a message. BTW, I am sending length-prefixed, protobuf-serialized messages over TCP.
Then I read that the lower layers (ethernet?, IP?) will actually re-assemble packets transparently.
My question is, in C#, am I guaranteed to receive a full "message" over TCP? In other words, if I send 32 bytes, will I necessarily receive those 32 bytes in "one-go" (one call to ReceiveAsync())? Or do I have to "store" each receive until the number of bytes received is equal to the length-prefix?
Also, could I receive more than one message in a single call to ReceiveAsync()? Say one "protobuf message" is 32 bytes. I send 2 of them. Could I potentially receive 48 bytes in "one go" and then 16 in another?
I know this question shows up easily on google, but I can never tell if it's in the correct context (talking about the actual TCP protocol, or how C# will expose network traffic to the programmer).
Thanks.

TCP is a stream protocol - it transmits a stream of bytes. That's all. Absolutely no message framing / grouping is implied. In fact, you should forget that Ethernet packets or IP datagrams even exist when writing code using a TCP socket.
You may find yourself with 1 byte available, or 10,000 bytes available to read. The beauty of the (synchronous) Berkeley sockets API is that you, as an application programmer don't need to worry about this. Since you're using a length-prefixed message format (good job!) simply recv() as many bytes as you're expecting. If there are more bytes available than the application requests, the kernel will keep the rest buffered until the next call. If there are fewer bytes available than required, the thread will either block or the call will indicate that fewer bytes were received. In this case, you can simply sleep again until data is available.
The problem with async APIs is that it requires the application to track a lot more state itself. Even this Microsoft example of Asynchronous Client Sockets is far more complicated than it needs to be. With async APIs, you still control the amount of data you're requesting from the kernel, but when your async callback is fired, you then need to know the next amount of data to request.
Note that the C# async/await in 4.5 make asynchronous processing easier, as you can do so in a synchronous way. Have a look at this answer where the author comments:
Socket.ReceiveAsync is a strange one. It has nothing to do with async/await features in .net4.5. It was designed as an alternative socket API that wouldn't thrash memory as hard as BeginReceive/EndReceive, and only needs to be used in the most hardcore of server apps.

TCP is a stream-based octet protocol. So, from the application's perspective, you can only read or write bytes to the stream.
I have been reading about TCP packet and how they can be split up any number of times during their voyage.
TCP packets are a network implementation detail. They're used for efficiency (it would be very inefficient to send one byte at a time). Packet fragmentation is done at the device driver / hardware level, and is never exposed to applications. An application never knows what a "packet" is or where its boundaries are.
I took this to assume I would have to implement some kind of buffer on top of the buffer used for the actual network traffic in order to store each ReceiveAsync() until enough data is available to parse a message.
Yes. Because "message" is not a TCP concept. It's purely an application concept. Most application protocols do define a kind of "message" because it's easier to reason about.
Some application protocols, however, do not define the concept of a "message"; they treat the TCP stream as an actual stream, not a sequence of messages.
In order to support both kinds of application protocols, TCP/IP APIs have to be stream-based.
BTW, I am sending length-prefixed, protobuf-serialized messages over TCP.
That's good. Length prefixing is much easier to deal with than the alternatives, IMO.
My question is, in C#, am I guaranteed to receive a full "message" over TCP?
No.
Or do I have to "store" each receive until the number of bytes received is equal to the length-prefix? Also, could I receive more than one message in a single call to ReceiveAsync()?
Yes, and yes.
Even more fun:
You can get only part of your length prefix (assuming a multi-byte length prefix).
You can get any number of messages at once.
Your buffer can contain part of a message, or part of a message's length prefix.
The next read may not finish the current message, or even the current message's length prefix.
For more information on the details, see my TCP/IP .NET FAQ, particularly the sections on message framing and some example code for length-prefixed messages.
I strongly recommend using only asynchronous APIs in production; the synchronous alternative of having two threads per connection negatively impacts scalability.
Oh, and I also always recommend using SignalR if possible. Raw TCP/IP socket programming is always complex.

My question is, in C#, am I guaranteed to receive a full "message" over TCP?
No. You will not receive a full message. A single send does not result in a single receive. You must keep reading on the receiving side until you have received everything you need.
See the example here, it keeps the read data in a buffer and keeps checking to see if there is more data to be read:
private static void ReceiveCallback(IAsyncResult ar)
{
try
{
// Retrieve the state object and the client socket
// from the asynchronous state object.
StateObject state = (StateObject)ar.AsyncState;
Socket client = state.workSocket;
// Read data from the remote device.
int bytesRead = client.EndReceive(ar);
if (bytesRead > 0)
{
// There might be more data, so store the data received so far.
state.sb.Append(Encoding.ASCII.GetString(state.buffer, 0, bytesRead));
// Get the rest of the data.
client.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0,
new AsyncCallback(ReceiveCallback), state);
}
else
{
// All the data has arrived; put it in response.
if (state.sb.Length > 1)
{
response = state.sb.ToString();
}
// Signal that all bytes have been received.
receiveDone.Set();
}
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
}
}
See this MSDN article and this article for more details. The 2nd link goes into more details and it also has sample code.

TCP messages arrival on the same socket

I've got a 2 services that communicate using a TCP socket (the one that initiates the connection is a C++ Windows service and the receiver is a C# TCP stream) and (let's say that they might not use the same TCP connection all the time). Some of the time I'm getting half messages (where the number of bytes is miscalculated somehow) on a great network load.
I have several questions in order to resolve the issue:
Can I be sure that messages (not packets...) must follow one another, for example, if I send message 1 (that was received as half), then message 2, can I be sure that the next half of message 1 won't be after message 2?
Please have separate answer for the same TCP connection between all messages and not having the same TCP connection between them.
Is there any difference whether the sender and the receiver are on the same station?

TCP is a stream protocol that delivers a stream of bytes. It does not know (or care) about your message boundaries.
To send the data, TCP breaks the stream up into arbitrarily sized packets. [it's not really arbitrary and it is really complicated.] and send these reliably to the other end.
When you read data from a TCP socket, you get whatever data 1) has arrived, and 2) will fit in the buffer you have provided.
On the receive side you need to use code to reassemble complete messages from the TCP stream. You may have to write this yourself or you may find that it is already written for you by whatever library (if any) you are using to interact with the socket.

TCP is a streaming protocol, it doesn't have "messages" or "packets" or other boundaries. Sometimes when you receive data you might not get all that was sent, other times you could get more than one "message" in the received stream.
You have to model your protocol to handle these things, for example by including special message terminators or including a fixed-sized header including the data size.
If you don't get all of a message at once, then you have to receive again, maybe even multiple times, to get all the data that was sent.
And to answer your question, you can be sure that the stream is received in the order it was sent, or the TCP layer will give you an error. If byte A was sent before byte B then it's guaranteed that they will arrive in that order. There is also no theoretical difference if the sender and receiver is on the same system, or on different continents.

How do you know if a Socket.Send() has worked?

Quite a simple question - I've been having some TCP packets go missing (they aren't received by the remote socket), so I'm considering resending any packets that don't get received.
However I need to know when send doesn't work in order to do this! I know Send() returns an integer containing the 'The number of bytes sent to the Socket', but even when the other computer receives nothing, this is always the length of the entire buffer, indicating that everything was (theoretically) sent. I also know there's a Socket.Connected property, but that is false even when that data is received, and sometimes true even when it isn't, so that doesn't help either.
So how do I know if Send() has worked?

Send() simply places the data in a buffer for the network adapter to process. When Send() returns, you have no guarantee that a single byte has "left" your computer, even when the socket is in blocking mode.
TCP ensures though that within a connection all data is received in the order it was sent. It never "forgets" a packet in the middle of a conversation, and automatically retransmits data when needed.
To determine whether retransmission is required, the protocol sends acknowledgement messages, but:
you can't access them from the Socket class
hosts may postpone sending this ACK message
The easiest way to ensure your message has arrived is to let the other party respond to them yourself. If you don't receive a response within a reasonable amount of time, you could treat that connection as broken.
About the whole discussion between Sriram Sakthivel and "the others" - I've had similar problems of receiving duplicate messages and missing others, but in my personal case this was caused by:
using BeginReceive() (the async receive method),
reusing the same buffer on each BeginReceive() call, and
calling BeginReceive() before processing that buffer, causing the buffer to be filled with new data before having read the old message.

The ideal way of doing this is to check if the bufferedAmount is zero after you've sent a message. The trick is, you'll have to let your application know you are in fact sending a message.

C# socket image transfer: file sometimes partially transferred

Problem
I have PHP client which sends a image file to a C# socket server. My problem is about 30% of the time the file is partially transferred and stops.
PHP END ->
$file = file_get_contents('a.bmp');
socket_write($socket,$file);
C# END ->
int l= Socket.Receive(buffer, buffer.Length, SocketFlags.None);
//create the file using a file stream
How can I always transfer the full file without intermediate states? And why does it happen?

From the documentation for Socket.Receive:
If you are using a connection-oriented Socket, the Receive method will read as much data as is available, up to the number of bytes specified by the size parameter. If the remote host shuts down the Socket connection with the Shutdown method, and all available data has been received, the Receive method will complete immediately and return zero bytes.
This means you may get less than the total amount. This is just the way sockets work.
So if if you get a partial read, you should call Socket.Receive again. You can use the overload of Socket.Receive to continue reading into the same buffer.
Here is an article that shows how "keep reading" until you get what you want:
Socket Send and Receive
If you don't know how big the data is, you must keep reading until Socket.Receive returns zero.

.Net SendAsync always sends all data?

Will Socket.SendAsync always send all data in the byte[] buffer that the SocketAsyncEventArgs has been assigned with? I've tested some code but only on a local network and there it seems to be that way..
Edit:
Ok but does it always send all data before running the completed event?
the only socket.BeginSend did not if I remember right..

It will attempt to send all data, however, from the docs on MSDN:
"For message-oriented sockets, do not exceed the maximum message size of the underlying Windows sockets service provider. If the data is too long to pass atomically through the underlying service provider, no data is transmitted and the SendAsync method throws a SocketException with the SocketAsyncEventArgs.SocketError set to the native Winsock WSAEMSGSIZE error code (10040)."
There are times when a buffer that is too large should be split up. It depends on the underlying socket implementation.

No it will not. There are a lot of factors to consider here including buffering, timeouts, etc ...
The simplest to consider though is the limit on packets at the IPV4 level. IPV4 packets have a strict limit that cannot be exceeded (65,535 bytes). It's therefore not possible for SendAsync to push data which is larger than an IPV4 packet size into a single packet.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.