Beginner Question again: Kind of a follow up to a question I asked not long ago.
I am trying to understand this synchronous socket tutorial http://msdn.microsoft.com/en-us/library/6y0e13d3.aspx, particularly one single line in the code below.
QUESTION:
I want to make sure I am understanding the program flow right . When does handler.Receive(bytes) return? Does it return and store the number of bytes received in int bytesRec**when it "overflows" and has received more than 1024bytes? **And if that is so, and this might sound silly, what happens if MORE bytes arrive as it is storing the 1024 bytes in the *data*variable and not listening for more bytes that might be arriving at that time? Or should I not worry about it and let .net take care of that?
Socket handler = listener.Accept();
data = null;
// An incoming connection needs to be processed.
while (true) {
bytes = new byte[1024];
int bytesRec = handler.Receive(bytes);
// My question is WHEN does the following line
// get to be executed
data += Encoding.ASCII.GetString(bytes,0,bytesRec);
if (data.IndexOf("<EOF>") > -1) {
break;
}
}
When does handler.Receive(bytes) return?
Documentation:
If no data is available for reading, the Receive method will block
until data is available, unless a time-out value was set by using
Socket.ReceiveTimeout. If the time-out value was exceeded, the Receive
call will throw a SocketException. If you are in non-blocking mode,
and there is no data available in the in the protocol stack buffer,
the Receive method will complete immediately and throw a
SocketException. You can use the Available property to determine if
data is available for reading. When Available is non-zero, retry the
receive operation.
Does it return and store the number of bytes received in int
bytesRec when it "overflows" and has received more than 1024 bytes?
No, it always returns the number of bytes that have been read. If it didn't, how could you know what part of bytes contains meaningful data and what part of it has remained unused?
It is very important to understand how sockets typically work: bytes may be arriving in packets, but as far as the receiver is concerned each byte should be considered independently. This means that there is no guarantee that you will get the bytes in the chunks the sender sent them, and of course there is no guarantee that there is enough data to fill up your buffer.
If you only want to process incoming data in 1024-byte chunks, it is your own responsibility to keep calling Receive until it has released 1024 bytes in total to you.
And if that is so, and this might sound silly, what happens if MORE
bytes arrive as it is storing the 1024 bytes in the variable and
not listening for more bytes that might be arriving at that time?
Let's reiterate that Receive will not store 1024 bytes in the buffer because that's the buffer's size. It will store up to 1024 bytes.
If there is more data buffered internally by the network stack than your buffer can hold then 1024 bytes will be given back to you and the rest will stay in the network stack's buffers until you call Receive again. If Receive has begun copying data to your buffer and at that moment more data is received from the network then most likely what's going to happen is that these will have to wait for the next Receive call.
After all, at no point did anyone provide a guarantee that Receive would give you all the data it can (although of course that is desirable and it is what happens most of the time).
Related
If i send 1000 bytes in TCP, does it guarantee that the receiver will get the entire 1000 bytes "togther"? or perhaps he will first only get 500 bytes, and later he'll receive the other bytes?
EDIT: the question comes from the application's point of view. If the 1000 bytes are reassembles into a single buffer before they reach the application .. then i don't care if it was fragmented in the way..
See Transmission Control Protocol:
TCP provides reliable, ordered delivery of a stream of bytes from a program on one computer to another program on another computer.
A "stream" means that there is no message boundary from the receiver's point of view. You could get one 1000 byte message or one thousand 1 byte messages depending on what's underneath and how often you call read/select.
Edit: Let me clarify from the application's point of view. No, TCP will not guarantee that the single read would give you all of the 1000 bytes (or 1MB or 1GB) packet the sender may have sent. Thus, a protocol above the TCP usually contains fixed length header with the total content length in it. For example you could always send 1 byte that indicates the total length of the content in bytes, which would support up to 255 bytes.
As other answers indicated, TCP is a stream protocol -- every byte sent will be received (once and in the same order), but there are no intrinsic "message boundaries" -- whether all bytes are sent in a single .send call, or multiple ones, they might still be received in one or multiple .receive calls.
So, if you need "message boundaries", you need to impose them on top of the TCP stream, IOW, essentially, at application level. For example, if you know the bytes you're sending will never contain a \0, null-terminated strings work fine; various methods of "escaping" let you send strings of bytes which obey no such limitations. (There are existing protocols for this but none is really widespread or widely accepted).
Basically as far as TCP goes it only guarantees that the data sent from one end to the other end will be sent in the same order.
Now usually what you'll have to do is have an internal buffer that keeps looping until it has received your 1000 byte "packet".
Because the recv command as mentioned returns how much has actually been received.
So usually you'll have to then implement a protocol on top of TCP to make sure you send data at an appropriate speed. Because if you send() all the data in one run through it will overload the under lying networking stack, and which will cause complications.
So usually in the protocol there is a tiny acknowledgement packet sent back to confirm that the packet of 1000 bytes are sent.
You decide, in your message that how many bytes your message shall contain. For instance in your case its 1000. Following is up and running C# code to achieve the same. The method returns with 1000 bytes. The abort code is 0 bytes; you can tailor that according to your needs.
Usage:
strMsg = ReadData(thisTcpClient.Client, 1000, out bDisconnected);
Following is the method:
string ReadData(Socket sckClient, int nBytesToRead, out bool bShouldDisconnect)
{
bShouldDisconnect = false;
byte[] byteBuffer = new byte[nBytesToRead];
Array.Clear(byteBuffer, 0, byteBuffer.Length);
int nDataRead = 0;
int nStartIndex = 0;
while (nDataRead < nBytesToRead)
{
int nBytesRead = sckClient.Receive(byteBuffer, nStartIndex, nBytesToRead - nStartIndex, SocketFlags.None);
if (0 == nBytesRead)
{
bShouldDisconnect = true;
//0 bytes received; assuming disconnect signal
break;
}
nDataRead += nBytesRead;
nStartIndex += nBytesRead;
}
return Encoding.Default.GetString(byteBuffer, 0, nDataRead);
}
Let us know this didn't help you (0: Good luck.
Yes, there is a chance for receiving packets part by part. Hope this msdn article and following example (taken from the article in msdn for quick review) would be helpful to you if you are using windows sockets.
void CChatSocket::OnReceive(int nErrorCode)
{
CSocket::OnReceive(nErrorCode);
DWORD dwReceived;
if (IOCtl(FIONREAD, &dwReceived))
{
if (dwReceived >= dwExpected) // Process only if you have enough data
m_pDoc->ProcessPendingRead();
}
else
{
// Error handling here
}
}
TCP guarantees that they will recieve all 1000 bytes, but not necessarily in order (though, it will appear so to the recieving application) and not necessarily all at once (unless you craft the packet yourself and make it so.).
That said, for a packet as small as 1000 bytes, there is a good chance it'll send in one packet as long as you do it in one call to send, though for larger transmissions it may not.
The only thing that the TCP layer guarantees is that the receiver will receive:
all the bytes transmitted by the sender
in the same order
There are no guarantees at all about how the bytes might be split up into "packets". All the stuff you might read about MTU, packet fragmentation, maximum segment size, or whatever else is all below the layer of TCP sockets, and is irrelevant. TCP provides a stream service only.
With reference to your question, this means that the receiver may receive the first 500 bytes, then the next 500 bytes later. Or, the receiver might receive the data one byte at a time, if that's what it asks for. This is the reason that the recv() function takes a parameter that tells it how much data to return, instead of it telling you how big a packet is.
The transmission control protocol guarantees successful delivery of all packets by requiring acknowledgment of the successful delivery of each packet to the sender by the receiver. By this definition the receiver will always receive the payload in chunks when the size of the payload exceeds the MTU (maximum transmission unit).
For more information please read Transmission Control Protocol.
The IP packets may get fragmented during retransmission.
So the destination machine may receive multiple packets - which will be reassembled back by TCP/IP stack. Depending on the network API you are using - the data will be given to you either reassembled or in RAW packets.
It depends of the stablished MTU (Maximum transfer unit). If your stablished connection (once handshaked) refers to a MTU of 512 bytes you will need two or more TCP packets to send 1000 bytes.
I just read an article that says TCPClient.Read() may not get all the sent bytes in one read. How do you account for this?
For example, the server can write a string to the tcp stream. The client reads half of the string's bytes, and then reads the other half in another read call.
how do you know when you need to combine the byte arrays received in both calls?
how do you know when you need to combine the byte arrays received in both calls?
You need to decide this at the protocol level. There are four common models:
Close-on-finish: each side can only send a single "message" per connection. After sending the message, they close the sending side of the socket. The receiving side keeps reading until it reaches the end of the stream.
Length-prefixing: Before each message, include the number of bytes in the message. This could be in a fixed-length format (e.g. always 4 bytes) or some compressed format (e.g. 7 bits of size data per byte, top bit set for the final byte of size data). Then there's the message itself. The receiving code will read the size, then read that many bytes.
Chunking: Like length-prefixing, but in smaller chunks. Each chunk is length-prefixed, with a final chunk indicating "end of message"
End-of-message signal: Keep reading until you see the terminator for the message. This can be a pain if the message has to be able to include arbitrary data, as you'd need to include an escaping mechanism in order to represent the terminator data within the message.
Additionally, less commonly, there are protocols where each message is always a particular size - in which case you just need to keep going until you've read that much data.
In all of these cases, you basically need to loop, reading data into some sort of buffer until you've got enough of it, however you determine that. You should always use the return value of Read to note how many bytes you actually read, and always check whether it's 0, in which case you've reached the end of the stream.
Also note that this doesn't just affect network streams - for anything other than a local MemoryStream (which will always read as much data as you ask for in one go, if it's in the stream at all), you should assume that data may only become available over the course of multiple calls.
You should call read() in a loop. The condition of that loop would check if there is still any data available to be read.
That is kinda hard to answer, because you can never know when data will arrive, and thats why I usually use a thread for receiving data in my chat program. But you should be able to use something similar to this:
do{
numberOfBytesRead = myNetworkStream.Read(myReadBuffer,
0,
myReadBuffer.Length);
myCompleteMessage.AppendFormat("{0}",
Encoding.ASCII.GetString(myReadBuffer, 0, numberOfBytesRead));
}
while(myNetworkStream.DataAvailable);
Look at this source!
I am using the server-client model for communicating with a hardware board using socket programing.
I receive data from board using "read()" method of "NetworkStream" class which reads a buffer with specified maximum size and returns the length of valid data in buffer. I have considered the maximum size of buffer with a enough big number.
The board sends a set of messages every 100ms. Each message consists a 2-byte constant header and a variable number of bytes as its data after the header bytes.
The problem is that I do not receive the messages one by one! Instead, I receive a buffer may contains 2 or 3 messages or one message is scattered between two buffer.
Currently, I am using a DFA which gather the content of messages using the constant header bytes (We do not know the length of messages, we just know the header bytes) but the problem is that the data bytes may contains the header bytes randomly !!
Is there any efficient way to gather the bytes of each message from buffers using any specific stream or class? How can I overcome to this problem?!
You need to add an additional buffer component between your consumer DFA and the socket client.
Whenever data is avaliable from the NetworkStream the buffer component will read it and append it to its own private buffer, incrementing an "available bytes" counter. The buffer component needs to expose at least the following functionality to its users:
a BytesAvailable property -- this returns the value of the counter
a PeekBytes(int count) method -- this returns the first count bytes of the buffer, if that much is available at least, and does not modify the counter or the buffer
a ReadBytes(int count) method -- as above, but it decrements the counter by count and removes the bytes read from the buffer so that subsequent PeekBytes calls will never read them again
Keep in mind that you don't need to be able to service an arbitrarily high count parameter; it is enough if you can service a count as long as the longest message it would be possible to receive at all times.
Obviously the buffer component needs to keep some kind of data structure that allows "wrapping around" of some kind; you might want to look into a circular (ring) buffer implementation, or you can just use two fixed buffers of size N where N is the length of the longest message and switch from one to the other as they become full. You should be careful so that you stop pulling in data from the NetworkStream if your buffers become full and only continue pulling after the DFA has called ReadBytes to free up some buffer space.
Whenever your DFA needs to read data, it will first ask your buffer stage how much data it has accumulated and then proceed accordingly. It would look something like this:
if BytesAvailable < 2
return; // no header to read, cannot do anything
// peek at the header -- do not remove it from the buffer!
header = PeekBytes(2);
// calculate the full message length based on the header
// if this is not possible from just the header, you might want to do this
// iteratively, or you might want to change the header so that it is possible
length = X;
if BytesAvailable < X
return; // no full message to read, cannot continue
header = ReadBytes(2); // to remove it from the buffer
message = ReadBytes(X); // remove the whole message as well
This way your DFA will only ever deal with whole messages.
This code works:
TcpClient tcpClient = (TcpClient)client;
NetworkStream clientStream = tcpClient.GetStream();
byte[] message = new byte[5242880];
int bytesRead;
bytesRead = clientStream.Read(message, 0, 909699);
But this returns the wrong number of bytes:
bytesRead = clientStream.Read(message, 0, 5242880);
Why? How can I fix it?
(the real data size is 1475186; the code returns the 11043 as the number of bytes)
If this is a TCP based stream, then the answer is that the rest of the data simply didn't arrive yet.
TCP is stream oriented. That means there is no relation between the number of Send/Write calls, and the number of receive events. Multiple writes can be combined together, and single writes can be split.
If you want to work with messages on TCP, you need to implement your own packeting algorithm on top of it. Typical strategies to achieve this are:
Prefix each packed by its length, usual with binary data
Use a separation sequence such as a line-break. Usual with text data.
If you want to read all data in a blocking way you can use loop until DataAvailable is true but a subsequent call to Read returns 0. (Hope I remembered that part correctly, haven't done any network programming in a while)
From MSDN:
The Read operation reads as much data as is available, up to the
number of bytes specified by the size parameter.
I.e. you have to call the Read() method in a loop until you received all data. Have a look at the sample code in MSDN.
You need to loop reading bytes from the message until the Available property on the TCP client or the DataAvailable property of the NetworkStream are 0 (= no more bytes left)
Read the Documentation:
This method reads data into the buffer parameter and returns the
number of bytes successfully read. If no data is available for
reading, the Read method returns 0. The Read operation reads as much
data as is available, up to the number of bytes specified by the size
parameter. If the remote host shuts down the connection, and all
available data has been received, the Read method completes
immediately and return zero bytes.
So it could be because of connection failure that you get each time different number, anyway you can check the result to know if its the reason.
I think the answers already here respond to your specific question quite well, but possibly more generally: If you are trying to send data over a networkStream object for the purposes of network communication check out the open source library, networkComms.net.
If i send 1000 bytes in TCP, does it guarantee that the receiver will get the entire 1000 bytes "togther"? or perhaps he will first only get 500 bytes, and later he'll receive the other bytes?
EDIT: the question comes from the application's point of view. If the 1000 bytes are reassembles into a single buffer before they reach the application .. then i don't care if it was fragmented in the way..
See Transmission Control Protocol:
TCP provides reliable, ordered delivery of a stream of bytes from a program on one computer to another program on another computer.
A "stream" means that there is no message boundary from the receiver's point of view. You could get one 1000 byte message or one thousand 1 byte messages depending on what's underneath and how often you call read/select.
Edit: Let me clarify from the application's point of view. No, TCP will not guarantee that the single read would give you all of the 1000 bytes (or 1MB or 1GB) packet the sender may have sent. Thus, a protocol above the TCP usually contains fixed length header with the total content length in it. For example you could always send 1 byte that indicates the total length of the content in bytes, which would support up to 255 bytes.
As other answers indicated, TCP is a stream protocol -- every byte sent will be received (once and in the same order), but there are no intrinsic "message boundaries" -- whether all bytes are sent in a single .send call, or multiple ones, they might still be received in one or multiple .receive calls.
So, if you need "message boundaries", you need to impose them on top of the TCP stream, IOW, essentially, at application level. For example, if you know the bytes you're sending will never contain a \0, null-terminated strings work fine; various methods of "escaping" let you send strings of bytes which obey no such limitations. (There are existing protocols for this but none is really widespread or widely accepted).
Basically as far as TCP goes it only guarantees that the data sent from one end to the other end will be sent in the same order.
Now usually what you'll have to do is have an internal buffer that keeps looping until it has received your 1000 byte "packet".
Because the recv command as mentioned returns how much has actually been received.
So usually you'll have to then implement a protocol on top of TCP to make sure you send data at an appropriate speed. Because if you send() all the data in one run through it will overload the under lying networking stack, and which will cause complications.
So usually in the protocol there is a tiny acknowledgement packet sent back to confirm that the packet of 1000 bytes are sent.
You decide, in your message that how many bytes your message shall contain. For instance in your case its 1000. Following is up and running C# code to achieve the same. The method returns with 1000 bytes. The abort code is 0 bytes; you can tailor that according to your needs.
Usage:
strMsg = ReadData(thisTcpClient.Client, 1000, out bDisconnected);
Following is the method:
string ReadData(Socket sckClient, int nBytesToRead, out bool bShouldDisconnect)
{
bShouldDisconnect = false;
byte[] byteBuffer = new byte[nBytesToRead];
Array.Clear(byteBuffer, 0, byteBuffer.Length);
int nDataRead = 0;
int nStartIndex = 0;
while (nDataRead < nBytesToRead)
{
int nBytesRead = sckClient.Receive(byteBuffer, nStartIndex, nBytesToRead - nStartIndex, SocketFlags.None);
if (0 == nBytesRead)
{
bShouldDisconnect = true;
//0 bytes received; assuming disconnect signal
break;
}
nDataRead += nBytesRead;
nStartIndex += nBytesRead;
}
return Encoding.Default.GetString(byteBuffer, 0, nDataRead);
}
Let us know this didn't help you (0: Good luck.
Yes, there is a chance for receiving packets part by part. Hope this msdn article and following example (taken from the article in msdn for quick review) would be helpful to you if you are using windows sockets.
void CChatSocket::OnReceive(int nErrorCode)
{
CSocket::OnReceive(nErrorCode);
DWORD dwReceived;
if (IOCtl(FIONREAD, &dwReceived))
{
if (dwReceived >= dwExpected) // Process only if you have enough data
m_pDoc->ProcessPendingRead();
}
else
{
// Error handling here
}
}
TCP guarantees that they will recieve all 1000 bytes, but not necessarily in order (though, it will appear so to the recieving application) and not necessarily all at once (unless you craft the packet yourself and make it so.).
That said, for a packet as small as 1000 bytes, there is a good chance it'll send in one packet as long as you do it in one call to send, though for larger transmissions it may not.
The only thing that the TCP layer guarantees is that the receiver will receive:
all the bytes transmitted by the sender
in the same order
There are no guarantees at all about how the bytes might be split up into "packets". All the stuff you might read about MTU, packet fragmentation, maximum segment size, or whatever else is all below the layer of TCP sockets, and is irrelevant. TCP provides a stream service only.
With reference to your question, this means that the receiver may receive the first 500 bytes, then the next 500 bytes later. Or, the receiver might receive the data one byte at a time, if that's what it asks for. This is the reason that the recv() function takes a parameter that tells it how much data to return, instead of it telling you how big a packet is.
The transmission control protocol guarantees successful delivery of all packets by requiring acknowledgment of the successful delivery of each packet to the sender by the receiver. By this definition the receiver will always receive the payload in chunks when the size of the payload exceeds the MTU (maximum transmission unit).
For more information please read Transmission Control Protocol.
The IP packets may get fragmented during retransmission.
So the destination machine may receive multiple packets - which will be reassembled back by TCP/IP stack. Depending on the network API you are using - the data will be given to you either reassembled or in RAW packets.
It depends of the stablished MTU (Maximum transfer unit). If your stablished connection (once handshaked) refers to a MTU of 512 bytes you will need two or more TCP packets to send 1000 bytes.