Sorry for such a vague title, I really dont know what to title this issue.
Basically when I get a stream thats chunked as told by Transfer-Encoding, I then do the following code:
private IEnumerable<byte[]> ReceiveMessageBodyChunked() {
readChunk:
#region Read a line from the Stream which should be a Block Length (Chunk Body Length)
string blockLength = _receiverHelper.ReadLine();
#endregion
#region If the end of the block is reached, re-read from the stream
if (blockLength == Http.NewLine) {
goto readChunk;
}
#endregion
#region Trim it so it should end up with JUST the number
blockLength = blockLength.Trim(' ', '\r', '\n');
#endregion
#region If the end of the message body is reached
if (blockLength == string.Empty) {
yield break;
}
#endregion
int blockLengthInt = 0;
#region Convert the Block Length String to an Int32 base16 (hex)
try {
blockLengthInt = Convert.ToInt32(blockLength, 16);
} catch (Exception ex) {
if (ex is FormatException || ex is OverflowException) {
throw new Exception(string.Format(ExceptionValues.HttpException_WrongChunkedBlockLength, blockLength), ex);
}
throw;
}
#endregion
// If the end of the message body is reached.
if (blockLengthInt == 0) {
yield break;
}
byte[] buffer = new byte[blockLengthInt];
int totalBytesRead = 0;
while (totalBytesRead != blockLengthInt) {
int length = blockLengthInt - totalBytesRead;
int bytesRead = _receiverHelper.HasData ? _receiverHelper.Read(buffer, 0, length) : _request.ClientStream.Read(buffer, 0, length);
if (bytesRead == 0) {
WaitData();
continue;
}
totalBytesRead += bytesRead;
System.Windows.Forms.MessageBox.Show("Chunk Length: " + blockLengthInt + "\nBytes Read/Total:" + bytesRead + "/" + totalBytesRead + "\n\n" + Encoding.ASCII.GetString(buffer));
yield return buffer;
}
goto readChunk;
}
What this is doing is reading 1 line of data from the stream which should be the Chunk's Length, does some checks here and there but eventually converts that to a Int32 Radix16 integer.
From there it essentially creates a byte buffer of that int32 as its length size.
It then just keeps reading from the stream until its read the same amount as the Int32 we converted.
This works splendid, however, for whatever reason, its responding incorrectly on the last read.
It will read the exact amount of bytes as the chunk length perfectly fine, and all data I expect is read. BUT it's ALSO reading again another small chunk of data that was ALREADY read at the very end, resulting in lets say all data from <!DOCTYPE html> down to </html> ASWELL as some data from inside somewhere like <form> e.t.c
Here's an example of what occured:
As you can see, the highlighted red text should NOT have been returned from the read! It should have ended at </html>.
Why is the Chunk's Length lying to me and how can I find the proper size to read at?
I'm not familiar with C# but if I understand your code and the semantics of Read in C# correctly (which seem to be similar to read in C) then the problem is that you are using the same buffer again and again without resetting it first:
byte[] buffer = new byte[blockLengthInt];
int totalBytesRead = 0;
while (totalBytesRead != blockLengthInt) {
int length = blockLengthInt - totalBytesRead;
int bytesRead = _receiverHelper.HasData ? _receiverHelper.Read(buffer, 0, length) : _request.ClientStream.Read(buffer, 0, length);
...
totalBytesRead += bytesRead;
...
yield return buffer;
}
To make some example of what goes wrong here: assume that the chunk size is 10, the content you read is 0123456789 and the first read will return 6 bytes and the second read the remaining 4 bytes. In this case your buffer will be 012345 after the first read and 567845 after the second read. These 45 at the end of the buffer remain from the previous read since you only replaced the first 4 bytes in the buffer but kept the rest.
Whats odd AF is that if I hand the request to another TCPStream proxied (127.0.0.1:8888 as a proxy which is fiddler) it works perfectly fine...
Fiddler is a proxy and might change how the response gets transferred. For example it might use Content-length instead of chunked encoding or it might use smaller chunks so that you always get the full chunk with the first read.
Related
So what i'm doing is:
int blockLength;
try {
blockLength = Convert.ToInt32(line, 16);
} catch (Exception ex) {
if (ex is FormatException || ex is OverflowException) {
throw new Exception("WrongChunkedBlockLength", ex);
}
throw;
}
if (blockLength == 0) {
yield break;
}
(where line = a line read from the stream that is expected to be the chunk length)
This works fine but for whatever reason im getting returned a value thats lower than what the actual chunk size is. What I mean by this, is im sending a request to Hotels.com and when I get the "chunk size" and then read from the stream with the chunk size as the length/count it reads it all, but theres actually still further data to read for that chunk.
Since the code now thinks it read the full chunk, it loops to continue the next chunk only to get hit with first chunks data.
This is how im reading from the stream:
int totalBytesRead = 0;
byte[] buffer = new byte[blockLength];
while (totalBytesRead < blockLength) {
int read = stream.Read(buffer, totalBytesRead, blockLength);
if (read == 0) {
WaitData();
continue;
}
totalBytesRead += read;
}
and then right after that this is what I do to further test the issue:
buffer = new byte[90];
while (true) {
stream.Read(buffer, 0, 10);
MessageBox.Show(Encoding.ASCII.GetString(buffer));
}
TCP is stream-based protocol. To convert that stream into my messages, I send the size of each message with the message itself. At server side, I first read the first two bytes of message, which have the size. Then I create a byte array, of size equal to the size which was just read. Then I read the bytes into that array. But for some reason, more bytes are being read than specified. How can I read exactly the same number of bytes as I specify?
Here is my code:
while (true)
{
data = null;
length = null;
size = new byte[2];
handler.Receive(size);
length += Encoding.ASCII.GetString(size, 0, 2);
System.Console.WriteLine("Size: " + Int32.Parse(length));
bufferSize = Int32.Parse(length) + 2;
bytes = new byte[bufferSize];
handler.Receive(bytes);
data += Encoding.ASCII.GetString(bytes, 0, bufferSize);
System.Console.WriteLine("Data: " + data);
}
This is my server running in Windows PC, written in C#. My client is running in android phone, written in Java.
It's unclear why you're adding two to the size that's been transmitted - you've already accounted for the two additional bytes for storing the length during your previous receive. So I'd get rid of the +2.
You also need to respect the fact already stated in your question - TCP is a sequence of bytes, not messages. As such, you're never guaranteed whether a call to Receive is going to retrieve an entire "message" or just part of one (or, possible, parts of multiple messages). As such, you need to make sure that you respect the return value from Receive.
We can probably re-write your code as:
while (true)
{
data = null;
length = null;
size = ReceiveExactly(handler,2);
length = Encoding.ASCII.GetString(size, 0, 2); //Why +=?
bufferSize = Int32.Parse(length); //Why + 2?
System.Console.WriteLine("Size: " + bufferSize);
bytes = ReceiveExactly(handler,bufferSize);
data += Encoding.ASCII.GetString(bytes, 0, bufferSize);
System.Console.WriteLine("Data: " + data);
}
Where ReceiveExactly is defined something like this:
private byte[] ReceiveExactly(Socket handler, int length)
{
var buffer = new byte[length];
var receivedLength = 0;
while(receivedLength < length)
{
var nextLength = handler.Receive(buffer,receivedLength,length-receivedLength);
if(nextLength==0)
{
//Throw an exception? Something else?
//The socket's never going to receive more data
}
receivedLength += nextLength;
}
return buffer;
}
to receive a specific amount of bytes use the method
Socket.Receive(Byte[], Int32, Int32, SocketFlags)
rather than Socket.Receive(Byte[]). see spec here
I suspect you want something like
int len = Socket.Receive(bytes, 0, bufferSize, SocketFlags.None);
data += Encoding.ASCII.GetString(bytes, 0, len);
System.Console.WriteLine("Data: " + data);
I am currently working on a networking project where I worked out a binary protocol. My packets look like this:
[1 byte TYPE][2 bytes INDEX][2 bytes LENGTH][LENGTH bytes DATA]
And here's the code where I am receiving the packets:
NetworkStream clientStream= Client.GetStream();
while (Client.Connected)
{
Thread.Sleep(10);
try
{
if (clientStream.DataAvailable)
{
byte[] infobuffer = new byte[5];
int inforead = clientStream.Read(infobuffer, 0, 5);
if (inforead < 5) { continue; }
byte[] rawclient = new byte[2];
Array.Copy(infobuffer, 1, rawclient, 0, 2);
PacketType type = (PacketType)Convert.ToSByte(infobuffer[0]);
int clientIndex = BitConverter.ToInt16(rawclient, 0);
int readLength = BitConverter.ToInt16(infobuffer, 3);
byte[] readbuffer = new byte[readLength];
int count_read = clientStream.Read(readbuffer, 0, readLength);
byte[] read_data = new byte[count_read];
Array.Copy(readbuffer, read_data, count_read);
HandleData(read_data, type, clientIndex);
}
}
catch (Exception ex)
{
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine("[E] " + ex.GetType().ToString());
Console.ResetColor();
break;
}
}
Well, and everything works fine... as long as I run it on 127.0.0.1. As soon as I try testing it over long distance, packets somehow get lost, and I am getting an overflow-exception on the line where I convert the first byte to PacketType. Also, if I try to convert the other values to int16, I get very strange values.
I assume the stream somehow looses some bytes on its way to the server, but can this be? Or is it just a little mistake of mine somewhere in the code?
edit:
I now edited the code, now it reads till it gets its 5 bytes. But I still get the same exception over long distance...
NetworkStream clientStream = Client.GetStream();
while (Client.Connected)
{
Thread.Sleep(10);
try
{
if (clientStream.DataAvailable)
{
int totalread = 0;
byte[] infobuffer = new byte[5];
while (totalread < 5)
{
int inforead = clientStream.Read(infobuffer, totalread, 5 - totalread);
if (inforead == 0)
{ break; }
totalread += inforead;
}
byte[] rawclient = new byte[2];
Array.Copy(infobuffer, 1, rawclient, 0, 2);
PacketType type = (PacketType)Convert.ToSByte(infobuffer[0]);
int clientIndex = BitConverter.ToInt16(rawclient, 0);
int readLength = BitConverter.ToInt16(infobuffer, 3);
byte[] readbuffer = new byte[readLength];
int count_read = clientStream.Read(readbuffer, 0, readLength);
byte[] read_data = new byte[count_read];
Array.Copy(readbuffer, read_data, count_read);
HandleData(read_data, type, clientIndex);
}
}
catch (Exception ex)
{
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine("[E] " + ex.GetType().ToString());
Console.ResetColor();
break;
}
}
PacketType is an enum:
public enum PacketType
{
AddressSocks5 = 0,
Status = 1,
Data = 2,
Disconnect = 3,
AddressSocks4 = 4
}
So many things you're doing wrong here... so many bugs... where to even start...
First Network polling? Really? That's just a naïve way of doing network activity in this day and age.. but I won't harp on that.
Second, with this type of protocol, it's pretty easy to get "out of sync" and once you do, you have no way to get back in sync. This is typically accomplished with some kind of "framing protocol" which provides a unique sequence of bytes that you can use to indicate the start and end of a frame, so that if you ever find yourself out of sync you can read data until you get back in sync. Yes, you will lose data, but you've already lost it if you're out of sync.
Third, you're not really doing anything huge here, so I shamelessly stole the "ReadWholeArray" code from here, it's not the most efficient, but it works and there is other code there that might help:
http://www.yoda.arachsys.com/csharp/readbinary.html
Note: you don't mention how you are serializing the length, type and index values on the other side. So using the BitConverter may be the wrong thing depending on how that was done.
if (clientStream.DataAvailable)
{
byte[] data = new byte[5];
// if it can't read all 5 bytes, it throws an exception
ReadWholeArray(clientStream, data);
PacketType type = (PacketType)Convert.ToSByte(data[0]);
int clientIndex = BitConverter.ToInt16(data, 1);
int readLength = BitConverter.ToInt16(data, 3);
byte[] rawdata = new byte[readLength];
ReadWholeArray(clientStream, rawdata);
HandleData(rawdata, type, clientIndex);
}
/// <summary>
/// Reads data into a complete array, throwing an EndOfStreamException
/// if the stream runs out of data first, or if an IOException
/// naturally occurs.
/// </summary>
/// <param name="stream">The stream to read data from</param>
/// <param name="data">The array to read bytes into. The array
/// will be completely filled from the stream, so an appropriate
/// size must be given.</param>
public static void ReadWholeArray (Stream stream, byte[] data)
{
int offset=0;
int remaining = data.Length;
while (remaining > 0)
{
int read = stream.Read(data, offset, remaining);
if (read <= 0)
throw new EndOfStreamException
(String.Format("End of stream reached with {0} bytes left to read", remaining));
remaining -= read;
offset += read;
}
}
I think the problem is in these lines
int inforead = clientStream.Read(infobuffer, 0, 5);
if (inforead < 5) { continue; }
what happen to your previously read data if the length is under 5 byte? you should save the bytes you have read so far and append next bytes so you can have the header completely
You Read 5 - totalRead.
let totalRead equal 5 or more. When that happens you read nothing, and in cases of 1 - 4 you read that many arbitrary bytes. Not 5. You also then discard any result of less then 5.
You also copy at a offset 1 or another offset without really knowing the offset.
BitConverter.ToInt16(infobuffer, 3);
Is an example of this, what is at off 2?
So if it's not that (decoding error) and and not the structure of your data then unless you change the structure of your loop its you who's losing the bytes not the NetworkStream.
Calculate totalRead by increments of justRead when you recieve so you can handle any size of data as well as receiving it at the correct offset.
This is C# related. We have a case where we need to copy the entire source stream into a destination stream except for the last 16 bytes.
EDIT: The streams can range upto 40GB, so can't do some static byte[] allocation (eg: .ToArray())
Looking at the MSDN documentation, it seems that we can reliably determine the end of stream only when the return value is 0. Return values between 0 and the requested size can imply bytes are "not currently available" (what does that really mean?)
Currently it copies every single byte as follows. inStream and outStream are generic - can be memory, disk or network streams (actually some more too).
public static void StreamCopy(Stream inStream, Stream outStream)
{
var buffer = new byte[8*1024];
var last16Bytes = new byte[16];
int bytesRead;
while ((bytesRead = inStream.Read(buffer, 0, buffer.Length)) > 0)
{
outStream.Write(buffer, 0, bytesRead);
}
// Issues:
// 1. We already wrote the last 16 bytes into
// outStream (possibly over the n/w)
// 2. last16Bytes = ? (inStream may not necessarily support rewinding)
}
What is a reliable way to ensure all but the last 16 are copied? I can think of using Position and Length on the inStream but there is a gotcha on MSDN that says
If a class derived from Stream does not support seeking, calls to Length, SetLength, Position, and Seek throw a NotSupportedException. .
Read between 1 and n bytes from the input stream.1
Append the bytes to a circular buffer.2
Write the first max(0, b - 16) bytes from the circular buffer to the output stream, where b is the number of bytes in the circular buffer.
Remove the bytes that you just have written from the circular buffer.
Go to step 1.
1This is what the Read method does – if you call int n = Read(buffer, 0, 500); it will read between 1 and 500 bytes into buffer and return the number of bytes read. If Read returns 0, you have reached the end of the stream.
2For maximum performance, you can read the bytes directly from the input stream into the circular buffer. This is a bit tricky, because you have to deal with the wraparound within the array underlying the buffer.
The following solution is fast and tested. Hope it's useful. It uses the double buffering idea you already had in mind. EDIT: simplified loop removing the conditional that separated the first iteration from the rest.
public static void StreamCopy(Stream inStream, Stream outStream) {
// Define the size of the chunk to copy during each iteration (1 KiB)
const int blockSize = 1024;
const int bytesToOmit = 16;
const int buffSize = blockSize + bytesToOmit;
// Generate working buffers
byte[] buffer1 = new byte[buffSize];
byte[] buffer2 = new byte[buffSize];
// Initialize first iteration
byte[] curBuffer = buffer1;
byte[] prevBuffer = null;
int bytesRead;
// Attempt to fully fill the buffer
bytesRead = inStream.Read(curBuffer, 0, buffSize);
if( bytesRead == buffSize ) {
// We succesfully retrieved a whole buffer, we will output
// only [blockSize] bytes, to avoid writing to the last
// bytes in the buffer in case the remaining 16 bytes happen to
// be the last ones
outStream.Write(curBuffer, 0, blockSize);
} else {
// We couldn't retrieve the whole buffer
int bytesToWrite = bytesRead - bytesToOmit;
if( bytesToWrite > 0 ) {
outStream.Write(curBuffer, 0, bytesToWrite);
}
// There's no more data to process
return;
}
curBuffer = buffer2;
prevBuffer = buffer1;
while( true ) {
// Attempt again to fully fill the buffer
bytesRead = inStream.Read(curBuffer, 0, buffSize);
if( bytesRead == buffSize ) {
// We retrieved the whole buffer, output first the last 16
// bytes of the previous buffer, and output just [blockSize]
// bytes from the current buffer
outStream.Write(prevBuffer, blockSize, bytesToOmit);
outStream.Write(curBuffer, 0, blockSize);
} else {
// We could not retrieve a complete buffer
if( bytesRead <= bytesToOmit ) {
// The bytes to output come solely from the previous buffer
outStream.Write(prevBuffer, blockSize, bytesRead);
} else {
// The bytes to output come from the previous buffer and
// the current buffer
outStream.Write(prevBuffer, blockSize, bytesToOmit);
outStream.Write(curBuffer, 0, bytesRead - bytesToOmit);
}
break;
}
// swap buffers for next iteration
byte[] swap = prevBuffer;
prevBuffer = curBuffer;
curBuffer = swap;
}
}
static void Assert(Stream inStream, Stream outStream) {
// Routine that tests the copy worked as expected
inStream.Seek(0, SeekOrigin.Begin);
outStream.Seek(0, SeekOrigin.Begin);
Debug.Assert(outStream.Length == Math.Max(inStream.Length - bytesToOmit, 0));
for( int i = 0; i < outStream.Length; i++ ) {
int byte1 = inStream.ReadByte();
int byte2 = outStream.ReadByte();
Debug.Assert(byte1 == byte2);
}
}
A much easier solution to code, yet slower since it would work at a byte level, would be to use an intermediate queue between the input stream and the output stream. The process would first read and enqueue 16 bytes from the input stream. Then it would iterate over the remaining input bytes, reading a single byte from the input stream, enqueuing it and then dequeuing a byte. The dequeued byte would be written to the output stream, until all bytes from the input stream are processed. The unwanted 16 bytes should linger in the intermediate queue.
Hope this helps!
=)
Use a circular buffer sounds great but there is no circular buffer class in .NET which means additional code anyways. I ended up with the following algorithm, a sort of map and copy - I think it's simple. The variable names are longer than usual for the sake of being self descriptive here.
This flows thru the buffers as
[outStream] <== [tailBuf] <== [mainBuf] <== [inStream]
public byte[] CopyStreamExtractLastBytes(Stream inStream, Stream outStream,
int extractByteCount)
{
//var mainBuf = new byte[1024*4]; // 4K buffer ok for network too
var mainBuf = new byte[4651]; // nearby prime for testing
int mainBufValidCount;
var tailBuf = new byte[extractByteCount];
int tailBufValidCount = 0;
while ((mainBufValidCount = inStream.Read(mainBuf, 0, mainBuf.Length)) > 0)
{
// Map: how much of what (passthru/tail) lives where (MainBuf/tailBuf)
// more than tail is passthru
int totalPassthruCount = Math.Max(0, tailBufValidCount +
mainBufValidCount - extractByteCount);
int tailBufPassthruCount = Math.Min(tailBufValidCount, totalPassthruCount);
int tailBufTailCount = tailBufValidCount - tailBufPassthruCount;
int mainBufPassthruCount = totalPassthruCount - tailBufPassthruCount;
int mainBufResidualCount = mainBufValidCount - mainBufPassthruCount;
// Copy: Passthru must be flushed per FIFO order (tailBuf then mainBuf)
outStream.Write(tailBuf, 0, tailBufPassthruCount);
outStream.Write(mainBuf, 0, mainBufPassthruCount);
// Copy: Now reassemble/compact tail into tailBuf
var tempResidualBuf = new byte[extractByteCount];
Array.Copy(tailBuf, tailBufPassthruCount, tempResidualBuf, 0,
tailBufTailCount);
Array.Copy(mainBuf, mainBufPassthruCount, tempResidualBuf,
tailBufTailCount, mainBufResidualCount);
tailBufValidCount = tailBufTailCount + mainBufResidualCount;
tailBuf = tempResidualBuf;
}
return tailBuf;
}
I'm trying to read the response stream from an HttpWebResponse object. I know the length of the stream (_response.ContentLength) however I keep getting the following exception:
Specified argument was out of the range of valid values.
Parameter name: size
While debugging, I noticed that at the time of the error, the values were as such:
length = 15032 //the length of the stream as defined by _response.ContentLength
bytesToRead = 7680 //the number of bytes in the stream that still need to be read
bytesRead = 7680 //the number of bytes that have been read (offset)
body.length = 15032 //the size of the byte[] the stream is being copied to
The peculiar thing is that the bytesToRead and bytesRead variables are ALWAYS 7680, regardless of the size of the stream (contained in the length variable). Any ideas?
Code:
int length = (int)_response.ContentLength;
byte[] body = null;
if (length > 0)
{
int bytesToRead = length;
int bytesRead = 0;
try
{
body = new byte[length];
using (Stream stream = _response.GetResponseStream())
{
while (bytesToRead > 0)
{
// Read may return anything from 0 to length.
int n = stream.Read(body, bytesRead, length);
// The end of the file is reached.
if (n == 0)
break;
bytesRead += n;
bytesToRead -= n;
}
stream.Close();
}
}
catch (Exception exception)
{
throw;
}
}
else
{
body = new byte[0];
}
_responseBody = body;
You want this line:
int n = stream.Read(body, bytesRead, length);
to be this:
int n = stream.Read(body, bytesRead, bytesToRead);
You are saying the maximum number of bytes to read is the stream's length, but it isn't since it is actually only the remaining information in the stream after the offset has been applied.
You also shouldn't need this part:
if (n == 0)
break;
The while should end the reading correctly, and it is possible that you won't read any bytes before you have finished the whole thing (if the stream is filling slower than you are taking the data out of it)