I tried to understand the MSDN example for NetworkStream.EndRead(). There are some parts that i do not understand.
So here is the example (copied from MSDN):
// Example of EndRead, DataAvailable and BeginRead.
public static void myReadCallBack(IAsyncResult ar ){
NetworkStream myNetworkStream = (NetworkStream)ar.AsyncState;
byte[] myReadBuffer = new byte[1024];
String myCompleteMessage = "";
int numberOfBytesRead;
numberOfBytesRead = myNetworkStream.EndRead(ar);
myCompleteMessage =
String.Concat(myCompleteMessage, Encoding.ASCII.GetString(myReadBuffer, 0, numberOfBytesRead));
// message received may be larger than buffer size so loop through until you have it all.
while(myNetworkStream.DataAvailable){
myNetworkStream.BeginRead(myReadBuffer, 0, myReadBuffer.Length,
new AsyncCallback(NetworkStream_ASync_Send_Receive.myReadCallBack),
myNetworkStream);
}
// Print out the received message to the console.
Console.WriteLine("You received the following message : " +
myCompleteMessage);
}
It uses BeginRead() and EndRead() to read asynchronously from the network stream.
The whole thing is invoked by calling
myNetworkStream.BeginRead(someBuffer, 0, someBuffer.Length, new AsyncCallback(NetworkStream_ASync_Send_Receive.myReadCallBack), myNetworkStream);
somewhere else (not displayed in the example).
What I think it should do is print the whole message received from the NetworkStream in a single WriteLine (the one at the end of the example). Notice that the string is called myCompleteMessage.
Now when I look at the implementation some problems arise for my understanding.
First of all: The example allocates a new method-local buffer myReadBuffer. Then EndStream() is called which writes the received message into the buffer that BeginRead() was supplied. This is NOT the myReadBuffer that was just allocated. How should the network stream know of it? So in the next line numberOfBytesRead-bytes from the empty buffer are appended to myCompleteMessage. Which has the current value "". In the last line this message consisting of a lot of '\0's is printed with Console.WriteLine.
This doesn't make any sense to me.
The second thing I do not understand is the while-loop.
BeginRead is an asynchronous call. So no data is immediately read. So as I understand it, the while loop should run quite a while until some asynchronous call is actually executed and reads from the stream so that there is no data available any more. The documentation doesn't say that BeginRead immediately marks some part of the available data as being read, so I do not expect it to do so.
This example does not improve my understanding of those methods. Is this example wrong or is my understanding wrong (I expect the latter)? How does this example work?
I think the while loop around the BeginRead shouldn't be there. You don't want to execute the BeginRead more than ones before the EndRead is done. Also the buffer needs to be specified outside the BeginRead, because you may use more than one reads per packet/buffer.
There are some things you need to think about, like how long are my messages/blocks (fixed size). Shall I prefix it with a length. (variable size) <datalength><data><datalength><data>
Don't forget it is a Streaming connection, so multiple/partial messages/packets can be read in one read.
Pseudo example:
int bytesNeeded;
int bytesRead;
public void Start()
{
bytesNeeded = 40; // u need to know how much bytes you're needing
bytesRead = 0;
BeginReading();
}
public void BeginReading()
{
myNetworkStream.BeginRead(
someBuffer, bytesRead, bytesNeeded - bytesRead,
new AsyncCallback(EndReading),
myNetworkStream);
}
public void EndReading(IAsyncResult ar)
{
numberOfBytesRead = myNetworkStream.EndRead(ar);
if(numberOfBytesRead == 0)
{
// disconnected
return;
}
bytesRead += numberOfBytesRead;
if(bytesRead == bytesNeeded)
{
// Handle buffer
Start();
}
else
BeginReading();
}
Related
I've been working with windows app store programming in c# recently, and I've come across a problem with sockets.
I need to be able to read data with an unknown length from a DataReader().
It sounds simple enough, but I've not been able to find a solution after a few days of searching.
Here's my current receiving code (A little sloppy, need to clean it up after I find a solution to this problem. And yes, a bit of this is from the Microsoft example)
DataReader reader = new DataReader(args.Socket.InputStream);
try
{
while (true)
{
// Read first 4 bytes (length of the subsequent string).
uint sizeFieldCount = await reader.LoadAsync(sizeof(uint));
if (sizeFieldCount != sizeof(uint))
{
// The underlying socket was closed before we were able to read the whole data.
return;
}
reader.InputStreamOptions
// Read the string.
uint stringLength = reader.ReadUInt32();
uint actualStringLength = await reader.LoadAsync(stringLength);
if (stringLength != actualStringLength)
{
// The underlying socket was closed before we were able to read the whole data.
return;
}
// Display the string on the screen. The event is invoked on a non-UI thread, so we need to marshal
// the text back to the UI thread.
//MessageBox.Show("Received data: " + reader.ReadString(actualStringLength));
MessageBox.updateList(reader.ReadString(actualStringLength));
}
}
catch (Exception exception)
{
// If this is an unknown status it means that the error is fatal and retry will likely fail.
if (SocketError.GetStatus(exception.HResult) == SocketErrorStatus.Unknown)
{
throw;
}
MessageBox.Show("Read stream failed with error: " + exception.Message);
}
You are going down the right lines - read the first INT to find out how many bytes are to be sent.
Franky Boyle is correct - without a signalling mechanism it is impossible to ever know the length of a stream. Thats why it is called a stream!
NO socket implementation (including the WinSock) will ever be clever enough to know when a client has finished sending data. The client could be having a cup of tea half way through sending the data!
Your server and its sockets will never know! What are they going to do? Wait forever? I suppose they could wait until the client had 'closed' the connection? But your client could have had a blue screen and the server will never get that TCP close packet, it will just be sitting there thinking it is getting more data one day?
I have never used a DataReader - i have never even heard of that class! Use NetworkStream instead.
From my memory I have written code like this in the past. I am just typing, no checking of syntax.
using(MemoryStream recievedData = new MemoryStream())
{
using(NetworkStream networkStream = new NetworkStream(connectedSocket))
{
int totalBytesToRead = networkStream.ReadByte();
// This is your mechanism to find out how many bytes
// the client wants to send.
byte[] readBuffer = new byte[1024]; // Up to you the length!
int totalBytesRead = 0;
int bytesReadInThisTcpWindow = 0;
// The length of the TCP window of the client is usually
// the number of bytes that will be pushed through
// to your server in one SOCKET.READ method call.
// For example, if the clients TCP window was 777 bytes, a:
// int bytesRead =
// networkStream.Read(readBuffer, 0, int.Max);
// bytesRead would be 777.
// If they were sending a large file, you would have to make
// it up from the many 777s.
// If it were a small file under 777 bytes, your bytesRead
// would be the total small length of say 500.
while
(
(
bytesReadInThisTcpWindow =
networkStream.Read(readBuffer, 0, readBuffer.Length)
) > 0
)
// If the bytesReadInThisTcpWindow = 0 then the client
// has disconnected or failed to send the promised number
// of bytes in your Windows server internals dictated timeout
// (important to kill it here to stop lots of waiting
// threads killing your server)
{
recievedData.Write(readBuffer, 0, bytesReadInThisTcpWindow);
totalBytesToRead = totalBytesToRead + bytesReadInThisTcpWindow;
}
if(totalBytesToRead == totalBytesToRead)
{
// We have our data!
}
}
}
I've implemented an Asynchronous Client Socket extremely similar to this example. Is there any reason why I cannot dramatically increase this buffer size? In this example the buffer size is 256 bytes. In many cases, my application ends up receiving data that is 5,000++ bytes of data. Should I increase the buffer size? Are there any reasons why I should NOT increase the buffer size?
Every once in a long while I'll get some issue where the data comes in out of order or a chunk is missing (yet to be confirmed exactly which it is). For example, one time I received some corrupt data that looks like this
Slice Id="0" layotartX='100'
The attribute called layotartX does not exist in my data, it was supposed to say layout=... but instead the layout got cut off and other data was appended to it later. I counted the bytes and noticed that it was cut off at exactly 256 bytes which just so happens to be my buffer size. It's very possible that increasing my buffer size would prevent this problem from happening (data coming in out of order??). Anyways, as stated in the 1st paragraph, I'm just asking if there is any reason I should NOT increase the buffer size to be say like 5,000 bytes or even 10,000 bytes.
Adding some code. Below is my modified ReceiveCallback function (see the linked example code above for the rest of the classes. When the ReceiveCallback receives data, it calls the "ReceiveSomeData" function which I've also posted below. For some reason every once in a while I get data out of order or pieces missing. The "ReceiveSomeData" function is in a class called "MyChitterChatter" and the "ReceiveCallback" function is in a class called "AsyncClient". So when you see the ReceiveSomeData function locking "this", it's locking the MyChitterChatter class. Is there were my problem could by lying?
private static void ReceiveCallback( IAsyncResult ar )
{
AppDelegate appDel = (AppDelegate)UIApplication.SharedApplication.Delegate;
try {
// Retrieve the state object and the client socket
// from the asynchronous state object.
StateObject state = (StateObject) ar.AsyncState;
Socket client = state.workSocket;
// Read data from the remote device.
int bytesRead = client.EndReceive(ar);
if (bytesRead > 0) {
// There might be more data, so store the data received so far.
string stuffWeReceived = Encoding.ASCII.GetString(state.buffer,0,bytesRead);
string debugString = "~~~~~ReceiveCallback~~~~~~ " + stuffWeReceived + " len = " + stuffWeReceived.Length + " bytesRead = " + bytesRead;
Console.WriteLine(debugString);
// Send this data to be received
appDel.wallInteractionScreen.ChitterChatter.ReceiveSomeData(stuffWeReceived);
// Get the rest of the data.
client.BeginReceive(state.buffer,0,StateObject.BufferSize,0,
new AsyncCallback(ReceiveCallback), state);
} else {
// Signal that all bytes have been received.
receiveDone.Set();
}
}
catch (Exception e) {
Console.WriteLine("Error in AsyncClient ReceiveCallback: ");
Console.WriteLine(e.Message);
Console.WriteLine(e.StackTrace);
}
}
public void ReceiveSomeData ( string data )
{
lock(this)
{
DataList_New.Add(data);
// Update the keepalive when we receive ANY data at all
IsConnected = true;
LastDateTime_KeepAliveReceived = DateTime.Now;
}
}
Yes, you absolutely should increase the buffer size to something much closer to what you expect to get in a single read. 32k or 64k would be fine choice for most uses.
Having said that, data never comes in "out of order" or "missing a chunk" if you're using a TCP/IP socket; if you see something like that, it's a bug in your code, not a bug in the socket. Share your code if you want help.
I am writing a telnet server using the async Begin/End methods. The issue that I am having is determining what within my buffer is actual data and what is not. Network coding is a bit new to me, but I've tried to research this and have not been able to find a answer.
public bool Start(IGame game)
{
// Get our server address information.
IPHostEntry serverHost = Dns.GetHostEntry(Dns.GetHostName());
IPEndPoint serverEndPoint = new IPEndPoint(IPAddress.Any, this.Port);
// Instance the server socket, bind it to a port and begin listening for connections.
this._ServerSocket = new Socket(serverEndPoint.Address.AddressFamily, SocketType.Stream, ProtocolType.Tcp);
this._ServerSocket.Bind(serverEndPoint);
this._ServerSocket.Listen(this.MaxQueuedConnections);
this._ServerSocket.BeginAccept(new AsyncCallback(Connect), this._ServerSocket);
return true;
}
private void Connect(IAsyncResult result)
{
var player = new BasePlayer();
try
{
player.Game = this.Game;
player.Connection = this._ServerSocket.EndAccept(result);
lock (this.Connections)
{
this.Connections.Add(player);
}
// Pass all of the data handling for the player to itself.
player.Connection.BeginReceive(player.Buffer, 0, player.BufferSize, SocketFlags.None, new AsyncCallback(player.ReceiveData), player);
// Fetch the next incoming connection.
this._ServerSocket.BeginAccept(new AsyncCallback(Connect), this._ServerSocket);
}
catch (Exception)
{
}
}
and then the player.ReceiveData..
public void ReceiveData(IAsyncResult result)
{
int bytesRead = this.Connection.EndReceive(result);
if (bytesRead > 0)
{
// TODO: Parse data received by the user.
//Queue the next receive data.
this.Connection.BeginReceive(this.Buffer, 0, this.BufferSize, SocketFlags.None, new AsyncCallback(ReceiveData), this);
var str = System.Text.Encoding.Default.GetString(Buffer);
}
else
{
this.Disconnect(result);
}
}
So when I call BeginReceive, I need to provide a buffer of a predetermined size. In doing that, I end up with unused bytes in my buffer array. They all have the value of 0, so I am assuming that I can loop through the array and build a new one starting at index 0 and working until I hit a value of 0.
I imagine there is a better way to do this? Can someone please point me in the right direction as to how I should determine what the data is within my buffer or perhaps a way that I can do this without having to use a predetermined buffer size.
So when call BeginReceive, I need to provide a buffer of a predetermined size. In doing that, I end up with unused bytes in my buffer array. They all have the value of 0, so I am assuming that I can loop through the array and build a new one starting at index 0 and working until I hit a value of 0.
No, that's not what you should do. Instead, in your callback (ReceiveData) you're already calling EndReceive - and the result of that is the number of bytes you read. That's how much of the buffer you should use.
However, you should copy the data you've read out of the buffer before you call BeginReceive again, otherwise you may end up with the next bit of data overwriting the just-read data before you get to use it.
So something like:
string text = Encoding.ASCII.GetString(Buffer, 0, bytesRead);
Connection.BeginReceive(this.Buffer, 0, this.BufferSize, SocketFlags.None,
new AsyncCallback(ReceiveData), this);
I would not suggest that you use Encoding.Default to convert the bytes to text - instead, you should decide which encoding you're using, and stick to that. If you use an encoding which isn't always one-byte-per-character, you'll end up in a slightly trickier situation, as then you might end up receiving a buffer with part of a character. At that point you need to keep a Decoder which can maintain state about partial characters read.
Suppose I am writing a tcp proxy code.
I am reading from the incoming stream and writing to the output stream.
I know that Stream.Copy uses a buffer, but my question is:
Does the Stream.Copy method writes to the output stream while fetching the next chunk from the input stream or it a loop like "read chunk from input, write chunk to ouput, read chunk from input, etc" ?
Here's the implementation of CopyTo in .NET 4.5:
private void InternalCopyTo(Stream destination, int bufferSize)
{
int num;
byte[] buffer = new byte[bufferSize];
while ((num = this.Read(buffer, 0, buffer.Length)) != 0)
{
destination.Write(buffer, 0, num);
}
}
So as you can see, it reads from the source, then writes to the destination. This could probably be improved ;)
EDIT: here's a possible implementation of a piped version:
public static void CopyToPiped(this Stream source, Stream destination, int bufferSize = 0x14000)
{
byte[] readBuffer = new byte[bufferSize];
byte[] writeBuffer = new byte[bufferSize];
int bytesRead = source.Read(readBuffer, 0, bufferSize);
while (bytesRead > 0)
{
Swap(ref readBuffer, ref writeBuffer);
var iar = destination.BeginWrite(writeBuffer, 0, bytesRead, null, null);
bytesRead = source.Read(readBuffer, 0, bufferSize);
destination.EndWrite(iar);
}
}
static void Swap<T>(ref T x, ref T y)
{
T tmp = x;
x = y;
y = tmp;
}
Basically, it reads a chunk synchronously, starts to copy it to the destination asynchronously, then read the next chunk and waits for the write to complete.
I ran a few performance tests:
using MemoryStreams, I didn't expect a significant improvement, since it doesn't use IO completion ports (AFAIK); and indeed, the performance is almost identical
using files on different drives, I expected the piped version to perform better, but it doesn't... it's actually slightly slower (by 5 to 10%)
So it apparently doesn't bring any benefit, which is probably the reason why it isn't implemented this way...
According to Reflector it does not. Such behavior better be documented because it would introduce concurrency. This is never safe to do in general. So the API design to not "pipe" is sound.
So this is not just a question of Stream.Copy being more or less smart. Copying in a concurrent way is not an implementation detail.
Stream.Copy is synchronous operation. I don't think it is reasonable to expect it to use asynchronous read/write to make simultaneous read and write.
I would expect asynchrounous version (like RandomAccessStream.CopyAsync) to use simultaneous read and write.
Note: using multiple threads during copy would be unwelcome behavior, but using asynchronous read and write to run them at the same time is ok.
Writing to the output stream is impossible (when using one buffer) while fetching next chunk because fetching the next chunk can overwrite the buffer while its being used for output.
You can say use double buffering but its pretty much the same as using a double sized buffer.
I recently wrote a quick-and-dirty proof-of-concept proxy server in C# as part of an effort to get a Java web application to communicate with a legacy VB6 application residing on another server. It's ridiculously simple:
The proxy server and clients both use the same message format; in the code I use a ProxyMessage class to represent both requests from clients and responses generated by the server:
public class ProxyMessage
{
int Length; // message length (not including the length bytes themselves)
string Body; // an XML string containing a request/response
// writes this message instance in the proper network format to stream
// (helper for response messages)
WriteToStream(Stream stream) { ... }
}
The messages are as simple as could be: the length of the body + the message body.
I have a separate ProxyClient class that represents a connection to a client. It handles all the interaction between the proxy and a single client.
What I'm wondering is are they are design patterns or best practices for simplifying the boilerplate code associated with asynchronous socket programming? For example, you need to take some care to manage the read buffer so that you don't accidentally lose bytes, and you need to keep track of how far along you are in the processing of the current message. In my current code, I do all of this work in my callback function for TcpClient.BeginRead, and manage the state of the buffer and the current message processing state with the help of a few instance variables.
The code for my callback function that I'm passing to BeginRead is below, along with the relevant instance variables for context. The code seems to work fine "as-is", but I'm wondering if it can be refactored a bit to make it clearer (or maybe it already is?).
private enum BufferStates
{
GetMessageLength,
GetMessageBody
}
// The read buffer. Initially 4 bytes because we are initially
// waiting to receive the message length (a 32-bit int) from the client
// on first connecting. By constraining the buffer length to exactly 4 bytes,
// we make the buffer management a bit simpler, because
// we don't have to worry about cases where the buffer might contain
// the message length plus a few bytes of the message body.
// Additional bytes will simply be buffered by the OS until we request them.
byte[] _buffer = new byte[4];
// A count of how many bytes read so far in a particular BufferState.
int _totalBytesRead = 0;
// The state of the our buffer processing. Initially, we want
// to read in the message length, as it's the first thing
// a client will send
BufferStates _bufferState = BufferStates.GetMessageLength;
// ...ADDITIONAL CODE OMITTED FOR BREVITY...
// This is called every time we receive data from
// the client.
private void ReadCallback(IAsyncResult ar)
{
try
{
int bytesRead = _tcpClient.GetStream().EndRead(ar);
if (bytesRead == 0)
{
// No more data/socket was closed.
this.Dispose();
return;
}
// The state passed to BeginRead is used to hold a ProxyMessage
// instance that we use to build to up the message
// as it arrives.
ProxyMessage message = (ProxyMessage)ar.AsyncState;
if(message == null)
message = new ProxyMessage();
switch (_bufferState)
{
case BufferStates.GetMessageLength:
_totalBytesRead += bytesRead;
// if we have the message length (a 32-bit int)
// read it in from the buffer, grow the buffer
// to fit the incoming message, and change
// state so that the next read will start appending
// bytes to the message body
if (_totalBytesRead == 4)
{
int length = BitConverter.ToInt32(_buffer, 0);
message.Length = length;
_totalBytesRead = 0;
_buffer = new byte[message.Length];
_bufferState = BufferStates.GetMessageBody;
}
break;
case BufferStates.GetMessageBody:
string bodySegment = Encoding.ASCII.GetString(_buffer, _totalBytesRead, bytesRead);
_totalBytesRead += bytesRead;
message.Body += bodySegment;
if (_totalBytesRead >= message.Length)
{
// Got a complete message.
// Notify anyone interested.
// Pass a response ProxyMessage object to
// with the event so that receivers of OnReceiveMessage
// can send a response back to the client after processing
// the request.
ProxyMessage response = new ProxyMessage();
OnReceiveMessage(this, new ProxyMessageEventArgs(message, response));
// Send the response to the client
response.WriteToStream(_tcpClient.GetStream());
// Re-initialize our state so that we're
// ready to receive additional requests...
message = new ProxyMessage();
_totalBytesRead = 0;
_buffer = new byte[4]; //message length is 32-bit int (4 bytes)
_bufferState = BufferStates.GetMessageLength;
}
break;
}
// Wait for more data...
_tcpClient.GetStream().BeginRead(_buffer, 0, _buffer.Length, this.ReadCallback, message);
}
catch
{
// do nothing
}
}
So far, my only real thought is to extract the buffer-related stuff into a separate MessageBuffer class and simply have my read callback append new bytes to it as they arrive. The MessageBuffer would then worry about things like the current BufferState and fire an event when it received a complete message, which the ProxyClient could then propagate further up to the main proxy server code, where the request can be processed.
I've had to overcome similar problems. Here's my solution (modified to fit your own example).
We create a wrapper around Stream (a superclass of NetworkStream, which is a superclass of TcpClient or whatever). It monitors reads. When some data is read, it is buffered. When we receive a length indicator (4 bytes) we check if we have a full message (4 bytes + message body length). When we do, we raise a MessageReceived event with the message body, and remove the message from the buffer. This technique automatically handles fragmented messages and multiple-messages-per-packet situations.
public class MessageStream : IMessageStream, IDisposable
{
public MessageStream(Stream stream)
{
if(stream == null)
throw new ArgumentNullException("stream", "Stream must not be null");
if(!stream.CanWrite || !stream.CanRead)
throw new ArgumentException("Stream must be readable and writable", "stream");
this.Stream = stream;
this.readBuffer = new byte[512];
messageBuffer = new List<byte>();
stream.BeginRead(readBuffer, 0, readBuffer.Length, new AsyncCallback(ReadCallback), null);
}
// These belong to the ReadCallback thread only.
private byte[] readBuffer;
private List<byte> messageBuffer;
private void ReadCallback(IAsyncResult result)
{
int bytesRead = Stream.EndRead(result);
messageBuffer.AddRange(readBuffer.Take(bytesRead));
if(messageBuffer.Count >= 4)
{
int length = BitConverter.ToInt32(messageBuffer.Take(4).ToArray(), 0); // 4 bytes per int32
// Keep buffering until we get a full message.
if(messageBuffer.Count >= length + 4)
{
messageBuffer.Skip(4);
OnMessageReceived(new MessageEventArgs(messageBuffer.Take(length)));
messageBuffer.Skip(length);
}
}
// FIXME below is kinda hacky (I don't know the proper way of doing things...)
// Don't bother reading again. We don't have stream access.
if(disposed)
return;
try
{
Stream.BeginRead(readBuffer, 0, readBuffer.Length, new AsyncCallback(ReadCallback), null);
}
catch(ObjectDisposedException)
{
// DO NOTHING
// Ends read loop.
}
}
public Stream Stream
{
get;
private set;
}
public event EventHandler<MessageEventArgs> MessageReceived;
protected virtual void OnMessageReceived(MessageEventArgs e)
{
var messageReceived = MessageReceived;
if(messageReceived != null)
messageReceived(this, e);
}
public virtual void SendMessage(Message message)
{
// Have fun ...
}
// Dispose stuff here
}
I think the design you've used is fine that's roughly how I would and have done the same sort of thing. I don't think you'd gain much by refactoring into additional classes/structs and from what I've seen you'd actually make the solution more complex by doing so.
The only comment I'd have is as to whether the two reads where the first is always the messgae length and the second always being the body is robust enough. I'm always wary of approaches like that as if they somehow get out of sync due to an unforseen circumstance (such as the other end sending the wrong length) it's very difficult to recover. Instead I'd do a single read with a big buffer so that I always get all the available data from the network and then inspect the buffer to extract out complete messages. That way if things do go wrong the current buffer can just be thrown away to get things back to a clean state and only the current messages are lost rather than stopping the whole service.
Actually at the moment you would have a problem if you message body was big and arrived in two seperate receives and the next message in line sent it's length at the same time as the second half of the previous body. If that happened your message length would end up appended to the body of the previous message and you'd been in the situation as desecribed in the previous paragraph.
You can use yield return to automate the generation of a state machine for asynchronous callbacks. Jeffrey Richter promotes this technique through his AsyncEnumerator class, and I've played around with the idea here.
There's nothing wrong with the way you've done it. For me, though, I like to separate the receiving of the data from the processing of it, which is what you seem to be thinking with your proposed MessageBuffer class. I have discussed that in detail here.