Is Stream.Copy piped?

Is Stream.Copy piped? - c#

Suppose I am writing a tcp proxy code.
I am reading from the incoming stream and writing to the output stream.
I know that Stream.Copy uses a buffer, but my question is:
Does the Stream.Copy method writes to the output stream while fetching the next chunk from the input stream or it a loop like "read chunk from input, write chunk to ouput, read chunk from input, etc" ?

Here's the implementation of CopyTo in .NET 4.5:
private void InternalCopyTo(Stream destination, int bufferSize)
{
int num;
byte[] buffer = new byte[bufferSize];
while ((num = this.Read(buffer, 0, buffer.Length)) != 0)
{
destination.Write(buffer, 0, num);
}
}
So as you can see, it reads from the source, then writes to the destination. This could probably be improved ;)
EDIT: here's a possible implementation of a piped version:
public static void CopyToPiped(this Stream source, Stream destination, int bufferSize = 0x14000)
{
byte[] readBuffer = new byte[bufferSize];
byte[] writeBuffer = new byte[bufferSize];
int bytesRead = source.Read(readBuffer, 0, bufferSize);
while (bytesRead > 0)
{
Swap(ref readBuffer, ref writeBuffer);
var iar = destination.BeginWrite(writeBuffer, 0, bytesRead, null, null);
bytesRead = source.Read(readBuffer, 0, bufferSize);
destination.EndWrite(iar);
}
}
static void Swap<T>(ref T x, ref T y)
{
T tmp = x;
x = y;
y = tmp;
}
Basically, it reads a chunk synchronously, starts to copy it to the destination asynchronously, then read the next chunk and waits for the write to complete.
I ran a few performance tests:
using MemoryStreams, I didn't expect a significant improvement, since it doesn't use IO completion ports (AFAIK); and indeed, the performance is almost identical
using files on different drives, I expected the piped version to perform better, but it doesn't... it's actually slightly slower (by 5 to 10%)
So it apparently doesn't bring any benefit, which is probably the reason why it isn't implemented this way...

According to Reflector it does not. Such behavior better be documented because it would introduce concurrency. This is never safe to do in general. So the API design to not "pipe" is sound.
So this is not just a question of Stream.Copy being more or less smart. Copying in a concurrent way is not an implementation detail.

Stream.Copy is synchronous operation. I don't think it is reasonable to expect it to use asynchronous read/write to make simultaneous read and write.
I would expect asynchrounous version (like RandomAccessStream.CopyAsync) to use simultaneous read and write.
Note: using multiple threads during copy would be unwelcome behavior, but using asynchronous read and write to run them at the same time is ok.

Writing to the output stream is impossible (when using one buffer) while fetching next chunk because fetching the next chunk can overwrite the buffer while its being used for output.
You can say use double buffering but its pretty much the same as using a double sized buffer.

Related

SerialPort.ReadLine() slow compared to manual method?

I've recently implemented a small program which reads data coming from a sensor and plotting it as diagram.
The data comes in as chunks of 5 bytes, roughly every 500 µs (baudrate: 500000). Around 3000 chunks make up a complete line. So the total transmission time is around 1.5 s.
As I was looking at the live diagram I noticed a severe lag between what is shown and what is currently measured. Investigating, it all boiled down to:
SerialPort.ReadLine();
It takes around 0.5 s more than the line to be transmitted. So each line read takes around 2 s. Interestingly no data is lost, it just lags behind even more with each new line read. This is very irritating for the user, so I couldn't leave it like that.
I've implemented my own variant and it shows a consistent time of around 1.5 s, and no lag occurs. I'm not really proud of my implementation (more or less polling the BaseStream) and I'm wondering if there is a way to speed up the ReadLine function of the SerialPort class. With my implementation I'm also getting some corrupted lines, and haven't found the exact issue yet.
I've tried changing the ReadTimeout to 1600, but that just produced a TimeoutException. Although the data arrived.
Any explanation as of why it is slow or a way to fix it is appreciated.
As a side-note: I've tried this on a Console application with only SerialPort.ReadLine() as well and the result is the same, so I'm ruling out my own application affecting the SerialPort.
I'm not sure this is relevant, but my implementation looks like this:
LineSplitter lineSplitter = new LineSplitter();
async Task<string> SerialReadLineAsync(SerialPort serialPort)
{
byte[] buffer = new byte[5];
string ret = string.Empty;
while (true)
{
try
{
int bytesRead = await serialPort.BaseStream.ReadAsync(buffer, 0, buffer.Length).ConfigureAwait(false);
byte[] line = lineSplitter.OnIncomingBinaryBlock(this, buffer, bytesRead);
if (null != line)
{
return Encoding.ASCII.GetString(line).TrimEnd('\r', '\n');
}
}
catch
{
return string.Empty;
}
}
}
With LineSplitter being the following:
class LineSplitter
{
// based on: http://www.sparxeng.com/blog/software/reading-lines-serial-port
public byte Delimiter = (byte)'\n';
byte[] leftover;
public byte[] OnIncomingBinaryBlock(object sender, byte[] buffer, int bytesInBuffer)
{
leftover = ConcatArray(leftover, buffer, 0, bytesInBuffer);
int newLineIndex = Array.IndexOf(leftover, Delimiter);
if (newLineIndex >= 0)
{
byte[] result = new byte[newLineIndex+1];
Array.Copy(leftover, result, result.Length);
byte[] newLeftover = new byte[leftover.Length - result.Length];
Array.Copy(leftover, newLineIndex + 1, newLeftover, 0, newLeftover.Length);
leftover = newLeftover;
return result;
}
return null;
}
static byte[] ConcatArray(byte[] head, byte[] tail, int tailOffset, int tailCount)
{
byte[] result;
if (head == null)
{
result = new byte[tailCount];
Array.Copy(tail, tailOffset, result, 0, tailCount);
}
else
{
result = new byte[head.Length + tailCount];
head.CopyTo(result, 0);
Array.Copy(tail, tailOffset, result, head.Length, tailCount);
}
return result;
}
}

I ran into this issue in 2008 talking to GPS modules. Essentially the blocking functions are flaky and the solution is to use APM.
Here are the gory details in another Stack Overflow answer: How to do robust SerialPort programming with .NET / C#?
You may also find this of interest: How to kill off a pending APM operation

Understanding the NetworkStream.EndRead()-example from MSDN

I tried to understand the MSDN example for NetworkStream.EndRead(). There are some parts that i do not understand.
So here is the example (copied from MSDN):
// Example of EndRead, DataAvailable and BeginRead.
public static void myReadCallBack(IAsyncResult ar ){
NetworkStream myNetworkStream = (NetworkStream)ar.AsyncState;
byte[] myReadBuffer = new byte[1024];
String myCompleteMessage = "";
int numberOfBytesRead;
numberOfBytesRead = myNetworkStream.EndRead(ar);
myCompleteMessage =
String.Concat(myCompleteMessage, Encoding.ASCII.GetString(myReadBuffer, 0, numberOfBytesRead));
// message received may be larger than buffer size so loop through until you have it all.
while(myNetworkStream.DataAvailable){
myNetworkStream.BeginRead(myReadBuffer, 0, myReadBuffer.Length,
new AsyncCallback(NetworkStream_ASync_Send_Receive.myReadCallBack),
myNetworkStream);
}
// Print out the received message to the console.
Console.WriteLine("You received the following message : " +
myCompleteMessage);
}
It uses BeginRead() and EndRead() to read asynchronously from the network stream.
The whole thing is invoked by calling
myNetworkStream.BeginRead(someBuffer, 0, someBuffer.Length, new AsyncCallback(NetworkStream_ASync_Send_Receive.myReadCallBack), myNetworkStream);
somewhere else (not displayed in the example).
What I think it should do is print the whole message received from the NetworkStream in a single WriteLine (the one at the end of the example). Notice that the string is called myCompleteMessage.
Now when I look at the implementation some problems arise for my understanding.
First of all: The example allocates a new method-local buffer myReadBuffer. Then EndStream() is called which writes the received message into the buffer that BeginRead() was supplied. This is NOT the myReadBuffer that was just allocated. How should the network stream know of it? So in the next line numberOfBytesRead-bytes from the empty buffer are appended to myCompleteMessage. Which has the current value "". In the last line this message consisting of a lot of '\0's is printed with Console.WriteLine.
This doesn't make any sense to me.
The second thing I do not understand is the while-loop.
BeginRead is an asynchronous call. So no data is immediately read. So as I understand it, the while loop should run quite a while until some asynchronous call is actually executed and reads from the stream so that there is no data available any more. The documentation doesn't say that BeginRead immediately marks some part of the available data as being read, so I do not expect it to do so.
This example does not improve my understanding of those methods. Is this example wrong or is my understanding wrong (I expect the latter)? How does this example work?

I think the while loop around the BeginRead shouldn't be there. You don't want to execute the BeginRead more than ones before the EndRead is done. Also the buffer needs to be specified outside the BeginRead, because you may use more than one reads per packet/buffer.
There are some things you need to think about, like how long are my messages/blocks (fixed size). Shall I prefix it with a length. (variable size) <datalength><data><datalength><data>
Don't forget it is a Streaming connection, so multiple/partial messages/packets can be read in one read.
Pseudo example:
int bytesNeeded;
int bytesRead;
public void Start()
{
bytesNeeded = 40; // u need to know how much bytes you're needing
bytesRead = 0;
BeginReading();
}
public void BeginReading()
{
myNetworkStream.BeginRead(
someBuffer, bytesRead, bytesNeeded - bytesRead,
new AsyncCallback(EndReading),
myNetworkStream);
}
public void EndReading(IAsyncResult ar)
{
numberOfBytesRead = myNetworkStream.EndRead(ar);
if(numberOfBytesRead == 0)
{
// disconnected
return;
}
bytesRead += numberOfBytesRead;
if(bytesRead == bytesNeeded)
{
// Handle buffer
Start();
}
else
BeginReading();
}

Infinite Do...while loop using Async Await

I have following code:
public static async Task<string> ReadLineAsync(this Stream stream, Encoding encoding)
{
byte[] byteArray = null;
using (MemoryStream ms = new MemoryStream())
{
int bytesRead= 0;
do
{
byte[] buf = new byte[1024];
try
{
bytesRead = await stream.ReadAsync(buf, 0, 1024);
await ms.WriteAsync(buf, 0, bytesRead);
}
catch (Exception e)
{
Console.WriteLine(e.Message + e.StackTrace);
}
} while (stream.CanRead && bytesRead> 0);
byteArray = ms.ToArray();
return encoding.GetString(ms.ToArray());
}
I am trying to read Stream to write into MemoryStream asynchronously, but the Do...while loop is failing to break. I mean it's an infinite loop. How to solve this?

First, in an exceptional situation, your loop would continue indefinitely. You shouldn't catch and ignore exceptions.
Secondly, if the stream doesn't actually end, then bytesRead would never be zero. I suspect this is the case because the name of the method (ReadLineAsync) doesn't imply to me that it will read until the end of the stream.
P.S. CanRead does not ever change for a specific stream. It's whether it makes semantic sense for a stream to do a read operation, not whether it can read right now.

You have your loop condition set to run as long as CanRead is true and bytesRead is greater then 0. CanRead will always be true if your file is readable. This means as long as you start reading your bytes will always be greater than zero. You need to have a maximum number of bytes to be read as well as a minimum or set some other control to break out.

So, you are taking a stream from IMAP and this method is for converting that steam into text?
Why not construct a SteamReader round the stream and call either it's ReadToEndAsync or just ReadToEnd? I doubt the need for making this an Async operation, if the stream is something like an e-mail then it is unlikely to be so big that a user will notice the UI blocking while it reads.
If, as one of your comments suggests, this isn't a UI app at all then it is probably even less of an issue.
If my assumptions are wrong then could I ask you to update your question with some more information about how this function is being used. The more information you can tell us, the better our answers can be.
EDIT:
I just noticed that your method is called ReadLineAsync, although I can't see anywhere in the code that you are looking for a line ending. If your intention is to read a line of text then the SteamReader also provides ReadLine and ReadLineAsync.

I took your method and modified it just a tad by shortening the read buffer size and adding some debug statements
public static async Task<string> ReadLineAsync(this Stream stream, Encoding encoding)
{
const int count = 2;
byte[] byteArray = Enumerable.Empty<byte>().ToArray();
using (MemoryStream ms = new MemoryStream())
{
int bytesRead = 0;
do
{
byte[] buf = new byte[count];
try
{
bytesRead = await stream.ReadAsync(buf, 0, count);
await ms.WriteAsync(buf, 0, bytesRead);
Console.WriteLine("{0:ffffff}:{1}:{2}",DateTime.Now, stream.CanRead, bytesRead);
}
catch (Exception e)
{
Console.WriteLine(e.Message + e.StackTrace);
}
} while (stream.CanRead && bytesRead > 0);
byteArray = ms.ToArray();
return encoding.GetString(byteArray);
}
}
but basically it worked as expected with the following call:
private static void Main(string[] args)
{
FileStream stream = File.OpenRead(#"C:\in.txt");
Encoding encoding = Encoding.GetEncoding(1252);
Task<string> result = stream.ReadLineAsync(encoding);
result.ContinueWith(o =>
{
Console.Write(o.Result);
stream.Dispose();
});
Console.WriteLine("Press ENTER to continue...");
Console.ReadLine();
}
so I'm wondering could it be something with your input file? Mine was (encoded in Windows-1252 in Notepad++)
one
two
three
and my output was
Press ENTER to continue...
869993:True:2
875993:True:2
875993:True:2
875993:True:2
875993:True:2
875993:True:2
875993:True:2
875993:True:1
875993:True:0
one
two
three
note how the "Press ENTER to continue..." was printed first as expected because the main method was invoked asynchronously, and CanRead is always true because it means the file is readable. Its the state of how the file was opened, not the state meaning that the cursor is at the EOF.

From my POV, looks like your code is trying to do the following:
read an entire stream as a sequence of 1024-octet chunks,
concatenate all those chunks into a MemoryStream (which uses a byte array as its backing store),
convert the MemoryStream to a string using the specified encoding
return that string to the caller.
This seems...complicated to me. Maybe I'm missing something, but to use async and await, you've got to be using VS2012 and .Net 4.5, or VS2010. .Net 4.0 and the Async CTP, right? If so, why wouldn't you simply use a StreamReader and its StreamReader.ReadToEndAsync() method?
public static async Task<string> MyReadLineAsync(this Stream stream, Encoding encoding)
{
using ( StreamReader reader = new StreamReader( stream , encoding ) )
{
return await reader.ReadToEndAsync() ;
}
}
The overlapping i/o idea is nice, but the time required to write to a memory stream is, to say the least, not enough to make one whit of difference with respect to the time required to peform actual I/O (presumably your input stream is doing disk or network i/o).

How to tell at runtime if my process is CPU bound or I/O bound

I have a program where I send data over TCP link. I am using Asynchronous Reads and writes to both the disk and network. If I put a DeflateStream in the middle (so I compress before I write to the network link and I decompress when I receive the data and write it out to the disk) I am CPU bound on the compressing side. This causes my max transfer rate to be about 300 KB/s. However if I remove the compression step I am now I/O bound to the disk and I get transfer rates of 40,000 KB/s.
Under strictly LAN conditions my upper I/O limit will always be more than 300 KB/s, however if my program is run over the internet I very well may have a network IO limit below 300 KB/s.
I would like to detect if I am I/O bound and my network/disk link is the limiting factor or if I am CPU bound and the act of compressing is what is slowing me down most. How could I detect if my program is being limited by my CPU or by my I/O at runtime so I could switch protocols and get the best possible transfer rate?
private static void SendFile(string filename, NetworkStream stream, int sendBufferSize)
{
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096, FileOptions.Asynchronous | FileOptions.SequentialScan))
using (var ds = new DeflateStream(stream, CompressionMode.Compress))
{
StreamUtilities.CopyAsync(fs, ds, sendBufferSize);
}
}
public static void CopyAsync(Stream sourceStream, Stream destStream, int bufferSize = 4096)
{
Byte[] bufferA = new Byte[bufferSize];
Byte[] bufferB = new Byte[bufferSize];
IAsyncResult writeResult = null;
IAsyncResult readResult = null;
bool readBufferA = false;
int read;
readResult = sourceStream.BeginRead(bufferA, 0, bufferA.Length, null, null);
//Complete last read
while ((read = sourceStream.EndRead(readResult)) > 0)
{
if (readBufferA)
{
PerformOperations(sourceStream, destStream, bufferA, bufferB, ref readResult, ref writeResult, read);
}
else
{
PerformOperations(sourceStream, destStream, bufferB, bufferA, ref readResult, ref writeResult, read);
}
//Flip the bit on the next buffer
readBufferA = !readBufferA;
}
if (writeResult != null)
destStream.EndWrite(writeResult);
}
private static void PerformOperations(Stream sourceStream, Stream destStream, Byte[] readBuffer, Byte[] writeBuffer, ref IAsyncResult readResult, ref IAsyncResult writeResult, int bytesToWrite)
{
//Start next read
readResult = sourceStream.BeginRead(readBuffer, 0, readBuffer.Length, null, null);
//End previous write
if (writeResult != null)
destStream.EndWrite(writeResult);
writeResult = destStream.BeginWrite(writeBuffer, 0, bytesToWrite, null, null);
}

One option is to separate the two aspects out into a producer/consumer queue: your compressor write blocks into a queue which is then consumed by a thread which just performs IO.
That way:
You can compress while the IO is occurring, without going into asynchronous IO
You can detect whether you're CPU bound (queue is normally empty, or briefly has 1 block on it) or IO bound (queue gradually gets bigger as you compress faster than it can be sent)
With a bit of work, you could multi-thread the compression; you'd need to keep track of block order, but that should be feasible.

TCP Socket Communication Limit

Is there any limit on the size of data that can be received by TCP client.
With TCP socket communication, server is sending more data but the client is only getting 4K and stopping.

I'm guessing that you're doing exactly 1 Send and exactly 1 Receive.
You need to do multiple reads, there is no guarantee that a single read from the socket will contain everything.
The Receive method will read as much data as is available, up to the size of the buffer. But it will return when it has some data so your program can use it.

You may consider splitting your read/writes over multiple calls. I've definitely had some problems with TcpClient in the past. To fix that we use a wrapped stream class with the following read/write methods:
public override int Read(byte[] buffer, int offset, int count)
{
int totalBytesRead = 0;
int chunkBytesRead = 0;
do
{
chunkBytesRead = _stream.Read(buffer, offset + totalBytesRead, Math.Min(__frameSize, count - totalBytesRead));
totalBytesRead += chunkBytesRead;
} while (totalBytesRead < count && chunkBytesRead > 0);
return totalBytesRead;
}
public override void Write(byte[] buffer, int offset, int count)
{
int bytesSent = 0;
do
{
int chunkSize = Math.Min(__frameSize, count - bytesSent);
_stream.Write(buffer, offset + bytesSent, chunkSize);
bytesSent += chunkSize;
} while (bytesSent < count);
}
//_stream is the wrapped stream
//__frameSize is a constant, we use 4096 since its easy to allocate.

No, it should be fine. I suspect that your code to read from the client is flawed, but it's hard to say without you actually showing it.

No limit, TCP socket is a stream.

There's no limit for data with TCP in theory BUT since we're limited by physical resources (i.e memory), implementors such as Microsoft Winsock utilize something called "tcp window size".
That means that when you send something with the Winsock's send() function for example (and didn't set any options on the socket handler) the data will be first copied to the socket's temporary buffer. Only when the receiving side has acknowledged that he got that data, Winsock will use this memory again.
So, you might flood this buffer by sending faster than it frees up and then - error!

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Is Stream.Copy piped? - c#

Writing to the output stream is impossible (when using one buffer) while fetching next chunk because fetching the next chunk can overwrite the buffer while its being used for output. You can say use double buffering but its pretty much the same as using a double sized buffer.

Related

SerialPort.ReadLine() slow compared to manual method?

Understanding the NetworkStream.EndRead()-example from MSDN

Infinite Do...while loop using Async Await

How to tell at runtime if my process is CPU bound or I/O bound

TCP Socket Communication Limit

Categories

Resources