i want read a file in 1MB chunks with a FileStream and write it back with another FileStream.
The problem i have, is that the file with ~2.9MB get up to ~3.9MB because the last buffer is to big for the data (so it gets filled with \0 i think).
Is there a way to cut the overflow of the last buffer?
public static void ReadAndWriteFileStreamTest() {
string outputFile = "output.dat";
string inputFile = "input.dat";
using (FileStream fsOut = File.OpenWrite(outputFile))
{
using (FileStream fsIn = File.OpenRead(inputFile))
{
//read in ~1 MB chunks
int bufferLen = 1000000;
byte[] buffer = new byte[bufferLen];
long bytesRead;
do
{
bytesRead = fsIn.Read(buffer, 0, bufferLen);
fsOut.Write(buffer, 0, buffer.Length);
} while (bytesRead != 0);
}
}
}
Any help would be great! :)
PROBLEM:
fsOut.Write(buffer, 0, buffer.Length);
writes all bytes from the buffer, while you read only bytesRead amount.
SOLUTION:
You should use bytesRead as third parameter of FileStream.Write - count - amount of actually read bytes to avoid writing of bytes that aren't actually read.
do
{
bytesRead = fsIn.Read(buffer, 0, buffer.Length);
fsOut.Write(buffer, 0, bytesRead );
} while (bytesRead != 0);
It says here msdn.microsoft.com/en-us/library/system.io.stream.read.aspx that the Stream.Read and Stream.Write methods both advance the position/offset in the stream automatically so why is the examples here http://msdn.microsoft.com/en-us/library/system.io.stream.read.aspx and http://msdn.microsoft.com/en-us/library/system.io.filestream.read.aspx manually changing the offset?
Do you only set the offset in a loop if you know the size of the stream and set it to 0 if you don't know the size and using a buffer?
// Now read s into a byte buffer.
byte[] bytes = new byte[s.Length];
int numBytesToRead = (int) s.Length;
int numBytesRead = 0;
while (numBytesToRead > 0)
{
// Read may return anything from 0 to 10.
int n = s.Read(bytes, numBytesRead, 10);
// The end of the file is reached.
if (n == 0)
{
break;
}
numBytesRead += n;
numBytesToRead -= n;
}
and
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
The offset is actually the offset of the buffer, not the stream. Streams are advanced automatically as they are read.
Edit (to the edited question):
In none of the code snippets you pasted into the question I see any stream offset being set.
I think you are mistaking the calculation of bytes to read vs. bytes received. This protocol may seem funny (why would you receive fewer bytes than requested?) but it makes sense when you consider that you might be reading from a high-latency packet oriented source (think: network sockets).
You might be receiving 6 characters in one burst (from a TCP packet) and only receive the remaining 4 characters in your next read (when the next packet has arrived).
Edit In response to your linked example from the comment:
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
// ... snip
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
It appears that the coders use prior knowledge about the underlying stream implementation, that stream.Read will always return 0 OR the size requested. That seems like a risky bet, to me. But if the docs for GZipStream do state that, it could be alright. However, since the MSDN samples use a generic Stream variable, it is (way) more correct to check the exact number of bytes read.
The first linked example uses a MemoryStream in both Write and Read fashion. The position is reset in between, so the data that was written first will be read:
Stream s = new MemoryStream();
for (int i = 0; i < 100; i++)
{
s.WriteByte((byte)i);
}
s.Position = 0;
The second example linked does not set the stream position. You'd typically have seen a call to Seek if it did. You maybe confusing the offsets into the data buffer with the stream position?
Hey, I'm having problems compressing files with Ionic.zlib, I'm very new to C# so the problem may be easily solvable. If I compress a large file, let's say 500kb in size then once the compressed file has reached 65536 bytes it will stop, if I then decompress the file theres a lot of data missing :/. I can fix this by setting the buffer to like 4,000,000, but I heard that it's best to have it set to 0x4000.
ZlibStream compressor = new ZlibStream(gsc_stream, CompressionMode.Compress, CompressionLevel.BestCompression, true);
byte[] buffer = new byte[0x4000];
Int32 n;
int previous = Convert.ToInt32(zone.Position);
while ((n = compressor.Read(buffer, 0, buffer.Length)) != 0)
{
zone.Write(buffer, 0, n);
}
zone.Flush();
compressor.Flush();
Looks like you have it the other way around.
If you're trying to compress the file in the stream gsc_stream and write the result into the stream zone then correct code would be:
using (ZlibStream compressor = new ZlibStream(zone, CompressionMode.Compress, CompressionLevel.BestCompression, true))
{
byte[] buffer = new byte[0x4000];
int n;
while ((n = gsc_stream.Read(buffer, 0, buffer.Length)) != 0)
{
compressor.Write(buffer, 0, n);
}
zone.Flush();
compressor.Flush();
}
I have an input stream wrapped in a System.IO.StreamReader... I wish to write the content of the stream to a file (i.e. StreamWriter).
The length of the inputstream is unknown. Could be a few bytes and to gigabytes in length.
How is this done the easiest that does not take up too much memory?
Something like this:
public static void CopyText(TextReader input, TextWriter output)
{
char[] buffer = new char[8192];
int length;
while ((length = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, length);
}
}
Note that this is very similar to what you'd write to copy the contents of one stream to another - this just happens to be text data instead of binary data.
This question already has answers here:
Creating a byte array from a stream
(18 answers)
Closed 6 years ago.
Is there a simple way or method to convert a Stream into a byte[] in C#?
The shortest solution I know:
using(var memoryStream = new MemoryStream())
{
sourceStream.CopyTo(memoryStream);
return memoryStream.ToArray();
}
Call next function like
byte[] m_Bytes = StreamHelper.ReadToEnd (mystream);
Function:
public static byte[] ReadToEnd(System.IO.Stream stream)
{
long originalPosition = 0;
if(stream.CanSeek)
{
originalPosition = stream.Position;
stream.Position = 0;
}
try
{
byte[] readBuffer = new byte[4096];
int totalBytesRead = 0;
int bytesRead;
while ((bytesRead = stream.Read(readBuffer, totalBytesRead, readBuffer.Length - totalBytesRead)) > 0)
{
totalBytesRead += bytesRead;
if (totalBytesRead == readBuffer.Length)
{
int nextByte = stream.ReadByte();
if (nextByte != -1)
{
byte[] temp = new byte[readBuffer.Length * 2];
Buffer.BlockCopy(readBuffer, 0, temp, 0, readBuffer.Length);
Buffer.SetByte(temp, totalBytesRead, (byte)nextByte);
readBuffer = temp;
totalBytesRead++;
}
}
}
byte[] buffer = readBuffer;
if (readBuffer.Length != totalBytesRead)
{
buffer = new byte[totalBytesRead];
Buffer.BlockCopy(readBuffer, 0, buffer, 0, totalBytesRead);
}
return buffer;
}
finally
{
if(stream.CanSeek)
{
stream.Position = originalPosition;
}
}
}
I use this extension class:
public static class StreamExtensions
{
public static byte[] ReadAllBytes(this Stream instream)
{
if (instream is MemoryStream)
return ((MemoryStream) instream).ToArray();
using (var memoryStream = new MemoryStream())
{
instream.CopyTo(memoryStream);
return memoryStream.ToArray();
}
}
}
Just copy the class to your solution and you can use it on every stream:
byte[] bytes = myStream.ReadAllBytes()
Works great for all my streams and saves a lot of code!
Of course you can modify this method to use some of the other approaches here to improve performance if needed, but I like to keep it simple.
In .NET Framework 4 and later, the Stream class has a built-in CopyTo method that you can use.
For earlier versions of the framework, the handy helper function to have is:
public static void CopyStream(Stream input, Stream output)
{
byte[] b = new byte[32768];
int r;
while ((r = input.Read(b, 0, b.Length)) > 0)
output.Write(b, 0, r);
}
Then use one of the above methods to copy to a MemoryStream and call GetBuffer on it:
var file = new FileStream("c:\\foo.txt", FileMode.Open);
var mem = new MemoryStream();
// If using .NET 4 or later:
file.CopyTo(mem);
// Otherwise:
CopyStream(file, mem);
// getting the internal buffer (no additional copying)
byte[] buffer = mem.GetBuffer();
long length = mem.Length; // the actual length of the data
// (the array may be longer)
// if you need the array to be exactly as long as the data
byte[] truncated = mem.ToArray(); // makes another copy
Edit: originally I suggested using Jason's answer for a Stream that supports the Length property. But it had a flaw because it assumed that the Stream would return all its contents in a single Read, which is not necessarily true (not for a Socket, for example.) I don't know if there is an example of a Stream implementation in the BCL that does support Length but might return the data in shorter chunks than you request, but as anyone can inherit Stream this could easily be the case.
It's probably simpler for most cases to use the above general solution, but supposing you did want to read directly into an array that is bigEnough:
byte[] b = new byte[bigEnough];
int r, offset;
while ((r = input.Read(b, offset, b.Length - offset)) > 0)
offset += r;
That is, repeatedly call Read and move the position you will be storing the data at.
Byte[] Content = new BinaryReader(file.InputStream).ReadBytes(file.ContentLength);
byte[] buf; // byte array
Stream stream=Page.Request.InputStream; //initialise new stream
buf = new byte[stream.Length]; //declare arraysize
stream.Read(buf, 0, buf.Length); // read from stream to byte array
Ok, maybe I'm missing something here, but this is the way I do it:
public static Byte[] ToByteArray(this Stream stream) {
Int32 length = stream.Length > Int32.MaxValue ? Int32.MaxValue : Convert.ToInt32(stream.Length);
Byte[] buffer = new Byte[length];
stream.Read(buffer, 0, length);
return buffer;
}
if you post a file from mobile device or other
byte[] fileData = null;
using (var binaryReader = new BinaryReader(Request.Files[0].InputStream))
{
fileData = binaryReader.ReadBytes(Request.Files[0].ContentLength);
}
Stream s;
int len = (int)s.Length;
byte[] b = new byte[len];
int pos = 0;
while((r = s.Read(b, pos, len - pos)) > 0) {
pos += r;
}
A slightly more complicated solution is necesary is s.Length exceeds Int32.MaxValue. But if you need to read a stream that large into memory, you might want to think about a different approach to your problem.
Edit: If your stream does not support the Length property, modify using Earwicker's workaround.
public static class StreamExtensions {
// Credit to Earwicker
public static void CopyStream(this Stream input, Stream output) {
byte[] b = new byte[32768];
int r;
while ((r = input.Read(b, 0, b.Length)) > 0) {
output.Write(b, 0, r);
}
}
}
[...]
Stream s;
MemoryStream ms = new MemoryStream();
s.CopyStream(ms);
byte[] b = ms.GetBuffer();
"bigEnough" array is a bit of a stretch. Sure, buffer needs to be "big ebough" but proper design of an application should include transactions and delimiters. In this configuration each transaction would have a preset length thus your array would anticipate certain number of bytes and insert it into correctly sized buffer. Delimiters would ensure transaction integrity and would be supplied within each transaction. To make your application even better, you could use 2 channels (2 sockets). One would communicate fixed length control message transactions that would include information about size and sequence number of data transaction to be transferred using data channel. Receiver would acknowledge buffer creation and only then data would be sent.
If you have no control over stream sender than you need multidimensional array as a buffer. Component arrays would be small enough to be manageable and big enough to be practical based on your estimate of expected data. Process logic would seek known start delimiters and then ending delimiter in subsequent element arrays. Once ending delimiter is found, new buffer would be created to store relevant data between delimiters and initial buffer would have to be restructured to allow data disposal.
As far as a code to convert stream into byte array is one below.
Stream s = yourStream;
int streamEnd = Convert.ToInt32(s.Length);
byte[] buffer = new byte[streamEnd];
s.Read(buffer, 0, streamEnd);
Quick and dirty technique:
static byte[] StreamToByteArray(Stream inputStream)
{
if (!inputStream.CanRead)
{
throw new ArgumentException();
}
// This is optional
if (inputStream.CanSeek)
{
inputStream.Seek(0, SeekOrigin.Begin);
}
byte[] output = new byte[inputStream.Length];
int bytesRead = inputStream.Read(output, 0, output.Length);
Debug.Assert(bytesRead == output.Length, "Bytes read from stream matches stream length");
return output;
}
Test:
static void Main(string[] args)
{
byte[] data;
string path = #"C:\Windows\System32\notepad.exe";
using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read))
{
data = StreamToByteArray(fs);
}
Debug.Assert(data.Length > 0);
Debug.Assert(new FileInfo(path).Length == data.Length);
}
I would ask, why do you want to read a stream into a byte[], if you are wishing to copy the contents of a stream, may I suggest using MemoryStream and writing your input stream into a memory stream.
You could also try just reading in parts at a time and expanding the byte array being returned:
public byte[] StreamToByteArray(string fileName)
{
byte[] total_stream = new byte[0];
using (Stream input = File.Open(fileName, FileMode.Open, FileAccess.Read))
{
byte[] stream_array = new byte[0];
// Setup whatever read size you want (small here for testing)
byte[] buffer = new byte[32];// * 1024];
int read = 0;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
stream_array = new byte[total_stream.Length + read];
total_stream.CopyTo(stream_array, 0);
Array.Copy(buffer, 0, stream_array, total_stream.Length, read);
total_stream = stream_array;
}
}
return total_stream;
}