Uncompress data file with DeflateStream - c#

I'm having trouble reading a compressed (deflated) data file using C# .NET DeflateStream(..., CompressionMode.Decompress). The file was written earlier using DeflateStream(..., CompressionMode.Compress), and it seems to be just fine (I can even decompress it using a Java program).
However, the first Read() call on the input stream to decompress/inflate the compressed data returns a length of zero (end of file).
Here's the main driver, which is used for both compression and decompression:
public void Main(...)
{
Stream inp;
Stream outp;
bool compr;
...
inp = new FileStream(inName, FileMode.Open, FileAccess.Read);
outp = new FileStream(outName, FileMode.Create, FileAccess.Write);
if (compr)
Compress(inp, outp);
else
Decompress(inp, outp);
inp.Close();
outp.Close();
}
Here's the basic code for decompression, which is what is failing:
public long Decompress(Stream inp, Stream outp)
{
byte[] buf = new byte[BUF_SIZE];
long nBytes = 0;
// Decompress the contents of the input file
inp = new DeflateStream(inp, CompressionMode.Decompress);
for (;;)
{
int len;
// Read a data block from the input stream
len = inp.Read(buf, 0, buf.Length); //<<FAILS
if (len <= 0)
break;
// Write the data block to the decompressed output stream
outp.Write(buf, 0, len);
nBytes += len;
}
// Done
outp.Flush();
return nBytes;
}
The call marked FAILS always returns zero. Why? I know it's got to be something simple, but I'm just not seeing it.
Here's the basic code for compression, which works just fine, and is almost exactly the same as the decompression method with the names swapped:
public long Compress(Stream inp, Stream outp)
{
byte[] buf = new byte[BUF_SIZE];
long nBytes = 0;
// Compress the contents of the input file
outp = new DeflateStream(outp, CompressionMode.Compress);
for (;;)
{
int len;
// Read a data block from the input stream
len = inp.Read(buf, 0, buf.Length);
if (len <= 0)
break;
// Write the data block to the compressed output stream
outp.Write(buf, 0, len);
nBytes += len;
}
// Done
outp.Flush();
return nBytes;
}
Solved
After seeing the correct solution, the constructor statement should be changed to:
inp = new DeflateStream(inp, CompressionMode.Decompress, true);
which keeps the underlying input stream open, and the following line needs to be added following the inp.Flush() call:
inp.Close();
The Close() calls forces the deflater stream to flush its internal buffers. The true flag prevents it from closing the underlying stream, which is closed later in Main(). The same changes should also be made to the Compress() method.

In your decompress method, are reassigning inp to a new Stream (a deflate stream). You never close that Deflate stream, but you do close the underlying file stream in Main(). A similar thing is going on in the compress method.
I think that the problem is that the underlying file stream is being closed before the deflate stream's finalizers are automatically closing them.
I added 1 line of code to your Decompress and Compress methods:
inp.Close() // to the Decompressmehtod
outp.Close() // to the compress method.
a better practice would be to enclose the streams in a using clause.
Here's an alternative way to write your Decompress method (I tested, and it works)
public static long Decompress(Stream inp, Stream outp)
{
byte[] buf = new byte[BUF_SIZE];
long nBytes = 0;
// Decompress the contents of the input file
using (inp = new DeflateStream(inp, CompressionMode.Decompress))
{
int len;
while ((len = inp.Read(buf, 0, buf.Length)) > 0)
{
// Write the data block to the decompressed output stream
outp.Write(buf, 0, len);
nBytes += len;
}
}
// Done
return nBytes;
}

I had the same problem with GZipStream, Since we had the original length stored I had to re-write the code to only read the number of bytes expected in the original file.
Hopefully I'm about to learn that there was a better answer (fingers crossed).

Related

Download a file from System.Io.Stream class C# asp.net [duplicate]

I have a StreamReader object that I initialized with a stream, now I want to save this stream to disk (the stream may be a .gif or .jpg or .pdf).
Existing Code:
StreamReader sr = new StreamReader(myOtherObject.InputStream);
I need to save this to disk (I have the filename).
In the future I may want to store this to SQL Server.
I have the encoding type also, which I will need if I store it to SQL Server, correct?
As highlighted by Tilendor in Jon Skeet's answer, streams have a CopyTo method since .NET 4.
var fileStream = File.Create("C:\\Path\\To\\File");
myOtherObject.InputStream.Seek(0, SeekOrigin.Begin);
myOtherObject.InputStream.CopyTo(fileStream);
fileStream.Close();
Or with the using syntax:
using (var fileStream = File.Create("C:\\Path\\To\\File"))
{
myOtherObject.InputStream.Seek(0, SeekOrigin.Begin);
myOtherObject.InputStream.CopyTo(fileStream);
}
You have to call Seek if you're not already at the beginning or you won't copy the entire stream.
You must not use StreamReader for binary files (like gifs or jpgs). StreamReader is for text data. You will almost certainly lose data if you use it for arbitrary binary data. (If you use Encoding.GetEncoding(28591) you will probably be okay, but what's the point?)
Why do you need to use a StreamReader at all? Why not just keep the binary data as binary data and write it back to disk (or SQL) as binary data?
EDIT: As this seems to be something people want to see... if you do just want to copy one stream to another (e.g. to a file) use something like this:
/// <summary>
/// Copies the contents of input to output. Doesn't close either stream.
/// </summary>
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[8 * 1024];
int len;
while ( (len = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, len);
}
}
To use it to dump a stream to a file, for example:
using (Stream file = File.Create(filename))
{
CopyStream(input, file);
}
Note that Stream.CopyTo was introduced in .NET 4, serving basically the same purpose.
public void CopyStream(Stream stream, string destPath)
{
using (var fileStream = new FileStream(destPath, FileMode.Create, FileAccess.Write))
{
stream.CopyTo(fileStream);
}
}
private void SaveFileStream(String path, Stream stream)
{
var fileStream = new FileStream(path, FileMode.Create, FileAccess.Write);
stream.CopyTo(fileStream);
fileStream.Dispose();
}
I don't get all of the answers using CopyTo, where maybe the systems using the app might not have been upgraded to .NET 4.0+. I know some would like to force people to upgrade, but compatibility is also nice, too.
Another thing, I don't get using a stream to copy from another stream in the first place. Why not just do:
byte[] bytes = myOtherObject.InputStream.ToArray();
Once you have the bytes, you can easily write them to a file:
public static void WriteFile(string fileName, byte[] bytes)
{
string path = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
if (!path.EndsWith(#"\")) path += #"\";
if (File.Exists(Path.Combine(path, fileName)))
File.Delete(Path.Combine(path, fileName));
using (FileStream fs = new FileStream(Path.Combine(path, fileName), FileMode.CreateNew, FileAccess.Write))
{
fs.Write(bytes, 0, (int)bytes.Length);
//fs.Close();
}
}
This code works as I've tested it with a .jpg file, though I admit I have only used it with small files (less than 1 MB). One stream, no copying between streams, no encoding needed, just write the bytes! No need to over-complicate things with StreamReader if you already have a stream you can convert to bytes directly with .ToArray()!
Only potential downsides I can see in doing it this way is if there's a large file you have, having it as a stream and using .CopyTo() or equivalent allows FileStream to stream it instead of using a byte array and reading the bytes one by one. It might be slower doing it this way, as a result. But it shouldn't choke since the .Write() method of the FileStream handles writing the bytes, and it's only doing it one byte at a time, so it won't clog memory, except that you will have to have enough memory to hold the stream as a byte[] object. In my situation where I used this, getting an OracleBlob, I had to go to a byte[], it was small enough, and besides, there was no streaming available to me, anyway, so I just sent my bytes to my function, above.
Another option, using a stream, would be to use it with Jon Skeet's CopyStream function that was in another post - this just uses FileStream to take the input stream and create the file from it directly. It does not use File.Create, like he did (which initially seemed to be problematic for me, but later found it was likely just a VS bug...).
/// <summary>
/// Copies the contents of input to output. Doesn't close either stream.
/// </summary>
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[8 * 1024];
int len;
while ( (len = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, len);
}
}
public static void WriteFile(string fileName, Stream inputStream)
{
string path = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
if (!path.EndsWith(#"\")) path += #"\";
if (File.Exists(Path.Combine(path, fileName)))
File.Delete(Path.Combine(path, fileName));
using (FileStream fs = new FileStream(Path.Combine(path, fileName), FileMode.CreateNew, FileAccess.Write)
{
CopyStream(inputStream, fs);
}
inputStream.Close();
inputStream.Flush();
}
Here's an example that uses proper usings and implementation of idisposable:
static void WriteToFile(string sourceFile, string destinationfile, bool append = true, int bufferSize = 4096)
{
using (var sourceFileStream = new FileStream(sourceFile, FileMode.OpenOrCreate))
{
using (var destinationFileStream = new FileStream(destinationfile, FileMode.OpenOrCreate))
{
while (sourceFileStream.Position < sourceFileStream.Length)
{
destinationFileStream.WriteByte((byte)sourceFileStream.ReadByte());
}
}
}
}
...and there's also this
public static void WriteToFile(Stream stream, string destinationFile, int bufferSize = 4096, FileMode mode = FileMode.OpenOrCreate, FileAccess access = FileAccess.ReadWrite, FileShare share = FileShare.ReadWrite)
{
using (var destinationFileStream = new FileStream(destinationFile, mode, access, share))
{
while (stream.Position < stream.Length)
{
destinationFileStream.WriteByte((byte)stream.ReadByte());
}
}
}
The key is understanding the proper use of using (which should be implemented on the instantiation of the object that implements idisposable as shown above), and having a good idea as to how the properties work for streams. Position is literally the index within the stream (which starts at 0) that is followed as each byte is read using the readbyte method. In this case I am essentially using it in place of a for loop variable and simply letting it follow through all the way up to the length which is LITERALLY the end of the entire stream (in bytes). Ignore in bytes because it is practically the same and you will have something simple and elegant like this that resolves everything cleanly.
Keep in mind, too, that the ReadByte method simply casts the byte to an int in the process and can simply be converted back.
I'm gonna add another implementation I recently wrote to create a dynamic buffer of sorts to ensure sequential data writes to prevent massive overload
private void StreamBuffer(Stream stream, int buffer)
{
using (var memoryStream = new MemoryStream())
{
stream.CopyTo(memoryStream);
var memoryBuffer = memoryStream.GetBuffer();
for (int i = 0; i < memoryBuffer.Length;)
{
var networkBuffer = new byte[buffer];
for (int j = 0; j < networkBuffer.Length && i < memoryBuffer.Length; j++)
{
networkBuffer[j] = memoryBuffer[i];
i++;
}
//Assuming destination file
destinationFileStream.Write(networkBuffer, 0, networkBuffer.Length);
}
}
}
The explanation is fairly simple: we know that we need to keep in mind the entire set of data we wish to write and also that we only want to write certain amounts, so we want the first loop with the last parameter empty (same as while). Next, we initialize a byte array buffer that is set to the size of what's passed, and with the second loop we compare j to the size of the buffer and the size of the original one, and if it's greater than the size of the original byte array, end the run.
Why not use a FileStream object?
public void SaveStreamToFile(string fileFullPath, Stream stream)
{
if (stream.Length == 0) return;
// Create a FileStream object to write a stream to a file
using (FileStream fileStream = System.IO.File.Create(fileFullPath, (int)stream.Length))
{
// Fill the bytes[] array with the stream data
byte[] bytesInStream = new byte[stream.Length];
stream.Read(bytesInStream, 0, (int)bytesInStream.Length);
// Use FileStream object to write to the specified file
fileStream.Write(bytesInStream, 0, bytesInStream.Length);
}
}
//If you don't have .Net 4.0 :)
public void SaveStreamToFile(Stream stream, string filename)
{
using(Stream destination = File.Create(filename))
Write(stream, destination);
}
//Typically I implement this Write method as a Stream extension method.
//The framework handles buffering.
public void Write(Stream from, Stream to)
{
for(int a = from.ReadByte(); a != -1; a = from.ReadByte())
to.WriteByte( (byte) a );
}
/*
Note, StreamReader is an IEnumerable<Char> while Stream is an IEnumbable<byte>.
The distinction is significant such as in multiple byte character encodings
like Unicode used in .Net where Char is one or more bytes (byte[n]). Also, the
resulting translation from IEnumerable<byte> to IEnumerable<Char> can loose bytes
or insert them (for example, "\n" vs. "\r\n") depending on the StreamReader instance
CurrentEncoding.
*/
Another option is to get the stream to a byte[] and use File.WriteAllBytes. This should do:
using (var stream = new MemoryStream())
{
input.CopyTo(stream);
File.WriteAllBytes(file, stream.ToArray());
}
Wrapping it in an extension method gives it better naming:
public void WriteTo(this Stream input, string file)
{
//your fav write method:
using (var stream = File.Create(file))
{
input.CopyTo(stream);
}
//or
using (var stream = new MemoryStream())
{
input.CopyTo(stream);
File.WriteAllBytes(file, stream.ToArray());
}
//whatever that fits.
}
public void testdownload(stream input)
{
byte[] buffer = new byte[16345];
using (FileStream fs = new FileStream(this.FullLocalFilePath,
FileMode.Create, FileAccess.Write, FileShare.None))
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
fs.Write(buffer, 0, read);
}
}
}

Convert a VERY LARGE binary file into a Base64String incrementally

I need help converting a VERY LARGE binary file (ZIP file) to a Base64String and back again. The files are too large to be loaded into memory all at once (they throw OutOfMemoryExceptions) otherwise this would be a simple task. I do not want to process the contents of the ZIP file individually, I want to process the entire ZIP file.
The problem:
I can convert the entire ZIP file (test sizes vary from 1 MB to 800 MB at present) to Base64String, but when I convert it back, it is corrupted. The new ZIP file is the correct size, it is recognized as a ZIP file by Windows and WinRAR/7-Zip, etc., and I can even look inside the ZIP file and see the contents with the correct sizes/properties, but when I attempt to extract from the ZIP file, I get: "Error: 0x80004005" which is a general error code.
I am not sure where or why the corruption is happening. I have done some investigating, and I have noticed the following:
If you have a large text file, you can convert it to Base64String incrementally without issue. If calling Convert.ToBase64String on the entire file yielded: "abcdefghijklmnopqrstuvwx", then calling it on the file in two pieces would yield: "abcdefghijkl" and "mnopqrstuvwx".
Unfortunately, if the file is a binary then the result is different. While the entire file might yield: "abcdefghijklmnopqrstuvwx", trying to process this in two pieces would yield something like: "oiweh87yakgb" and "kyckshfguywp".
Is there a way to incrementally base 64 encode a binary file while avoiding this corruption?
My code:
private void ConvertLargeFile()
{
FileStream inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read);
byte[] buffer = new byte[MultipleOfThree];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while(bytesRead > 0)
{
byte[] secondaryBuffer = new byte[buffer.Length];
int secondaryBufferBytesRead = bytesRead;
Array.Copy(buffer, secondaryBuffer, buffer.Length);
bool isFinalChunk = false;
Array.Clear(buffer, 0, buffer.Length);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
if(bytesRead == 0)
{
isFinalChunk = true;
buffer = new byte[secondaryBufferBytesRead];
Array.Copy(secondaryBuffer, buffer, buffer.length);
}
String base64String = Convert.ToBase64String(isFinalChunk ? buffer : secondaryBuffer);
File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String);
}
inputStream.Dispose();
}
The decoding is more of the same. I use the size of the base64String variable above (which varies depending on the original buffer size that I test with), as the buffer size for decoding. Then, instead of Convert.ToBase64String(), I call Convert.FromBase64String() and write to a different file name/path.
EDIT:
In my haste to reduce the code (I refactored it into a new project, separate from other processing to eliminate code that isn't central to the issue) I introduced a bug. The base 64 conversion should be performed on the secondaryBuffer for all iterations save the last (Identified by isFinalChunk), when buffer should be used. I have corrected the code above.
EDIT #2:
Thank you all for your comments/feedback. After correcting the bug (see the above edit), I re-tested my code, and it is actually working now. I intend to test and implement #rene's solution as it appears to be the best, but I thought that I should let everyone know of my discovery as well.
Based on the code shown in the blog from Wiktor Zychla the following code works. This same solution is indicated in the remarks section of Convert.ToBase64String as pointed out by Ivan Stoev
// using System.Security.Cryptography
private void ConvertLargeFile()
{
//encode
var filein= #"C:\Users\test\Desktop\my.zip";
var fileout = #"C:\Users\test\Desktop\Base64Zip";
using (FileStream fs = File.Open(fileout, FileMode.Create))
using (var cs=new CryptoStream(fs, new ToBase64Transform(),
CryptoStreamMode.Write))
using(var fi =File.Open(filein, FileMode.Open))
{
fi.CopyTo(cs);
}
// the zip file is now stored in base64zip
// and decode
using (FileStream f64 = File.Open(fileout, FileMode.Open) )
using (var cs=new CryptoStream(f64, new FromBase64Transform(),
CryptoStreamMode.Read ) )
using(var fo =File.Open(filein +".orig", FileMode.Create))
{
cs.CopyTo(fo);
}
// the original file is in my.zip.orig
// use the commandlinetool
// fc my.zip my.zip.orig
// to verify that the start file and the encoded and decoded file
// are the same
}
The code uses standard classes found in System.Security.Cryptography namespace and uses a CryptoStream and the FromBase64Transform and its counterpart ToBase64Transform
You can avoid using a secondary buffer by passing offset and length to Convert.ToBase64String, like this:
private void ConvertLargeFile()
{
using (var inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[MultipleOfThree];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while(bytesRead > 0)
{
String base64String = Convert.ToBase64String(buffer, 0, bytesRead);
File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
}
}
}
The above should work, but I think Rene's answer is actually the better solution.
Use this code:
public void ConvertLargeFile(string source , string destination)
{
using (FileStream inputStream = new FileStream(source, FileMode.Open, FileAccess.Read))
{
int buffer_size = 30000; //or any multiple of 3
byte[] buffer = new byte[buffer_size];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while (bytesRead > 0)
{
byte[] buffer2 = buffer;
if(bytesRead < buffer_size)
{
buffer2 = new byte[bytesRead];
Buffer.BlockCopy(buffer, 0, buffer2, 0, bytesRead);
}
string base64String = System.Convert.ToBase64String(buffer2);
File.AppendAllText(destination, base64String);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
}
}
}

FileResult buffered to memory

I'm trying to return large files via a controller ActionResult and have implemented a custom FileResult class like the following.
public class StreamedFileResult : FileResult
{
private string _FilePath;
public StreamedFileResult(string filePath, string contentType)
: base(contentType)
{
_FilePath = filePath;
}
protected override void WriteFile(System.Web.HttpResponseBase response)
{
using (FileStream fs = new FileStream(_FilePath, FileMode.Open, FileAccess.Read))
{
int bufferLength = 65536;
byte[] buffer = new byte[bufferLength];
int bytesRead = 0;
while (true)
{
bytesRead = fs.Read(buffer, 0, bufferLength);
if (bytesRead == 0)
{
break;
}
response.OutputStream.Write(buffer, 0, bytesRead);
}
}
}
}
However the problem I am having is that entire file appears to be buffered into memory. What would I need to do to prevent this?
You need to flush the response in order to prevent buffering. However if you keep on buffering without setting content-length, user will not see any progress. So in order for users to see proper progress, IIS buffers entire content, calculates content-length, applies compression and then sends the response. We have adopted following procedure to deliver files to client with high performance.
FileInfo path = new FileInfo(filePath);
// user will not see a progress if content-length is not specified
response.AddHeader("Content-Length", path.Length.ToString());
response.Flush();// do not add anymore headers after this...
byte[] buffer = new byte[ 4 * 1024 ]; // 4kb is a good for network chunk
using(FileStream fs = path.OpenRead()){
int count = 0;
while( (count = fs.Read(buffer,0,buffer.Length)) >0 ){
if(!response.IsClientConnected)
{
// network connection broke for some reason..
break;
}
response.OutputStream.Write(buffer,0,count);
response.Flush(); // this will prevent buffering...
}
}
You can change buffer size, but 4kb is ideal as lower level file system also reads buffer in chunks of 4kb.
Akash Kava is partly right and partly wrong. You DO NOT need to add the Content-Length header or do the flush afterward. But you DO, need to periodically flush response.OutputStream and then response. ASP.NET MVC (at least version 5) will automatically convert this into a "Transfer-Encoding: chunked" response.
byte[] buffer = new byte[ 4 * 1024 ]; // 4kb is a good for network chunk
using(FileStream fs = path.OpenRead()){
int count = 0;
while( (count = fs.Read(buffer,0,buffer.Length)) >0 ){
if(!response.IsClientConnected)
{
// network connection broke for some reason..
break;
}
response.OutputStream.Write(buffer,0,count);
response.OutputStream.Flush();
response.Flush(); // this will prevent buffering...
}
}
I tested it and it works.

how to buffer an input stream until it is complete

i'm implementing a wcf service that accepts image streams. however i'm currently getting an exception when i run it. as its trying to get the length of the stream before the stream is complete. so what i'd like to do is buffer the stream until its complete. however i cant find any examples of how to do this...
can anyone help?
my code so far:
public String uploadUserImage(Stream stream)
{
Stream fs = stream;
BinaryReader br = new BinaryReader(fs);
Byte[] bytes = br.ReadBytes((Int32)fs.Length);// this causes exception
File.WriteAllBytes(filepath, bytes);
}
Rather than try to fetch the length, you should read from the stream until it returns that it's "done". In .NET 4, this is really easy:
// Assuming we *really* want to read it into memory first...
MemoryStream memoryStream = new MemoryStream();
stream.CopyTo(memoryStream);
memoryStream.Position = 0;
File.WriteAllBytes(filepath, memoryStream);
In .NET 3.5 there's no CopyTo method, but you can write something similar yourself:
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[8192];
int bytesRead;
while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, bytesRead);
}
}
However, now we've got something to copy a stream, why bother reading it all into memory first? Let's just write it straight to a file:
using (FileStream output = File.OpenWrite(filepath))
{
CopyStream(stream, output); // Or stream.CopyTo(output);
}
I'm not sure what you are returning (or not returning), but something like this might work for you:
public String uploadUserImage(Stream stream) {
const int KB = 1024;
Byte[] bytes = new Byte[KB];
StringBuilder sb = new StringBuilder();
using (BinaryReader br = new BinaryReader(stream)) {
int len;
do {
len = br.Read(bytes, 0, KB);
string readData = Encoding.UTF8.GetString(bytes);
sb.Append(readData);
} while (len == KB);
}
//File.WriteAllBytes(filepath, bytes);
return sb.ToString();
}
A string can hold up to 2 GB, I believe.
Try this :
using (StreamWriter sw = File.CreateText(filepath))
{
stream.CopyTo(sw);
sw.Close();
}
Jon Skeets answer for .Net 3.5 and below using a Buffer Read is actually done incorrectly.
The buffer isn't cleared between reads which can result in issues on any read that returns less than 8192, for example if the 2nd read, read 192 bytes, the 8000 last bytes from the first read would STILL be in the buffer which would then be returned to the stream.
My code below you supply it a Stream and it will return a IEnumerable array.
Using this you can for-each it and Write to a MemoryStream and then use .GetBuffer() to end up with a compiled merged byte[].
private IEnumerable<byte[]> ReadFullStream(Stream stream) {
while(true) {
byte[] buffer = new byte[8192];//since this is created every loop, its buffer is cleared
int bytesRead = stream.Read(buffer, 0, buffer.Length);//read up to 8192 bytes into buffer
if (bytesRead == 0) {//if we read nothing, stream is finished
break;
}
if(bytesRead < buffer.Length) {//if we read LESS than 8192 bytes, resize the buffer to essentially remove everything after what was read, otherwise you will have nullbytes/0x00bytes at the end of your buffer
Array.Resize(ref buffer, bytesRead);
}
yield return buffer;//yield return the buffer data
}//loop here until we reach a read == 0 (end of stream)
}

compressing and decompressing source data gives result different than source data

In my app I need to Decompress data written by DataContractSerializer to compression Deflate Stream in another app, edit the decompressed data and Compress it again.
Decompression works fine, but not for data compressed by me.
The problem is that when I do this:
byte[] result = Compressor.Compress(Compressor.Decompress(sourceData));
the length of the result byte array is different than sourceData array.
For example:
string source = "test value";
byte[] oryg = Encoding.Default.GetBytes(source);
byte[] comp = Compressor.Compress(oryg);
byte[] result1 = Compressor.Decompress(comp);
string result2 = Encoding.Default.GetString(res);
and here result1.Length is 0 and result2 is "" of course
Here is the code of my Compressor class.
public static class Compressor
{
public static byte[] Decompress(byte[] data)
{
byte[] result;
using (MemoryStream baseStream = new MemoryStream(data))
{
using (DeflateStream stream = new DeflateStream(baseStream, CompressionMode.Decompress))
{
result = ReadFully(stream, -1);
}
}
return result;
}
public static byte[] Compress(byte[] data)
{
byte[] result;
using (MemoryStream baseStream = new MemoryStream())
{
using (DeflateStream stream = new DeflateStream(baseStream, CompressionMode.Compress, true))
{
stream.Write(data, 0, data.Length);
result = baseStream.ToArray();
}
}
return result;
}
/// <summary>
/// Reads data from a stream until the end is reached. The
/// data is returned as a byte array. An IOException is
/// thrown if any of the underlying IO calls fail.
/// </summary>
/// <param name="stream">The stream to read data from</param>
/// <param name="initialLength">The initial buffer length</param>
private static byte[] ReadFully(Stream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, just
// use 32K.
if (initialLength < 1)
{
initialLength = 65768 / 2;
}
byte[] buffer = new byte[initialLength];
int read = 0;
int chunk;
while ((chunk = stream.Read(buffer, read, buffer.Length - read)) > 0)
{
read += chunk;
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
int nextByte = stream.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
}
Please help me with this case if You can.
Best regards,
Adam
(edited: switched from using flush, which still might not flush out all bytes, to now ensuring deflate is disposed first, as per Phil's answer here: zip and unzip string with Deflate)
Before attempting to read from backing store, you have to ensure the deflate stream has fully flushed itself when compressing, allowing deflate to finish compressing and write final bytes. Closing the deflate steam, or disposing of it, will achieve this.
public static byte[] Compress(byte[] data)
{
byte[] result;
using (MemoryStream baseStream = new MemoryStream())
{
using (DeflateStream stream = new DeflateStream(baseStream, CompressionMode.Compress, true))
{
stream.Write(data, 0, data.Length);
}
result = baseStream.ToArray(); // only safe to read after deflate closed
}
return result;
}
Also your ReadFully routine looks incredibly complicated and likely to have bugs.
One being:
while ((chunk = stream.Read(buffer, read, buffer.Length - read)) > 0)
When reading the 2nd chunk, read will be greater than the length of the buffer, meaning it'll always pass a negative value to stream.Read for the number of bytes to read. My guess is that it'll never read the 2nd chunk, returning zero, and fall out of the while loop.
I recommend Jon's version of ReadFully for this purpose: Creating a byte array from a stream

Categories