GZipStream does not decompress data correctly - c#

I'm trying to compress data with GZipStream. The code is quite straightforward:
// Serialize
var ms = new MemoryStream();
ProtoBuf.Serializer.Serialize(ms, result);
ms.Seek(0, SeekOrigin.Begin);
// Compress
var ms2 = new MemoryStream();
GZipStream zipStream = new GZipStream(ms2, CompressionMode.Compress);
ms.CopyTo(zipStream);
zipStream.Flush();
// Test
ms2.Seek(0, SeekOrigin.Begin);
var ms3 = new MemoryStream();
var unzipStream = new GZipStream(ms2, CompressionMode.Decompress);
unzipStream.CopyTo(ms3);
System.Diagnostics.Debug.WriteLine($"{ms.Length} =? {ms3.Length}");
Results should be equal, but I'm getting:
244480 =? 191481
Is the GZipStream unable to decompress stream compressed by itself? Or am I doing something wrong?

From the docs of GZipStream.Flush:
The current implementation of this method does not flush the internal buffer. The internal buffer is flushed when the object is disposed.
That fits in with not enough data being written to ms2. Try wrapping zipStream in a using block instead:
var ms2 = new MemoryStream();
using (GZipStream zipStream = new GZipStream(ms2, CompressionMode.Compress))
{
ms.CopyTo(zipStream);
}

Related

Is it safe to return a Stream in WebAPI?

My WebAPI needs to return a Stream.
using (var stream = new MemoryStream())
{
//Fill the Stream
stream.Seek(0, SeekOrigin.Begin);
return new OkObjectResult(stream);
}
The above would not work as stream is disposed before the Client can get it. So I simply changed it to:
var stream = new MemoryStream();
//Fill the Stream
stream.Seek(0, SeekOrigin.Begin);
return new OkObjectResult(stream);
I wonder if it is safe to return a stream from a WebAPI. My doubt is that the garbage collector could dispose stream before the client has finished to get/download the content.

The stream is invalid or no corresponding signature was found

I am trying to use SevenZipSharp to compress and decompress a memory stream. Compression is working fine but decompression is not. I think SevenZipSharp is not able to figure the archive type from the stream.
SevenZipCompressor compress = new SevenZip.SevenZipCompressor();
compress.CompressionLevel = CompressionLevel.Normal;
compress.CompressionMethod = CompressionMethod.Lzma
using (MemoryStream memStream = new MemoryStream())
{
compress.CompressFiles(memStream, #"d:\Temp1\MyFile.bmp");
using (FileStream file = new FileStream(#"d:\arch.7z", FileMode.Create, System.IO.FileAccess.Write))
{
memStream.CopyTo(file);
}
}
//works till here, file is created
Console.Read();
using (FileStream file = new FileStream(#"d:\arch.7z", FileMode.Open, System.IO.FileAccess.Read))
{
using (MemoryStream memStream = new MemoryStream())
{
file.CopyTo(memStream);
//throws exception here on this line
using (var extractor = new SevenZipExtractor(memStream))
{
extractor.ExtractFiles(#"d:\x", 0);
}
}
}
Try to see if your output file can be loaded using the 7Zip client. I'm guessing that it will fail.
The problem lies in the writing to the memorystream. Say, you write 100 bytes to the stream, it will be on position 100. When you use CopyTo, the stream will be copied from the current position, not the start of the stream.
So you'll have to reset the position to 0 after reading/writing to allow the next reader to read all the data. For instance when creating the 7Zip file:
using (MemoryStream memStream = new MemoryStream())
{
// Position starts at 0
compress.CompressFiles(memStream, #"d:\Temp1\MyFile.bmp");
// Position is now N
memStream.Position = 0; // <-- Reset the position to 0.
using (FileStream file = new FileStream(#"d:\arch.7z", FileMode.Create, System.IO.FileAccess.Write))
{
// Will copy all data in the stream from current position till the end of the stream.
memStream.CopyTo(file);
}
}

Writing a memory stream to a file

I have tried retrieving data in the json format as a string and writing it to a file and it worked great. Now I am trying to use MemoryStream to do the same thing but nothing gets written to a file - merely [{},{},{},{},{}] without any actual data.
My question is - how can I check if data indeed goes to memory stream correctly or if the problem occurs somewhere else. I do know that myList does contain data.
Here is my code:
MemoryStream ms = new MemoryStream();
DataContractJsonSerializer dcjs = new DataContractJsonSerializer(typeof(List<myClass>));
dcjs.WriteObject(ms, myList);
using (FileStream fs = new FileStream(Path.Combine(Application.StartupPath,"MyFile.json"), FileMode.OpenOrCreate))
{
ms.Position = 0;
ms.Read(ms.ToArray(), 0, (int)ms.Length);
fs.Write(ms.ToArray(), 0, ms.ToArray().Length);
ms.Close();
fs.Flush();
fs.Close();
}
There is a very handy method, Stream.CopyTo(Stream).
using (MemoryStream ms = new MemoryStream())
{
StreamWriter writer = new StreamWriter(ms);
writer.WriteLine("asdasdasasdfasdasd");
writer.Flush();
//You have to rewind the MemoryStream before copying
ms.Seek(0, SeekOrigin.Begin);
using (FileStream fs = new FileStream("output.txt", FileMode.OpenOrCreate))
{
ms.CopyTo(fs);
fs.Flush();
}
}
Also, you don't have to close fs since it's in a using statement and will be disposed at the end.
using (var memoryStream = new MemoryStream())
{
...
var fileName = $"FileName.xlsx";
string tempFilePath = Path.Combine(Path.GetTempPath() + fileName );
using (var fs = new FileStream(tempFilePath, FileMode.Create, FileAccess.Write))
{
memoryStream.WriteTo(fs);
}
}
//reset the position of the stream
ms.Position = 0;
//Then copy to filestream
ms.CopyTo(fileStream);
The issue is nothing to do with your file stream/ memory stream. The problem is that DataContractJsonSerializer is an OPT IN Serializer. You need to add [DataMemberAttribute] to all the properties that you need to serialize on myClass.
[DataContract]
public class myClass
{
[DataMember]
public string Foo { get; set; }
}
This line looks problematic:
ms.Read(ms.ToArray(), 0, (int)ms.Length);
You shouldn't need to read anything into the memory stream at this point, particularly when you're code is written to read ms into ms.
I'm pretty confident that simply removing this line will fix your problem.

SharpZipLib - zero bytes

I am trying to implement SharpZipLib
So I copied the below snippet from their samples:
// Compresses the supplied memory stream, naming it as zipEntryName, into a zip,
// which is returned as a memory stream or a byte array.
//
public MemoryStream CreateToMemoryStream(MemoryStream memStreamIn, string zipEntryName)
{
MemoryStream outputMemStream = new MemoryStream();
ZipOutputStream zipStream = new ZipOutputStream(outputMemStream);
zipStream.SetLevel(3); //0-9, 9 being the highest level of compression
ZipEntry newEntry = new ZipEntry(zipEntryName);
newEntry.DateTime = DateTime.Now;
zipStream.PutNextEntry(newEntry);
StreamUtils.Copy(memStreamIn, zipStream, new byte[4096]);
zipStream.CloseEntry();
zipStream.IsStreamOwner = false; // False stops the Close also Closing the underlying stream.
zipStream.Close(); // Must finish the ZipOutputStream before using outputMemStream.
outputMemStream.Position = 0;
return outputMemStream;
// Alternative outputs:
// ToArray is the cleaner and easiest to use correctly with the penalty of duplicating allocated memory.
byte[] byteArrayOut = outputMemStream.ToArray();
// GetBuffer returns a raw buffer raw and so you need to account for the true length yourself.
byte[] byteArrayOut = outputMemStream.GetBuffer();
long len = outputMemStream.Length;
}
I copy pasted that function and called it this way:
using (MemoryStream ms = new MemoryStream())
using (FileStream file = new FileStream(#"c:\file.jpg", FileMode.Open, FileAccess.Read))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms.Write(bytes, 0, (int)file.Length);
var result = SharpZip.CreateToMemoryStream(ms, "file.jpg");
result.WriteTo(new FileStream(#"c:\myzip.zip", FileMode.Create, System.IO.FileAccess.Write));
}
The myzip.zip is sucessfully created, but the file.jpg inside has zero bytes.
Any ideas?
Thanks a lot
It is necessary to "rewind" the input MemoryStream after writing the input file data:
using (MemoryStream ms = new MemoryStream())
using (FileStream file = File.OpenRead(#"input file path"))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms.Write(bytes, 0, (int)file.Length);
ms.Position = 0; // "Rewind" the stream to the beginning.
var result = SharpZip.CreateToMemoryStream(ms, "file.jpg");
using (var outputStream = File.Create(#"output file path"))
{
result.WriteTo(outputStream);
}
}
Alternative (slightly simplified) version of the implementation:
var bytes = File.ReadAllBytes(#"input file path");
using (MemoryStream ms = new MemoryStream(bytes))
{
var result = SharpZip.CreateToMemoryStream(ms, "file.jpg");
using (var outputStream = File.Create(#"output file path"))
{
result.WriteTo(outputStream);
}
}

DeflateStream doesnt work on MemoryStream?

I have the following piece of code:
MemoryStream resultStream = new MemoryStream();
string users = ""//Really long string goes here
BinaryFormatter bFormatter = new BinaryFormatter();
using (MemoryStream assignedUsersStream = new MemoryStream())
{
bFormatter.Serialize(assignedUsersStream, users);
assignedUsersStream.Position = 0;
using (var compressionStream =
new DeflateStream(resultStream, CompressionLevel.Optimal))
{
assignedUsersStream.CopyTo(compressionStream);
Console.WriteLine("Compressed from {0} to {1} bytes.",
assignedUsersStream.Length.ToString(),
resultStream.Length.ToString());
}
}
the thing is that resultStream is always empty!
What am I doing wrong here?
Put your verification WriteLine outside of the using. The buffers haven't been flushed yet.
using (DeflateStream compressionStream = new DeflateStream(resultStream, CompressionLevel.Optimal))
{
assignedUsersStream.CopyTo(compressionStream);
//Console.WriteLine("Compressed from {0} to {1} bytes.",
// assignedUsersStream.Length.ToString(), resultStream.Length.ToString());
}
Console.WriteLine("Compressed from {0} to {1} bytes.",
assignedUsersStream.Length, resultStream.ToArray().Length);
And aside, you don't need all those ToString()s in a writeline.
PS: All a BinaryFormatter does with a string is write the bytes with length prefix. If you don't need the prefix (my guess), it could become:
string users = "";//Really long string goes here
byte[] result;
using (MemoryStream resultStream = new MemoryStream())
{
using (DeflateStream compressionStream = new DeflateStream(resultStream,
CompressionLevel.Optimal))
{
byte[] inBuffer = Encoding.UTF8.GetBytes(users);
compressionStream.Write(inBuffer, 0, inBuffer.Length);
}
result = resultStream.ToArray();
}
The reverse is just as easy but you'll need an estimate of the maximum length to create the read-buffer:
string users2 = null;
using (MemoryStream resultStream = new MemoryStream(result))
{
using (DeflateStream compressionStream = new DeflateStream(resultStream,
CompressionMode.Decompress))
{
byte[] outBuffer = new byte[2048]; // need an estimate here
int length = compressionStream.Read(outBuffer, 0, outBuffer.Length);
users2 = Encoding.UTF8.GetString(outBuffer, 0, length);
}
}
That is because the DeflateStream doesn't flush the data to the underlying stream until it is closed. After it is closed, resultStream will contain the compressed data. Note that by default, DeflateStream closes the underlying stream when it's closed, but you don't want that, so you need to pass true for the leaveOpen parameter. Also, you don't need 2 memory streams, you can just serialize directly to the compressionStream:
string users = ""; //Really long string goes here
BinaryFormatter bFormatter = new BinaryFormatter();
using (MemoryStream resultStream = new MemoryStream())
{
using (DeflateStream compressionStream = new DeflateStream(resultStream, CompressionLevel.Optimal, true))
{
bFormatter.Serialize(compressionStream, users);
Console.WriteLine(resultStream.Length); // 0 at this point
}
Console.WriteLine(resultStream.Length); // now contains the actual length
}
From the original answer (I don't have enough credits to vote down)
Put your control WriteLine outside of the using
This is incomplete and IMO therefore misleading. DeflateStream's Dispose(bool) implementation Closes the underlying resultStream when the DeflateStream is being Finalized after it's been Garbage Collected. When this happens, resultStream.Length will throw:
Unhandled Exception: System.ObjectDisposedException: Cannot access a closed Stream.
In other words, Thomas Levesque's note is critical: also set leaveOpen to true.
An interesting question with some good points raised by HH and TL.
I've come in late to this as I ran into this same problem and reading the conflicting answers. The initial response works! as does this test (over simplified to highlight the answer):
var inStream = new MemoryStream(data);
var outStream = new MemoryStream();
using (var compressor = new DeflateStream(outStream, CompressionLevel.Optimal))
{
inStream.CopyTo(compressor);
}
return outStream;
where the using block needs to complete, triggering the compressor's Dispose, which internally Flush()es so that outStream will be guaranteed to contain the complete compressed data.

Categories