Failed to use GZipStream to decompress a file (without using CopyToAsync)

Failed to use GZipStream to decompress a file (without using CopyToAsync) - c#

I'm trying to to decompress a MemoryStream using ReadAsync/WriteAsync but it's not working.
int bufferSize = 8192;
using (var memoryStream = new MemoryStream())
using (var fileStream = new FileStream(destinationFilename, FileMode.Create, FileAccess.ReadWrite, FileShare.None))
{
// ... populate the MemoryStream ...
memoryStream.Position = 0;
using (var gzipStream = new GZipStream(memoryStream, CompressionMode.Decompress, true))
{
////await gzipStream.CopyToAsync(fileStream);
byte[] buffer = new byte[bufferSize];
while (await gzipStream.ReadAsync(buffer, 0, bufferSize) > 0)
{
await fileStream.WriteAsync(buffer, 0, bufferSize);
}
}
await fileStream.FlushAsync();
}
The gzipStream.CopyToAsync works but not the other way. Why?
Thanks.

ReadAsync is returning number of bytes read - you're ignoring that number. You can only WriteAsync the exact count of bytes you've read first.

Related

Zip file is getting corrupted after downloading from server in C#

request = MakeConnection(uri, WebRequestMethods.Ftp.DownloadFile, username, password);
response = (FtpWebResponse)request.GetResponse();
Stream responseStream = response.GetResponseStream();
//This part of the code is used to write the read content from the server
using (StreamReader responseReader = new StreamReader(responseStream))
{
using (var destinationStream = new FileStream(toFilenameToWrite, FileMode.Create))
{
byte[] fileContents = Encoding.UTF8.GetBytes(responseReader.ReadToEnd());
destinationStream.Write(fileContents, 0, fileContents.Length);
}
}
//This part of the code is used to write the read content from the server
using (var destinationStream = new FileStream(toFilenameToWrite, FileMode.Create))
{
long length = response.ContentLength;
int bufferSize = 2048;
int readCount;
byte[] buffer = new byte[2048];
readCount = responseStream.Read(buffer, 0, bufferSize);
while (readCount > 0)
{
destinationStream.Write(buffer, 0, readCount);
readCount = responseStream.Read(buffer, 0, bufferSize);
}
}
The former ones writes the content to the file but when I try to open the file it says it is corrupted. But the later one does the job perfectly when downloading zip files. Is there any specific reason why the former code doesn't work for zip files as it works perfectly for text files?

byte[] fileContents = Encoding.UTF8.GetBytes(responseReader.ReadToEnd());
You try to interpret a binary PDF file as an UTF-8 text. That just cannot work.
For a correct code, see Upload and download a binary file to/from FTP server in C#/.NET.

Use BinaryWriter and pass it FileStream.
//This part of the code is used to write the read content from the server
using (var destinationStream = new BinaryWriter(new FileStream(toFilenameToWrite, FileMode.Create)))
{
long length = response.ContentLength;
int bufferSize = 2048;
int readCount;
byte[] buffer = new byte[2048];
readCount = responseStream.Read(buffer, 0, bufferSize);
while (readCount > 0)
{
destinationStream.Write(buffer, 0, readCount);
readCount = responseStream.Read(buffer, 0, bufferSize);
}
}

here is my solution that worked for me
C#
public IActionResult GetZip([FromBody] List<DocumentAndSourceDto> documents)
{
List<Document> listOfDocuments = new List<Document>();
foreach (DocumentAndSourceDto doc in documents)
listOfDocuments.Add(_documentService.GetDocumentWithServerPath(doc.Id));
using (var ms = new MemoryStream())
{
using (var zipArchive = new ZipArchive(ms, ZipArchiveMode.Create, true))
{
foreach (var attachment in listOfDocuments)
{
var entry = zipArchive.CreateEntry(attachment.FileName);
using (var fileStream = new FileStream(attachment.FilePath, FileMode.Open))
using (var entryStream = entry.Open())
{
fileStream.CopyTo(entryStream);
}
}
}
ms.Position = 0;
return File(ms.ToArray(), "application/zip");
}
throw new ErrorException("Can't zip files");
}
don't miss the ms.Position = 0; here

uncompressed file is bigger than original file in GZIP

i'm using the following function to compress(thanks to http://www.dotnetperls.com/):
public static void CompressStringToFile(string fileName, string value)
{
// A.
// Write string to temporary file.
string temp = Path.GetTempFileName();
File.WriteAllText(temp, value);
// B.
// Read file into byte array buffer.
byte[] b;
using (FileStream f = new FileStream(temp, FileMode.Open))
{
b = new byte[f.Length];
f.Read(b, 0, (int)f.Length);
}
// C.
// Use GZipStream to write compressed bytes to target file.
using (FileStream f2 = new FileStream(fileName, FileMode.Create))
using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
{
gz.Write(b, 0, b.Length);
}
}
and for decompress:
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
so my goal is actually compress log files and than to decompress them in memory and compare the uncompressed file to the original file in order to check that the compression succeeded and i'm able to open the compressed file successfuly.
the problem is that the uncompressed file is most of the time bigger than the original file and my compare check is failing altough the compression probably succeeded.
any idea why ?
btw here how i compare the uncompressed file to the original file:
static bool FileEquals(byte[] file1, byte[] file2)
{
if (file1.Length == file2.Length)
{
for (int i = 0; i < file1.Length; i++)
{
if (file1[i] != file2[i])
{
return false;
}
}
return true;
}
return false;
}

Try this method to compress a file:
public static byte[] Compress(byte[] raw)
{
using (MemoryStream memory = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(memory,
CompressionMode.Compress, true))
{
gzip.Write(raw, 0, raw.Length);
}
return memory.ToArray();
}
}
}
And this to decompress :
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
}
Tell me if it worked.
Goodluck.

Think you'd be better off with the simplest API call, try Stream.CopyTo(). I can't find the error in your code. If I was working on it, I'd probably make sure everything is getting flushed properly.. can't recall if GZipStream is going to flush its output to FileStream when the using block closes.. but then you are also saying that the final file is larger, not smaller.
Anyhow, best policy in my experience.. don't rewrite gotcha prone code when you don't need to. At least you tested it ;)

SharpZipLib - zero bytes

I am trying to implement SharpZipLib
So I copied the below snippet from their samples:
// Compresses the supplied memory stream, naming it as zipEntryName, into a zip,
// which is returned as a memory stream or a byte array.
//
public MemoryStream CreateToMemoryStream(MemoryStream memStreamIn, string zipEntryName)
{
MemoryStream outputMemStream = new MemoryStream();
ZipOutputStream zipStream = new ZipOutputStream(outputMemStream);
zipStream.SetLevel(3); //0-9, 9 being the highest level of compression
ZipEntry newEntry = new ZipEntry(zipEntryName);
newEntry.DateTime = DateTime.Now;
zipStream.PutNextEntry(newEntry);
StreamUtils.Copy(memStreamIn, zipStream, new byte[4096]);
zipStream.CloseEntry();
zipStream.IsStreamOwner = false; // False stops the Close also Closing the underlying stream.
zipStream.Close(); // Must finish the ZipOutputStream before using outputMemStream.
outputMemStream.Position = 0;
return outputMemStream;
// Alternative outputs:
// ToArray is the cleaner and easiest to use correctly with the penalty of duplicating allocated memory.
byte[] byteArrayOut = outputMemStream.ToArray();
// GetBuffer returns a raw buffer raw and so you need to account for the true length yourself.
byte[] byteArrayOut = outputMemStream.GetBuffer();
long len = outputMemStream.Length;
}
I copy pasted that function and called it this way:
using (MemoryStream ms = new MemoryStream())
using (FileStream file = new FileStream(#"c:\file.jpg", FileMode.Open, FileAccess.Read))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms.Write(bytes, 0, (int)file.Length);
var result = SharpZip.CreateToMemoryStream(ms, "file.jpg");
result.WriteTo(new FileStream(#"c:\myzip.zip", FileMode.Create, System.IO.FileAccess.Write));
}
The myzip.zip is sucessfully created, but the file.jpg inside has zero bytes.
Any ideas?
Thanks a lot

It is necessary to "rewind" the input MemoryStream after writing the input file data:
using (MemoryStream ms = new MemoryStream())
using (FileStream file = File.OpenRead(#"input file path"))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms.Write(bytes, 0, (int)file.Length);
ms.Position = 0; // "Rewind" the stream to the beginning.
var result = SharpZip.CreateToMemoryStream(ms, "file.jpg");
using (var outputStream = File.Create(#"output file path"))
{
result.WriteTo(outputStream);
}
}
Alternative (slightly simplified) version of the implementation:
var bytes = File.ReadAllBytes(#"input file path");
using (MemoryStream ms = new MemoryStream(bytes))
{
var result = SharpZip.CreateToMemoryStream(ms, "file.jpg");
using (var outputStream = File.Create(#"output file path"))
{
result.WriteTo(outputStream);
}
}

C# decode (decompress) Deflate data of PDF File

I would like to decompress in C# some DeflateCoded data (PDF extracted).
Unfortunately I got every time the exception "Found invalid data while decoding.".
But the data are valid.
private void Decompress()
{
FileStream fs = new FileStream(#"S:\Temp\myFile.bin", FileMode.Open);
//First two bytes are irrelevant
fs.ReadByte();
fs.ReadByte();
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Decompress);
StreamToFile(d_Stream, #"S:\Temp\myFile1.txt", FileMode.OpenOrCreate);
d_Stream.Close();
fs.Close();
}
private static void StreamToFile(Stream inputStream, string outputFile, FileMode fileMode)
{
if (inputStream == null)
throw new ArgumentNullException("inputStream");
if (String.IsNullOrEmpty(outputFile))
throw new ArgumentException("Argument null or empty.", "outputFile");
using (FileStream outputStream = new FileStream(outputFile, fileMode, FileAccess.Write))
{
int cnt = 0;
const int LEN = 4096;
byte[] buffer = new byte[LEN];
while ((cnt = inputStream.Read(buffer, 0, LEN)) != 0)
outputStream.Write(buffer, 0, cnt);
}
}
Does anyone has some ideas?
Thanks.

I added this for test data:-
private static void Compress()
{
FileStream fs = new FileStream(#"C:\Temp\myFile.bin", FileMode.Create);
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Compress);
for (byte n = 0; n < 255; n++)
d_Stream.WriteByte(n);
d_Stream.Close();
fs.Close();
}
Modified Decompress like this:-
private static void Decompress()
{
FileStream fs = new FileStream(#"C:\Temp\myFile.bin", FileMode.Open);
//First two bytes are irrelevant
// fs.ReadByte();
// fs.ReadByte();
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Decompress);
StreamToFile(d_Stream, #"C:\Temp\myFile1.txt", FileMode.OpenOrCreate);
d_Stream.Close();
fs.Close();
}
Ran it like this:-
static void Main(string[] args)
{
Compress();
Decompress();
}
And got no errors.
I conclude that either the first two bytes are relevant (Obviously they are with my particular test data.) or
that your data has a problem.
Can we have some of your test data to play with?
(Obviously don't if it's sensitive)

private static string decompress(byte[] input)
{
byte[] cutinput = new byte[input.Length - 2];
Array.Copy(input, 2, cutinput, 0, cutinput.Length);
var stream = new MemoryStream();
using (var compressStream = new MemoryStream(cutinput))
using (var decompressor = new DeflateStream(compressStream, CompressionMode.Decompress))
decompressor.CopyTo(stream);
return Encoding.Default.GetString(stream.ToArray());
}
Thank you user159335 and user1011394 for bringing me on the right track! Just pass all bytes of the stream to input of above function. Make sure the bytecount is the same as the length specified.

All you need to do is use GZip instead of Deflate. Below is the code I use for the content of the stream… endstream section in a PDF document:
using System.IO.Compression;
public void DecompressStreamData(byte[] data)
{
int start = 0;
while ((this.data[start] == 0x0a) | (this.data[start] == 0x0d)) start++; // skip trailling cr, lf
byte[] tempdata = new byte[this.data.Length - start];
Array.Copy(data, start, tempdata, 0, data.Length - start);
MemoryStream msInput = new MemoryStream(tempdata);
MemoryStream msOutput = new MemoryStream();
try
{
GZipStream decomp = new GZipStream(msInput, CompressionMode.Decompress);
decomp.CopyTo(msOutput);
}
catch (Exception e)
{
MessageBox.Show(e.Message);
}
}

None of the solutions worked for me on Deflate attachments in a PDF/A-3 document. Some research showed that .NET DeflateStream does not support compressed streams with a header and trailer as per RFC1950.
Error message for reference: The archive entry was compressed using an unsupported compression method.
The solution is to use an alternative library SharpZipLib
Here is a simple method that successfully decoded a Deflate attachment from a PDF/A-3 file for me:
public static string SZLDecompress(byte[] data) {
var outputStream = new MemoryStream();
using var compressedStream = new MemoryStream(data);
using var inputStream = new InflaterInputStream(compressedStream);
inputStream.CopyTo(outputStream);
outputStream.Position = 0;
return Encoding.Default.GetString(outputStream.ToArray());
}

compressing and decomressing in .net endsup with an decompressed array of zero's

I'm trying to compress and decompress a memory stream to send it over an tcp connection.
In the following code snap I do do the decompressing right after compressing to get it working first.
What ever I do I end up with a devompressed buffer wit all zero's and in the line
int read = Decompress.Read(buffie, 0, buffie.Length);
it seems that 0 bytes are read.
Does anyone has a clue what is wrong?
bytesRead = ms.Read(buf, 0, i);
MemoryStream partialMs = new MemoryStream();
GZipStream gZip = new GZipStream(partialMs, CompressionMode.Compress);
gZip.Write(buf, 0, buf.Length);
partialMs.Position = 0;
byte[] compressedBuf = new byte[partialMs.Length];
partialMs.Read(compressedBuf, 0, (int)partialMs.Length);
partialMs.Close();
byte[] gzBuffer = new byte[compressedBuf.Length + 4];
System.Buffer.BlockCopy(compressedBuf, 0, gzBuffer, 4, compressedBuf.Length);
System.Buffer.BlockCopy(BitConverter.GetBytes(buf.Length), 0, gzBuffer, 0, 4);
using (MemoryStream mems = new MemoryStream())
{
int msgLength = BitConverter.ToInt32(gzBuffer, 0);
byte[] buffie = new byte[msgLength];
mems.Write(gzBuffer, 4, gzBuffer.Length - 4);
mems.Flush();
mems.Position = 0;
using (GZipStream Decompress = new GZipStream(mems, CompressionMode.Decompress, true))
{
int read = Decompress.Read(buffie, 0, buffie.Length);
Decompress.Close();
}
}

Your implementation could use some work. there seems to be some confusion as to which streams should be used where. here is a working example to get you started..
see user content at the bottom of this MSDN page
var original = new byte[65535];
var compressed = GZipTest.Compress(original);
var decompressed = GZipTest.Decompress(compressed);
using System.IO;
using System.IO.Compression;
public class GZipTest
{
public static byte[] Compress(byte[] uncompressedBuffer)
{
using (var ms = new MemoryStream())
{
using (var gzip = new GZipStream(ms, CompressionMode.Compress, true))
{
gzip.Write(uncompressedBuffer, 0, uncompressedBuffer.Length);
}
byte[] compressedBuffer = ms.ToArray();
return compressedBuffer;
}
}
public static byte[] Decompress(byte[] compressedBuffer)
{
using (var gzip = new GZipStream(new MemoryStream(compressedBuffer), CompressionMode.Decompress))
{
byte[] uncompressedBuffer = ReadAllBytes(gzip);
return uncompressedBuffer;
}
}
private static byte[] ReadAllBytes(Stream stream)
{
var buffer = new byte[4096];
using (var ms = new MemoryStream())
{
int bytesRead = 0;
do
{
bytesRead = stream.Read(buffer, 0, buffer.Length);
if (bytesRead > 0)
{
ms.Write(buffer, 0, bytesRead);
}
} while (bytesRead > 0);
return ms.ToArray();
}
}
}

You're not closing the GzipStream you're writing to, so it's probably all buffered. I suggest you close it when you're done writing your data.
By the way, you can get the data out of a MemoryStream much more easily than your current code: use MemoryStream.ToArray.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Failed to use GZipStream to decompress a file (without using CopyToAsync) - c#

ReadAsync is returning number of bytes read - you're ignoring that number. You can only WriteAsync the exact count of bytes you've read first.

Related

Zip file is getting corrupted after downloading from server in C#

uncompressed file is bigger than original file in GZIP

SharpZipLib - zero bytes

C# decode (decompress) Deflate data of PDF File

compressing and decomressing in .net endsup with an decompressed array of zero's

Categories

Resources