The MSDN documentation tells me the following:
The GZipStream class uses the gzip
data format, which includes a cyclic
redundancy check value for detecting
data corruption. The gzip data format
uses the same compression algorithm as
the DeflateStream class.
It seems GZipStream adds some extra data to the output (relative to DeflateStream). I'm wondering, in what type of a scenario would it be essential to use GZipStream and not DeflateStream?
Deflate is just the compression algorithm. GZip is actually a format.
If you use the GZipStream to compress a file (and save it with the extension .gz), the result can actually be opened by archivers such as WinZip or the gzip tool. If you compress with a DeflateStream, those tools won't recognize the file.
If the compressed file is designed to be opened by these tools, then it is essential to use GZipStream instead of DeflateStream.
I would also consider it essential if you're transferring a large amount of data over an unreliable medium (i.e. an internet connection) and not using an error-correcting protocol such as TCP/IP. For example, you might be transmitting over a serial port, raw socket, or UDP. In this case, you would definitely want the CRC information that is embedded in the GZip format in order to ensure that the data is correct.
GZipStream is the same as DeflateStream but it adds some CRC to ensure the data has no error.
Well, i was completely wrong in my first answer. I have looked up in Mono source code and found that GZipStream class actually redirects its read/write(and almost any other) calls to an appropriate calls of methods of an internal DeflateStream object:
public override int Read (byte[] dest, int dest_offset, int count)
{
return deflateStream.Read(dest, dest_offset, count);
}
public override void Write (byte[] src, int src_offset, int count)
{
deflateStream.Write (src, src_offset, count);
}
The only difference, is that it always creates a DeflateStream object with a gzip flag set to true.
This is certainly not an answer to you question, but maybe it'll help a bit.
While GZipStream seems to be using DeflateStream to do decompression, the two algorithms don't seem to be interchangeable. Following test code will give you an exception:
MemoryStream wtt=new MemoryStream();
using (var gs=new GZipStream(wtt,CompressionMode.Compress,true))
{
using (var sw=new StreamWriter(gs,Encoding.ASCII,1024,true))
{
sw.WriteLine("Hello");
}
}
wtt.Position = 0;
using (var ds = new DeflateStream(wtt, CompressionMode.Decompress, true))
{
using (var sr=new StreamReader(ds,Encoding.ASCII,true,1024,true))
{
var txt = sr.ReadLine();
}
}
Dito as per Aaronaught
Note one other important difference as per
http://www.webpronews.com/gzip-vs-deflate-compression-and-performance-2006-12:
I measured the DeflateStream to 41% faster than GZip.
I didn't measure the speed, but I measured the file size to be appx. the same.
Related
I am confused by the behavior of Deflate algorithm, such as, the first chunk of bytes (size 12~13k) always decompress successfully. But the second decompression never turns out successful..
I am using DotNetZip (DeflateStream) with a simple code, later I switched to ZLIB.Net (component ace), Org.Bouncycastle, and variety of c# libraries.
The compression goes in c++ (the server that sends the packets) with deflateInit2, windowSize (-15) -> (15 - nowrap).
What could be incorrectly going so that I'm having zeros at the end of the buffer despite the fact that the decompression went successfully?
a example code with "Org.BouncyCastle.Utilities.Zlib"
it's pretty much the same code for almost any lib (DotNetZip, ZLIB.Net, ...)
internal static bool Inflate(byte[] compressed, out byte[] decompressed)
{
using (var inputStream = new MemoryStream(compressed))
using (var zInputStream = new ZInputStream(inputStream, true))
using (var outputStream = new MemoryStream())
{
zInputStream.CopyTo(outputStream);
decompressed = outputStream.ToArray();
}
return true;
}
To make sure everything working correctly, you should check for the following:
Both zlib versions match on both sides (compression - server, and decompression - client).
Flush mode is set to sync, that means the buffer must be synchronized in order to decompress further packets sent by the server.
Ensure that the packets you received is actually correct, and at my specific case, I was appending a different array size (a constant one in fact, of 0xFFFF) which might be different than the size of the received data (and that happens in most cases).
[EDIT 13th Nov. 19']
Keep in mind that conventionally the server may not send the last 4 bytes (sync 00 00 ff ff) if both has a contract that the flush type is sync, so be aware of adding them manually.
I'm trying to decompress data compressed with zlib algorithm in C# using 2 most legitimate libraries compatible with zlib algorithm and I got similar exception thrown.
Using DotNetZip:
Ionic.Zlib.ZlibException: Bad state (invalid stored block lengths)
Using Zlib.Net:
inflate: invalid stored block lenghts
but using same data as input to zlib-flate command on linux using only default parameters, works great and decompressed without any warnings (output is correct):
zlib-flate -uncompress < ./dbgZlib
Any suggestions what I can do in order to decompress this data in C# or why actually decompression failing in this case?
Compressed data as hex:
root#localhost:~# od -t x1 -An ./dbgZlib |tr -d '\n '
789c626063520b666060606262d26160d05307329999e70a6400e93c2066644080cf8c938c0c0c4d0d0d0d2d839c437c02dcfd0c0c0c11d28ea121013e7e41860ce18e210640e06810141669c080051840012eb970d790800090f99eee409ea189025e806c8e8b5354a89b13d81c136ca60f3a000e5fd6af0fb14a3221873e96400506374cd6c7d52dc8d98980657e7e06460ace0a4ce86e80da9f0249030edf816c16481ab06b60404f03931169c0cdc728c0db0fd928681a3042a481480347336c6e21320d78fb8155195a9090067ca3420387771a400a546aa70100000000ffff
Compressed data as base64:
root#localhost:~# base64 ./dbgZlib
eJxiYGNSC2ZgYGBiYtJhYNBTBzKZmecKZADpPCBmZECAz4yTjAwMTQ0NDS2DnEN8Atz9DAwMEdKO
oSEBPn5BhgzhjiEGQOBoEBQWacCABRhAAS65cNeQgACQ+Z7uQJ6hiQJegGyOi1NUqJsT2BwTbKYP
OgAOX9avD7FKMiGHPpZABQY3TNbH1S3I2YmAZX5+BkYKzgpM6G6A2p8CSQMO34FsFkgasGtgQE8D
kxFpwM3HKMDbD9koaBowQqSBSANHM2xuITINePuBVRlakJAGfKNCA4d3GkAKVGqnAQAAAAD//w==
Data after decompression, encoded with base64 look like this:
root#localhost:~# zlib-flate -uncompress < ./dbgZlib | base64
AAYCJlMAAAACAgIsAAAuJwAAAAMDnRBoAAAAbgAAAAEAAAAAAAAAAAAAAPMBkjIwMTUxMTE5UkNU
TFBHTjAwMQAAAAAAAAAAAABBVVRQTE5SMQBXQVQwMDAwQTBSVlkwAAAAAAAAAAAAAAAAAAAAAAAA
AAAwMDAwMDAwMAAAAAAAAAAAAAAAAAAAAAAAAAAAMDAwMFdFVFBQTFBHTklHMDAwMTQgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAwMAAAAAAAAAAAAABEQlpVRkIAAAAAMDQAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAX14QAAAAAAAAAA
AAAAAAAAAAAAAAAAAAIBAAAAAAAAAAAAAABBVDAwMDBBMFJWWTAAAAAAAAAAUExOAAAAAAAAAABM
RUZSQ0IAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATk4wMiBDIAIAAAAAAAAAAAAAAAAA
AAABBfXhAAAAAGQCAgIsAABA9wAAAAQDnRBoAAA+gAAAAAEAAAAAAAAAAAAAAPMBkzIwMTUxMTE5
UkNGTDJQS04AAAAAAAAAAAAAAABBVVRQTE5SMgBXQVQwMDAwQTBZMEE2AAAAAAAAAAAAAAAAAAAA
AAAAAAAwMDAwMDAwMAAAAAAAAAAAAAAAAAAAAAAAAAAAMDAwMFdFVFBQTFBLTjAwMDAwMTggICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAwMAAAAAAAAAAAAABETVpVUUIAAAAAMDQAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAX14QAAAAAA
AAAAAAAAAAAAAAAAAAAAAAIBAAAAAAAAAAAAAABBVDAwMDBBMFkwQTYAAAAAAAAAUExOAAAAAAAA
AABMRUZSQ0IAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATk4wMiBDIAIAAAAAAAAAAAAA
AAAAAAABBfXhAAAAAGQ=
The problem is that you are using zlib-flate as a general-purpose compression algorithm which, according to the manpage for it, you should not do:
This program should not be used as a general purpose compression tool.
Use something like gzip(1) instead.
So perhaps you should follow the instructions given by your tools and not use them for things that they are not intended for. Use gzip and the System.IO.Compression.GZipStream instead, it's much simpler, especially when you're looking for cross-platform compatible compression algorithms.
That said...
The reason that you can't inflate the data is that it lacks a correct GZIP header. If you add the right header to it you will get something that can be decompressed.
For instance:
public static byte[] DecompressZLibRaw(byte[] bCompressed)
{
byte[] bHdr = new byte[] { 0x1F, 0x8b, 0x08, 0, 0, 0, 0, 0 };
using (var sOutput = new MemoryStream())
using (var sCompressed = new MemoryStream())
{
sCompressed.Write(bHdr, 0, bHdr.Length);
sCompressed.Write(bCompressed, 0, bCompressed.Length);
sCompressed.Position = 0;
using (var decomp = new GZipStream(sCompressed, CompressionMode.Decompress))
{
decomp.CopyTo(sOutput);
}
return sOutput.ToArray();
}
}
Adding the header makes all the difference.
NB: There are two bytes in the 10-byte GZIP header that are not stripped from your source. These are normally used to store the compression flags and the source file system. In the compressed data you present they are invalid values. Additionally the file footer is abbreviated to 5 bytes instead of 8 bytes... all of which is not actually required for decompression. Which probably has a lot to do with why the manpage says not to use this for general compression.
The stream you provided is not complete. It appears that you ended it with a Z_SYNC_FLUSH or Z_FULL_FLUSH in your C# code, instead of a Z_FINISH like you're supposed to. That is causing the error. If you terminate the stream properly, you won't have a problem.
zlib-flate is simply ignoring that error.
If you are not in control of the generation of the stream, you can still use zlib to decompress what's there. You just need to use it at a lower level where you operate on blocks of data and get the decompressed data available given the provided input.
Are System.IO.Compression.GZipStream or System.IO.Compression.Deflate compatible with zlib compression?
I ran into this issue with Git objects. In that particular case, they store the objects as deflated blobs with a Zlib header, which is documented in RFC 1950. You can make a compatible blob by making a file that contains:
Two header bytes (CMF and FLG from RFC 1950) with the values 0x78 0x01
CM = 8 = deflate
CINFO = 7 = 32Kb window
FCHECK = 1 = checksum bits for this header
The output of the C# DeflateStream
An Adler32 checksum of the input data to the DeflateStream, big-endian format (MSB first)
I made my own Adler implementation
public class Adler32Computer
{
private int a = 1;
private int b = 0;
public int Checksum
{
get
{
return ((b * 65536) + a);
}
}
private static readonly int Modulus = 65521;
public void Update(byte[] data, int offset, int length)
{
for (int counter = 0; counter < length; ++counter)
{
a = (a + (data[offset + counter])) % Modulus;
b = (b + a) % Modulus;
}
}
}
And that was pretty much it.
DotNetZip includes a DeflateStream, a ZlibStream, and a GZipStream, to handle RFC 1950, 1951, and 1952. The all use the DEFLATE Algorithm but the framing and header bytes are different for each one.
As an advantage, the streams in DotNetZip do not exhibit the anomaly of expanding data size under compression, reported against the built-in streams. Also, there is no built-in ZlibStream, whereas DotNetZip gives you that, for good interop with zlib.
From MSDN about System.IO.Compression.GZipStream:
This class represents the gzip data format, which uses an industry standard algorithm for lossless file compression and decompression.
From the zlib FAQ:
The gz* functions in zlib on the other hand use the gzip format.
So zlib and GZipStream should be interoperable, but only if you use the zlib functions for handling the gzip-format.
System.IO.Compression.Deflate and zlib are reportedly not interoperable.
If you need to handle zip files (you probably don't, but someone else might need this) you need to use SharpZipLib or another third-party library.
I've used GZipStream to compress the output from the .NET XmlSerializer and it has worked perfectly fine to decompress the result with gunzip (in cygwin), winzip and another GZipStream.
For reference, here's what I did in code:
FileStream fs = new FileStream(filename, FileMode.Create, FileAccess.Write);
using (GZipStream gzStream = new GZipStream(fs, CompressionMode.Compress))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyDataType));
serializer.Serialize(gzStream, myData);
}
Then, to decompress in c#
FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
using (Stream input = new GZipStream(fs, CompressionMode.Decompress))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyDataType));
myData = (MyDataType) serializer.Deserialize(input);
}
Using the 'file' utility in cygwin reveals that there is indeed a difference between the same file compressed with GZipStream and with GNU GZip (probably header information as others has stated in this thread). This difference, however, seems to not matter in practice.
gzip is deflate + some header/footer data, like a checksum and length, etc. So they're not compatible in the sense that one method can use a stream from the other, but they employ the same compression algorithm.
They just compressing the data using zlib or deflate algorithms , but does not provide the output for some specific file format. This means that if you store the stream as-is to the hard drive most probably you will not be able to open it using some application (gzip or winrar) because file headers (magic number, etc ) are not included in stream an you should write them yourself.
Starting from .NET Framework 4.5 the System.IO.Compression.DeflateStream class uses the zlib library.
From the class's MSDN article:
This class represents the Deflate algorithm, which is an industry-standard algorithm for lossless file compression and decompression. Starting with the .NET Framework 4.5, the DeflateStream class uses the zlib library. As a result, it provides a better compression algorithm and, in most cases, a smaller compressed file than it provides in earlier versions of the .NET Framework.
I agree with andreas. You probably won't be able to open the file in an external tool, but if that tool expects a stream you might be able to use it. You would also be able to deflate the file back using the same compression class.
I'm working on EBICS protocol and i want to read a data in an XML File to compare with another file.
I have successfull decode data from base64 using
Convert.FromBase64String(OrderData); but now i have a byte array.
To read the content i have to unzip it. I tried to unzip it using Gzip like this example :
static byte[] Decompress(byte[] data)
{
using (var compressedStream = new MemoryStream(data))
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (var resultStream = new MemoryStream())
{
zipStream.CopyTo(resultStream);
return resultStream.ToArray();
}
}
But it does not work i have an error message :
the magic number in gzip header is not correct. make sure you are passing in a gzip stream
Now i have no idea how i can unzip it, please help me !
Thanks !
The first four bytes provided by the OP in a comment to another answer: 0x78 0xda 0xe5 0x98 is the start of a zlib stream. It is neither gzip, nor zip, but zlib. You need a ZlibStream, which for some reason Microsoft does not provide. That's fine though, since what Microsoft does provide is buggy.
You should use DotNetZip, which provides ZlibStream, and it works.
Try using SharpZipLib. It copes with various compression formats and is free under the GPL license.
As others have pointed out, I suspect you have a zip stream and not gzip. If you check the first 4 bytes in a hex view, ZIP files always start with 0x04034b50 ZIP File Format Wiki whereas GZIP files start with 0x8b1f GZIP File Format Wiki
I think I finally got it - as usual the problem is not what is in the title. Luckily I've noticed the word EBICS in your post. So, according to EBICS spec the data is first compressed, then encrypted and finally base64 encoded. As you see, after decoding base64 you need first to decrypt the data and then try to unzip it.
UPDATE: If that's not the case, it turns out from the EBICS spec Chapter 16 Appendix: Standards and references that ZIP refers to zlib/deflate format, so all you need to do is to replace GZipStream with the DeflateStream
I have a C# application that communicates with a PHP-based SOAP web service for updates and licensing.
I am now working on a feedback system for users to submit errors and tracelogs automatically through the software. Based on a previous question I posted, I felt that a web service would be the best way to do it (most likely to work properly with least configuration).
My current thought is to use .NET built-in gzip compression to compress the text file, convert to base64, send to the web-service, and have the PHP script convert to binary and uncompress the data.
Can PHP decompress data compressed with GZipStream, and if so, how?
I actually tried this. GZipStream doesn't work. On the other hand, compressing with DeflateStream on .NET side and decompressing with gzinflate on PHP side do work. Your mileage may vary...
If the http-level libraries implements it (Both client and server), http has support for gzip-compression, in which case there would be no reason to manually compress anything. You should check if this is already happening before you venture any further.
Since the server is accepting web requests you really should be checking the HTTP headers to determine if any client accepts GZIP encoding rather than just guessing and gzipping each and every time.
If the PHP client can do gzip itll set the header and your code will then react according and do the right thing. Assuming or guessing is a poor choice when the facility is provided for your code to learn the capabilities of the client.
I wrote an article I recently posted that shows how to compress/decompress in C#. I used it for almost the same scenario. I wanted to transfer log files from the client to the server and they were often quite large. However in my case my webservice was running in .NET so I could use the decompress method. But looks like PHP does support a method called gzdecode that would work.
http://coding.infoconex.com/post/2009/05/Compress-and-Decompress-using-net-framework-and-built-in-GZipStream.aspx
Yes, PHP can decompress GZIP compressed strings, with or without headers.
gzdecode for GZIP file format (ie, compatible with gzip)
gzinflate for "raw" DEFLATE format
gzuncompress for ZLIB format (GZIP format without some header info)
I don't know for sure which one you'd want as I'm unfamiliar with .NET GZipStream. It sounds a little like gzuncompress, as the ZLIB format is kind of a "streaming" format, but try all three.
I was able to demo this with Gzip on C# and PHP.
Gzip Compressing in C#:
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
public class Program {
public static void Main() {
string s = "Hi!!";
byte[] byteArray = Encoding.UTF8.GetBytes(s);
byte[] b2 = Compress(byteArray);
Console.WriteLine(System.Convert.ToBase64String(b2));
}
public static byte[] Compress(byte[] bytes) {
using (var memoryStream = new MemoryStream()) {
using (var gzipStream = new GZipStream(memoryStream, CompressionLevel.Optimal)) {
gzipStream.Write(bytes, 0, bytes.Length);
}
return memoryStream.ToArray();
}
}
public static byte[] Decompress(byte[] bytes) {
using (var memoryStream = new MemoryStream(bytes)) {
using (var outputStream = new MemoryStream()) {
using (var decompressStream = new GZipStream(memoryStream, CompressionMode.Decompress)) {
decompressStream.CopyTo(outputStream);
}
return outputStream.ToArray();
}
}
}
}
the code above prints the base64 encoded compressed string which is H4sIAAAAAAAEAPPIVFQEANxaFPgEAAAA for the Hi!! input.
Here's the code to decompress in PHP:
echo gzdecode(base64_decode('H4sIAAAAAAAEAPPIVFQEANxaFPgEAAAA'));