Problem with C# Decompression - c#

Have some data in a sybase image type column that I want to use in a C# app. The data has been compressed by Java using the java.util.zip package. I wanted to test that I could decompress the data in C#. So I wrote a test app that pulls it out of the database:
byte[] bytes = (byte[])reader.GetValue(0);
This gives me a compressed byte[] of 2479 length.
Then I pass this to a seemingly standard C# decompression method:
public static byte[] Decompress(byte[] gzBuffer)
{
MemoryStream ms = new MemoryStream();
int msgLength = BitConverter.ToInt32(gzBuffer, 0);
ms.Write(gzBuffer, 4, gzBuffer.Length - 4);
byte[] buffer = new byte[msgLength];
ms.Position = 0;
GZipStream zip = new GZipStream(ms, CompressionMode.Decompress);
zip.Read(buffer, 0, buffer.Length);
return buffer;
}
The value for msgLength is 1503501432 which seems way out of range. The original document should be in the range of 5K -50k. Anyway when I use that value to create "buffer" not surprisingly I get an OutOfMemoryException.
What is happening?
Jim
The Java compress method is as follows:
public byte[] compress(byte[] bytes) throws Exception {
byte[] results = new byte[bytes.length];
Deflater deflator = new Deflater();
deflater.setInput(bytes);
deflater.finish();
int len = deflater.deflate(results);
byte[] out = new byte[len];
for(int i=0; i<len; i++) {
out[i] = results[i];
}
return(out);
}

As I cant see your java code, I can only guess you are compressing your data to a zip file stream. Therefore it will obviously fail if you are trying to decompress that stream with a gzip decompression in c#. Either you change your java code to a gzip compression (Example here at the bottom of the page), or you decompress the zip file stream in c# with an appropriate library (e.g. SharpZipLib).
Update
Ok now, I see you are using deflate for the compression in java. So, obviously you have to use the same algorithm in c#: System.IO.Compression.DeflateStream
public static byte[] Decompress(byte[] buffer)
{
using (MemoryStream ms = new MemoryStream(buffer))
using (Stream zipStream = new DeflateStream(ms,
CompressionMode.Decompress, true))
{
int initialBufferLength = buffer.Length * 2;
byte[] buffer = new byte[initialBufferLength];
bool finishedExactly = false;
int read = 0;
int chunk;
while (!finishedExactly &&
(chunk = zipStream.Read(buffer, read, buffer.Length - read)) > 0)
{
read += chunk;
if (read == buffer.Length)
{
int nextByte = zipStream.ReadByte();
// End of Stream?
if (nextByte == -1)
{
finishedExactly = true;
}
else
{
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
}
if (!finishedExactly)
{
byte[] final = new byte[read];
Array.Copy(buffer, final, read);
buffer = final;
}
}
return buffer;
}

Related

uncompressed file is bigger than original file in GZIP

i'm using the following function to compress(thanks to http://www.dotnetperls.com/):
public static void CompressStringToFile(string fileName, string value)
{
// A.
// Write string to temporary file.
string temp = Path.GetTempFileName();
File.WriteAllText(temp, value);
// B.
// Read file into byte array buffer.
byte[] b;
using (FileStream f = new FileStream(temp, FileMode.Open))
{
b = new byte[f.Length];
f.Read(b, 0, (int)f.Length);
}
// C.
// Use GZipStream to write compressed bytes to target file.
using (FileStream f2 = new FileStream(fileName, FileMode.Create))
using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
{
gz.Write(b, 0, b.Length);
}
}
and for decompress:
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
so my goal is actually compress log files and than to decompress them in memory and compare the uncompressed file to the original file in order to check that the compression succeeded and i'm able to open the compressed file successfuly.
the problem is that the uncompressed file is most of the time bigger than the original file and my compare check is failing altough the compression probably succeeded.
any idea why ?
btw here how i compare the uncompressed file to the original file:
static bool FileEquals(byte[] file1, byte[] file2)
{
if (file1.Length == file2.Length)
{
for (int i = 0; i < file1.Length; i++)
{
if (file1[i] != file2[i])
{
return false;
}
}
return true;
}
return false;
}
Try this method to compress a file:
public static byte[] Compress(byte[] raw)
{
using (MemoryStream memory = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(memory,
CompressionMode.Compress, true))
{
gzip.Write(raw, 0, raw.Length);
}
return memory.ToArray();
}
}
}
And this to decompress :
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
}
Tell me if it worked.
Goodluck.
Think you'd be better off with the simplest API call, try Stream.CopyTo(). I can't find the error in your code. If I was working on it, I'd probably make sure everything is getting flushed properly.. can't recall if GZipStream is going to flush its output to FileStream when the using block closes.. but then you are also saying that the final file is larger, not smaller.
Anyhow, best policy in my experience.. don't rewrite gotcha prone code when you don't need to. At least you tested it ;)

Extract a bytearray variable using C# DotNetZip

I have a C# function that receives a compressed ByteArray as a parameter.
I need to EXTRACT this byteArray and send the resulting uncompressed byteArray to another function.
I need help extracting zipBytes to unzippedBytes please see below PSEUDO CODE:
SOLUTION using Zlib.net!
byte[] receiveZipByte (byte[] zipBytes)
{
MemoryStream oInStream = new MemoryStream(pZFileData);
ZInputStream oZInstream = new ZInputStream(oInStream);
MemoryStream oOutStream = new MemoryStream();
byte[] buffer = new byte[2000];
int len;
while ((len = oZInstream.read(buffer, 0, 2000)) > 0)
{
oOutStream.Write(buffer, 0, len);
}
byte[] pFileData = oOutStream.ToArray();
oZInstream.Close();
oOutStream.Close();
return unzippedBytes;
}
If what you're trying to do is decompress data that has been compressed using zlib, I would suggest using the ZLib.NET library: http://www.componentace.com/zlib_.NET.htm
It would depend on what you want to do with the data; but, here's an example of using a Stream with DotNetZip:
using (var input = new ZipInputStream(new MemoryStream(zipBytes)))
{
ZipEntry e;
while ((e = input.GetNextEntry()) != null)
{
if (e.IsDirectory) continue;
using (var output = File.Open(e.FileName, FileMode.Create, FileAccess.ReadWrite))
{
while ((n = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, n);
}
}
}
}

zlib.net code example for C# that take byte[] as input agument

I spent 3 hours searching for how to uncompress a string using Zlib.net.dll and I did not find anything useful.
Since my string is compressed by the old VB6 program that uses zlib.dll and I do not want to use file access each time I want to uncompress a string.
The problem is you need to know what the original size of the byte[] is before compression.
Or you can use dynamic array for decoding the data.
The code is here:
private string ZlibNetDecompress(string iCompressData, uint OriginalSize)
{
byte[] todecode_byte = Convert.FromBase64String(iCompressData);
byte[] lDecodeData = new byte[OriginalSize];
string lTempoString = System.Text.Encoding.Unicode.GetString(todecode_byte);
todecode_byte = System.Text.Encoding.Default.GetBytes(lTempoString);
string lReVal = "";
MemoryStream outStream = new MemoryStream();
MemoryStream InStream = new MemoryStream(todecode_byte);
zlib.ZOutputStream outZStream = new zlib.ZOutputStream(outStream);
try
{
CopyStream(InStream, outZStream);
lDecodeData = outStream.GetBuffer();
lReVal = System.Text.Encoding.Default.GetString(lDecodeData);
}
finally
{
outZStream.Close();
InStream.Close();
}
return lReVal;
}
private void CopyStream(System.IO.Stream input, System.IO.Stream output)
{
byte[] buffer = new byte[2000];
int len;
while ((len = input.Read(buffer, 0, 2000)) > 0)
{
output.Write(buffer, 0, len);
}
output.Flush();
}
You could use the GZipStreamClass from the framework.
var data = new byte[resultSizeMax];
using (Stream ds = new DeflateStream(stream, CompressionMode.Decompress))
for (var i=0; i< 1000; i+=ds.Read(data, i,1000-i);

GZIP Java vs .NET

Using the following Java code to compress/decompress bytes[] to/from GZIP.
First text bytes to gzip bytes:
public static byte[] fromByteToGByte(byte[] bytes) {
ByteArrayOutputStream baos = null;
try {
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
baos = new ByteArrayOutputStream();
GZIPOutputStream gzos = new GZIPOutputStream(baos);
byte[] buffer = new byte[1024];
int len;
while((len = bais.read(buffer)) >= 0) {
gzos.write(buffer, 0, len);
}
gzos.close();
baos.close();
} catch (IOException e) {
e.printStackTrace();
}
return(baos.toByteArray());
}
Then the method that goes the other way compressed bytes to uncompressed bytes:
public static byte[] fromGByteToByte(byte[] gbytes) {
ByteArrayOutputStream baos = null;
ByteArrayInputStream bais = new ByteArrayInputStream(gbytes);
try {
baos = new ByteArrayOutputStream();
GZIPInputStream gzis = new GZIPInputStream(bais);
byte[] bytes = new byte[1024];
int len;
while((len = gzis.read(bytes)) > 0) {
baos.write(bytes, 0, len);
}
} catch (IOException e) {
e.printStackTrace();
}
return(baos.toByteArray());
}
Think there is any effect since I'm not writing out to a gzip file?
Also I noticed that in the standard C# function that BitConverter reads the first four bytes and then the MemoryStream Write function is called with a start point of 4 and a length of input buffer length - 4. So is that effect the validity of the header?
Jim
I tryed it out, and I cant reproduce your 'Invalid GZip Header' issue. Here is what I did:
Java side
I took your Java compression method together with this java snippet:
public static String ToHexString(byte[] bytes){
StringBuilder hexString = new StringBuilder();
for (int i = 0; i < bytes.length; i++)
hexString.append((i == 0 ? "" : "-") +
Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
return hexString.toString();
}
So that this minimalistic java application, taking the bytes of a test string, compressing it, and converting it to a hex string of the compressed data...:
public static void main(String[] args){
System.out.println(ToHexString(fromByteToGByte("asdf".getBytes())));
}
... outputs the following (I added annotations):
1f-8b-08-00-00-00-00-00-00-00-4b-2c-4e-49-03-00-bd-f3-29-51-04-00-00-00
^------- GZip Header -------^ ^----------- Compressed data -----------^
C# side
I wrote two methods for compressing and uncompressing a byte array to another byte array (compression method is just for completeness, and my testings):
public static byte[] Compress(byte[] uncompressed)
{
using (MemoryStream ms = new MemoryStream())
using (GZipStream gzs = new GZipStream(ms, CompressionMode.Compress))
{
gzs.Write(uncompressed, 0, uncompressed.Length);
gzs.Close();
return ms.ToArray();
}
}
public static byte[] Decompress(byte[] compressed)
{
byte[] buffer = new byte[4096];
using (MemoryStream ms = new MemoryStream(compressed))
using (GZipStream gzs = new GZipStream(ms, CompressionMode.Decompress))
using (MemoryStream uncompressed = new MemoryStream())
{
for (int r = -1; r != 0; r = gzs.Read(buffer, 0, buffer.Length))
if (r > 0) uncompressed.Write(buffer, 0, r);
return uncompressed.ToArray();
}
}
Together with a small function that takes a hex string and turns it back to a byte array... (also just for testing purposes):
public static byte[] ToByteArray(string hexString)
{
hexString = hexString.Replace("-", "");
int NumberChars = hexString.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
bytes[i / 2] = Convert.ToByte(hexString.Substring(i, 2), 16);
return bytes;
}
... I did the following:
// Just hardcoded the output of the java program, convert it back to byte[]
byte[] fromjava = ToByteArray("1f-8b-08-00-00-00-00-00-00-00-" +
"4b-2c-4e-49-03-00-bd-f3-29-51-04-00-00-00");
// Decompress it with my function above
byte[] uncompr = Decompress(fromjava);
// Get the string out of the byte[] and print it
Console.WriteLine(System.Text.ASCIIEncoding.ASCII
.GetString(uncompr, 0, uncompr.Length));
Et voila, the output is:
asdf
Works perfect for me. Maybe you should check your decompression method in your c# application.
You said in your previous question you are storing those byte arrays in a database, right? Maybe you want to check whether the bytes come back from the database the way you put them in.
Posting this as an answer so the code looks decent.
Note a couple things:
First, the round trip to the database did not appear to have any effect. Java on both sides produced exactly what I put in. Java in C# out worked fine with the Ionic API, as did C# in and Java out. Which brings me to the second point.
Second, my original decompress was on the order of:
public static string Decompress(byte[] gzBuffer)
{
using (MemoryStream ms = new MemoryStream())
{
int msgLength = BitConverter.ToInt32(gzBuffer, 0);
ms.Write(gzBuffer, 4, gzBuffer.Length – 4);
byte[] buffer = new byte[msgLength];
ms.Position = 0;
using (GZipStream zip = new GZipStream(ms, CompressionMode.Decompress))
{
zip.Read(buffer, 0, buffer.Length);
}
return Encoding.UTF8.GetString(buffer);
}
}
Which depended on the internal byte count, yours reads the whole file regardless of internal value. Don't know what the Ionic algorithm is. Yours works the same as the Java methods I've used. That's the only difference I see. Thanks very much for doing all that work. I will remember that way of doing it.
Thanks,
Jim

compressing and decomressing in .net endsup with an decompressed array of zero's

I'm trying to compress and decompress a memory stream to send it over an tcp connection.
In the following code snap I do do the decompressing right after compressing to get it working first.
What ever I do I end up with a devompressed buffer wit all zero's and in the line
int read = Decompress.Read(buffie, 0, buffie.Length);
it seems that 0 bytes are read.
Does anyone has a clue what is wrong?
bytesRead = ms.Read(buf, 0, i);
MemoryStream partialMs = new MemoryStream();
GZipStream gZip = new GZipStream(partialMs, CompressionMode.Compress);
gZip.Write(buf, 0, buf.Length);
partialMs.Position = 0;
byte[] compressedBuf = new byte[partialMs.Length];
partialMs.Read(compressedBuf, 0, (int)partialMs.Length);
partialMs.Close();
byte[] gzBuffer = new byte[compressedBuf.Length + 4];
System.Buffer.BlockCopy(compressedBuf, 0, gzBuffer, 4, compressedBuf.Length);
System.Buffer.BlockCopy(BitConverter.GetBytes(buf.Length), 0, gzBuffer, 0, 4);
using (MemoryStream mems = new MemoryStream())
{
int msgLength = BitConverter.ToInt32(gzBuffer, 0);
byte[] buffie = new byte[msgLength];
mems.Write(gzBuffer, 4, gzBuffer.Length - 4);
mems.Flush();
mems.Position = 0;
using (GZipStream Decompress = new GZipStream(mems, CompressionMode.Decompress, true))
{
int read = Decompress.Read(buffie, 0, buffie.Length);
Decompress.Close();
}
}
Your implementation could use some work. there seems to be some confusion as to which streams should be used where. here is a working example to get you started..
see user content at the bottom of this MSDN page
var original = new byte[65535];
var compressed = GZipTest.Compress(original);
var decompressed = GZipTest.Decompress(compressed);
using System.IO;
using System.IO.Compression;
public class GZipTest
{
public static byte[] Compress(byte[] uncompressedBuffer)
{
using (var ms = new MemoryStream())
{
using (var gzip = new GZipStream(ms, CompressionMode.Compress, true))
{
gzip.Write(uncompressedBuffer, 0, uncompressedBuffer.Length);
}
byte[] compressedBuffer = ms.ToArray();
return compressedBuffer;
}
}
public static byte[] Decompress(byte[] compressedBuffer)
{
using (var gzip = new GZipStream(new MemoryStream(compressedBuffer), CompressionMode.Decompress))
{
byte[] uncompressedBuffer = ReadAllBytes(gzip);
return uncompressedBuffer;
}
}
private static byte[] ReadAllBytes(Stream stream)
{
var buffer = new byte[4096];
using (var ms = new MemoryStream())
{
int bytesRead = 0;
do
{
bytesRead = stream.Read(buffer, 0, buffer.Length);
if (bytesRead > 0)
{
ms.Write(buffer, 0, bytesRead);
}
} while (bytesRead > 0);
return ms.ToArray();
}
}
}
You're not closing the GzipStream you're writing to, so it's probably all buffered. I suggest you close it when you're done writing your data.
By the way, you can get the data out of a MemoryStream much more easily than your current code: use MemoryStream.ToArray.

Categories