Invalid data while decoding a byte[] using gzip

Invalid data while decoding a byte[] using gzip - c#

I have a message payload and one of the properties is "contents" which contains a gzip compressed data. Ive extracted out the property contents from the message and published it to another queue. This next queue then needs to decompress the byte[] and then extract out the original string.
I get an "Invalid data while decoding" exception when decompressing the byte[] received from the previous queue. Ive actually taken the gzip compressed data and run it through online gzip decompressor to check that it is valid and it is. Im not sure why it is unable to decompress the byte[] even though an online gzip decompressor can do it without issue.
Code for how I am compressing and sending the original message (I am using a dynamic object cause i am extracting other properties which are not show in this snippet):
dynamic messageObject = new ExpandoObject();
messageObject.PreviousWorkerResult = message.PreviousWorkersResults;
messageObject.Contents = GzipHelper.Compress(Encoding.UTF8.GetBytes(serializedMessage));
var serializedExpandoObject = JsonConvert.SerializeObject(messageObject, new JsonSerializerSettings
{
NullValueHandling = NullValueHandling.Ignore,
ContractResolver = new CustomCamelCasePropertyNamesContractResolver(),
});
body = Encoding.UTF8.GetBytes(serializedExpandoObject);
_bus.Advanced.Publish(exchange,
queueName,
mandatory: true,
messageProperties: properties,
body: body);
Code for extracting the gzipped compressed data property "contents":
dynamic data = JsonConvert.DeserializeObject(message.Payload);
string contents = data.contents;
var properties = new MessageProperties()
{
ContentType = "gzip"
};
await messageQueuePublisher.PublishAsync(contents, dto.SendTo, properties);
Code for the receiving end to decompress. This is where the exception is being thrown on the GzipHelper.Decompress(body):
if (properties.ContentType == "gzip")
{
var decompressed = GzipHelper.Decompress(body);
jsonMessage = Encoding.UTF8.GetString(decompressed);
}
Code for the Compress and Decompress in GzipHelper:
public static byte[] Compress(byte[] data)
{
using (var memory = new MemoryStream())
{
using (var gzip = new GZipStream(memory, CompressionLevel.Optimal))
{
gzip.Write(data, 0, data.Length);
}
return memory.ToArray();
}
}
public static byte[] Decompress(byte[] data)
{
try
{
using (var compressedStream = new MemoryStream(data))
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (var resultStream = new MemoryStream())
{
zipStream.CopyTo(resultStream);
return resultStream.ToArray();
}
}
catch (Exception e)
{
throw new Exception(e.ToString());
}
}

Related

Value already read, or no value when trying to read from a Stream

I've been trying this for a long time but it keeps giving me an error. I have an array of bytes that should represent a nbt document. I would like to convert this into a c# object with a library: fNbt.
Here is my code:
byte[] buffer = Convert.FromBase64String(value);
byte[] decompressed;
using (var inputStream = new MemoryStream(buffer))
{
using var outputStream = new MemoryStream();
using (var gzip = new GZipStream(inputStream, CompressionMode.Decompress, leaveOpen: true))
{
gzip.CopyTo(outputStream);
}
fNbt.NbtReader reader = new fNbt.NbtReader(outputStream, true);
var output = reader.ReadValueAs<AuctionItem>(); //Error: Value already read, or no value to read.
return output;
}
When I try this, it works:
decompressed = outputStream.ToArray();
outputStream.Seek(0, SeekOrigin.Begin);
outputStream.Read(new byte[1000], 0, decompressed.Count() - 1);
But when I try this, it doesn't:
outputStream.Seek(0, SeekOrigin.Begin);
fNbt.NbtReader reader = new fNbt.NbtReader(outputStream, true);
reader.ReadValueAs<AuctionItem>();

NbtReader, like most stream readers, begins reading from the current position of whatever stream you give it. Since you're just done writing to outputStream, then that position is the stream's end. Which means at that point there's nothing to be read.
The solution is to seek the outputStream back to the beginning before reading from it:
outputStream.Seek(0, SeekOrigin.Begin); // <-- seek to the beginning
// Do the read
fNbt.NbtReader reader = new fNbt.NbtReader(outputStream, true);
var output = reader.ReadValueAs<AuctionItem>(); // No error anymore
return output;

The solution is as follows. NbtReader.ReadValueAs does not consider a nbtCompound or nbtList as value. I made this little reader but it is not done yet (I will update the code once it is done).
public static T ReadValueAs<T>(string value) where T: new()
{
byte[] buffer = Convert.FromBase64String(value);
using (var inputStream = new MemoryStream(buffer))
{
using var outputStream = new MemoryStream();
using (var gzip = new GZipStream(inputStream, CompressionMode.Decompress, leaveOpen: true))
{
gzip.CopyTo(outputStream);
}
outputStream.Seek(0, SeekOrigin.Begin);
return new EasyNbt.NbtReader(outputStream).ReadValueAs<T>();
}
}
This is the NbtReader:
private MemoryStream MemStream { get; set; }
public NbtReader(MemoryStream memStream)
{
MemStream = memStream;
}
public T ReadValueAs<T>() where T: new()
{
return ReadTagAs<T>(new fNbt.NbtReader(MemStream, true).ReadAsTag());
}
private T ReadTagAs<T>(fNbt.NbtTag nbtTag)
{
//Reads to the root and adds to T...
}

protobuf-net returns null when calling Deserialize

My end goal is to use protobuf-net and GZipStream in an attempt to compress a List<MyCustomType> object to store in a varbinary(max) field in SQL Server. I'm working on unit tests to understand how everything works and fits together.
Target .NET framework is 3.5.
My current process is:
Serialize the data with protobuf-net (good).
Compress the serialized data from #1 with GZipStream (good).
Convert the compressed data to a base64 string (good).
At this point, the value from step #3 will be stored in a varbinary(max) field. I have no control over this. The steps resume with needing to take a base64 string and deserialize it to a concrete type.
Convert a base 64 string to a byte[] (good).
Decompress the data with GZipStream (good).
Deserialize the data with protobuf-net (bad).
Can someone assist with why the call to Serializer.Deserialize<string> returns null? I'm stuck on this one and hopefully a fresh set of eyes will help.
FWIW, I tried another version of this using List<T> where T is a custom class I created and I Deserialize<> still returns null.
FWIW 2, data.txt is a 4MB plaintext file residing on my C:.
[Test]
public void ForStackOverflow()
{
string data = "hi, my name is...";
//string data = File.ReadAllText(#"C:\Temp\data.txt");
string serializedBase64;
using (MemoryStream protobuf = new MemoryStream())
{
Serializer.Serialize(protobuf, data);
using (MemoryStream compressed = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(compressed, CompressionMode.Compress))
{
byte[] s = protobuf.ToArray();
gzip.Write(s, 0, s.Length);
gzip.Close();
}
serializedBase64 = Convert.ToBase64String(compressed.ToArray());
}
}
byte[] base64byteArray = Convert.FromBase64String(serializedBase64);
using (MemoryStream base64Stream = new MemoryStream(base64byteArray))
{
using (GZipStream gzip = new GZipStream(base64Stream, CompressionMode.Decompress))
{
using (MemoryStream plainText = new MemoryStream())
{
byte[] buffer = new byte[4096];
int read;
while ((read = gzip.Read(buffer, 0, buffer.Length)) > 0)
{
plainText.Write(buffer, 0, read);
}
// why does this call to Deserialize return null?
string deserialized = Serializer.Deserialize<string>(plainText);
Assert.IsNotNull(deserialized);
Assert.AreEqual(data, deserialized);
}
}
}
}

Because you didn't rewind plainText after writing to it. Actually, that entire Stream is unnecessary - this works:
using (MemoryStream base64Stream = new MemoryStream(base64byteArray))
{
using (GZipStream gzip = new GZipStream(
base64Stream, CompressionMode.Decompress))
{
string deserialized = Serializer.Deserialize<string>(gzip);
Assert.IsNotNull(deserialized);
Assert.AreEqual(data, deserialized);
}
}
Likewise, this should work for the serialize:
using (MemoryStream compressed = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(
compressed, CompressionMode.Compress, true))
{
Serializer.Serialize(gzip, data);
}
serializedBase64 = Convert.ToBase64String(
compressed.GetBuffer(), 0, (int)compressed.Length);
}

Decode Base64 and Inflate Zlib compressed XML

Sorry for the long post, will try to make this as short as possible.
I'm consuming a json API (which has zero documentation of course) which returns something like this:
{
uncompressedlength: 743637,
compressedlength: 234532,
compresseddata: "lkhfdsbjhfgdsfgjhsgfjgsdkjhfgj"
}
The data (xml in this case) is compressed and then base64 encoded data which I am attempting to extract. All I have is their demo code written in perl to decode it:
use Compress::Zlib qw(uncompress);
use MIME::Base64 qw(decode_base64);
my $uncompresseddata = uncompress(decode_base64($compresseddata));
Seems simple enough.
I've tried a number of methods to decode the base64:
private string DecodeFromBase64(string encodedData)
{
byte[] encodedDataAsBytes = System.Convert.FromBase64String(encodedData);
string returnValue = System.Text.Encoding.Unicode.GetString(encodedDataAsBytes);
return returnValue;
}
public string base64Decode(string data)
{
try
{
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
System.Text.Decoder utf8Decode = encoder.GetDecoder();
byte[] todecode_byte = Convert.FromBase64String(data);
int charCount = utf8Decode.GetCharCount(todecode_byte, 0, todecode_byte.Length);
char[] decoded_char = new char[charCount];
utf8Decode.GetChars(todecode_byte, 0, todecode_byte.Length, decoded_char, 0);
string result = new String(decoded_char);
return result;
}
catch (Exception e)
{
throw new Exception("Error in base64Decode" + e.Message);
}
}
And I have tried using Ionic.Zip.dll (DotNetZip?) and zlib.net to inflate the Zlib compression. But everything errors out. I am trying to track down where the problem is coming from. Is it the base64 decode or the Inflate?
I always get an error when inflating using zlib: I get a bad Magic Number error using zlib.net and I get "Bad state (invalid stored block lengths)" when using DotNetZip:
string decoded = DecodeFromBase64(compresseddata);
string decompressed = UnZipStr(GetBytes(decoded));
public static string UnZipStr(byte[] input)
{
using (MemoryStream inputStream = new MemoryStream(input))
{
using (Ionic.Zlib.DeflateStream zip =
new Ionic.Zlib.DeflateStream(inputStream, Ionic.Zlib.CompressionMode.Decompress))
{
using (StreamReader reader =
new StreamReader(zip, System.Text.Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
}
After reading this:
http://george.chiramattel.com/blog/2007/09/deflatestream-block-length-does-not-match.html
And listening to one of the comments. I changed the code to this:
MemoryStream memStream = new MemoryStream(Convert.FromBase64String(compresseddata));
memStream.ReadByte();
memStream.ReadByte();
DeflateStream deflate = new DeflateStream(memStream, CompressionMode.Decompress);
string doc = new StreamReader(deflate, System.Text.Encoding.UTF8).ReadToEnd();
And it's working fine.

This was the culprit:
http://george.chiramattel.com/blog/2007/09/deflatestream-block-length-does-not-match.html
With skipping the first two bytes I was able to simplify it to:
MemoryStream memStream = new MemoryStream(Convert.FromBase64String(compresseddata));
memStream.ReadByte();
memStream.ReadByte();
DeflateStream deflate = new DeflateStream(memStream, CompressionMode.Decompress);
string doc = new StreamReader(deflate, System.Text.Encoding.UTF8).ReadToEnd();

First, use System.IO.Compression.DeflateStream to re-inflate the data. You should be able to use a MemoryStream as the input stream. You can create a MemoryStream using the byte[] result of Convert.FromBase64String.
You are likely causing all kinds of trouble trying to convert the base64 result to a given encoding; use the raw data directly to Deflate.

How do I correctly prepare an 'HTTP Redirect Binding' SAML Request using C#

I need to create an SP initiated SAML 2.0 Authentication transaction using HTTP Redirect Binding method. It turns out this is quite easy. Just get the IdP URI and concatenate a single query-string param SAMLRequest. The param is an encoded block of xml that describes the SAML request. So far so good.
The problem comes when converting the SAML into the query string param. I believe this process of preparation should be:
Build a SAML string
Compress this string
Base64 encode the string
UrlEncode the string.
The SAML Request
<samlp:AuthnRequest
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
ID="{0}"
Version="2.0"
AssertionConsumerServiceIndex="0"
AttributeConsumingServiceIndex="0">
<saml:Issuer>URN:xx-xx-xx</saml:Issuer>
<samlp:NameIDPolicy
AllowCreate="true"
Format="urn:oasis:names:tc:SAML:2.0:nameid-format:transient"/>
</samlp:AuthnRequest>
The Code
private string GetSAMLHttpRedirectUri(string idpUri)
{
var saml = string.Format(SAMLRequest, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
using (var output = new MemoryStream())
{
using (var zip = new DeflaterOutputStream(output))
{
zip.Write(bytes, 0, bytes.Length);
}
var base64 = Convert.ToBase64String(output.ToArray());
var urlEncode = HttpUtility.UrlEncode(base64);
return string.Concat(idpUri, "?SAMLRequest=", urlEncode);
}
}
I suspect the compression is somehow to blame. I am using the DeflaterOutputStream class from SharpZipLib which is supposed to implement an industry standard deflate-algorithm so perhaps there are some settings here I have wrong?
The encoded output can be tested using this SAML2.0 Debugger (its a useful online conversion tool). When I decode my output using this tool it comes out as nonsense.
The question therefore is: Do you know how to convert a SAML string into the correctly deflated and encoded SAMLRequest query-param?
Thank you
EDIT 1
The accepted answer below gives the answer to the problem. Here is final code as corrected by all subsequent comments and answers.
Encode SAMLRequest - Working Code
private string GenerateSAMLRequestParam()
{
var saml = string.Format(SAMLRequest, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
using (var output = new MemoryStream())
{
using (var zip = new DeflateStream(output, CompressionMode.Compress))
{
zip.Write(bytes, 0, bytes.Length);
}
var base64 = Convert.ToBase64String(output.ToArray());
return HttpUtility.UrlEncode(base64);
}
}
The SAMLRequest variable contains the SAML shown at the top of this question.
Decode SAMLResponse - Working Code
private string DecodeSAMLResponse(string response)
{
var utf8 = Encoding.UTF8;
var bytes = utf8.GetBytes(response);
using (var output = new MemoryStream())
{
using (new DeflateStream(output, CompressionMode.Decompress))
{
output.Write(bytes, 0, bytes.Length);
}
var base64 = utf8.GetString(output.ToArray());
return utf8.GetString(Convert.FromBase64String(base64));
}
}

I've just run the following code with your example SAML:
var saml = string.Format(sample, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
string middle;
using (var output = new MemoryStream())
{
using (var zip = new DeflaterOutputStream(output))
zip.Write(bytes, 0, bytes.Length);
middle = Convert.ToBase64String(output.ToArray());
}
string decoded;
using (var input = new MemoryStream(Convert.FromBase64String(middle)))
using (var unzip = new InflaterInputStream(input))
using (var reader = new StreamReader(unzip, Encoding.UTF8))
decoded = reader.ReadToEnd();
bool test = decoded == saml;
The test variable is true. This means that the zip/base64/unbase64/unzip roundtrip performs correctly. The error must occur later. Maybe the URLEncoder destroys them? Could you try similar urlencode/decode test? Also, check how long the result is. It may be possible that the resulting URL is truncated due to its length.
(edit: I've added a StreamReader instead of reading to arrays. Earlier my sample used bytes.Length to prepare the buffer and that could damage the test. Now the reading uses only the information from the compressed stream)
edit:
var saml = string.Format(sample, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
string middle;
using (var output = new MemoryStream())
{
using (var zip = new DeflateStream(output, CompressionMode.Compress))
zip.Write(bytes, 0, bytes.Length);
middle = Convert.ToBase64String(output.ToArray());
}
// MIDDLE is the thing that should be now UrlEncode'd
string decoded;
using (var input = new MemoryStream(Convert.FromBase64String(middle)))
using (var unzip = new DeflateStream(input, CompressionMode.Decompress))
using (var reader = new StreamReader(unzip, Encoding.UTF8))
decoded = reader.ReadToEnd();
bool test = decoded == saml;
this code produces a middle variable, that once is UrlEncoded, passes through the debugger properly. DeflateStream comes from the standard .Net's System.IO.Compression namespace. I don't have the slightest idea why the SharpZip's Deflate is not accepted by the 'debugger' site. It is undeniable that the compression works, as it manages to decompress the data properly.. it just has to be some difference in the algorithms, but I cannot tell what is the difference between this deflate and that deflate, d'oh.

The question at the top contains a "Decode SAMLResponse - Working Code" section, but that code seemed broken. After trying a few things, I discovered that it was trying to read and write to the same stream at the same time. I reworked it by separating the read and write streams and here is my solution (I am providing the request section for convenience and clarity):
Encode SAML Authentication Request:
public static string EncodeSamlAuthnRequest(this string authnRequest) {
var bytes = Encoding.UTF8.GetBytes(authnRequest);
using (var output = new MemoryStream()) {
using (var zip = new DeflateStream(output, CompressionMode.Compress)) {
zip.Write(bytes, 0, bytes.Length);
}
var base64 = Convert.ToBase64String(output.ToArray());
return HttpUtility.UrlEncode(base64);
}
}
Decode SAML Authentication Response:
public static string DecodeSamlAuthnRequest(this string encodedAuthnRequest) {
var utf8 = Encoding.UTF8;
var bytes = Convert.FromBase64String(HttpUtility.UrlDecode(encodedAuthnRequest));
using (var output = new MemoryStream()) {
using (var input = new MemoryStream(bytes)) {
using (var unzip = new DeflateStream(input, CompressionMode.Decompress)) {
unzip.CopyTo(output, bytes.Length);
unzip.Close();
}
return utf8.GetString(output.ToArray());
}
}
}

Gzip uncompress from string error, The magic number in GZip header is not correct

I am trying to replicate the php function gzuncompress in C#
So far I got part of following code working. see comment and code below.
I thing the tricky bit is happening during byte[] and string convertion.
How can I fix this? and where did I missed??
I am using .Net 3.5 environment
var plaintext = Console.ReadLine();
Console.WriteLine("string to byte[] then to string");
byte[] buff = Encoding.UTF8.GetBytes(plaintext);
var compress = GZip.GZipCompress(buff);
//Uncompress working below
try
{
var unpressFromByte = GZip.GZipUncompress(compress);
Console.WriteLine("uncompress successful by uncompress byte[]");
}catch
{
Console.WriteLine("uncompress failed by uncompress byte[]");
}
var compressString = Encoding.UTF8.GetString(compress);
Console.WriteLine(compressString);
var compressBuff = Encoding.UTF8.GetBytes(compressString);
Console.WriteLine(Encoding.UTF8.GetString(compressBuff));
//Uncompress not working below by using string
//The magic number in GZip header is not correct
try
{
var uncompressFromString = GZip.GZipUncompress(compressBuff);
Console.WriteLine("uncompress successful by uncompress string");
}
catch
{
Console.WriteLine("uncompress failed by uncompress string");
}
code for class Gzip
public static class GZip
{
public static byte[] GZipUncompress(byte[] data)
{
using (var input = new MemoryStream(data))
using (var gzip = new GZipStream(input, CompressionMode.Decompress))
using (var output = new MemoryStream())
{
gzip.CopyTo(output);
return output.ToArray();
}
}
public static byte[] GZipCompress(byte[] data)
{
using (var input = new MemoryStream(data))
using (var output = new MemoryStream())
{
using (var gzip = new GZipStream(output, CompressionMode.Compress, true))
{
input.CopyTo(gzip);
}
return output.ToArray();
}
}
public static long CopyTo(this Stream source, Stream destination)
{
var buffer = new byte[2048];
int bytesRead;
long totalBytes = 0;
while ((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
{
destination.Write(buffer, 0, bytesRead);
totalBytes += bytesRead;
}
return totalBytes;
}
}

This is inappropriate:
var compressString = Encoding.UTF8.GetString(compress);
compress isn't a UTF-8-encoded piece of text. You should treat it as arbitrary binary data - which isn't appropriate to pass into Encoding.GetString. If you really need to convert arbitrary binary data into text, use Convert.ToBase64String (and then reverse with Convert.FromBase64String):
var compressString = Convert.ToBase64String(compress);
Console.WriteLine(compressString);
var compressBuff = Convert.FromBase64String(compressString);
That may or may not match what PHP does, but it's a safe way of representing arbitrary binary data as text, unlike treating the binary data as if it were valid UTF-8-encoded text.

I am trying to replicate the php function gzuncompress in C#
Then use GZipStream or DeflateStream classes which are built into the .NET framework for this purpose.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Invalid data while decoding a byte[] using gzip - c#

Related

Value already read, or no value when trying to read from a Stream

protobuf-net returns null when calling Deserialize

Decode Base64 and Inflate Zlib compressed XML

How do I correctly prepare an 'HTTP Redirect Binding' SAML Request using C#

Gzip uncompress from string error, The magic number in GZip header is not correct

Categories

Resources