Decode Base64 and Inflate Zlib compressed XML - c#

Sorry for the long post, will try to make this as short as possible.
I'm consuming a json API (which has zero documentation of course) which returns something like this:
{
uncompressedlength: 743637,
compressedlength: 234532,
compresseddata: "lkhfdsbjhfgdsfgjhsgfjgsdkjhfgj"
}
The data (xml in this case) is compressed and then base64 encoded data which I am attempting to extract. All I have is their demo code written in perl to decode it:
use Compress::Zlib qw(uncompress);
use MIME::Base64 qw(decode_base64);
my $uncompresseddata = uncompress(decode_base64($compresseddata));
Seems simple enough.
I've tried a number of methods to decode the base64:
private string DecodeFromBase64(string encodedData)
{
byte[] encodedDataAsBytes = System.Convert.FromBase64String(encodedData);
string returnValue = System.Text.Encoding.Unicode.GetString(encodedDataAsBytes);
return returnValue;
}
public string base64Decode(string data)
{
try
{
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
System.Text.Decoder utf8Decode = encoder.GetDecoder();
byte[] todecode_byte = Convert.FromBase64String(data);
int charCount = utf8Decode.GetCharCount(todecode_byte, 0, todecode_byte.Length);
char[] decoded_char = new char[charCount];
utf8Decode.GetChars(todecode_byte, 0, todecode_byte.Length, decoded_char, 0);
string result = new String(decoded_char);
return result;
}
catch (Exception e)
{
throw new Exception("Error in base64Decode" + e.Message);
}
}
And I have tried using Ionic.Zip.dll (DotNetZip?) and zlib.net to inflate the Zlib compression. But everything errors out. I am trying to track down where the problem is coming from. Is it the base64 decode or the Inflate?
I always get an error when inflating using zlib: I get a bad Magic Number error using zlib.net and I get "Bad state (invalid stored block lengths)" when using DotNetZip:
string decoded = DecodeFromBase64(compresseddata);
string decompressed = UnZipStr(GetBytes(decoded));
public static string UnZipStr(byte[] input)
{
using (MemoryStream inputStream = new MemoryStream(input))
{
using (Ionic.Zlib.DeflateStream zip =
new Ionic.Zlib.DeflateStream(inputStream, Ionic.Zlib.CompressionMode.Decompress))
{
using (StreamReader reader =
new StreamReader(zip, System.Text.Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
}
After reading this:
http://george.chiramattel.com/blog/2007/09/deflatestream-block-length-does-not-match.html
And listening to one of the comments. I changed the code to this:
MemoryStream memStream = new MemoryStream(Convert.FromBase64String(compresseddata));
memStream.ReadByte();
memStream.ReadByte();
DeflateStream deflate = new DeflateStream(memStream, CompressionMode.Decompress);
string doc = new StreamReader(deflate, System.Text.Encoding.UTF8).ReadToEnd();
And it's working fine.

This was the culprit:
http://george.chiramattel.com/blog/2007/09/deflatestream-block-length-does-not-match.html
With skipping the first two bytes I was able to simplify it to:
MemoryStream memStream = new MemoryStream(Convert.FromBase64String(compresseddata));
memStream.ReadByte();
memStream.ReadByte();
DeflateStream deflate = new DeflateStream(memStream, CompressionMode.Decompress);
string doc = new StreamReader(deflate, System.Text.Encoding.UTF8).ReadToEnd();

First, use System.IO.Compression.DeflateStream to re-inflate the data. You should be able to use a MemoryStream as the input stream. You can create a MemoryStream using the byte[] result of Convert.FromBase64String.
You are likely causing all kinds of trouble trying to convert the base64 result to a given encoding; use the raw data directly to Deflate.

Related

How to convert already encoded string representation of Bytes Array to actualy Bytes Array? C#

The problem: We had a system that events and projections had a column Payload which is a serialized object. This payload was a string but for performance and saving disk space considerations we started saving a compressed version of the string in the database. and we decompress it whenever fetching from the Database.
Code for compressing and decompressing
using System.IO;
using System.IO.Compression;
using System.Text;
namespace DemoEFCore.Helpers
{
public class CompressionHelper
{
public static byte[] Compress(string stringData)
{
var stringBytes = Encoding.UTF8.GetBytes(stringData);
using var output = new MemoryStream();
using (DeflateStream dstream = new DeflateStream(output, CompressionLevel.Fastest))
{
dstream.Write(stringBytes, 0, stringBytes.Length);
}
return output.ToArray();
}
public static string Decompress(byte[] data)
{
using var input = new MemoryStream(data);
using var output = new MemoryStream();
using (DeflateStream dstream = new DeflateStream(input, CompressionMode.Decompress))
{
dstream.CopyTo(output);
}
var bytes = output.ToArray();
return Encoding.UTF8.GetString(bytes);
}
}
}
It works perfectly fine and it really gives performance improvements.
But sometimes when you are fixing a bug you go straight to the database to see a payload of a concrete record. I could copy it and paste in some of the JSON beautifier previously but now I can copy only encoded representation.
screenshot from db
0x8590416BC3300C85FFCA7867BBC88E9DC4BE6D1D8C5DB6427BEAD82189952D236D83E3144AE97F1F69BBF3408787D07B9FA4335E033C7496DBB6B546EADCE6D2144ECBD29293A1A6D0385B84820C049691ABC4E131CD16D25A2A25C96CA8F4B6F4E4163A3345A1DD1602AB7808539336A781E1151109ACA781E33AC56EFF058FA7581D1902EFF50F376984FFC010BB2327082D529C5840B9822429496A43E4AFB58538E3ADDA31FC2DEF65D633AE6B18DEDA8515584D75DF8DDF1C6FB73559C921AFA42D9D9626574A56D690AC759B99AC0A2E84664EF833C19FB13CEC866A7FBA83C679F187E6D683C0CC1CE1F753DF8BEBFFEE6A5C1FDAF4CC3D270EF06DD58F7CF977E0F3F20B
What we want to do is to have the ability to copy this string, paste it in our application and get the decoded JSON string. It sounds easy because we already have the CompressionHelper.Decompress method but it accepts a bytes array as a parameter. I found 3 solutions on how I can convert such a string to a bytes array but they didn't work for me.
1.
var s = "0xB5945F6FDA3014C5BF0ABACF71E5FF2179AB18D22A552D5A190FDBFAE0C437C11A719013B67515DF7D22A40C06A3ADB63EC6F139E7FADE9FFD08571652A054B0445B4164CE389179C249A6D112ADA842458B38CB044470632A84142675684766B1B8A92BE74DEB6A3F416F9D2FAF5DD3CE1C7E8768B7E735F6336C1A5CF421D339D6B60E8371D364184A0CBD69FFFBAAAA9C2FE7A6EA97BB9CD84A9E6405126D634EE490496292212536CF448C4C67B99610C1A8AE2A0CB9338BDB501AEF7E7667E81C146A9170A949A20D25D2E69464C6E4C42844AE5070610A88608AF9DCBBFC8481502C16BC50246668881446902CE78288213342E499A6F1102298D50FA6C49B559561801418D707755D96E8DBA3E2B2586A5B5843322C349142539220CB89B454243197892AD4799BBE7923D32C310CEEE66EB974BE1C5CB7F60222B8FDEE31FC53F46987F3A9D38725422A28A511DCB5A65D3590F2EE6BEA2ABC2D7ACA36E071CA19619C303165C394A994F10BC919535A7FDA70109CCFDDF2702A903E3E713BFEB2A294EBBBC1D4F8AF181A88B6F4BF68E8EB7D6EDEA1B10BE7B181F4F3234C4C681F2065DBA2BBE3B058D1083E6081017D8E93DAF91652A936877C5FAF4203A9B8A0741D9D5677FB8ED49D7FAF5654FE5DCF9FD7F33372D9059D8D1F9E093FA94EF68B3F17BE05E128FCA59D7B3E5C76B5DFF7B89F9C648FDF1E98AFA9675FBD65E24FF54133B6F5FC06E140FEFC208F4038D0BF10C3FBBDCBF3D491F10FD7B490B66185118CBFA16FB7B45F9665C0D2B4F89AA7FD638361B3DFAF168B087616330C8DAB3DA4DD4DD82DFF4F67B6B9636FE32CE337F41EAEEF23D8F5474670D58CE6C69768B73359FF02";
var stream = new MemoryStream();
var writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
var s1 = CompressionHelper.Decompress(stream.ToArray());
The line var s1 = CompressionHeplepr.Decompress(stream.ToArray()); throws exception System.IO.InvalidDataException: The archive entry was compressed using an unsupported compression method.
2.
var s = "0xB5945F6FDA3014C5BF0ABACF71E5FF2179AB18D22A552D5A190FDBFAE0C437C11A719013B67515DF7D22A40C06A3ADB63EC6F139E7FADE9FFD08571652A054B0445B4164CE389179C249A6D112ADA842458B38CB044470632A84142675684766B1B8A92BE74DEB6A3F416F9D2FAF5DD3CE1C7E8768B7E735F6336C1A5CF421D339D6B60E8371D364184A0CBD69FFFBAAAA9C2FE7A6EA97BB9CD84A9E6405126D634EE490496292212536CF448C4C67B99610C1A8AE2A0CB9338BDB501AEF7E7667E81C146A9170A949A20D25D2E69464C6E4C42844AE5070610A88608AF9DCBBFC8481502C16BC50246668881446902CE78288213342E499A6F1102298D50FA6C49B559561801418D707755D96E8DBA3E2B2586A5B5843322C349142539220CB89B454243197892AD4799BBE7923D32C310CEEE66EB974BE1C5CB7F60222B8FDEE31FC53F46987F3A9D38725422A28A511DCB5A65D3590F2EE6BEA2ABC2D7ACA36E071CA19619C303165C394A994F10BC919535A7FDA70109CCFDDF2702A903E3E713BFEB2A294EBBBC1D4F8AF181A88B6F4BF68E8EB7D6EDEA1B10BE7B181F4F3234C4C681F2065DBA2BBE3B058D1083E6081017D8E93DAF91652A936877C5FAF4203A9B8A0741D9D5677FB8ED49D7FAF5654FE5DCF9FD7F33372D9059D8D1F9E093FA94EF68B3F17BE05E128FCA59D7B3E5C76B5DFF7B89F9C648FDF1E98AFA9675FBD65E24FF54133B6F5FC06E140FEFC208F4038D0BF10C3FBBDCBF3D491F10FD7B490B66185118CBFA16FB7B45F9665C0D2B4F89AA7FD638361B3DFAF168B087616330C8DAB3DA4DD4DD82DFF4F67B6B9636FE32CE337F41EAEEF23D8F5474670D58CE6C69768B73359FF02";
var b = Convert.FromBase64String(s);
var jsonString = CompressionHelper.Decompress(b);
The line var b = Convert.FromBase64String(s); throws this exception System.FormatException: The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.
3.
try
{
var s = "0xB5945F6FDA3014C5BF0ABACF71E5FF2179AB18D22A552D5A190FDBFAE0C437C11A719013B67515DF7D22A40C06A3ADB63EC6F139E7FADE9FFD08571652A054B0445B4164CE389179C249A6D112ADA842458B38CB044470632A84142675684766B1B8A92BE74DEB6A3F416F9D2FAF5DD3CE1C7E8768B7E735F6336C1A5CF421D339D6B60E8371D364184A0CBD69FFFBAAAA9C2FE7A6EA97BB9CD84A9E6405126D634EE490496292212536CF448C4C67B99610C1A8AE2A0CB9338BDB501AEF7E7667E81C146A9170A949A20D25D2E69464C6E4C42844AE5070610A88608AF9DCBBFC8481502C16BC50246668881446902CE78288213342E499A6F1102298D50FA6C49B559561801418D707755D96E8DBA3E2B2586A5B5843322C349142539220CB89B454243197892AD4799BBE7923D32C310CEEE66EB974BE1C5CB7F60222B8FDEE31FC53F46987F3A9D38725422A28A511DCB5A65D3590F2EE6BEA2ABC2D7ACA36E071CA19619C303165C394A994F10BC919535A7FDA70109CCFDDF2702A903E3E713BFEB2A294EBBBC1D4F8AF181A88B6F4BF68E8EB7D6EDEA1B10BE7B181F4F3234C4C681F2065DBA2BBE3B058D1083E6081017D8E93DAF91652A936877C5FAF4203A9B8A0741D9D5677FB8ED49D7FAF5654FE5DCF9FD7F33372D9059D8D1F9E093FA94EF68B3F17BE05E128FCA59D7B3E5C76B5DFF7B89F9C648FDF1E98AFA9675FBD65E24FF54133B6F5FC06E140FEFC208F4038D0BF10C3FBBDCBF3D491F10FD7B490B66185118CBFA16FB7B45F9665C0D2B4F89AA7FD638361B3DFAF168B087616330C8DAB3DA4DD4DD82DFF4F67B6B9636FE32CE337F41EAEEF23D8F5474670D58CE6C69768B73359FF02";
//var substring = s.Substring(2);
var byteArray = new byte[s.Length];
for (var i = 0; i < s.Length; i++)
{
var b = Byte.Parse(s[i].ToString());
byteArray[i] = b;
}
var jsonString = CompressionHelper.Decompress(byteArray);
}
catch (Exception e)
{
Console.WriteLine(e);
throw;
}
throws this exception
System.FormatException: Input string was not in a correct format.
at System.Number.ThrowOverflowOrFormatException(ParsingStatus status, TypeCode type)
at System.Byte.Parse(String s)
Can you please help me to figure out how to solve my problem?
#madmonk46 Thank You indeed. Convert.FromHexString really helped. I couldn't find this method at first because my demo project is on .NET Core 3.1 but Convert.FromHexString supported starting from .NET 5. luckily our project is on .NET 5 and we are migrating to .NET6 now))
Also it is not working with string that starts from 0x.
var stringFromDataBase = "0x8590416BC3300C85FFCA7867BBC88E9DC4BE6D1D8C5DB6427BEAD82189952D236D83E3144AE97F1F69BBF3408787D07B9FA4335E033C7496DBB6B546EADCE6D2144ECBD29293A1A6D0385B84820C049691ABC4E131CD16D25A2A25C96CA8F4B6F4E4163A3345A1DD1602AB7808539336A781E1151109ACA781E33AC56EFF058FA7581D1902EFF50F376984FFC010BB2327082D529C5840B9822429496A43E4AFB58538E3ADDA31FC2DEF65D633AE6B18DEDA8515584D75DF8DDF1C6FB73559C921AFA42D9D9626574A56D690AC759B99AC0A2E84664EF833C19FB13CEC866A7FBA83C679F187E6D683C0CC1CE1F753DF8BEBFFEE6A5C1FDAF4CC3D270EF06DD58F7CF977E0F3F20B";
var bytesArray = Convert.FromHexString(stringFromDataBase);
var jsonString = CompressionHelper.Decompress(bytesArray);
this code throws
System.FormatException: The input is not a valid hex string as it contains a non-hex character.
so you need to add one line of code to it.
var stringFromDataBase = "0x8590416BC3300C85FFCA7867BBC88E9DC4BE6D1D8C5DB6427BEAD82189952D236D83E3144AE97F1F69BBF3408787D07B9FA4335E033C7496DBB6B546EADCE6D2144ECBD29293A1A6D0385B84820C049691ABC4E131CD16D25A2A25C96CA8F4B6F4E4163A3345A1DD1602AB7808539336A781E1151109ACA781E33AC56EFF058FA7581D1902EFF50F376984FFC010BB2327082D529C5840B9822429496A43E4AFB58538E3ADDA31FC2DEF65D633AE6B18DEDA8515584D75DF8DDF1C6FB73559C921AFA42D9D9626574A56D690AC759B99AC0A2E84664EF833C19FB13CEC866A7FBA83C679F187E6D683C0CC1CE1F753DF8BEBFFEE6A5C1FDAF4CC3D270EF06DD58F7CF977E0F3F20B";
var substring = stringFromDataBase.Substring(2);
var bytesArray = Convert.FromHexString(substring);
var jsonString = CompressionHelper.Decompress(bytesArray);
Thank you one more time for your quick help, madmonk46!

How do I correctly prepare an 'HTTP Redirect Binding' SAML Request using C#

I need to create an SP initiated SAML 2.0 Authentication transaction using HTTP Redirect Binding method. It turns out this is quite easy. Just get the IdP URI and concatenate a single query-string param SAMLRequest. The param is an encoded block of xml that describes the SAML request. So far so good.
The problem comes when converting the SAML into the query string param. I believe this process of preparation should be:
Build a SAML string
Compress this string
Base64 encode the string
UrlEncode the string.
The SAML Request
<samlp:AuthnRequest
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
ID="{0}"
Version="2.0"
AssertionConsumerServiceIndex="0"
AttributeConsumingServiceIndex="0">
<saml:Issuer>URN:xx-xx-xx</saml:Issuer>
<samlp:NameIDPolicy
AllowCreate="true"
Format="urn:oasis:names:tc:SAML:2.0:nameid-format:transient"/>
</samlp:AuthnRequest>
The Code
private string GetSAMLHttpRedirectUri(string idpUri)
{
var saml = string.Format(SAMLRequest, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
using (var output = new MemoryStream())
{
using (var zip = new DeflaterOutputStream(output))
{
zip.Write(bytes, 0, bytes.Length);
}
var base64 = Convert.ToBase64String(output.ToArray());
var urlEncode = HttpUtility.UrlEncode(base64);
return string.Concat(idpUri, "?SAMLRequest=", urlEncode);
}
}
I suspect the compression is somehow to blame. I am using the DeflaterOutputStream class from SharpZipLib which is supposed to implement an industry standard deflate-algorithm so perhaps there are some settings here I have wrong?
The encoded output can be tested using this SAML2.0 Debugger (its a useful online conversion tool). When I decode my output using this tool it comes out as nonsense.
The question therefore is: Do you know how to convert a SAML string into the correctly deflated and encoded SAMLRequest query-param?
Thank you
EDIT 1
The accepted answer below gives the answer to the problem. Here is final code as corrected by all subsequent comments and answers.
Encode SAMLRequest - Working Code
private string GenerateSAMLRequestParam()
{
var saml = string.Format(SAMLRequest, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
using (var output = new MemoryStream())
{
using (var zip = new DeflateStream(output, CompressionMode.Compress))
{
zip.Write(bytes, 0, bytes.Length);
}
var base64 = Convert.ToBase64String(output.ToArray());
return HttpUtility.UrlEncode(base64);
}
}
The SAMLRequest variable contains the SAML shown at the top of this question.
Decode SAMLResponse - Working Code
private string DecodeSAMLResponse(string response)
{
var utf8 = Encoding.UTF8;
var bytes = utf8.GetBytes(response);
using (var output = new MemoryStream())
{
using (new DeflateStream(output, CompressionMode.Decompress))
{
output.Write(bytes, 0, bytes.Length);
}
var base64 = utf8.GetString(output.ToArray());
return utf8.GetString(Convert.FromBase64String(base64));
}
}
I've just run the following code with your example SAML:
var saml = string.Format(sample, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
string middle;
using (var output = new MemoryStream())
{
using (var zip = new DeflaterOutputStream(output))
zip.Write(bytes, 0, bytes.Length);
middle = Convert.ToBase64String(output.ToArray());
}
string decoded;
using (var input = new MemoryStream(Convert.FromBase64String(middle)))
using (var unzip = new InflaterInputStream(input))
using (var reader = new StreamReader(unzip, Encoding.UTF8))
decoded = reader.ReadToEnd();
bool test = decoded == saml;
The test variable is true. This means that the zip/base64/unbase64/unzip roundtrip performs correctly. The error must occur later. Maybe the URLEncoder destroys them? Could you try similar urlencode/decode test? Also, check how long the result is. It may be possible that the resulting URL is truncated due to its length.
(edit: I've added a StreamReader instead of reading to arrays. Earlier my sample used bytes.Length to prepare the buffer and that could damage the test. Now the reading uses only the information from the compressed stream)
edit:
var saml = string.Format(sample, Guid.NewGuid());
var bytes = Encoding.UTF8.GetBytes(saml);
string middle;
using (var output = new MemoryStream())
{
using (var zip = new DeflateStream(output, CompressionMode.Compress))
zip.Write(bytes, 0, bytes.Length);
middle = Convert.ToBase64String(output.ToArray());
}
// MIDDLE is the thing that should be now UrlEncode'd
string decoded;
using (var input = new MemoryStream(Convert.FromBase64String(middle)))
using (var unzip = new DeflateStream(input, CompressionMode.Decompress))
using (var reader = new StreamReader(unzip, Encoding.UTF8))
decoded = reader.ReadToEnd();
bool test = decoded == saml;
this code produces a middle variable, that once is UrlEncoded, passes through the debugger properly. DeflateStream comes from the standard .Net's System.IO.Compression namespace. I don't have the slightest idea why the SharpZip's Deflate is not accepted by the 'debugger' site. It is undeniable that the compression works, as it manages to decompress the data properly.. it just has to be some difference in the algorithms, but I cannot tell what is the difference between this deflate and that deflate, d'oh.
The question at the top contains a "Decode SAMLResponse - Working Code" section, but that code seemed broken. After trying a few things, I discovered that it was trying to read and write to the same stream at the same time. I reworked it by separating the read and write streams and here is my solution (I am providing the request section for convenience and clarity):
Encode SAML Authentication Request:
public static string EncodeSamlAuthnRequest(this string authnRequest) {
var bytes = Encoding.UTF8.GetBytes(authnRequest);
using (var output = new MemoryStream()) {
using (var zip = new DeflateStream(output, CompressionMode.Compress)) {
zip.Write(bytes, 0, bytes.Length);
}
var base64 = Convert.ToBase64String(output.ToArray());
return HttpUtility.UrlEncode(base64);
}
}
Decode SAML Authentication Response:
public static string DecodeSamlAuthnRequest(this string encodedAuthnRequest) {
var utf8 = Encoding.UTF8;
var bytes = Convert.FromBase64String(HttpUtility.UrlDecode(encodedAuthnRequest));
using (var output = new MemoryStream()) {
using (var input = new MemoryStream(bytes)) {
using (var unzip = new DeflateStream(input, CompressionMode.Decompress)) {
unzip.CopyTo(output, bytes.Length);
unzip.Close();
}
return utf8.GetString(output.ToArray());
}
}
}

Compressing and decompressing a string yields only the first letter of the original string?

I'm compressing a string with Gzip using this code:
public static String Compress(String decompressed)
{
byte[] data = Encoding.Unicode.GetBytes(decompressed);
using (var input = new MemoryStream(data))
using (var output = new MemoryStream())
{
using (var gzip = new GZipStream(output, CompressionMode.Compress, true))
{
input.CopyTo(gzip);
}
return Convert.ToBase64String(output.ToArray());
}
}
and decompressing it with this code:
public static String Decompress(String compressed)
{
byte[] data = Convert.FromBase64String(compressed);
using (MemoryStream input = new MemoryStream(data))
using (GZipStream gzip = new GZipStream(input, CompressionMode.Decompress))
using (MemoryStream output = new MemoryStream())
{
gzip.CopyTo(output);
StringBuilder sb = new StringBuilder();
foreach (byte b in output.ToArray())
sb.Append((char)b);
return sb.ToString();
}
}
When I use these functions in this sample code, the result is only the letter S:
String test = "SELECT * FROM foods f WHERE f.name = 'chicken';";
String com = Compress(test);
String decom = Decompress(com);
Console.WriteLine(decom);
If I debug the code, I see that the value of decom is
S\0E\0L\0E\0C\0T\0 \0*\0 \0F\0R\0O\0M\0 \0f\0o\0o\0d\0s\0 \0f\0 \0W\0H\0E\0R\0E\0 \0f\0.\0n\0a\0m\0e\0 \0=\0 \0'\0c\0h\0i\0c\0k\0e\0n\0'\0;\0
but the value displayed is only the letter S.
These lines are the problem:
foreach (byte b in output.ToArray())
sb.Append((char)b);
You are interpreting each byte as its own character, when in fact that is not the case. Instead, you need the line:
string decoded = Encoding.Unicode.GetString(output.ToArray());
which will convert the byte array to a string, based on the encoding.
The basic problem is that you are converting to a byte array based on an encoding, but then ignoring that encoding when you retrieve the bytes. As well, you may want to use Encoding.UTF8 instead of Encoding.Unicode (though that shouldn't matter, as long as the encodings match up.)
In your compress method replace Unicode with UTF8:
byte[] data = Encoding.UTF8.GetBytes(decompressed);

Gzip uncompress from string error, The magic number in GZip header is not correct

I am trying to replicate the php function gzuncompress in C#
So far I got part of following code working. see comment and code below.
I thing the tricky bit is happening during byte[] and string convertion.
How can I fix this? and where did I missed??
I am using .Net 3.5 environment
var plaintext = Console.ReadLine();
Console.WriteLine("string to byte[] then to string");
byte[] buff = Encoding.UTF8.GetBytes(plaintext);
var compress = GZip.GZipCompress(buff);
//Uncompress working below
try
{
var unpressFromByte = GZip.GZipUncompress(compress);
Console.WriteLine("uncompress successful by uncompress byte[]");
}catch
{
Console.WriteLine("uncompress failed by uncompress byte[]");
}
var compressString = Encoding.UTF8.GetString(compress);
Console.WriteLine(compressString);
var compressBuff = Encoding.UTF8.GetBytes(compressString);
Console.WriteLine(Encoding.UTF8.GetString(compressBuff));
//Uncompress not working below by using string
//The magic number in GZip header is not correct
try
{
var uncompressFromString = GZip.GZipUncompress(compressBuff);
Console.WriteLine("uncompress successful by uncompress string");
}
catch
{
Console.WriteLine("uncompress failed by uncompress string");
}
code for class Gzip
public static class GZip
{
public static byte[] GZipUncompress(byte[] data)
{
using (var input = new MemoryStream(data))
using (var gzip = new GZipStream(input, CompressionMode.Decompress))
using (var output = new MemoryStream())
{
gzip.CopyTo(output);
return output.ToArray();
}
}
public static byte[] GZipCompress(byte[] data)
{
using (var input = new MemoryStream(data))
using (var output = new MemoryStream())
{
using (var gzip = new GZipStream(output, CompressionMode.Compress, true))
{
input.CopyTo(gzip);
}
return output.ToArray();
}
}
public static long CopyTo(this Stream source, Stream destination)
{
var buffer = new byte[2048];
int bytesRead;
long totalBytes = 0;
while ((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
{
destination.Write(buffer, 0, bytesRead);
totalBytes += bytesRead;
}
return totalBytes;
}
}
This is inappropriate:
var compressString = Encoding.UTF8.GetString(compress);
compress isn't a UTF-8-encoded piece of text. You should treat it as arbitrary binary data - which isn't appropriate to pass into Encoding.GetString. If you really need to convert arbitrary binary data into text, use Convert.ToBase64String (and then reverse with Convert.FromBase64String):
var compressString = Convert.ToBase64String(compress);
Console.WriteLine(compressString);
var compressBuff = Convert.FromBase64String(compressString);
That may or may not match what PHP does, but it's a safe way of representing arbitrary binary data as text, unlike treating the binary data as if it were valid UTF-8-encoded text.
I am trying to replicate the php function gzuncompress in C#
Then use GZipStream or DeflateStream classes which are built into the .NET framework for this purpose.

How to determine size of string, and compress it

I'm currently developing an application in C# that uses Amazon SQS
The size limit for a message is 8kb.
I have a method that is something like:
public void QueueMessage(string message)
Within this method, I'd like to first of all, compress the message (most messages are passed in as json, so are already fairly small)
If the compressed string is still larger than 8kb, I'll store it in S3.
My question is:
How can I easily test the size of a string, and what's the best way to compress it?
I'm not looking for massive reductions in size, just something nice and easy - and easy to decompress the other end.
To know the "size" (in kb) of a string we need to know the encoding. If we assume UTF8, then it is (not including BOM etc) like below (but swap the encoding if it isn't UTF8):
int len = Encoding.UTF8.GetByteCount(longString);
Re packing it; I would suggest GZIP via UTF8, optionally followed by base-64 if it has to be a string:
using (MemoryStream ms = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(ms, CompressionMode.Compress, true))
{
byte[] raw = Encoding.UTF8.GetBytes(longString);
gzip.Write(raw, 0, raw.Length);
gzip.Close();
}
byte[] zipped = ms.ToArray(); // as a BLOB
string base64 = Convert.ToBase64String(zipped); // as a string
// store zipped or base64
}
Give unzip bytes to this function.The best I could come up with was
public static byte[] ZipToUnzipBytes(byte[] bytesContext)
{
byte[] arrUnZipFile = null;
if (bytesContext.Length > 100)
{
using (var inFile = new MemoryStream(bytesContext))
{
using (var decompress = new GZipStream(inFile, CompressionMode.Decompress, false))
{
byte[] bufferWrite = new byte[4];
inFile.Position = (int)inFile.Length - 4;
inFile.Read(bufferWrite, 0, 4);
inFile.Position = 0;
arrUnZipFile = new byte[BitConverter.ToInt32(bufferWrite, 0) + 100];
decompress.Read(arrUnZipFile, 0, arrUnZipFile.Length);
}
}
}
return arrUnZipFile;
}

Categories