convert byte array to string but not with Convert.ToBase64 - c#

Dears
I have a byte array that is returned from web server , it is a part of json-serialized object (property value)
It looks like below in the json string:
,"n":"y1GpP7FibyTYl40Jhx1B90WOi1mecJfpi4IEhbHPbAB64jhV16UlpEPyGpNIzDS4Lct80sIs7FW5Vnf38Z-tzPbtHyFVYYU2AC4SVrwQp9-ELz-..._xW3bmMxuwoBgHpWDTw"
Please note that there is no double equal sign at the end, like for Base64 strings. I've used three dots (...) to make string representation a little bit shorter
I can deserialize object and get proper byte array:
var kb = JsonConvert.DeserializeObject<KeyBundle>(Properties.Resources.keyBundleJson);
And can it serialize to json back:
JsonSerializerSettings settings = new JsonSerializerSettings
{
TypeNameHandling = TypeNameHandling.None,
Formatting = Formatting.Indented
};
string json = JsonConvert.SerializeObject(kb, settings);
But the problem is that result property value looks not the same as original string:
from web server it was:
y1GpP7FibyTYl40Jhx1B90WOi1mecJfpi4IEhbHPbAB64jhV16UlpEPyGpNIzDS4Lct80sIs7FW5Vnf38Z-tzPbtHyFVYYU2AC4SVrwQp9-ELz-..._xW3bmMxuwoBgHpWDTw
serialized locally:
y1GpP7FibyTYl40Jhx1B90WOi1mecJfpi4IEhbHPbAB64jhV16UlpEPyGpNIzDS4Lct80sIs7FW5Vnf38Z+tzPbtHyFVYYU2AC4SVrwQp9+ELz+.../xW3bmMxuwoBgHpWDTw==
underscores and slashes, plus and minus signs, two equal signs at the end
is it possible to serialize byte array exactly as it is done by web-server?
I have an idea to serialize it with Json and then replace minus with plus, underscore with slash and remove last two equal signs.
Any other method to get it immediately out of the box?
Regards

In urls there is different variant of Base64 used with - and _ which doesn't require additional encoding (e.g. + would be encoded to %2B). For this you can simply use string Replace method to replace those characters.
If you want an out-of-the box solution you can try Microsoft.IdentityModel.Tokens nuget package:
var encoded = Base64UrlEncoder.Encode(someString);
var decoded = Base64UrlEncoder.Decode(encoded);
For more info: https://en.wikipedia.org/wiki/Base64#URL_applications

Related

Characters added and wrong output during serialization with Json.NET

JSON.NET seems to serialize my code into what appear to be strings, instead of objects. Here's an example of what it returns:
"{\"kvk_nummer\":11111111,\"onderneming\":\"berijf B.V.\",\"vestigingsplaats\":\"AMSTERDAM\",\"actief\":1}"
It also adds strange backslashes, I tried to get rid of them, but none of the answers I've found seemed to have helped. Here is the code that returns the string.
getregister r = new getregister
{
kvk_nummer = col1, //contains an 8 digit number
onderneming = checkTotaal[col1], //contains a name
vestigingsplaats = checkTotaal2[col1], //contains a location
actief = 1 // bool that represents wether the company is active or not
};
yield return JsonConvert.SerializeObject(r);
How can i get JSON.NET to output an object, instead of some JSON strings?
Looks like you're confusing some stuff. Taken from Serialization (C#)
Serialization is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.
When you serialize into JSON, you get a JSON representation of your object. Which is a string representation. Taken from the JSON Wikipedia page:
JavaScript Object Notation or JSON is an open-standard file format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value).
In short: your code is doing what you're asking it to do. As far as the slashes go: those are escape characters. If you want (JSON.NET to return) an object, return the object you're creating (r).
return new getregister
{
kvk_nummer = col1, //contains an 8 digit number
onderneming = checkTotaal[col1], //contains a name
vestigingsplaats = checkTotaal2[col1], //contains a location
actief = 1 // bool that represents wether the company is active or not
};
If you're looking for a way to have JSON.NET return an object, you should take a look into Deserializing it. Since that takes the string-representation (JSON) for your object, and turns it back into an actual object for you.

Why am I getting two different 'formats' of hex in my bytes while evaluating an HMAC?

I'm getting a signed payload from an authentication source that comes in a base64 encoded and URL encoded format. I'm getting confused somewhere while evaluating, and ending up with similar data in different 'formats'.
Here's my code:
//Split the message to payload and signature
string[] split = raw_message.Split('.');
//Payload
string base64_payload = WebUtility.UrlDecode(split[0]);
byte[] payload = Convert.FromBase64String(base64_payload);
//Expected signature
string base64_expected_sig = WebUtility.UrlDecode(split[1]);
byte[] expected_sig = Convert.FromBase64String(base64_expected_sig);
//Signature
byte[] signature = hmacsha256.ComputeHash(payload);
//Output as a string
var foo = System.Text.Encoding.UTF8.GetString(expected_sig);
var bar = BitConverter.ToString(signature);
The expected signature (foo) comes out like so:
76eba09fcb54877299dcbd1e1e35717e3bd42e066e7ecdb131c7d0161dec3418
The computed signature (bar) is as follows:
76-EB-A0-9F-CB-54-87-72-99-DC-BD-1E-1E-35-71-7E-3B-D4-2E-06-6E-7E-CD-B1-31-C7-D0-16-1D-EC-34-18
Obviously, when comparing bytes for bytes, this doesn't work.
I see that I'm having to convert the expected_sig and the signature in different ways to get them to display as a string, but I can't figure out how I need to change the expected signature to get to where I can compare bytes for bytes.
I can obviously work around the issue but simply converting the string bar, but that's dirty and I just don't like it.
Where am I going wrong here? What am I not understanding?
The good news is that the hash computation appears to be working.
The bad news is that you're receiving the hash in a brain-dead fashion. For some reason it seems that the authors decided it was a good idea to:
Compute the hash (fine)
Convert this binary data to text as hex (fine)
Convert the hex back into binary data by applying ASCII/UTF-8/anything-ASCII-compatible encoding (why?)
Convert the result back into text using base64 (what?)
URL-encode the result (which wouldn't even be necessary with hex...)
Using either base64 or hex on the original binary makes sense, but applying both is crazy.
Anyway, it's fairly easy for you to do the same thing. For example:
string hexSignature = string.Join("", signature.Select(b => b.ToString("x2")));
byte[] hexSignatureUtf8 = Encoding.UTF8.GetBytes(hexSignature);
string finalSignature = Convert.ToBase64String(hexSignatureUtf8);
That should now match WebUtility.UrlDecode(split[1]).
Alternatively, you can work backwards from what's in the result, but I wouldn't go as far as parsing the hex back to bytes - it would be simpler to keep the first line of the above, but use:
string expectedHexBase64 = WebUtility.UrlDecode(split[1]);
byte[] expectedHexUtf8 = Convert.FromBase64String(expectedHexBase64);
string expectedHex = Encoding.UTF8.GetString(expectedHexUtf8);
Then compare it with hexSignature.
Ideally, you should talk to whoever's providing you with the crazy format and hit them with a cluestick though...

unexpected non-whitespace character after JSON data

string result="12334,23432,3453455";
I am getting this string through Ajax call but it gives me the following error:
"unexpected non-whitespace character after JSON data"
When I remove comma's between strings it works fine .How to handle this?. I want to put value in textarea with comma's after the Ajax call
Whatever's outputting that isn't doing so in JSON format, but more like CSV.
A few options:
If you're able, fix the output method to correctly output JSON
Parse the string like a CSV
e.g. "12334,23432,3453455".split(',')
Conform the output to JSON first, then parse
e.g. JSON.parse("["+"12334,23432,3453455"+"]") (wrap with [])
Specify dataType:'text' in your $.ajax call.
Options 1-3 of the above would result in [12334,23432,3453455] as a javascript array of numbers, while Option 4 will simply result in "12334,23432,3453455" as a string.
BTW, using JSON.NET, this is what it should result in:
// As an array:
Int32[] ary = new[]{ 12334, 23432, 3453455 };
Console.WriteLine(JsonConvert.SerializeObject(ary));
// [12334,23432,3453455]
// As a string:
String str = "12334,23432,3453455";
Console.WriteLine(JsonConvert.SerializeObject(str));
// "12334,23432,3453455"
Your data has to be parsed by your JSON parser.
If your data is an array, your string should look like:
"[12334,23432,3453455]"
or should it be astring:
"\"12334,23432,3453455\""

C# UTF8 encoding

I have a c# program that retrieve some JSON data and use Newtonsoft JSON to Deserialize it.
as i use persian chars in my program the JSON codes will be shown like this:\u060c \u067e\u0644\u0627\u06a9 .... also after i retrive the JSON data in my program this chars still show like its coded sample.but after i Deserialize it converted to ???? chars.
what should i do?
Your JSON deserializer is broken; \uXXXX is supposed to be turned into proper characters.
To do that yourself, use this function
// Turns every occurrence of \uXXXX into a proper character
void UnencodeJSONUnicode(string str) {
return Regex.Replace(str,
#"\\u(?<value>[0-9a-f]{4})",
match => {
string digits = match.Groups["value"].Value;
int number = int.Parse(digits, NumberStyles.HexNumber);
return char.ConvertFromUtf32(number);
});
}
(Untested code; I don't have VS available at the moment. Some exception handling would probably be nice too)
Looks like it has been JSON encoded, so you need to decode it. The DataContractJsonSerializer class can do this.
See this MSDN link for more information.

Can we simplify this string encoding code

Is it possible to simplify this code into a cleaner/faster form?
StringBuilder builder = new StringBuilder();
var encoding = Encoding.GetEncoding(936);
// convert the text into a byte array
byte[] source = Encoding.Unicode.GetBytes(text);
// convert that byte array to the new codepage.
byte[] converted = Encoding.Convert(Encoding.Unicode, encoding, source);
// take multi-byte characters and encode them as separate ascii characters
foreach (byte b in converted)
builder.Append((char)b);
// return the result
string result = builder.ToString();
Simply put, it takes a string with Chinese characters such as 鄆 and converts them to ài.
For example, that Chinese character in decimal is 37126 or 0x9106 in hex.
See http://unicodelookup.com/#0x9106/1
Converted to a byte array, we get [145, 6] (145 * 256 + 6 = 37126). When encoded in CodePage 936 (simplified chinese), we get [224, 105]. If we break this byte array down into individual characters, we 224=e0=à and 105=69=i in unicode.
See http://unicodelookup.com/#0x00e0/1
and
http://unicodelookup.com/#0x0069/1
Thus, we're doing an encoding conversion and ensuring that all characters in our output Unicode string can be represented using at most two bytes.
Update: I need this final representation because this is the format my receipt printer is accepting. Took me forever to figure it out! :) Since I'm not an encoding expert, I'm looking for simpler or faster code, but the output must remain the same.
Update (Cleaner version):
return Encoding.GetEncoding("ISO-8859-1").GetString(Encoding.GetEncoding(936).GetBytes(text));
Well, for one, you don't need to convert the "built-in" string representation to a byte array before calling Encoding.Convert.
You could just do:
byte[] converted = Encoding.GetEncoding(936).GetBytes(text);
To then reconstruct a string from that byte array whereby the char values directly map to the bytes, you could do...
static string MangleTextForReceiptPrinter(string text) {
return new string(
Encoding.GetEncoding(936)
.GetBytes(text)
.Select(b => (char) b)
.ToArray());
}
I wouldn't worry too much about efficiency; how many MB/sec are you going to print on a receipt printer anyhow?
Joe pointed out that there's an encoding that directly maps byte values 0-255 to code points, and it's age-old Latin1, which allows us to shorten the function to...
return Encoding.GetEncoding("Latin1").GetString(
Encoding.GetEncoding(936).GetBytes(text)
);
By the way, if this is a buggy windows-only API (which it is, by the looks of it), you might be dealing with codepage 1252 instead (which is almost identical). You might try reflector to see what it's doing with your System.String before it sends it over the wire.
Almost anything would be cleaner than this - you're really abusing text here, IMO. You're trying to represent effectively opaque binary data (the encoded text) as text data... so you'll potentially get things like bell characters, escapes etc.
The normal way of encoding opaque binary data in text is base64, so you could use:
return Convert.ToBase64String(Encoding.GetEncoding(936).GetBytes(text));
The resulting text will be entirely ASCII, which is much less likely to cause you hassle.
EDIT: If you need that output, I would strongly recommend that you represent it as a byte array instead of as a string... pass it around as a byte array from that point onwards, so you're not tempted to perform string operations on it.
Does your receipt printer have an API that accepts a byte array rather than a string?
If so you may be able to simplify the code to a single conversion, from a Unicode string to a byte array using the encoding used by the receipt printer.
Also, if you want to convert an array of bytes to a string whose character values correspond 1-1 to the values of the bytes, you can use the code page 28591 aka Latin1 aka ISO-8859-1.
I.e., the following
foreach (byte b in converted)
builder.Append((char)b);
string result = builder.ToString();
can be replaced by:
// All three of the following are equivalent
// string result = Encoding.GetEncoding(28591).GetString(converted);
// string result = Encoding.GetEncoding("ISO-8859-1").GetString(converted);
string result = Encoding.GetEncoding("Latin1").GetString(converted);
Latin1 is a useful encoding when you want to encode binary data in a string, e.g. to send through a serial port.

Categories