ICQ encoding of Special Characters - c#

I'm working with ICQ protocol and I found problem with special letters (fxp diacritics). I read that ICQ using another encoding (CP-1251 if I remember).
How can I decode string with text to correct encoding?
I've tried using UTF8Encoding class, but without success.
Using ICQ-sharp library.
private void ParseMessage (string uin, byte[] data)
{
ushort capabilities_length = LittleEndianBitConverter.Big.ToUInt16 (data, 2);
ushort msg_tlv_length = LittleEndianBitConverter.Big.ToUInt16 (data, 6 + capabilities_length);
string message = Encoding.UTF8.GetString (data, 12 + capabilities_length, msg_tlv_length - 4);
Debug.WriteLine(message);
}
If contact using the same client it's OK, but if not incoming and outcoming messages with diacritics are just unreadable.
I've determinated (using this -> https://stackoverflow.com/a/12853721/846232) that it's in BigEndianUnicode encoding. But if string not contains diacritics its unreadable (chinese letters). But if I use UTF8 encoding on text without diacritics its ok. But I don't know how to do that it will be encoded right allways.

If UTF-8 kinda works (i.e. it works for "english", or any US-ASCII characters), then you don't have UTF-16. Latin1 (or Windows-1252, Microsoft's variant), or e.g. Windows-1251 or Windows-1250 are perfectly possible though, since these the first part containing latin letters without diacritics are the same.
Decode like this:
var encoding = Encoding.GetEncoding("Windows-1250");
string message = encoding.GetString(data, 12 + capabilities_length, msg_tlv_length - 4);

Related

In C#, how to convert the javascript's base64 string into a file

In angular, I am making a form like this,
form = this.fb.group({
product_Id: [0]
product_Name: [''],
product_Image: ['']
})
Now I want to pass the product image in base64 string. So I have converted it using below,
fileSelected(files: FileList)
{
let file = <File>files[0]
let reader = new FileReader()
reader.readAsDataURL(file)
reader.onloadend = () => {
this.form.get('product_Image').setValue(reader.result)
}
}
On asp.net core side, I have implemented below logic,
var path = Directory.GetCurrentDirectory() + "wwwroot\\Product\\test.jpg";
await File.WriteAllBytesAsync(path, Convert.FromBase64String(vm.Product_Image));
Now the asp.net core is giving me error: "The given format is not a valid base 64 string"
PS: I have also reader.readAsBinaryString(file) but the error is the same.
The Error: The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.
I don't know what wrong?
As per the MDN page for readAsDataURL, you must remove the URI part before decoding. I quote:
Note: The blob's result cannot be directly decoded as Base64 without first removing the Data-URL declaration preceding the Base64-encoded data. To retrieve only the Base64 encoded string, first remove data:*/*;base64, from the result.
try remove "data:image/jpeg;base64," from vm.Product_Image string before
await File.WriteAllBytesAsync(path, Convert.FromBase64String(vm.Product_Image));
Sample base 64string as below
/9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAYEBQYFBAYGBQYHBwYIChAKC...

PHP utf8 variable encoding (HMAC Key -> C# Server)

I'm trying to create a PHP client wrapper to talk to a .NET API. What I have is working but I am new to PHP development and what I have now looks like it may not work 100% of the time.
C# code I am trying to replicate:
private static void HMAC_Debug()
{
Console.WriteLine("Secret Key (Base64): 'qCJ6KNCd/ASFOt1cL5uq2TUYcRjplpYUy7QdUmvaCTs='");
var secret = Convert.FromBase64String("qCJ6KNCd/ASFOt1cL5uq2TUYcRjplpYUy7QdUmvaCTs=");
Console.WriteLine("Value To Hash (UTF8): 'MyHashingValue©'");
var value = Encoding.UTF8.GetBytes("MyHashingValue©");
using (HMACSHA256 hmac = new HMACSHA256(secret))
{
byte[] signatureBytes = hmac.ComputeHash(value);
string requestSignatureBase64String = Convert.ToBase64String(signatureBytes);
Console.WriteLine("Resulting Hash (Base64): '{0}'", requestSignatureBase64String);
}
Console.ReadLine();
}
My PHP Equiv:
$rawKey = base64_decode("qCJ6KNCd/ASFOt1cL5uq2TUYcRjplpYUy7QdUmvaCTs=");
// $hashValArr = unpack("C*", utf8_encode("MyHashingValue©"));
//
// $hashVal = call_user_func_array("pack", array_merge(array("C*"), $hashValArr));
$hashVal = "MyHashingValue©";
$raw = hash_hmac("sha256", $hashVal, $rawKey, TRUE);
$rawEnc = base64_encode($raw);
echo $rawEnc;
These two snippets produce the same Base64 output, but I am relying on the string variables in PHP being default encoded to UTF8 - is this a correct assumption or is there something more stable I can do?
You can see from the commented out PHP lines I attempted to manually encode it to UTF8 then extract out the ASCII bytes for the PHP HMAC function but it didn't produce the same output as the c# code.
Thanks
Marlon
Which version of PHP are you using?
In general you cannot rely on the encoding being UTF-8. In fact it might be possible that you just stored the file as UTF-8 (I guess without BOM) but older PHP versions (as far as I know before PHP 7) are not capable to work natively with unicode, they just read it as ASCII / Extended ASCII.
That said, if you do not manipulate the string it is possible that your example works because you are just processing the bytes that are stored in the variable. And if this byte sequence happend to be a UTF-8 encoded string at the time you inserted it into your source code it stays that way.
If you get the string from an abritrary source you should make sure which encoding is used and consider the multibyte string processing functions of PHP, which can work with different encodings [1].
[1] http://us2.php.net/manual/en/ref.mbstring.php

Encoding issue when handling a string that contains "question mark" (�)

I am parsing some web content in a response from a HttpWebRequest.
This web content is using charset ISO-8859-1 and when parsing it and finally getting the word needed from the response, I am receiving a string with a question mark like this � and I want to know which is the right way to transform it back into a readable string.
So, what I've tried is to convert the current word encoding into UTF-8 like this:
(I am wondering if UTF-8 could solve my problem)
string word = "ESPA�OL";
Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf = Encoding.GetEncoding("UTF-8");
byte[] isoBytes = iso.GetBytes(word);
byte[] utfBytes = Encoding.Convert(iso, utf, isoBytes);
string utfWord = utf.GetString(utfBytes);
Console.WriteLine(utfWord);
However, utfWord variable outputs ESPA?OL which is still wrong. The correct output is supposed to be ESPAÑOL.
Can someone please give me the right directions to solve this, if possible?
The word in question is "ESPAÑOL". This can be encoded correctly in ISO-8859-1 since all characters in the word are represented in ISO-8859-1.
You can see this for yourself using the following simple program:
using System;
using System.Diagnostics;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Encoding enc = Encoding.GetEncoding("ISO-8859-1");
string original = "ESPAÑOL";
byte[] iso_8859_1 = enc.GetBytes(original);
string roundTripped = enc.GetString(iso_8859_1);
Debug.Assert(original == roundTripped);
Console.WriteLine(roundTripped);
}
}
}
What this tells you is that you need to properly diagnose where the erroneous character comes from. By the time that you have a � character, it is too late. The information has been lost. The presence of the � character indicates that, at some point, a conversion was performed into a character set that did not contain the character Ñ.
A conversion from ISO-8859-1 to a Unicode encoding will correctly handle "ESPAÑOL" because that word can be encoded in ISO-8859-1.
The most likely explanation is that somewhere along the way, the text "ESPAÑOL" is being converted to a character set that does not contain the letter Ñ.

Converting windows-1252 encoding to UTF-8 in Silverlight

In my Silverlight Application I am getting an XML File encoded with windows-1252.
Now my Problem it won't display correctly until the windows-1252 string is converted to a UTF8 string.
In a normal C# enviornment that wouldn't be that big of a problem: There I could do something like this:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = ReadFile(Server.MapPath(HtmlFile));
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);
(Convert a string's character encoding from windows-1252 to utf-8)
But silverlight doesn't support windows-1252 - it is unicode only.
PS
I stumbled upon "Encoding for Silverlight" http://encoding4silverlight.codeplex.com/ - but it seems there is no support for windows-1252 there either?
EDIT:
I solved my problem on the "Server Side" - The actual problem is still open.
Encoding for Silverlight is a third party encoding system but only supported all DBCS (Double-Byte Character Set) now. However, windows-1252 is SBCS (Single-Byte Character Set).
But you can write a encoder/decoder for Encoding for Silverlight, I Think will be very easy.

Encoding issue in .NET

I have a requirement to encode and decode Japanese characters. I tried in JAVA and it worked fine with "Cp939" encoding but am unable to find that encoding in .NET. The 932 encoding doesn't encode all the characters and so i need to find out a way of implementing 939 encoding in .NET.
Java Code :
convStr = new String(str8859_1.getBytes("Cp037"), "Cp939");
.NET :
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(932), bytesConverted);
// This result is a junk of characters and is totally different
// from the expected output 'ニツポンバ'
convStr = Encoding.GetEncoding(1252).GetString(bytesConverted);
The encoded bytes are in the encoding 932, so why are you using the encoding 1252 when you convert the encoded bytes to a string?
The following should work:
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(932), bytesConverted);
// This result is a junk of characters and is totally different
// from the expected output 'ニツポンバ'
convStr = Encoding.GetEncoding(932).GetString(bytesConverted);
is this an error or just how you typed it ?
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(932), bytesConverted);
should be:
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(939), bytesConverted);
Surely ?

Categories