Can't decode text

Can't decode text - c#

Example:
"Ð—Ð°Ð¿Ð¾Ð»Ð½Ð¸ Ð¿Ñ€Ð¾Ñ„Ð¸Ð»ÑŒ"
I try
var latinString = "Ð—Ð°Ð¿Ð¾Ð»Ð½Ð¸ Ð¿Ñ€Ð¾Ñ„Ð¸Ð»ÑŒ"; // år
Encoding latinEncoding = Encoding.GetEncoding("iso-8859-1");
Encoding utf8Encoding = Encoding.GetEncoding("WINDOWS-1252");
byte[] latinBytes = latinEncoding.GetBytes(latinString);
byte[] utf8Bytes = Encoding.Convert(latinEncoding, utf8Encoding, latinBytes);
var utf8String = Encoding.UTF8.GetString(utf8Bytes);
but it doesn't work:
�?аполни п�?о�?ил�?
this is russian text, help plz

It seems, latinString is UTF-8 string in Win-1252 encoding. Let's return it back into UTF-8:
// Uncomment in case of .Net Core or .Net 5
// Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var latinString = "Ð—Ð°Ð¿Ð¾Ð»Ð½Ð¸ Ð¿Ñ€Ð¾Ñ„Ð¸Ð»ÑŒ";
string result = Encoding.UTF8.GetString(
Encoding.GetEncoding(1252).GetBytes(latinString));
// Let's have a look
Console.Write(result);
Outcome:
Заполни профиль

Related

Websocket having Error: Frame must be terminated with a null octet while using Cyrillic instead of english

I am having a problem with sending a cyrillic (russian letters) instead of english ones to server(java spring boot utf-8). Here are my frames examples below. The one with english works fine, but cyrillic have a wrong calculating null octet. I am using websocket-csharp-net-stomp-client for it.
I have also tried to change encoding of the string with message to UTF-8
The one that works:
The one that does not work:
public static string SendMessage(string messageText, string chatID)
{
Encoding utf16 = Encoding.GetEncoding("utf-16"); //also tried encode by 1251 instead of utf-16
Encoding utf8 = Encoding.UTF8;
byte[] utf8Bytes = utf8.GetBytes(messageText);
byte[] utf16Bytes = Encoding.Convert(utf8, utf16, utf8Bytes);
string msg = utf16.GetString(isoBytes);
StompMessageSerializer serializer = new StompMessageSerializer();
var content = new MessageContent() { text = msg };
var broad = new StompMessage("SEND", JsonConvert.SerializeObject(content));
broad["token"] = $"{Global.AuthCompTokenFinal}";
broad["contentType"] = "application/json";
broad["destination"] = $"/app/send/{chatID}";
var str = serializer.Serialize(broad);
Console.WriteLine(str);
Global.ws.Send(str);
return str;
}
content length is getting here (library text)
internal StompMessage(string command, string body, Dictionary<string, string> headers)
{
stompCommand = command;
Body = body;
nativeHeaders = headers;
this["content-length"] = body.Length.ToString();
}
What am I missing here?
Here is an error example:

Just deleted this["content-length"] = body.Length.ToString(); and encoding to UTF-8
and it works fine . Wow

ASP.NET SOAP Webservice ,Encode Problem in Exception

Here is my problem, Im trying to Encode the response of my webservice with the following Code.
public static string ConvertToUTF8(string Cadena)
{
string mensajeex = Cadena;
Encoding utf8 = Encoding.UTF8;
Encoding unicode = Encoding.Unicode;
// Convert the string into a byte array.
byte[] unicodeBytes = unicode.GetBytes(mensajeex);
// Perform the conversion from one encoding to the other.
byte[] asciiBytes = Encoding.Convert(unicode, utf8, unicodeBytes);
// Convert the new byte[] into a char[] and then into a string.
char[] asciiChars = new char[utf8.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
utf8.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
string Utf8string = new string(asciiChars);
// Display the strings created before and after the conversion.
Console.WriteLine("Original string: {0}", mensajeex);
Console.WriteLine("Ascii converted string: {0}", Utf8string);
return Utf8string;
}
And actually it works! But when I try to Encode a string and then pass through an exception as a Message property like this
throw new Exception(XMLHelper.ConvertToUTF8(Message));
It give me the response wrong like:
El valor 'R' no es válido seg&#250
Any ideas? Thanks

C# equivalent to parse cryptojs

I'm trying to create C# that does this in CryptoJS
var hash = CryptoJS.HmacSHA512(msg, key);
var crypt = CryptoJS.enc.Utf8.parse(hash.toString());
var base64 = CryptoJS.enc.Base64.stringify(crypt);
My question is in the second statement where hash variable is put into a string then parsed.
Is there an equivalent in C#? Once parsed how do you encode the result into Utf8.
Thanks

I'm not 100% if I understand exactly which piece you are looking for here. But there is no such thing as a UTF8 System.String in C#. However when you write a string to a stream you can choose the encoding of the bytes in the stream to be UTF8
For example by passing that encoding as an option to a StreamWriter.
using (StreamWriter writer = new StreamWriter(stream, Encoding.UTF8)) {
writer.Write(text);
}

My boss find the answer to this. The difference is that before you return the base64 string using C# you have to change the bytes into hexadecimal.
var encoder = new UTF8Encoding();
byte[] keyBytes = encoder.GetBytes(key);
var newlinemsg = action + "\n" + msg;
byte[] messageBytes = encoder.GetBytes(newlinemsg);
byte[] hashBytes = new HMACSHA512(keyBytes).ComputeHash(messageBytes);
var hexString = ToHexString(hashBytes);
var base64 = Convert.ToBase64String(encoder.GetBytes(hexString));

convert a string from ISO-8859-5 to UTF8

I'm writing an application for windows mobile. I use a scan, i get a string encoding ISO-8859-5.How do I convert a string in UTF8?
Here is my code
var str_source = "³¿±2";
Console.WriteLine(str_source);
Encoding iso = Encoding.GetEncoding("iso-8859-5");
Encoding utf8 = Encoding.UTF32;
byte[] utfBytes = utf8.GetBytes(str_source);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
var str_result = iso.GetString(isoBytes, 0, isoBytes.Length);
Console.WriteLine(str_result);

You should never start off your testing code with using string literals when dealing with encoding issues. Always use bytes to start with.
Encoding iso = Encoding.GetEncoding("iso-8859-5");
Encoding utf = Encoding.UTF8;
var isoBytes = new byte[] { 228, 232 }; // фш
// iso to utf8
var utfBytes = Encoding.Convert(iso, utf, isoBytes);
// utf8 to iso
var isoBytes2 = Encoding.Convert(utf, iso, utfBytes);
// get all strings (with the correct encoding)
// all 3 strings will contain фш
string s1 = iso.GetString(isoBytes);
string s2 = utf.GetString(utfBytes);
string s3 = iso.GetString(isoBytes2);
Edit: If you do want to use string literals to get you started, then you can use the code below to change their encoding (Encoding.Unicode) to the expected 'incoming text' encoding:
string stringLiteral = "фш";
Encoding.Convert(Encoding.Unicode, Encoding.GetEncoding("iso-8859-5"),
Encoding.Unicode.GetBytes(stringLiteral)); // { 228, 232 }

Convert a string's character encoding from windows-1252 to utf-8

I had converted a Word Document(docx) to html, the converted html has windows-1252 as its character encoding. In .Net for this 1252 character encoding all the special characters are being displayed as '�'. This html is being displayed in a Rad Editor which displays correctly if the html is in Utf-8 format.
I had tried the following code but no vein
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = wind1252.GetBytes(strHtml);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
char[] utf8Chars = new char[utf8.GetCharCount(utf8Bytes, 0, utf8Bytes.Length)];
utf8.GetChars(utf8Bytes, 0, utf8Bytes.Length, utf8Chars, 0);
string utf8String = new string(utf8Chars);
Any suggestions on how to convert the html into UTF-8?

This should do it:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = wind1252.GetBytes(strHtml);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);

Actually the problem lies here
byte[] wind1252Bytes = wind1252.GetBytes(strHtml);
We should not get the bytes from the html String. I tried the below code and it worked.
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = ReadFile(Server.MapPath(HtmlFile));
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);
public static byte[] ReadFile(string filePath)
{
byte[] buffer;
FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
try
{
int length = (int)fileStream.Length; // get file length
buffer = new byte[length]; // create buffer
int count; // actual number of bytes read
int sum = 0; // total number of bytes read
// read until Read method returns 0 (end of the stream has been reached)
while ((count = fileStream.Read(buffer, sum, length - sum)) > 0)
sum += count; // sum is a buffer offset for next reading
}
finally
{
fileStream.Close();
}
return buffer;
}

How you are planning to use resulting html? The most appropriate way in my opinion to solve your problem would be add meta with encoding specification. Something like:
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />

Use Encoding.Convert method. Details are in the Encoding.Convert method MSDN article.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Can't decode text - c#

Related

Websocket having Error: Frame must be terminated with a null octet while using Cyrillic instead of english

ASP.NET SOAP Webservice ,Encode Problem in Exception

C# equivalent to parse cryptojs

convert a string from ISO-8859-5 to UTF8

Convert a string's character encoding from windows-1252 to utf-8

Categories

Resources