ASP.NET SOAP Webservice ,Encode Problem in Exception - c#

Here is my problem, Im trying to Encode the response of my webservice with the following Code.
public static string ConvertToUTF8(string Cadena)
{
string mensajeex = Cadena;
Encoding utf8 = Encoding.UTF8;
Encoding unicode = Encoding.Unicode;
// Convert the string into a byte array.
byte[] unicodeBytes = unicode.GetBytes(mensajeex);
// Perform the conversion from one encoding to the other.
byte[] asciiBytes = Encoding.Convert(unicode, utf8, unicodeBytes);
// Convert the new byte[] into a char[] and then into a string.
char[] asciiChars = new char[utf8.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
utf8.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
string Utf8string = new string(asciiChars);
// Display the strings created before and after the conversion.
Console.WriteLine("Original string: {0}", mensajeex);
Console.WriteLine("Ascii converted string: {0}", Utf8string);
return Utf8string;
}
And actually it works! But when I try to Encode a string and then pass through an exception as a Message property like this
throw new Exception(XMLHelper.ConvertToUTF8(Message));
It give me the response wrong like:
El valor 'R' no es válido seg&#250
Any ideas? Thanks

Related

Can't decode text

Example:
"Заполни профиль"
I try
var latinString = "Заполни профиль"; // år
Encoding latinEncoding = Encoding.GetEncoding("iso-8859-1");
Encoding utf8Encoding = Encoding.GetEncoding("WINDOWS-1252");
byte[] latinBytes = latinEncoding.GetBytes(latinString);
byte[] utf8Bytes = Encoding.Convert(latinEncoding, utf8Encoding, latinBytes);
var utf8String = Encoding.UTF8.GetString(utf8Bytes);
but it doesn't work:
�?аполни п�?о�?ил�?
this is russian text, help plz
It seems, latinString is UTF-8 string in Win-1252 encoding. Let's return it back into UTF-8:
// Uncomment in case of .Net Core or .Net 5
// Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var latinString = "Заполни профиль";
string result = Encoding.UTF8.GetString(
Encoding.GetEncoding(1252).GetBytes(latinString));
// Let's have a look
Console.Write(result);
Outcome:
Заполни профиль

Can't convert HttpResponseMessage with UTF8 encoding

I'm struggling with the usual conversion issue, but unfortunately I haven't been able to find anything for my specific problem.
My app is receiving a System.Net.Http.HttpResponseMessage, from a php server, UTF8 encoded, containing some characters like \u00c3\u00a0 (à) and I'm not able to convert them.
string message = await result.Content.ReadAsStringAsync();
byte[] messageBytes = Encoding.UTF8.GetBytes(message);
string newmessage = Encoding.UTF8.GetString(messageBytes, 0, messageBytes.Length);
This is just one of my try, but nothing happens, the resultring string still has the \u00c3\u00a0 characters.
I have also read some answers like How to convert a UTF-8 string into Unicode? but this solution doesn't work for me. This is the solution code:
public static string DecodeFromUtf8(this string utf8String)
{
// copy the string as UTF-8 bytes.
byte[] utf8Bytes = new byte[utf8String.Length];
for (int i=0;i<utf8String.Length;++i) {
//Debug.Assert( 0 <= utf8String[i] && utf8String[i] <= 255, "the char must be in byte's range");
utf8Bytes[i] = (byte)utf8String[i];
}
return Encoding.UTF8.GetString(utf8Bytes,0,utf8Bytes.Length);
}
DecodeFromUtf8("d\u00C3\u00A9j\u00C3\u00A0"); // déjà
I have noticed that when I try the above solution with a simple string like
string str = "Comunit\u00c3\u00a0"
the DecodeFromUtf8 method works perfectly, the problem is when I use my response message.
Any advice would be very appreciated
I've solved this problem by myself. I've discovered that the server response was a ISO string of a utf-8 json, so I had to remove the json escape characters and then convert the iso into a utf8
So I had to do the following:
private async Task<string> ResponseMessageAsync(HttpResponseMessage result)
{
string message = await result.Content.ReadAsStringAsync();
string parsedString = Regex.Unescape(message);
byte[] isoBites = Encoding.GetEncoding("ISO-8859-1").GetBytes(parsedString);
return Encoding.UTF8.GetString(isoBites, 0, isoBites.Length);
}
for me works change from:
string message = await result.Content.ReadAsStringAsync();
byte[] messageBytes = Encoding.UTF8.GetBytes(message);
string newmessage = Encoding.UTF8.GetString(messageBytes, 0, messageBytes.Length);
to:
byte[] bytes = await result.Content.ReadAsByteArrayAsync();
Encoding utf8 = Encoding.UTF8;
string newmessage = utf8.GetString(bytes);

convert a string from ISO-8859-5 to UTF8

I'm writing an application for windows mobile. I use a scan, i get a string encoding ISO-8859-5.How do I convert a string in UTF8?
Here is my code
var str_source = "³¿±2";
Console.WriteLine(str_source);
Encoding iso = Encoding.GetEncoding("iso-8859-5");
Encoding utf8 = Encoding.UTF32;
byte[] utfBytes = utf8.GetBytes(str_source);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
var str_result = iso.GetString(isoBytes, 0, isoBytes.Length);
Console.WriteLine(str_result);
You should never start off your testing code with using string literals when dealing with encoding issues. Always use bytes to start with.
Encoding iso = Encoding.GetEncoding("iso-8859-5");
Encoding utf = Encoding.UTF8;
var isoBytes = new byte[] { 228, 232 }; // фш
// iso to utf8
var utfBytes = Encoding.Convert(iso, utf, isoBytes);
// utf8 to iso
var isoBytes2 = Encoding.Convert(utf, iso, utfBytes);
// get all strings (with the correct encoding)
// all 3 strings will contain фш
string s1 = iso.GetString(isoBytes);
string s2 = utf.GetString(utfBytes);
string s3 = iso.GetString(isoBytes2);
Edit: If you do want to use string literals to get you started, then you can use the code below to change their encoding (Encoding.Unicode) to the expected 'incoming text' encoding:
string stringLiteral = "фш";
Encoding.Convert(Encoding.Unicode, Encoding.GetEncoding("iso-8859-5"),
Encoding.Unicode.GetBytes(stringLiteral)); // { 228, 232 }

Convert a string to byte[] for socket

I am writing a simple ftp client with c#.
I am not pro in c#. Is there any way to convert string to byte[] and write it to the socket?
for example for introducing username this is the socket content:
5553455220736f726f7573680d0a
and ASCII equivalent is:
USER soroush
I want a method to convert string. Something like this:
public byte[] getByte(string str)
{
byte[] ret;
//some code here
return ret;
}
Try
byte[] array = Encoding.ASCII.GetBytes(input);
// C# to convert a string to a byte array.
public static byte[] StrToByteArray(string str)
{
Encoding encoding = Encoding.UTF8; //or below line
//System.Text.UTF8Encoding encoding=new System.Text.UTF8Encoding();
return encoding.GetBytes(str);
}
and
// C# to convert a byte array to a string.
byte [] dBytes = ...
string str;
Encoding enc = Encoding.UTF8; //or below line
//System.Text.UTF8Encoding enc = new System.Text.UTF8Encoding();
str = enc.GetString(dBytes);

Text Decoding Problem

So given this input string:
=?ISO-8859-1?Q?TEST=2C_This_Is_A_Test_of_Some_Encoding=AE?=
And this function:
private string DecodeSubject(string input)
{
StringBuilder sb = new StringBuilder();
MatchCollection matches = Regex.Matches(inputText.Text, #"=\?(?<encoding>[\S]+)\?.\?(?<data>[\S]+[=]*)\?=");
foreach (Match m in matches)
{
string encoding = m.Groups["encoding"].Value;
string data = m.Groups["data"].Value;
Encoding enc = Encoding.GetEncoding(encoding.ToLower());
if (enc == Encoding.UTF8)
{
byte[] d = Convert.FromBase64String(data);
sb.Append(Encoding.ASCII.GetString(d));
}
else
{
byte[] bytes = Encoding.Default.GetBytes(data);
string decoded = enc.GetString(bytes);
sb.Append(decoded);
}
}
return sb.ToString();
}
The result is the same as the data extracted from the input string. What am i doing wrong that this text is not getting decoded properly?
UPDATE
So i have this code for decoding quote-printable:
public string DecodeQuotedPrintable(string encoded)
{
byte[] buffer = new byte[1];
return Regex.Replace(encoded, "=(\r\n?|\n)|=([A-F0-9]{2})", delegate(Match m)
{
if (byte.TryParse(m.Groups[2].Value, NumberStyles.HexNumber, CultureInfo.InvariantCulture, out buffer[0]))
{
return Encoding.ASCII.GetString(buffer);
}
else
{
return string.Empty;
}
});
}
And that just leaves the underscores. Do i manually convert those to spaces (Replace("_"," ")), or is there something else i need to do to handle that?
Looks like you don't fully understand format of input line. Check it here: http://www.ietf.org/rfc/rfc2047.txt
format is: encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
so you have to
Extranct charset(encoding in terms of .net). Not just UTF8 or Default (Utf16)
Extract encoding: either B for base64 Q for quoted-printable (your case!)
Then perform decoding to bytes then to string
The function's not even trying to decode the quoted-printable encoded stuff (the hex codes and underscores). You need to add that.
It's handling the encoding wrong (UTF-8 gets decoded with Encoding.ASCII for some bizarre reason)

Categories