In my Silverlight Application I am getting an XML File encoded with windows-1252.
Now my Problem it won't display correctly until the windows-1252 string is converted to a UTF8 string.
In a normal C# enviornment that wouldn't be that big of a problem: There I could do something like this:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = ReadFile(Server.MapPath(HtmlFile));
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8, wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);
(Convert a string's character encoding from windows-1252 to utf-8)
But silverlight doesn't support windows-1252 - it is unicode only.
PS
I stumbled upon "Encoding for Silverlight" http://encoding4silverlight.codeplex.com/ - but it seems there is no support for windows-1252 there either?
EDIT:
I solved my problem on the "Server Side" - The actual problem is still open.
Encoding for Silverlight is a third party encoding system but only supported all DBCS (Double-Byte Character Set) now. However, windows-1252 is SBCS (Single-Byte Character Set).
But you can write a encoder/decoder for Encoding for Silverlight, I Think will be very easy.
Related
I try to encode a string in Windows-1252 with a StreamWriter. The input string (dataString) is encode in UTF8.
StreamWriter sw = new StreamWriter(#"C:\Temp\data.txt", true, Encoding.GetEncoding(1252));
sw.Write(dataString);
sw.Close();
When I open the file in Notepad++ I get a ANSI file. I need a Windows-1252 encoded file.
Someone have an idea?
Your file is Windows-1252 encoded. There is no data in the file of a non-Unicode to indicate how the file is encoded. In this case ANSI just means not Unicode. If you where to encode the as Russian/Windows-1251 and open it in Notepad++, Notepad++ would display it as ANSI as well.
See Unicode, UTF, ASCII, ANSI format differences for more info.
I'm working with ICQ protocol and I found problem with special letters (fxp diacritics). I read that ICQ using another encoding (CP-1251 if I remember).
How can I decode string with text to correct encoding?
I've tried using UTF8Encoding class, but without success.
Using ICQ-sharp library.
private void ParseMessage (string uin, byte[] data)
{
ushort capabilities_length = LittleEndianBitConverter.Big.ToUInt16 (data, 2);
ushort msg_tlv_length = LittleEndianBitConverter.Big.ToUInt16 (data, 6 + capabilities_length);
string message = Encoding.UTF8.GetString (data, 12 + capabilities_length, msg_tlv_length - 4);
Debug.WriteLine(message);
}
If contact using the same client it's OK, but if not incoming and outcoming messages with diacritics are just unreadable.
I've determinated (using this -> https://stackoverflow.com/a/12853721/846232) that it's in BigEndianUnicode encoding. But if string not contains diacritics its unreadable (chinese letters). But if I use UTF8 encoding on text without diacritics its ok. But I don't know how to do that it will be encoded right allways.
If UTF-8 kinda works (i.e. it works for "english", or any US-ASCII characters), then you don't have UTF-16. Latin1 (or Windows-1252, Microsoft's variant), or e.g. Windows-1251 or Windows-1250 are perfectly possible though, since these the first part containing latin letters without diacritics are the same.
Decode like this:
var encoding = Encoding.GetEncoding("Windows-1250");
string message = encoding.GetString(data, 12 + capabilities_length, msg_tlv_length - 4);
I can't handle with encoding in my language (Poland).
When I write żółw it works like a charm, but when I write ślimak there isn't ś in my array.
I tried also with UTF-8, but with no results.
Here is encoding in 1250. Works with ż,ó,ł, not with ą,ź....
byte[] buffer = Encoding.GetEncoding(1250).GetBytes(postdata);
Above code is used to communicate with web server, so I think the problem is before communication.
Tried also:
byte[] buffer = Encoding.GetEncoding(28592).GetBytes(postdata); //iso-8859-2 Central European (ISO)
Solved, iso-8859-2 Central European (ISO) was the correct answer. (I was running old exe project file).
You should not expect there to be a ś in the array; it needs to be encoded, and the encoded value is differerent. I would advise using UTF-8 here in which case you should expect 0xC5 0x9B in the output, as that is the UTF-8 encoding of ś.
If you use 28592, then 0xB6 is the encoded form, and round-trips successfully.
byte[] buffer = Encoding.GetEncoding(28592).GetBytes(postdata); //iso-8859-2 Central European (ISO)
Solved, iso-8859-2 Central European (ISO) was the correct answer. (I was running old exe project file).
I am dynamically creating CSV files using C#, and I am encountering some strange encoding issues. I currently use the ASCII encoding, which works fine in Excel 2010, which I use at home and on my work machine. However, the customer uses Excel 2007, and for them there are some strange formatting issues, namely that the '£' sign (UK pound sign) is preceded with an accented 'A' character.
What encoding should I use? The annoying thing is that I can hardly test these fixes as I don't have access to Excel 2007!
I'm using Windows ANSI codepage 1252 without any problems on Excel 2003. I explicitly changed to this because of the same issue you are seeing.
private const int WIN_1252_CP = 1252; // Windows ANSI codepage 1252
this._writer = new StreamWriter(fileName, false, Encoding.GetEncoding(WIN_1252_CP));
I've successfully used UTF8 encoding when writing CSV files intended to work with Excel.
The only problem I had was making sure to use the overload of the StreamWriter constructor that takes an encoding as a parameter. The default encoding of StreamWriter says it is UTF8 but it's really UTF8-Without-A-Byte-Order-Mark and without a BOM Excel will mess up characters using multiple bytes.
You need to add Preamble to file:
var data = Encoding.UTF8.GetBytes(csv);
var result = Encoding.UTF8.GetPreamble().Concat(data).ToArray();
return File(new MemoryStream(result), "application/octet-stream", "file.csv");
I have a requirement to encode and decode Japanese characters. I tried in JAVA and it worked fine with "Cp939" encoding but am unable to find that encoding in .NET. The 932 encoding doesn't encode all the characters and so i need to find out a way of implementing 939 encoding in .NET.
Java Code :
convStr = new String(str8859_1.getBytes("Cp037"), "Cp939");
.NET :
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(932), bytesConverted);
// This result is a junk of characters and is totally different
// from the expected output 'ニツポンバ'
convStr = Encoding.GetEncoding(1252).GetString(bytesConverted);
The encoded bytes are in the encoding 932, so why are you using the encoding 1252 when you convert the encoded bytes to a string?
The following should work:
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(932), bytesConverted);
// This result is a junk of characters and is totally different
// from the expected output 'ニツポンバ'
convStr = Encoding.GetEncoding(932).GetString(bytesConverted);
is this an error or just how you typed it ?
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(932), bytesConverted);
should be:
bytesConverted = Encoding.Convert(Encoding.GetEncoding(37),
Encoding.GetEncoding(939), bytesConverted);
Surely ?