Byte[] from Registry returns only one letter

Byte[] from Registry returns only one letter - c#

I'm trying to read data from the registry # ""SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs\"
The return value I get is System.byte[], when I convert it to a string like suggested here.
It works (I think). But I only get 1 letter returned and not the whole string.
Perhaps I'm doing something wrong? I'm fairly certain there can't be only one letter in there..
I've tried Encoding.ASCII.GetString(bytes); and Encoding.UTF8.GetString(bytes); and Encoding.Default.GetString(bytes); but it all returns only 1 character/letter.
I've checkout this link as well. But thats for C++ and I'm using C# and don't see that Method that they suggested (RegGetValueA)
Here is my code:
RegistryKey pRegKey = Registry.CurrentUser;
pRegKey = pRegKey.OpenSubKey("SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Explorer\\RecentDocs\\");
Object val = pRegKey..GetValue("0");
byte[] bytes = (byte[])pRegKey.GetValue ("0");
string str = Encoding.ASCII.GetString(bytes);
System.Windows.MessageBox.Show("The value is: " + str);
Thanks in advance for any help :)

The string is encoded using UTF-16, so you should use Encoding.Unicode.
But it doesn't seem it's just UTF-16 encoded strings, there's some more data. For me, (when decoded as UTF-16), it displays as
Stažené soubory□Š6□□□□□Stažené soubory.lnk□T□□뻯□□□□*□□□□□□□□□□□□Stažené soubory.lnk□6□
Stažené soubory means Downloads in Czech, which is the language of my Windows. And the U+25A1 squares in the above text are actually zero chars.

Are you sure that the encoding is ASCII ?
I would suspect some UTF like Encoding.UTF8 or Encoding.Unicode - try that...

Related

how to fix corrupt japanese character encoding

i have the following string that i know is suppose to be displayed as Japanese text
25“ú‚¨“¾‚ÈƒAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O
is there any way to decode and re-encode the text so it displays properly? i already tried using shift-jis but it did not produce a readable string.
string main = "25“ú‚¨“¾‚ÈƒAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O.zip";
byte[] mainBytes = System.Text.Encoding.GetEncoding("shift-jis").GetBytes(main);
string jpn = System.Text.Encoding.GetEncoding("shift-jis").GetString(mainBytes);
thanks!

I think that the original is Shift-JIS, but you didn't show how you did try. So here is my try to re-code it::
string s1 = "25“ú‚¨“¾‚ÈƒAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O";
byte[] bs = Encoding.GetEncoding(1252).GetBytes(s1);
string s2 = Encoding.GetEncoding(932).GetString(bs);
And s2 is now "25日お得なアルティャbトコスセット記念", that looks a lot more like Japanese.
What I assume it that some byte array that represent text Shift-JIS encoded, what read by using a different encoding, maybe Windows-1252. So first I try to get back the original byte array. Then I use the proper encoding to get the correct text.
A few notes about my code:
1252 is the numeric ID for Windows-1252, the most usually used-by-mistake encoding. But this is just a guess, you can try with other encodings and see if it makes more sense.
932 is de numeric ID for Shift-JIS (you can also use the string name). This is also a guess, but likely right.
Take into account that using a wrong encoding is not generally a reversible procedure so there may be characters that are lost in the translation.

.NET C# conversion from UTF 16 LE to UTF 16 BE failing

I'm trying to convert some strings from UTF 16 LE to UTF 16 BE but it fails to encode the second Chinese character.
Sample string: test馨俞
Code:
byte[] bytes = Encoding.Unicode.GetBytes(sendMsg.Text);
sendMsg.Text = Encoding.BigEndianUnicode.GetString(bytes)
I've also tried
var encode = new UnicodeEncoding(false, true, true);
var messageAsBytes = encode.GetBytes(sendMsg.Text);
var enc = new UnicodeEncoding(true, true, true);
sendMsg.Text = enc.GetString(messageAsBytes);
Which results in the following error: Unable to translate bytes [DE][4F] at index 184 from specified code page to Unicode on the line:
sendMsg.Text = enc.GetString(messageAsBytes);
Thanks.

I think you should process your input string with the BigEndianUnicode class.
I made this code from the one you provided. It works fine, without error:
String input = "馨俞";
var messageAsBytes = Encoding.BigEndianUnicode.GetBytes(input);
input = Encoding.BigEndianUnicode.GetString(messageAsBytes);
If I process "input" with Encoding.Unicode, and print out both byte arrays (the one processed with unicode and the one with big endian), it show the differences:
So, input is converted to the endian you need.

The result of encoding a string is a byte array, not another string.
Just use
byte[] bytes = Encoding.BigEndianUnicode.GetBytes(sendMsg.Text);
to encode the string to bytes using the UTF 16 BE encoding.
Then send those bytes to the mainframe.
How you send those bytes to the mainframe may be the topic of another question, but it sounds like you somehow need to present those encoded bytes in a variable of type string. That sounds like a bug in the library you are using. We would need to understand the nature of that library and its possible bug to find a workaround. One option you could try, but it's a shot in the dark, is this:
string toSend = Encoding.Default.GetString(bytes);
That will produce a string where each character is the representation of one byte from the encoded string, in UTF 16 BE order. It's length will be double the length of the original string.

I got it working by setting this property without any conversion.
sendMsg.SetIntProperty(XMSC.JMS_IBM_CHARACTER_SET, 1201);

What's going wrong with C#'s string formatter?

I'm getting the following behavior from C#s string encoder:
[Test Case Screenshot][1]
poundFromBytes should be "£", but instead it's "?".
It's as if it's trying to encode the byte array using ASCII instead of UTF-8.
Is this a bug in Windows 7 / C#'s string encoder, or am I missing something?
My real issue here is that I get the same problem when I use File.ReadAllText on an ANSI text file, and I get a related issue in a third party library.
EDIT
I found my problem, I was running under the assumption that UTF-8 was backwards compatible with ANSI, but it's actually only backwards compatible with ASCII. Cheers anyway, at least I'll know to make sure I have no immaterial problems with my test case next time.

The single-byte representation of the pound sign is not valid UTF-8.
Use Encoding.GetBytes instead:
byte[] poundBytes = Encoding.GetEncoding("UTF-8").GetBytes(sPound)

The correct block of code should read something like:
var testChar = '£';
var bytes = Encoding.UTF32.GetBytes(new []{testChar});
string testConvert = Encoding.UTF32.GetString(bytes, 0, bytes.Length);
As others have said, you need to use a UTF encoder to get the bytes for a character. Incidentally characters are UTF-16 format by default (see: http://msdn.microsoft.com/en-us/library/x9h8tsay.aspx)

If you want to use an Encoding's GetString() method, you should probably also use it's corresponding GetBytes() method:
static void Main(string[] args)
{
char cPound = '£';
byte bPound = (byte)cPound; //not really valid
string sPound = "" + cPound;
byte[] poundBytes = Encoding.UTF8.GetBytes(sPound);
string poundFromBytes = Encoding.UTF8.GetString(pountBytes);
Console.WriteLine(poundFromBytes);
Console.ReadKey(True);
}

Check out the documents here. As mentioned in the comments you can't just cast your char to a byte. I'll edit with a more succinct answer but I want to avoid just copy/pasting what msdn has. http://msdn.microsoft.com/en-us/library/ds4kkd55(v=vs.110).aspx
char[] pound = new char[] { '£' };
byte[] poundAsBytes = Encoding.UTF8.GetBytes(pound);
Also, why is everyone using this GetEncoding with a hard coded argument rather than accessing UTF8 directly?

German character ß encoding in Livelink using C#

I have folder name that contains German special character such äÄéöÖüß.The following screenshot display contents of LiveLink server.
I want to extract folder from Livelink server using C#.
valueis obtained from LLserver.
var bytes = new List<byte>(value.Length);
foreach (var c in value)
{
bytes.Add((byte)c);
}
var result = Encoding.UTF8.GetString(bytes.ToArray());
Finally, the result is äÄéöÖü�x .where ß is seen as box character '�x'. All other characters present in folder name are decoded successfully/properly except the ß character.
I am just wondering why the same code works for all other German special characters but not for ß.
Could anybody help to fix this problem in C#?
Thanks in advance.

Go to admin panel of server Livelink/livelink.exe?func=admin.sysvars
and set Character Set: UTF-8
and code section change as follow
byte[] bytes = Encoding.Default.GetBytes(value);
var retValue = Encoding.UTF8.GetString(bytes);
It works fine.

You guessed your encoding to be UTF8 and it obviously is not. You will need to find out what encoding the byte stream really represents and use that instead. We cannot help you with that, you will have to ask the sender of said bytes.

Decoding Base64 / Quoted Printable encoded UTF8 string

In my ASP.Net application working process, I need to do some work with string, which equals something like
=?utf-8?B?SWhyZSBCZXN0ZWxsdW5nIC0gVmVyc2FuZGJlc3TDpHRpZ3VuZyAtIDExMDU4OTEyNDY=?=
How can I decode it to normal human language?
Thanks in advance!
Update:
Convert.FromBase64String() does not work for string, which equals
=?UTF-8?Q?Bestellbest=C3=A4tigung?=
I get The format of s is invalid. s contains a non-base-64 character, more than two padding characters, or a non-white space-character among the padding characters. exception.
Update:
Solution Here
Alternative solution
Update:
What kind of string encoding is that: Nweiß ???

It's actually a base-64 string:
string zz = "SWhyZSBCZXN0ZWxsdW5nIC0gVmVyc2FuZGJlc3TDpHRpZ3VuZyAtIDExMDU4OTEyNDY=";
byte[] dd = Convert.FromBase64String(zz);
// Returns Ihre Bestellung - Versandbestätigung - 1105891246
string yy = System.Text.Encoding.UTF8.GetString(dd);

I've written a library that will decode these sorts of strings. You can find it at http://github.com/jstedfast/MimeKit
Specifically, take a look at MimeKit.Utils.Rfc2047.DecodeText()

This seems to be MIME Header Encoding. The Q in your second example indicates that it is Quoted Printable.
This question seems to cover the variants fairly well. In a quick search I didn't find any .NET libraries to decode this automatically, but it shouldn't be hard to do manually if you need to.

That's not UTF8. Thats a Base64 encoded string.
the UTF-8 only indicates that the target string is in UTF8 format.
After decoding the Base64 string:
SWhyZSBCZXN0ZWxsdW5nIC0gVmVyc2FuZGJlc3TDpHRpZ3VuZyAtIDExMDU4OTEyNDY=
You'll get the following result:
Ihre Bestellung - Versandbestätigung - 1105891246
See Base64 online decode/encode

Looks like a base64 string.
Try Convert.FromBase64String
http://msdn.microsoft.com/en-us/library/system.convert.frombase64string.aspx

This is an encoded word, which is used in email headers when there is non-ASCII content. Encoded words are defined in RFC 2047:
https://www.rfc-editor.org/rfc/rfc2047#section-2
The BNF for an encoded word is:
encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
So the correct way to interpret this is:
The data is the stuff between the 3rd and 4th question marks
It has been Base64 encoded (the 'B' stands for Base64; if it were a
'Q' then it would be quoted-printable).
Once you decode the
data, it will be in the UTF-8 character set.
The result, as #Shai correctly pointed out, is:
Ihre Bestellung - Versandbestätigung - 1105891246
This is German. The umlaut is obviously the reason for the UTF-8 and thus the need for an encoded word. The translation is:
Your order - Delivery confirmation - 1105891246
Apparently it's a tracking number for an order.
All modern email clients (and Outlook) transparently support encoded words.

This is a bit of guesswork, but let's try
remove =? from start and ?= from end
keep the start up to the next ? as the character set
Remove the B? - don't know, what it is
Convert the rest to a byte[] via System.Convert.FromBase64String()
Convert this to the final String via Encoding.GetSTring() using the character set remembered in the second step

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Byte[] from Registry returns only one letter - c#

Are you sure that the encoding is ASCII ? I would suspect some UTF like Encoding.UTF8 or Encoding.Unicode - try that...

Related

how to fix corrupt japanese character encoding

.NET C# conversion from UTF 16 LE to UTF 16 BE failing

What's going wrong with C#'s string formatter?

German character ß encoding in Livelink using C#

Decoding Base64 / Quoted Printable encoded UTF8 string

Categories

Resources