VB6 to C# XML String Conversion Special Characters - c#

I've been using the DOMDocument object from VB6 (MSXML) to create and save an XML file that has an encrypted string. However, this string I think has certain special characters...
<EncryptedPassword>ÆÔ¤ïÎ
߯8KHÖN›¢)Þ,qiãÔÙ</EncryptedPassword>
With this, I go into my C# Project, and de-serialise this XML file in UTF-8 encoding and it fails on this string. I've tried serialisation via ASCII and this gets a couple characters more, but still fails. If I put a plain text string in this place, all is ok! :(
I'm thinking that maybe I'm better converting the string into an MD5 type string from VB6 first, and decoding the MD5 string in .NET and then decrypting the actual string with special characters, but it's an extra step to code all this up and was hoping someone might have a better idea for me here?
Thanks in advance!

The best thing for you to do is to encode your encrypted string in something that will use the ASCII charset. The easiest way to do this is to take your encrypted string and then encode it into Base64 and write this encoded value to the XML element.
And in .net, simply take the value of the XML element and decode it from Base64 and 'voila', you have your enrypted string.
.Net can easily decode a base64 string, see: http://msdn.microsoft.com/en-us/library/system.text.encoding.ascii.aspx. (This page may make it look a bit complicated than it really is).
VB6 does not have native support for Base64 encoding but a quick trawl on google throws up some examples on how it can be achieved quite easily:
http://www.vbforums.com/showthread.php?t=379072
http://www.nonhostile.com/howto-encode-decode-base64-vb6.asp
http://www.mcmillan.org.nz/Programming/post/Base64-Class.aspx
http://www.freevbcode.com/ShowCode.asp?ID=2038

I've concluded that storing these characters in the XML file is wrong. VB6 allows this, but .NET doesn't! Therefore I have converted the string to a Base64 array in line with this link: -
http://www.nonhostile.com/howto-encode-decode-base64-vb6.asp
Now, on the .NET side, the file will de-serialise back into my class where I now store the password as a byte array. I then convert this back to the string I need to decrypt which now appears to raise another problem!
string password = Encoding.UTF7.GetString(this.EncryptedPassword);
With this encoding conversion, I get the string almost exactly back to how I want, but there is a small greater than character that is just not translating properly! Then a colleague found a stack overflow post that had the final answer! There's a discrepancy between VB6 and .NET on this type of conversion. Doing the following instead did the trick: -
string password = Encoding.GetEncoding(1252).GetString(this.EncryptedPassword);
Thanks for all the help, much appreciated. The original post about this is # .Net unicode problem, vb6 legacy

Related

Change encoding from Default to Unicode for encryption

I have a project where everything that is stored in database is encrypted. For encoding we use System.Text.Encoding.Default.GetBytes(text).
The problem is that now the client wants to add support for polish (and other nordic) characters and using the Default encoding doesn't work, the polish characters get converted to english characters (e.g Ą gets converted to A).
I can't change the encoding (Unicode seems to work) as the previous data will be lost.
Is there any way to get around this and add support for new characters while keeping the old data?
To be clear; you realise that "encoding" is not "encrypting", but I suppose you encrypt the byte array you get from encoding your string data?
Then I'd suggest either decrypting and re-encoding and re-encrypting all existing data using UTF-8 (the most efficient encoding for Western alphabets), or add a "version" or "encoding" column indicating with which encoding the data was encrypted.

UWP img in notification from base64 string

when I create a notification in uwp app, and I try set the image, it does work when I do something like:
((XmlElement)imageAttribute[0]).SetAttribute("src", "ms-appx:///Assets/Test.png");
This works fine. But what I need is to set the image from base64 string and not from the Assets folder. Does anyone have any solutions?
You cannot read a string and have it work as binary data. You need to first read the base64 string and convert it back to binary which would be usually in a type of byte array or something.
after you read the base64 string and convert it back to the binary data, then you can use that instead of the binary file in your attribute instead of referencing a resource.
There are multiple sources out there for converting base64 to binary data and/or files so an internet search should yield the results you are looking for... without knowing anything about the language you are writing in, it is impossible to give examples here but the method is the same.

Wrong characters for accents in one Windows-1252 encoded XML

In the XML i need to read in C#, i find characters such as
é, É.
As far as i know , i should not find those characters in a windows-1252 encoded XML. Can i fix that problem in C# or the XML itself must be updated?
Thanks in advance.
It does look like the XML needs to be updated.
You could certainly write something that reads it in as the UTF-8 it really is and writes it back out as the Windows-1252 it claimed to be, but why bother? XML in Windows-1252 is like someone using their smart-phone while dressed ye olde knight at a Renaissance Faire anyway. Just drop the incorrect declaration from the first line and away you go.
The simple answer is: you're probably using the wrong encoding. From this I'd say you should be using UTF-8. You can force it by downloading the document before parsing it.
I should note that downloading URL's is tricky: web servers often report the wrong encoding. That is also the reason why the HTML5 standard includes a section on encoding detection. I'm afraid there's no easy generic solution for this -- we ended up implementing our own encoding detection algorithms for our web crawlers.

Convert string to normal text

When I open one file it contains something like this:
It's that
What is this and how do I convert it to ASCII ?
This is HTML encoding, use WebUtility.HtmlDecode (in System.Net namespace):
string encoded = "It's that";
string decoded = System.Net.WebUtility.HtmlDecode(s);
HttpUtility.HtmlDecode() will do the trick.
Those are HTML entities. They represent ascii characters. You can decode them using HttpUtility.HTMLDecode().
If you're just trying to read this one line, you could also rename the file to a .html file and open it in your browser of choice. There are even tools that do this online.
The number between the &# ; is likely an ASCII code.
Convert the numbers manually or use the HTMLDecode to save yourself some time...
If you're using .Net Framework 4.0 or higher then the System.Net.WebUtility.HtmlDecode(s) will work.
I needed this solution for an SSRS report where only 3.5 was supported. Since the namespace above wasn't available I went the alternate route of
System.Web.HttpUtility.HtmlDecode(rawString)

string encoding in C# - strange characters

I have a file that i need to import.
The problem is that I have problems with a lot of characters in that file.
For example these names are wrong:
Björn (in file) - Should be Björn
Ã…ke (in file) - Should be Åke
Unfortunately I can't recreate the file with the correct encoding.
Also there are a lot of characters that are wrong (these was just examples). I can't do a search and replace on all (if there isn't a dictionary with all conversions).
Can I decode the strings in some way?
thanks Patrik
Edit:
Just some more info that I should added before (I blame my tiredness).
The file is an .xlsx file.
I debugged this with Notepad++. I copied the correct strings into Notepad++. I used Encoding | Convert to UTF-8. Then I selected Encoding | Encode as ANSI. This has the effect of interpreting the UTF-8 bytes as if they were ANSI. And when I did this I end up with the same erroneous values as you. So clearly when you read the file you are interpreting is as ANSI rather than UTF-8.
The solution then is that your file has been encoded as UTF-8. Make sure that the file is interpreted as UTF-8 when you read it. I can't tell you exactly how to do that since you didn't show how you were reading the file in the first place.
It's possible that your file does not contain a byte-order-mark (BOM). If so then specify the encoding when you read the file by passing Encoding.UTF8.
I've just tried your first example, and it definitely looks like that's UTF-8.
It's unclear what you're using to look at the file in the first place, but if you load it with a text editor which understands UTF-8 and tell it that it's a UTF-8 file, it should be fine.
When you load it with .NET, you should just be able to use File.OpenText, File.ReadAllText etc - most IO dealing with encodings in .NET defaults to UTF-8 anyway.

Categories