Converting SQL Server varBinary data into string C# - c#

I need help figuring out how to convert data that comes in from a SQL Server table column that is set as varBinary(max) into a string in order to display it in a label.
This is in C# and I'm using a DataReader.
I can pull the data in using:
var BinaryString = reader[1];
i know that this column holds text that was previously convert to binary.

It really depends on which encoding was used when you originally converted from string to binary:
byte[] binaryString = (byte[])reader[1];
// if the original encoding was ASCII
string x = Encoding.ASCII.GetString(binaryString);
// if the original encoding was UTF-8
string y = Encoding.UTF8.GetString(binaryString);
// if the original encoding was UTF-16
string z = Encoding.Unicode.GetString(binaryString);
// etc

The binary data must be encoded text - and you need to know which encoding was used in order to accurately convert it back to text. So for example, you might use:
byte[] binaryData = reader[1];
string text = Encoding.UTF8.GetString(binaryData);
or
byte[] binaryData = reader[1];
string text = Encoding.Unicode.GetString(binaryData);
or various other options... but you need to know the right encoding. Otherwise it's like trying to load a JPEG file into an image viewer which only reads PNG... but worse, because if you get the wrong encoding it may appear to work for some strings.
The next thing to work out is why it's being stored as binary in the first place... if it's meant to be text, why isn't it being stored that way.

You need to know what encoding was used to create the binary. Then you can use
System.Text.Encoding.UTF8.GetString(reader[1]);
And change UTF8 for whatever encoding was used.

Related

how to fix corrupt japanese character encoding

i have the following string that i know is suppose to be displayed as Japanese text
25“ú‚¨“¾‚ȃAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O
is there any way to decode and re-encode the text so it displays properly? i already tried using shift-jis but it did not produce a readable string.
string main = "25“ú‚¨“¾‚ȃAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O.zip";
byte[] mainBytes = System.Text.Encoding.GetEncoding("shift-jis").GetBytes(main);
string jpn = System.Text.Encoding.GetEncoding("shift-jis").GetString(mainBytes);
thanks!
I think that the original is Shift-JIS, but you didn't show how you did try. So here is my try to re-code it::
string s1 = "25“ú‚¨“¾‚ȃAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O";
byte[] bs = Encoding.GetEncoding(1252).GetBytes(s1);
string s2 = Encoding.GetEncoding(932).GetString(bs);
And s2 is now "25日お得なアルティャbトコスセット記念", that looks a lot more like Japanese.
What I assume it that some byte array that represent text Shift-JIS encoded, what read by using a different encoding, maybe Windows-1252. So first I try to get back the original byte array. Then I use the proper encoding to get the correct text.
A few notes about my code:
1252 is the numeric ID for Windows-1252, the most usually used-by-mistake encoding. But this is just a guess, you can try with other encodings and see if it makes more sense.
932 is de numeric ID for Shift-JIS (you can also use the string name). This is also a guess, but likely right.
Take into account that using a wrong encoding is not generally a reversible procedure so there may be characters that are lost in the translation.

Converting Byte[] to string to remain the original byte format

I have large amount of data which consists of tables,font,bold,size,etc. Those data will be stored as byte[] in Database.
when i retrieve those data i need to convert byte[] into string,because i need to some find & replace from this string,after i convert this string into byte[],am losing the original data structure which means, i can't able to see any tables,font,bold etc. properly. So how can i find and replace in byte[] by converting string and also to keep remain the data in original format.
The short answer is don't. Figure out the format of the data and see what you can do to do the manipulation. If the data is actually text, just stored as byte[], your approach would work, provided you encode the string correctly (ie. if your DB expects UTF-8, use UTF-8 encoding, if it's windows-1251, use that).
If you have a structure where a part of it is a string, what you're doing can't really work well. First, you probably want to modify just the relevant parts of the field. On MS SQL, you have handy functions for that. But even then, you should know what's actually stored there, not just assume that a string replace will magically work.
Now, a hack could be to use an explicit encoding that doesn't break the non-string data. That would be some single-byte encoding that doesn't do anything fancy. This is OK as long as you use the same encoding while reading the text data - however, if you use any variant of unicode, you're out of luck; due to features like string normalization, you can't really guarantee that what comes in comes out the same way, per-byte. It's generally a bad practice anyway.
Don't forget that it's quite possible the string you are looking for is actually somewhere outside of the text fields - even by pure chance, it can happen, and certain practices make that even more likely.
Again: figure out the data format inside that data field - then you can decide how to do what you want.
Try this
string result = System.Text.Encoding.UTF8.GetString(byteArray)
To make Byte[] to String
byte[] byteArray = new byte[10]; // put your byte array here
public void byteToString()
{
stringTemp = "";
stringTemp = BitConverter.ToString(byteArray).Replace("-", "");
}
And your data still in byteArray.. :)
If the byte Array contains binary data and is no string, try to convert it to base64:
Convert.ToBase64String(yourByteArray);

Byte array to text for ScintillaNET

I'm writing a windows forms application in c#. The application allows the user to select source code-files from a listbox and displays them in colored code using ScintillaNET. The files are saved as byte arrays in a database. I've managed to make the conversion from a file on my hard drive to byte array and store it. The user should also be able to edit the code and then save it to the database without having to dowload the file to their local hard drive first, I don't know how to approach this.
Basically I want to save the text from the ScintillNET control and convert it to a byte array.
And the other way around, take a byte array and print out the text as it originally appeared in ScintillaNET.
You can use the "Encoding" class from System.Text.
System.Text.Encoding.Unicode.GetBytes("Example");
This will return a byte array with the bytes equivalent to the text "string" using the unicode encoding. There are other encoding available, but I suggest using unicode since it supports more characters (anything you find in windows charmap, for example). In my case is because I'm latin and certain letters aren't available in UTF and I have my doubts about ASCII.
Now to convert from the byte array to string use:
byte[] exampleByteArray = MemStream.ToArray();
System.Text.Encoding.Unicode.GetString(exampleByteArray);
This code will return the string saved previously as a byte array in a memory stream. You can load the byte array with other methods, in you your case you are gonna load it from the database and call System.Text.Encoding.Unicode.GetString().
I believe you are looking for the System.Text.Encoding namespace...
// a sample string...
string example = "A string example...";
// convert string to bytes
byte[] bytes = Encoding.UTF8.GetBytes(example);
// convert bytes to string
string str = System.Text.Encoding.UTF8.GetString(bytes);

Convert UCS-2 characters to UTF-8 Using C#

I'm pulling some internationalized text from a MS SQL Server 2005 database. As per the defaults for that DB, the characters are stored as UCS-2. However, I need to output the data in UTF-8 format, as I'm sending it out over the web. Currently, I have the following code to convert:
SqlString dbString = resultReader.GetSqlString(0);
byte[] dbBytes = dbString.GetUnicodeBytes();
byte[] utf8Bytes = System.Text.Encoding.Convert(System.Text.Encoding.Unicode,
System.Text.Encoding.UTF8, dbBytes);
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
string outputString = encoder.GetString(utf8Bytes);
However, when I examine the output in the browser, it appears to be garbage, no matter what I set the encoding to.
What am I missing?
EDIT:
In response to the answers below, the reason I thought I had to perform a conversion is because I can output literal multibyte strings just fine. For example:
OutputControl.Text = "カルフォルニア工科大学とチューリッヒ工科大学は共同で、太陽光を保管可能な燃料に直接変えることのできる装置の開発に成功したとのこと";
works. Here, OutputControl is an ASP.Net Literal. However,
OutputControl.Text = outputString; //Output from above snippet
results in mangled output as described above. My hypothesis was that the database's output was somehow getting mangled by ASP.Net. If that's not the case, then what are some other possibilities?
EDIT 2:
Okay, I'm stupid. It turns out that there's nothing wrong with the database at all. When I tried inserting my own literal double byte characters (材料,原料;木料), I could read and output them just fine even without any conversion process at all. It seems to me that whatever is inserting the data into the DB is mangling the characters somehow, so I'm going to look at that. With my verified, "clean" data, the following code works:
OutputControl.Text = dbString.ToString();
as the responses below indicate it should.
Your code does essentially the same as:
SqlString dbString = resultReader.GetSqlString(0);
string outputString = dbString.ToString();
string itself is a UNICODE string (specifically, UTF-16, which is 'almost' the same as UCS-2, except for codepoints not fitting into the lowest 16 bits). In other words, the conversions you are performing are redundant.
Your web app most likely mangles the encoding somewhere else as well, or sets a wrong encoding for the HTML output. However, that can't be diagnosed from the information you provided so far.
String in .net is 'encoding agnostic'.
You can convert bytes to string using a particular encoding to tell .net how to interprets your bytes.
You can convert string to bytes using a particular encoding to tell .net how you want your bytes served.
But trying to convert a string to another string using encodings makes no sens at all.

Converting Byte Array to delimited String of raw byte values

I'm in the process of creating an application which will monitor specific registry key values for changes and write those changes to a text file.
At present I can monitor the changes and know when specific values have changed and collect the data held in those values. The problem I'm having at the moment is the return type of the data is Byte and I wish to convert this to String initially for display so I know it returns the right value and then can be saved to a text file.
The reason I'm asking is that later on the next time the user logs onto a system those keys will be created or changed to match the previous values. (Were doing this as a way to save user preferences as were currently using mandatory profiles).
If anyone has any advice it would be appreciated.
It depends what the bytes are.
You need to figure out what encoding the bytes were generated from, then write something like this:
string str = Encoding.UTF8.GetString(bytes);
Depending on how the bytes were made, you may need to use Encoding.ASCII or Encoding.GetEncoding.
You first need to decide what encoding the bytes are in before converting to a string..
Then :
System.Text.Encoding enc = System.Text.Encoding.ASCII;
string myString = enc.GetString(myByteArray );
The System.Text.Encoding class has static methods to convert bytes into strings. In your case, you will want to use System.Text.Encoding.ASCII.GetString(bytes). Use other encodings (UTF8, UTF16) as appropriate.
Instead of using a specidief encoding, use the default one of your system :
public string ReadBytes(byte[] rawData)
{
//the encoding will prolly be the default of your system
return Encoding.Default.GetString(rawData);
}
Right I've now been able to resolve this by storing the registry value in a byte array, then adding each part of the array to a string, using the following code:
RegistryKey key = Registry.Users.OpenSubKey(e.RegistryValueChangeData.KeyPath);
Byte[] byteValue = (Byte[])key.GetValue(e.RegistryValueChangeData.ValueName);
string stringValue = "";
for (int i = 0; i < byteValue.Length; i++)
{
stringValue += string.Format("{0:X2} ", byteValue[i]);
}
Thanks for all the suggestions

Categories