Unicode Hex String to String - c#

I have a unicode string like this:
0030003100320033
Which should turn into 0123.
This is a simple case of 0123 string, but there are some string and unicode chars as well. How can I turn this type of unicode hex string to string in C#?
For normal US charset, first part is always 00, so 0031 is "1" in ASCII, 0032 is "2" and so on.
When its actual unicode char, like Arabic and Chinese, first part is not 00, for instance for Arabic its 06XX, like 0663.
I need to be able to turn this type of Hex string into C# decimal string.

There are several encodings that can represent Unicode, of which UTF-8 is today's de facto standard. However, your example is actually a string representation of UTF-16 using the big-endian byte order. You can convert your hex string back into bytes, then use Encoding.BigEndianUnicode to decode this:
public static void Main()
{
var bytes = StringToByteArray("0030003100320033");
var decoded = System.Text.Encoding.BigEndianUnicode.GetString(bytes);
Console.WriteLine(decoded); // gives "0123"
}
// https://stackoverflow.com/a/311179/1149773
public static byte[] StringToByteArray(string hex)
{
byte[] bytes = new byte[hex.Length / 2];
for (int i = 0; i < hex.Length; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
Since Char in .NET represents a UTF-16 code unit, this answer should give identical results to Slai's, including for surrogate pairs.

Shorter less efficient alternative:
Regex.Replace("0030003100320033", "....", m => (char)Convert.ToInt32(m + "", 16) + "");

You should try this solution
public static void Main()
{
string hexString = "0030003100320033"; //Hexa pair numeric values
//string hexStrWithDash = "00-30-00-31-00-32-00-33"; //Hexa pair numeric values separated by dashed. This occurs using BitConverter.ToString()
byte[] data = ParseHex(hexString);
string result = System.Text.Encoding.BigEndianUnicode.GetString(data);
Console.Write("Data: {0}", result);
}
public static byte[] ParseHex(string hexString)
{
hexString = hexString.Replace("-", "");
byte[] output = new byte[hexString.Length / 2];
for (int i = 0; i < output.Length; i++)
{
output[i] = Convert.ToByte(hexString.Substring(i * 2, 2), 16);
}
return output;
}

Related

How to convert json to hexadecimal in c#

I have json string as in example below
{"SaleToPOIRequest":{"MessageHeader":{"ProtocolVersion":"2.0","MessageClass":"Service","MessageCategory":"Login","MessageType":"Request","ServiceID":"498","SaleID":"SaleTermA","POIID":"POITerm1"},"LogoutRequest":{}}}
I want to convert json request to hexadecimal. I tried example in this link but i cannot get the exact conversion because of {,:,",} values.
Actually i can get hexadecimal return but when i reconvert to string i got return as below
{"SaleToPOIReque§7B#§²$ÖW76vTVder":{"ProtocolV¦W'6öâ#¢#"ã"Â$ÚessageClass":"Se§'f6R"Â$ÖW76vT:ategory":"Login"¢Â$ÖW76vUGR#¢*Request","Servic¤B#¢#C"Â%6ÆZID":"SaleTermA",¢%ôB#¢%ôFW&Ú1"},"LogoutReque§7B#§·×
that is not usefull for me
Is there any way to convert this?
So basically the problem is not only converting to hex but also converting back.
This is nothing more then combining 2 answers already on SO:
First for converting we use the answer given here: Convert string to hex-string in C#
Then for the converting back you can use this answer:
https://stackoverflow.com/a/724905/10608418
For you it would then look something like this:
class Program
{
static void Main(string[] args)
{
var input = "{\"SaleToPOIRequest\":{\"MessageHeader\":{\"ProtocolVersion\":\"2.0\",\"MessageClass\":\"Service\",\"MessageCategory\":\"Login\",\"MessageType\":\"Request\",\"ServiceID\":\"498\",\"SaleID\":\"SaleTermA\",\"POIID\":\"POITerm1\"},\"LogoutRequest\":{}}}";
var hex = string.Join("",
input.Select(c => String.Format("{0:X2}", Convert.ToInt32(c))));
var output = Encoding.ASCII.GetString(FromHex(hex));
Console.WriteLine($"input: {input}");
Console.WriteLine($"hex: {hex}");
Console.WriteLine($"output: {output}");
Console.ReadKey();
}
public static byte[] FromHex(string hex)
{
byte[] raw = new byte[hex.Length / 2];
for (int i = 0; i < raw.Length; i++)
{
raw[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
}
return raw;
}
}
See it in action in a fiddle here:
https://dotnetfiddle.net/axUC5n
Hope this helps and good luck with your project
You should most probably use Encoding.Unicode to convert the string to a byte array: it's quite possible that some characters cannot be represented by ASCII chars.
Encoding.Unicode (UTF-16LE) always uses 2 bytes, so it's predictable: a sequence of 4 chars in the HEX string will always represent an UFT-16 CodePoint.
No matter what characters the input string contains.
Convert string to HEX:
string input = "Yourstring \"Ваша строка\"{あなたのひも},آپ کی تار";;
string hex = string.Concat(Encoding.Unicode.GetBytes(input).Select(b => b.ToString("X2")));
Convert back to string:
var bytes = new List<byte>();
for (int i = 0; i < hex.Length; i += 2) {
bytes.Add(byte.Parse(hex.Substring(i, 2), NumberStyles.HexNumber));
}
string original = Encoding.Unicode.GetString(bytes.ToArray());

How do you convert UTF8 number into written text

I am writing a winform to convert written text into Unicode numbers and UTF8 numbers. This bit is working well
//------------------------------------------------------------------------
// Convert to UTF8
// The return will be either 1 byte, 2 bytes or 3 bytes.
//-----------------------------------------------------------------------
UTF8Encoding utf8 = new UTF8Encoding();
StringBuilder builder = new StringBuilder();
string utext = rchtxbx_text.Text;
// do one char at a time
for (int text_index = 0; text_index < utext.Length; text_index++)
{
byte[] encodedBytes = utf8.GetBytes(utext.Substring(text_index, 1));
for (int index = 0; index < encodedBytes.Length; index++)
{
builder.AppendFormat("{0}", Convert.ToString(encodedBytes[index], 16));
}
builder.Append(" ");
}
rchtxtbx_UTF8.SelectionFont = new System.Drawing.Font("San Serif", 20);
rchtxtbx_UTF8.AppendText(builder.ToString() + "\r");
As an example the characters 乘义ש give me e4b998 e4b989 d7a9, note I have a mix LtoR and RtoL text. Now if the user inputs the number e4b998 I want to show them it is 乘, in Unicode 4E58
I have tried a few things and the closest I got, but still far away, is
Encoding utf8 = Encoding.UTF8;
rchtxbx_text.Text = Encoding.ASCII.GetString(utf8.GetBytes(e4b998));
What do I need to do to input e4b998 and write 乘 to a textbox?
Something like this:
Split source into 2-character chunks: "e4b998" -> {"e4", "b9", "98"}
Convert chunks into bytes
Encode bytes into the final string
Implementation:
string source = "e4b998";
string result = Encoding.UTF8.GetString(Enumerable
.Range(0, source.Length / 2)
.Select(i => Convert.ToByte(source.Substring(i * 2, 2), 16))
.ToArray());
If you have an int as source:
string s_unicode2 = System.Text.Encoding.UTF8.GetString(utf8.GetBytes(e4b998));

C# - Get Integer Byte Array in String

I have a random integer value which I need to represent in String as a Byte array. For example:
int value = 32;
String strValue = getStringByteArray(value);
Console.WriteLine(strValue); // should write: " \0\0\0"
If value = 11 then getStringByteArray(value) shuld return "\v\0\0\0".
If value = 13 then getStringByteArray(value) shuld return "\r\0\0\0".
And so on.
Any idea on how to implement the method getStringByteArray(int value) in C#?
UPDATE
This is the code that receives the data from the C# NamedPipe Server:
bool CFilePipe::ReadString(int m_handle, string &value)
{
//--- check for data
if(WaitForRead(sizeof(int)))
{
ResetLastError();
int size=FileReadInteger(m_handle);
if(GetLastError()==0)
{
//--- check for data
if(WaitForRead(size))
{
value=FileReadString(m_handle,size);
return(size==StringLen(value));
}
}
}
//--- failure
return(false);
}
Don't take this approach at all. You should be writing to a binary stream of some description - and write the binary data for the length of the packet/message, followed by the message itself. For example:
BinaryWriter writer = new BinaryWriter(stream);
byte[] data = Encoding.UTF8.GetBytes(text);
writer.Write(data.Length);
writer.Write(data);
Then at the other end, you'd use:
BinaryReader reader = new BinaryReader(stream);
int length = reader.ReadInt32();
byte[] data = reader.ReadBytes(length);
string text = Encoding.UTF8.GetString(data);
No need to treat binary data as text at all.
Well. First of all you should get bytes from integer. You can do it with BitConverter:
var bytes = BitConverter.GetBytes(value);
Next, here is three variants. First - if you want to get result in binary format. Just take all your bytes and write as it is:
var str = string.Concat(bytes.Select(b => Convert.ToString(b, 2)));
Second variant. If you want convert your byte array to hexadecimal string:
var hex = BitConverter.ToString(array).Replace("-","");
Third variant. Your representation ("\v\0\0\0") - it is simple converting byte to char. Use this:
var s = bytes.Aggregate(string.Empty, (current, t) => current + Convert.ToChar(t));
This should help with that.
class Program
{
static void Main(string[] args)
{
Random rand = new Random();
int number = rand.Next(1, 1000);
byte[] intBytes = BitConverter.GetBytes(number);
string answer = "";
for (int i = 0; i < intBytes.Length; i++)
{
answer += intBytes[i] + #"\";
}
Console.WriteLine(answer);
Console.WriteLine(number);
Console.ReadKey();
}
}
Obviously, you should implement two steps to achieve the goal:
Extract bytes from the integer in the appropriate order (little-endian or big-endian, it's up to you to decide), using bit arithmetics.
Merge extracted bytes into string using the format you need.
Possible implementation:
using System;
using System.Text;
public class Test
{
public static void Main()
{
Int32 value = 5152;
byte[] bytes = new byte[4];
for (int i = 0; i < 4; i++)
{
bytes[i] = (byte)((value >> i * 8) & 0xFF);
}
StringBuilder result = new StringBuilder();
for (int i = 0; i < 4; i++)
{
result.Append("\\" + bytes[i].ToString("X2"));
}
Console.WriteLine(result);
}
}
Ideone snippet: http://ideone.com/wLloo1
I think you are saying that you want to convert each byte into a character literal, using escape sequences for the non printable characters.
After converting the integer to 4 bytes, cast to char. Then use Char.IsControl() to identify the non-printing characters. Use the printable char directly, and use a lookup table to find the corresponding escape sequence for each non-printable char.

Expressing byte values > 127 in .Net Strings

I'm writing some binary protocol messages in .Net using strings, and it mostly works, except for one particular case.
The message I'm trying to send is:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
myDevice.Write(Encoding.ASCII.GetBytes(cmdPacket));
(to help decode, those bytes are 253, 11, 22, then the ASCII chars: "MBEPEXE1.").
Except when I do the Encoding.ASCII.GetBytes, the 0xFD comes out as byte 0x3F
(value 253 changed to 63).
(I should point out that the \x0B and \x16 are interpreted correctly as Hex 0B & Hex 16)
I've also tried Encoding.UTF8 and Encoding.UTF7, to no avail.
I feel there is probably a good simple way to express values above 128 in Strings, and convert them to bytes, but I'm missing it.
Any guidance?
Ignoring if it's good or bad what you are doing, the encoding ISO-8859-1 maps all its characters to the characters with the same code in Unicode.
// Bytes with all the possible values 0-255
var bytes = Enumerable.Range(0, 256).Select(p => (byte)p).ToArray();
// String containing the values
var all1bytechars = new string(bytes.Select(p => (char)p).ToArray());
// Sanity check
Debug.Assert(all1bytechars.Length == 256);
// The encoder, you could make it static readonly
var enc = Encoding.GetEncoding("ISO-8859-1"); // It is the codepage 28591
// string-to-bytes
var bytes2 = enc.GetBytes(all1bytechars);
// bytes-to-string
var all1bytechars2 = enc.GetString(bytes);
// check string-to-bytes
Debug.Assert(bytes.SequenceEqual(bytes2));
// check bytes-to-string
Debug.Assert(all1bytechars.SequenceEqual(all1bytechars2));
From the wiki:
ISO-8859-1 was incorporated as the first 256 code points of ISO/IEC 10646 and Unicode.
Or a simple and fast method to convert a string to a byte[] (with unchecked and checked variant)
public static byte[] StringToBytes(string str)
{
var bytes = new byte[str.Length];
for (int i = 0; i < str.Length; i++)
{
bytes[i] = checked((byte)str[i]); // Slower but throws OverflowException if there is an invalid character
//bytes[i] = unchecked((byte)str[i]); // Faster
}
return bytes;
}
ASCII is a 7-bit code. The high-order bit used to be used as a parity bit, so "ASCII" could have even, odd or no parity. You may notice that 0x3F (decimal 63) is the ASCII character ?. That is what non-ASCII octets (those greater than 0x7F/decimal 127) are converted to by the CLR's ASCII encoding. The reason is that there is no standard ASCII character representation of the code points in the range 0x80–0xFF.
C# strings are UTF-16 encoded Unicode internally. If what you care about are the byte values of the strings, and you know that the strings are, in fact, characters whose Unicode code points are in the range U+0000 through U+00FF, then its easy. Unicode's first 256 codepoints (0x00–0xFF), the Unicode blocks C0 Controls and Basic Latin (\x00-\x7F) and C1 Controls and Latin Supplement (\x80-\xFF) are the "normal" ISO-8859-1 characters. A simple incantation like this:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
byte[] buffer = cmdPacket.Select(c=>(byte)c).ToArray() ;
myDevice.Write(buffer);
will get you the byte[] you want, in this case
// \xFD \x0B \x16 M B E P E X E 1 .
[ 0xFD , 0x0B , 0x16 , 0x4d , 0x42 , 0x45, 0x50 , 0x45 , 0x58 , 0x45 , 0x31 , 0x2E ]
With LINQ, you could do something like this:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
myDevice.Write(cmdPacket.Select(Convert.ToByte).ToArray());
Edit: Added an explanation
First, you recognize that your string is really just an array of characters. What you want is an "equivalent" array of bytes, where each byte corresponds to a character.
To get the array, you have to "map" each character of the original array as a byte in the new array. To do that, you can use the built-in System.Convert.ToByte(char) method.
Once you've described your mapping from characters to bytes, it's as simple as projecting the input string, through the mapping, into an array.
Hope that helps!
I use Windows-1252 as it seems to give the most bang for the byte
And is compatible with all .NET string values
You will probably want to comment out the ToLower
This was built for compatibility with SQL char (single byte)
namespace String1byte
{
/// <summary>
/// Interaction logic for MainWindow.xaml
/// </summary>
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
String8bit s1 = new String8bit("cat");
String8bit s2 = new String8bit("cat");
String8bit s3 = new String8bit("\xFD\x0B\x16MBEPEXE1.");
HashSet<String8bit> hs = new HashSet<String8bit>();
hs.Add(s1);
hs.Add(s2);
hs.Add(s3);
System.Diagnostics.Debug.WriteLine(hs.Count.ToString());
System.Diagnostics.Debug.WriteLine(s1.Value + " " + s1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s2.Value + " " + s2.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s3.Value + " " + s3.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s1.Equals(s2).ToString());
System.Diagnostics.Debug.WriteLine(s1.Equals(s3).ToString());
System.Diagnostics.Debug.WriteLine(s1.MatchStart("ca").ToString());
System.Diagnostics.Debug.WriteLine(s3.MatchStart("ca").ToString());
}
}
public struct String8bit
{
private static Encoding EncodingUnicode = Encoding.Unicode;
private static Encoding EncodingWin1252 = System.Text.Encoding.GetEncoding("Windows-1252");
private byte[] bytes;
public override bool Equals(Object obj)
{
// Check for null values and compare run-time types.
if (obj == null) return false;
if (!(obj is String8bit)) return false;
String8bit comp = (String8bit)obj;
if (comp.Bytes.Length != this.Bytes.Length) return false;
for (Int32 i = 0; i < comp.Bytes.Length; i++)
{
if (comp.Bytes[i] != this.Bytes[i])
return false;
}
return true;
}
public override int GetHashCode()
{
UInt32 hash = (UInt32)(Bytes[0]);
for (Int32 i = 1; i < Bytes.Length; i++) hash = hash ^ (UInt32)(Bytes[0] << (i%4)*8);
return (Int32)hash;
}
public bool MatchStart(string start)
{
if (string.IsNullOrEmpty(start)) return false;
if (start.Length > this.Length) return false;
start = start.ToLowerInvariant(); // SQL is case insensitive
// Convert the string into a byte array
byte[] unicodeBytes = EncodingUnicode.GetBytes(start);
// Perform the conversion from one encoding to the other
byte[] win1252Bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
for (Int32 i = 0; i < win1252Bytes.Length; i++) if (Bytes[i] != win1252Bytes[i]) return false;
return true;
}
public byte[] Bytes { get { return bytes; } }
public String Value { get { return EncodingWin1252.GetString(Bytes); } }
public Int32 Length { get { return Bytes.Count(); } }
public String8bit(string word)
{
word = word.ToLowerInvariant(); // SQL is case insensitive
// Convert the string into a byte array
byte[] unicodeBytes = EncodingUnicode.GetBytes(word);
// Perform the conversion from one encoding to the other
bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
}
public String8bit(Byte[] win1252bytes)
{ // if reading from SQL char then read as System.Data.SqlTypes.SqlBytes
bytes = win1252bytes;
}
}
}

How can convert a hex string into a string whose ASCII values have the same value in C#?

Assume that I have a string containing a hex value. For example:
string command "0xABCD1234";
How can I convert that string into another string (for example, string codedString = ...) such that this new string's ASCII-encoded representation has the same binary as the original strings contents?
The reason I need to do this is because I have a library from a hardware manufacturer that can transmit data from their piece of hardware to another piece of hardware over SPI. Their functions take strings as an input, but when I try to send "AA" I am expecting the SPI to transmit the binary 10101010, but instead it transmits the ascii representation of AA which is 0110000101100001.
Also, this hex string is going to be 32 hex characters long (that is, 256-bits long).
string command = "AA";
int num = int.Parse(command,NumberStyles.HexNumber);
string bits = Convert.ToString(num,2); // <-- 10101010
I think I understand what you need... here is the main code part.. asciiStringWithTheRightBytes is what you would send to your command.
var command = "ABCD1234";
var byteCommand = GetBytesFromHexString(command);
var asciiStringWithTheRightBytes = Encoding.ASCII.GetString(byteCommand);
And the subroutines it uses are here...
static byte[] GetBytesFromHexString(string str)
{
byte[] bytes = new byte[str.Length * sizeof(byte)];
for (var i = 0; i < str.Length; i++)
bytes[i] = HexToInt(str[i]);
return bytes;
}
static byte HexToInt(char hexChar)
{
hexChar = char.ToUpper(hexChar); // may not be necessary
return (byte)((int)hexChar < (int)'A' ?
((int)hexChar - (int)'0') :
10 + ((int)hexChar - (int)'A'));
}

Categories