Expressing byte values > 127 in .Net Strings - c#

I'm writing some binary protocol messages in .Net using strings, and it mostly works, except for one particular case.
The message I'm trying to send is:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
myDevice.Write(Encoding.ASCII.GetBytes(cmdPacket));
(to help decode, those bytes are 253, 11, 22, then the ASCII chars: "MBEPEXE1.").
Except when I do the Encoding.ASCII.GetBytes, the 0xFD comes out as byte 0x3F
(value 253 changed to 63).
(I should point out that the \x0B and \x16 are interpreted correctly as Hex 0B & Hex 16)
I've also tried Encoding.UTF8 and Encoding.UTF7, to no avail.
I feel there is probably a good simple way to express values above 128 in Strings, and convert them to bytes, but I'm missing it.
Any guidance?

Ignoring if it's good or bad what you are doing, the encoding ISO-8859-1 maps all its characters to the characters with the same code in Unicode.
// Bytes with all the possible values 0-255
var bytes = Enumerable.Range(0, 256).Select(p => (byte)p).ToArray();
// String containing the values
var all1bytechars = new string(bytes.Select(p => (char)p).ToArray());
// Sanity check
Debug.Assert(all1bytechars.Length == 256);
// The encoder, you could make it static readonly
var enc = Encoding.GetEncoding("ISO-8859-1"); // It is the codepage 28591
// string-to-bytes
var bytes2 = enc.GetBytes(all1bytechars);
// bytes-to-string
var all1bytechars2 = enc.GetString(bytes);
// check string-to-bytes
Debug.Assert(bytes.SequenceEqual(bytes2));
// check bytes-to-string
Debug.Assert(all1bytechars.SequenceEqual(all1bytechars2));
From the wiki:
ISO-8859-1 was incorporated as the first 256 code points of ISO/IEC 10646 and Unicode.
Or a simple and fast method to convert a string to a byte[] (with unchecked and checked variant)
public static byte[] StringToBytes(string str)
{
var bytes = new byte[str.Length];
for (int i = 0; i < str.Length; i++)
{
bytes[i] = checked((byte)str[i]); // Slower but throws OverflowException if there is an invalid character
//bytes[i] = unchecked((byte)str[i]); // Faster
}
return bytes;
}

ASCII is a 7-bit code. The high-order bit used to be used as a parity bit, so "ASCII" could have even, odd or no parity. You may notice that 0x3F (decimal 63) is the ASCII character ?. That is what non-ASCII octets (those greater than 0x7F/decimal 127) are converted to by the CLR's ASCII encoding. The reason is that there is no standard ASCII character representation of the code points in the range 0x80–0xFF.
C# strings are UTF-16 encoded Unicode internally. If what you care about are the byte values of the strings, and you know that the strings are, in fact, characters whose Unicode code points are in the range U+0000 through U+00FF, then its easy. Unicode's first 256 codepoints (0x00–0xFF), the Unicode blocks C0 Controls and Basic Latin (\x00-\x7F) and C1 Controls and Latin Supplement (\x80-\xFF) are the "normal" ISO-8859-1 characters. A simple incantation like this:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
byte[] buffer = cmdPacket.Select(c=>(byte)c).ToArray() ;
myDevice.Write(buffer);
will get you the byte[] you want, in this case
// \xFD \x0B \x16 M B E P E X E 1 .
[ 0xFD , 0x0B , 0x16 , 0x4d , 0x42 , 0x45, 0x50 , 0x45 , 0x58 , 0x45 , 0x31 , 0x2E ]

With LINQ, you could do something like this:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
myDevice.Write(cmdPacket.Select(Convert.ToByte).ToArray());
Edit: Added an explanation
First, you recognize that your string is really just an array of characters. What you want is an "equivalent" array of bytes, where each byte corresponds to a character.
To get the array, you have to "map" each character of the original array as a byte in the new array. To do that, you can use the built-in System.Convert.ToByte(char) method.
Once you've described your mapping from characters to bytes, it's as simple as projecting the input string, through the mapping, into an array.
Hope that helps!

I use Windows-1252 as it seems to give the most bang for the byte
And is compatible with all .NET string values
You will probably want to comment out the ToLower
This was built for compatibility with SQL char (single byte)
namespace String1byte
{
/// <summary>
/// Interaction logic for MainWindow.xaml
/// </summary>
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
String8bit s1 = new String8bit("cat");
String8bit s2 = new String8bit("cat");
String8bit s3 = new String8bit("\xFD\x0B\x16MBEPEXE1.");
HashSet<String8bit> hs = new HashSet<String8bit>();
hs.Add(s1);
hs.Add(s2);
hs.Add(s3);
System.Diagnostics.Debug.WriteLine(hs.Count.ToString());
System.Diagnostics.Debug.WriteLine(s1.Value + " " + s1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s2.Value + " " + s2.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s3.Value + " " + s3.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s1.Equals(s2).ToString());
System.Diagnostics.Debug.WriteLine(s1.Equals(s3).ToString());
System.Diagnostics.Debug.WriteLine(s1.MatchStart("ca").ToString());
System.Diagnostics.Debug.WriteLine(s3.MatchStart("ca").ToString());
}
}
public struct String8bit
{
private static Encoding EncodingUnicode = Encoding.Unicode;
private static Encoding EncodingWin1252 = System.Text.Encoding.GetEncoding("Windows-1252");
private byte[] bytes;
public override bool Equals(Object obj)
{
// Check for null values and compare run-time types.
if (obj == null) return false;
if (!(obj is String8bit)) return false;
String8bit comp = (String8bit)obj;
if (comp.Bytes.Length != this.Bytes.Length) return false;
for (Int32 i = 0; i < comp.Bytes.Length; i++)
{
if (comp.Bytes[i] != this.Bytes[i])
return false;
}
return true;
}
public override int GetHashCode()
{
UInt32 hash = (UInt32)(Bytes[0]);
for (Int32 i = 1; i < Bytes.Length; i++) hash = hash ^ (UInt32)(Bytes[0] << (i%4)*8);
return (Int32)hash;
}
public bool MatchStart(string start)
{
if (string.IsNullOrEmpty(start)) return false;
if (start.Length > this.Length) return false;
start = start.ToLowerInvariant(); // SQL is case insensitive
// Convert the string into a byte array
byte[] unicodeBytes = EncodingUnicode.GetBytes(start);
// Perform the conversion from one encoding to the other
byte[] win1252Bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
for (Int32 i = 0; i < win1252Bytes.Length; i++) if (Bytes[i] != win1252Bytes[i]) return false;
return true;
}
public byte[] Bytes { get { return bytes; } }
public String Value { get { return EncodingWin1252.GetString(Bytes); } }
public Int32 Length { get { return Bytes.Count(); } }
public String8bit(string word)
{
word = word.ToLowerInvariant(); // SQL is case insensitive
// Convert the string into a byte array
byte[] unicodeBytes = EncodingUnicode.GetBytes(word);
// Perform the conversion from one encoding to the other
bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
}
public String8bit(Byte[] win1252bytes)
{ // if reading from SQL char then read as System.Data.SqlTypes.SqlBytes
bytes = win1252bytes;
}
}
}

Related

How to turn an ASCII value back into a character

I have a text message i have converted to ASCII. I've then used the ASCII value and a keyword to convert the ASCII value to that of its corresponding letter in a unique alphabet. How would i convert the ASCII number back into a character. I currently am using characteres 97 - 122
foreach (char c in txtEncryption.Text) // Finding ascii values for each character
{
byte[] TempAsciiValue = Encoding.ASCII.getChars(c); // Fix,,,
string TempAsciiValStr = TempAsciiValue.ToString();
int TempAsciiVal = int.Parse(TempAsciiValStr);
if (TempAsciiVal == 32)
{
ArrayVal[Count] = TempAsciiVal;
}
else
{
// Calculations
int Difference = TempAsciiVal - 65; // Finds what letter after A it is
int TempValue = ArrayAlphabet[Count, 1]; // Find the starting ASCII value for new alphabet
int TempValuePlusDifference = TempValue + Difference;
//Convert the ASCII value to the letter
ArrayVal[Count] = TempValuePlusDifference; //Store the letters ASCII code
Count++;
if (Count > 3)
{
Count = 1;
}
}
for (int d = 1; d < CountMessageLength; d++)
{
string TempArrayVal = ArrayVal[Count].ToString();
txtEncryption2.Text = TempArrayVal;
// Convert TempArrayVal to = Letter (TempLetterStorage),,,,
// String FinalMessage = all TempLetterStorage values
}
}
Starting with bytes that are ASCII character codes, say for the sake of this exercise, 97 to 122
Byte[] asciiBytes = Enumerable.Range(97, 122 + 1 - 97).Select(i => (Byte)i).ToArray();
Use an encoding object for the ASCII encoding, with desirable behaviors, to whit: validate our assumption that the input character codes are in the ASCII range:
Encoding asciiEncoding = Encoding.GetEncoding(
Encoding.ASCII.CodePage,
EncoderFallback.ExceptionFallback,
DecoderFallback.ExceptionFallback)
Decode to a .NET String (UTF-16)
String text = asciiEncoding.GetString(asciiBytes);

Unicode Hex String to String

I have a unicode string like this:
0030003100320033
Which should turn into 0123.
This is a simple case of 0123 string, but there are some string and unicode chars as well. How can I turn this type of unicode hex string to string in C#?
For normal US charset, first part is always 00, so 0031 is "1" in ASCII, 0032 is "2" and so on.
When its actual unicode char, like Arabic and Chinese, first part is not 00, for instance for Arabic its 06XX, like 0663.
I need to be able to turn this type of Hex string into C# decimal string.
There are several encodings that can represent Unicode, of which UTF-8 is today's de facto standard. However, your example is actually a string representation of UTF-16 using the big-endian byte order. You can convert your hex string back into bytes, then use Encoding.BigEndianUnicode to decode this:
public static void Main()
{
var bytes = StringToByteArray("0030003100320033");
var decoded = System.Text.Encoding.BigEndianUnicode.GetString(bytes);
Console.WriteLine(decoded); // gives "0123"
}
// https://stackoverflow.com/a/311179/1149773
public static byte[] StringToByteArray(string hex)
{
byte[] bytes = new byte[hex.Length / 2];
for (int i = 0; i < hex.Length; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
Since Char in .NET represents a UTF-16 code unit, this answer should give identical results to Slai's, including for surrogate pairs.
Shorter less efficient alternative:
Regex.Replace("0030003100320033", "....", m => (char)Convert.ToInt32(m + "", 16) + "");
You should try this solution
public static void Main()
{
string hexString = "0030003100320033"; //Hexa pair numeric values
//string hexStrWithDash = "00-30-00-31-00-32-00-33"; //Hexa pair numeric values separated by dashed. This occurs using BitConverter.ToString()
byte[] data = ParseHex(hexString);
string result = System.Text.Encoding.BigEndianUnicode.GetString(data);
Console.Write("Data: {0}", result);
}
public static byte[] ParseHex(string hexString)
{
hexString = hexString.Replace("-", "");
byte[] output = new byte[hexString.Length / 2];
for (int i = 0; i < output.Length; i++)
{
output[i] = Convert.ToByte(hexString.Substring(i * 2, 2), 16);
}
return output;
}

I Want Convert Hex to Ascii

I want convert hex to ascii.
I was try different two methods. But I could not be successful.
Method1:
public void ConvertHex(String hexString)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hexString.Length; i += 2)
{
String hs = hexString.Substring(i, i + 2);
System.Convert.ToChar(System.Convert.ToUInt32(hexString.Substring(0, 2), 16)).ToString();
}
String ascii = sb.ToString();
StreamWriter wrt = new StreamWriter("D:\\denemeASCII.txt");
wrt.Write(ascii);
}
Method 2:
public string HEX2ASCII(string hex)
{
string res = String.Empty;
for (int a = 0; a < hex.Length; a = a + 2)
{
string Char2Convert = hex.Substring(a, 2);
int n = Convert.ToInt32(Char2Convert, 16);
char c = (char)n;
res += c.ToString();
}
return res;
}
Incoming error message :(
What should I do?
Your "Method1" has a few chances of being rewritten to work. (Your "Method2" is hopeless.)
So, in "Method1", you do String hs = hexString.Substring( i, i + 2 ) and then you forget that hs ever existed. (Shouldn't the compiler be giving you a warning about that?) Then you proceed to do System.Convert.ToChar( System.Convert.ToUInt32( hexString.Substring( 0, 2 ), 16 ) ) but hexString.Substring( 0, 2 ) will always pick the first two characters of the hexString, not the two characters pointed by i. What you probably meant to do is this instead: System.Convert.ToChar( System.Convert.ToUInt32( hs , 16) )
Also, you are declaring a StringBuilder sb; but you are never adding anything to it. At the same time, System.Convert.ToChar() does not work by side effect; it returns a value; if you don't do anything with the returned value, the returned value is lost forever. What you probably meant to do is add the result of System.Convert.ToChar() to your StringBuilder.
You don't have valid character in your input. A character in c# is two bytes class with a private property which indicates if the character is one or two bytes. The encoding library methods (unicode, UTF6, UTF7, UTF8) normally does the conversion and sets the private property. It is hard to tell with your input if you are converting to one or two bytes, and if the input is big endian or little endian. The code below converts to byte[] and int16[].
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string input = "0178 0000 0082 f000 0063 6500 00da 6400 00be 0000 00ff ffff ffff ffff ffd6 6600";
ConvertHex(input);
}
static void ConvertHex(String hexString)
{
Int16[] hexArray = hexString.Split(new char[] {' '},StringSplitOptions.RemoveEmptyEntries)
.Select(x => Int16.Parse(x, NumberStyles.HexNumber)).ToArray();
byte[] byteArray = hexArray.Select(x => new byte[] { (byte)((x >> 8) & 0xff), (byte)(x & 0xff) }).SelectMany(x => x).ToArray();
}
}
}

C# - Get Integer Byte Array in String

I have a random integer value which I need to represent in String as a Byte array. For example:
int value = 32;
String strValue = getStringByteArray(value);
Console.WriteLine(strValue); // should write: " \0\0\0"
If value = 11 then getStringByteArray(value) shuld return "\v\0\0\0".
If value = 13 then getStringByteArray(value) shuld return "\r\0\0\0".
And so on.
Any idea on how to implement the method getStringByteArray(int value) in C#?
UPDATE
This is the code that receives the data from the C# NamedPipe Server:
bool CFilePipe::ReadString(int m_handle, string &value)
{
//--- check for data
if(WaitForRead(sizeof(int)))
{
ResetLastError();
int size=FileReadInteger(m_handle);
if(GetLastError()==0)
{
//--- check for data
if(WaitForRead(size))
{
value=FileReadString(m_handle,size);
return(size==StringLen(value));
}
}
}
//--- failure
return(false);
}
Don't take this approach at all. You should be writing to a binary stream of some description - and write the binary data for the length of the packet/message, followed by the message itself. For example:
BinaryWriter writer = new BinaryWriter(stream);
byte[] data = Encoding.UTF8.GetBytes(text);
writer.Write(data.Length);
writer.Write(data);
Then at the other end, you'd use:
BinaryReader reader = new BinaryReader(stream);
int length = reader.ReadInt32();
byte[] data = reader.ReadBytes(length);
string text = Encoding.UTF8.GetString(data);
No need to treat binary data as text at all.
Well. First of all you should get bytes from integer. You can do it with BitConverter:
var bytes = BitConverter.GetBytes(value);
Next, here is three variants. First - if you want to get result in binary format. Just take all your bytes and write as it is:
var str = string.Concat(bytes.Select(b => Convert.ToString(b, 2)));
Second variant. If you want convert your byte array to hexadecimal string:
var hex = BitConverter.ToString(array).Replace("-","");
Third variant. Your representation ("\v\0\0\0") - it is simple converting byte to char. Use this:
var s = bytes.Aggregate(string.Empty, (current, t) => current + Convert.ToChar(t));
This should help with that.
class Program
{
static void Main(string[] args)
{
Random rand = new Random();
int number = rand.Next(1, 1000);
byte[] intBytes = BitConverter.GetBytes(number);
string answer = "";
for (int i = 0; i < intBytes.Length; i++)
{
answer += intBytes[i] + #"\";
}
Console.WriteLine(answer);
Console.WriteLine(number);
Console.ReadKey();
}
}
Obviously, you should implement two steps to achieve the goal:
Extract bytes from the integer in the appropriate order (little-endian or big-endian, it's up to you to decide), using bit arithmetics.
Merge extracted bytes into string using the format you need.
Possible implementation:
using System;
using System.Text;
public class Test
{
public static void Main()
{
Int32 value = 5152;
byte[] bytes = new byte[4];
for (int i = 0; i < 4; i++)
{
bytes[i] = (byte)((value >> i * 8) & 0xFF);
}
StringBuilder result = new StringBuilder();
for (int i = 0; i < 4; i++)
{
result.Append("\\" + bytes[i].ToString("X2"));
}
Console.WriteLine(result);
}
}
Ideone snippet: http://ideone.com/wLloo1
I think you are saying that you want to convert each byte into a character literal, using escape sequences for the non printable characters.
After converting the integer to 4 bytes, cast to char. Then use Char.IsControl() to identify the non-printing characters. Use the printable char directly, and use a lookup table to find the corresponding escape sequence for each non-printable char.

How can convert a hex string into a string whose ASCII values have the same value in C#?

Assume that I have a string containing a hex value. For example:
string command "0xABCD1234";
How can I convert that string into another string (for example, string codedString = ...) such that this new string's ASCII-encoded representation has the same binary as the original strings contents?
The reason I need to do this is because I have a library from a hardware manufacturer that can transmit data from their piece of hardware to another piece of hardware over SPI. Their functions take strings as an input, but when I try to send "AA" I am expecting the SPI to transmit the binary 10101010, but instead it transmits the ascii representation of AA which is 0110000101100001.
Also, this hex string is going to be 32 hex characters long (that is, 256-bits long).
string command = "AA";
int num = int.Parse(command,NumberStyles.HexNumber);
string bits = Convert.ToString(num,2); // <-- 10101010
I think I understand what you need... here is the main code part.. asciiStringWithTheRightBytes is what you would send to your command.
var command = "ABCD1234";
var byteCommand = GetBytesFromHexString(command);
var asciiStringWithTheRightBytes = Encoding.ASCII.GetString(byteCommand);
And the subroutines it uses are here...
static byte[] GetBytesFromHexString(string str)
{
byte[] bytes = new byte[str.Length * sizeof(byte)];
for (var i = 0; i < str.Length; i++)
bytes[i] = HexToInt(str[i]);
return bytes;
}
static byte HexToInt(char hexChar)
{
hexChar = char.ToUpper(hexChar); // may not be necessary
return (byte)((int)hexChar < (int)'A' ?
((int)hexChar - (int)'0') :
10 + ((int)hexChar - (int)'A'));
}

Categories