C# Reading String inside a binary file - c#

I have some problems to understand how reading files from different format than the text format. I know that inside a given file there are some information as string. I managed to write the hex code to a text file which is helping me a lot for another function of the process, because I know that after some combinations of hex codes there might be string writed in the file.
For instance, I have this batch of hex codes.
00 39 AF 32 DD 24 BA 09 07 06 03 DB
I know that when the hex codes are equal to AF 32 the next information should be the string. For instance: "Invoice Number 223232"
Any help or reference will be appreciated.
Kind regards,
static void Main(string[] args)
{
StreamWriter writer = new StreamWriter("output.txt", true);
FileStream fs = new FileStream("File", FileMode.Open);
int hexIn;
String hex;
for (int i = 0; (hexIn = fs.ReadByte()) != -1; i++)
{
writer.Write(hexIn + " ");
hex = string.Format("{0:X2}", hexIn);
writer.Write(hex + " ");
}
}

The sample code you have looks like you are trying to read a binary file, not a hex-encoded text file.
If the source file is binary (which is ideal), you would read it byte-by-byte, and run through a state machine to know when to expect a string. You would have to know how long the string is. In the sample below I am assuming a null-terminated C-style string. For pascal style strings you would read a length prefix, or for fixed width just keep track of the expected number of character.
bool done = false;
int state = 0;
StringBuilder result = new StringBuilder();
while (!done) {
int byteValue = fs.ReadByte();
if (bytesValue == -1)
done = true;
else {
switch (state) {
case 0: //looking for 0xAF
if (byteValue == 0xAF)
state = 1;
break;
case 1: //looking for 0x32
if (byteValue == 0x32)
state = 2;
else
state = 0;
break;
case 2: //start reading string
if (byteValue == 0) {//end of C-style string
//Do something with result.ToString()
result.Clear();
state = 0; //go back to looking for more strings
} else {
result.Append((char)byteValue); //assuming 8-bit ASCII string
}
break;
}
}
}
If you are reading a hex-encoded text file, it would be more difficult, as you would have to read hex nibbles at a time and reconstruct the bytes, but the state machine approach would be similar.

Related

byte[] with ASCII values to string to int

So someone took int value, converted it to string then converted it to ASCII values and then finally to byte[] with inconsistent length 1 - 4 bytes.
e.g. 100 -> "100" -> { 49, 48, 48 }.
Now I need that int value and I did it like this:
{ 49, 48, 48 } -> '1' + '0' + '0' -> "100" -> 100
switch (header[25].Count)
{
case 1:
hex = "" + (char)header[25][0];
amountOfData = Convert.ToInt32(hex, 16);
break;
case 2:
hex = "" + (char)header[25][0] + (char)header[25][1];
amountOfData = Convert.ToInt32(hex, 16);
break;
case 3:
hex = "" + (char)header[25][0] + (char)header[25][1] + (char)header[25][2];
amountOfData = Convert.ToInt32(hex, 16);
break;
case 4:
hex = "" + (char)header[25][0] + (char)header[25][1] + (char)header[25][2] + (char)header[25][3];
amountOfData = Convert.ToInt32(hex, 16); ;
break;
default:
break;
}
but maybe there is better solution...
EDIT: sorry for not mentioning that, but header is List<List<byte>>
You can use the Encoding/GetString method to convert bytes of different encodings (e.g. ASCII in your case) to a .NET string:
var input = new byte[] { 49, 48, 48 };
var str = Encoding.ASCII.GetString(input);
var result = int.Parse(str, NumberStyles.None, CultureInfo.InvariantCulture);
You can use library functions to parse from byte-like data to primitives; you're talking about ASCII, which means that Utf8Parser will work fine for us (all ASCII is also valid UTF8, although the reverse is obviously not true); normally, we would expect that header[25] is a byte[], a segment there-of, or some other raw binary source, but: ultimately, something like:
var span = new ReadOnlySpan<byte>(header[25], 0, header[25].Count);
if (!Utf8Parser.TryParse(span, out int amountOfData, out _))
ThrowSomeError(); // not an integer
If header[25] is something less convenient (like a List<byte> - I notice that in your example, your header[25] has a .Count not a .Length, which suggests it isn't a byte[]), then you can always either stackalloc a local buffer and copy the data out, or you can peek inside the list with CollectionMarshal.AsSpan<T>(List<T>), which returns a Span<T> from the underlying data:
var span = CollectionMarshal.AsSpan(header[25]);
if (!Utf8Parser.TryParse(span, out int amountOfData, out _))
ThrowSomeError(); // not an integer
As a runnable example that just shows the API:
using System;
using System.Buffers.Text;
Span<byte> span = stackalloc byte[] { 49, 48, 48 };
if (!Utf8Parser.TryParse(span, out int amountOfData, out _))
throw new FormatException();
Console.WriteLine(amountOfData); // 100

C# - Get Integer Byte Array in String

I have a random integer value which I need to represent in String as a Byte array. For example:
int value = 32;
String strValue = getStringByteArray(value);
Console.WriteLine(strValue); // should write: " \0\0\0"
If value = 11 then getStringByteArray(value) shuld return "\v\0\0\0".
If value = 13 then getStringByteArray(value) shuld return "\r\0\0\0".
And so on.
Any idea on how to implement the method getStringByteArray(int value) in C#?
UPDATE
This is the code that receives the data from the C# NamedPipe Server:
bool CFilePipe::ReadString(int m_handle, string &value)
{
//--- check for data
if(WaitForRead(sizeof(int)))
{
ResetLastError();
int size=FileReadInteger(m_handle);
if(GetLastError()==0)
{
//--- check for data
if(WaitForRead(size))
{
value=FileReadString(m_handle,size);
return(size==StringLen(value));
}
}
}
//--- failure
return(false);
}
Don't take this approach at all. You should be writing to a binary stream of some description - and write the binary data for the length of the packet/message, followed by the message itself. For example:
BinaryWriter writer = new BinaryWriter(stream);
byte[] data = Encoding.UTF8.GetBytes(text);
writer.Write(data.Length);
writer.Write(data);
Then at the other end, you'd use:
BinaryReader reader = new BinaryReader(stream);
int length = reader.ReadInt32();
byte[] data = reader.ReadBytes(length);
string text = Encoding.UTF8.GetString(data);
No need to treat binary data as text at all.
Well. First of all you should get bytes from integer. You can do it with BitConverter:
var bytes = BitConverter.GetBytes(value);
Next, here is three variants. First - if you want to get result in binary format. Just take all your bytes and write as it is:
var str = string.Concat(bytes.Select(b => Convert.ToString(b, 2)));
Second variant. If you want convert your byte array to hexadecimal string:
var hex = BitConverter.ToString(array).Replace("-","");
Third variant. Your representation ("\v\0\0\0") - it is simple converting byte to char. Use this:
var s = bytes.Aggregate(string.Empty, (current, t) => current + Convert.ToChar(t));
This should help with that.
class Program
{
static void Main(string[] args)
{
Random rand = new Random();
int number = rand.Next(1, 1000);
byte[] intBytes = BitConverter.GetBytes(number);
string answer = "";
for (int i = 0; i < intBytes.Length; i++)
{
answer += intBytes[i] + #"\";
}
Console.WriteLine(answer);
Console.WriteLine(number);
Console.ReadKey();
}
}
Obviously, you should implement two steps to achieve the goal:
Extract bytes from the integer in the appropriate order (little-endian or big-endian, it's up to you to decide), using bit arithmetics.
Merge extracted bytes into string using the format you need.
Possible implementation:
using System;
using System.Text;
public class Test
{
public static void Main()
{
Int32 value = 5152;
byte[] bytes = new byte[4];
for (int i = 0; i < 4; i++)
{
bytes[i] = (byte)((value >> i * 8) & 0xFF);
}
StringBuilder result = new StringBuilder();
for (int i = 0; i < 4; i++)
{
result.Append("\\" + bytes[i].ToString("X2"));
}
Console.WriteLine(result);
}
}
Ideone snippet: http://ideone.com/wLloo1
I think you are saying that you want to convert each byte into a character literal, using escape sequences for the non printable characters.
After converting the integer to 4 bytes, cast to char. Then use Char.IsControl() to identify the non-printing characters. Use the printable char directly, and use a lookup table to find the corresponding escape sequence for each non-printable char.

Expressing byte values > 127 in .Net Strings

I'm writing some binary protocol messages in .Net using strings, and it mostly works, except for one particular case.
The message I'm trying to send is:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
myDevice.Write(Encoding.ASCII.GetBytes(cmdPacket));
(to help decode, those bytes are 253, 11, 22, then the ASCII chars: "MBEPEXE1.").
Except when I do the Encoding.ASCII.GetBytes, the 0xFD comes out as byte 0x3F
(value 253 changed to 63).
(I should point out that the \x0B and \x16 are interpreted correctly as Hex 0B & Hex 16)
I've also tried Encoding.UTF8 and Encoding.UTF7, to no avail.
I feel there is probably a good simple way to express values above 128 in Strings, and convert them to bytes, but I'm missing it.
Any guidance?
Ignoring if it's good or bad what you are doing, the encoding ISO-8859-1 maps all its characters to the characters with the same code in Unicode.
// Bytes with all the possible values 0-255
var bytes = Enumerable.Range(0, 256).Select(p => (byte)p).ToArray();
// String containing the values
var all1bytechars = new string(bytes.Select(p => (char)p).ToArray());
// Sanity check
Debug.Assert(all1bytechars.Length == 256);
// The encoder, you could make it static readonly
var enc = Encoding.GetEncoding("ISO-8859-1"); // It is the codepage 28591
// string-to-bytes
var bytes2 = enc.GetBytes(all1bytechars);
// bytes-to-string
var all1bytechars2 = enc.GetString(bytes);
// check string-to-bytes
Debug.Assert(bytes.SequenceEqual(bytes2));
// check bytes-to-string
Debug.Assert(all1bytechars.SequenceEqual(all1bytechars2));
From the wiki:
ISO-8859-1 was incorporated as the first 256 code points of ISO/IEC 10646 and Unicode.
Or a simple and fast method to convert a string to a byte[] (with unchecked and checked variant)
public static byte[] StringToBytes(string str)
{
var bytes = new byte[str.Length];
for (int i = 0; i < str.Length; i++)
{
bytes[i] = checked((byte)str[i]); // Slower but throws OverflowException if there is an invalid character
//bytes[i] = unchecked((byte)str[i]); // Faster
}
return bytes;
}
ASCII is a 7-bit code. The high-order bit used to be used as a parity bit, so "ASCII" could have even, odd or no parity. You may notice that 0x3F (decimal 63) is the ASCII character ?. That is what non-ASCII octets (those greater than 0x7F/decimal 127) are converted to by the CLR's ASCII encoding. The reason is that there is no standard ASCII character representation of the code points in the range 0x80–0xFF.
C# strings are UTF-16 encoded Unicode internally. If what you care about are the byte values of the strings, and you know that the strings are, in fact, characters whose Unicode code points are in the range U+0000 through U+00FF, then its easy. Unicode's first 256 codepoints (0x00–0xFF), the Unicode blocks C0 Controls and Basic Latin (\x00-\x7F) and C1 Controls and Latin Supplement (\x80-\xFF) are the "normal" ISO-8859-1 characters. A simple incantation like this:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
byte[] buffer = cmdPacket.Select(c=>(byte)c).ToArray() ;
myDevice.Write(buffer);
will get you the byte[] you want, in this case
// \xFD \x0B \x16 M B E P E X E 1 .
[ 0xFD , 0x0B , 0x16 , 0x4d , 0x42 , 0x45, 0x50 , 0x45 , 0x58 , 0x45 , 0x31 , 0x2E ]
With LINQ, you could do something like this:
String cmdPacket = "\xFD\x0B\x16MBEPEXE1.";
myDevice.Write(cmdPacket.Select(Convert.ToByte).ToArray());
Edit: Added an explanation
First, you recognize that your string is really just an array of characters. What you want is an "equivalent" array of bytes, where each byte corresponds to a character.
To get the array, you have to "map" each character of the original array as a byte in the new array. To do that, you can use the built-in System.Convert.ToByte(char) method.
Once you've described your mapping from characters to bytes, it's as simple as projecting the input string, through the mapping, into an array.
Hope that helps!
I use Windows-1252 as it seems to give the most bang for the byte
And is compatible with all .NET string values
You will probably want to comment out the ToLower
This was built for compatibility with SQL char (single byte)
namespace String1byte
{
/// <summary>
/// Interaction logic for MainWindow.xaml
/// </summary>
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
String8bit s1 = new String8bit("cat");
String8bit s2 = new String8bit("cat");
String8bit s3 = new String8bit("\xFD\x0B\x16MBEPEXE1.");
HashSet<String8bit> hs = new HashSet<String8bit>();
hs.Add(s1);
hs.Add(s2);
hs.Add(s3);
System.Diagnostics.Debug.WriteLine(hs.Count.ToString());
System.Diagnostics.Debug.WriteLine(s1.Value + " " + s1.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s2.Value + " " + s2.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s3.Value + " " + s3.GetHashCode().ToString());
System.Diagnostics.Debug.WriteLine(s1.Equals(s2).ToString());
System.Diagnostics.Debug.WriteLine(s1.Equals(s3).ToString());
System.Diagnostics.Debug.WriteLine(s1.MatchStart("ca").ToString());
System.Diagnostics.Debug.WriteLine(s3.MatchStart("ca").ToString());
}
}
public struct String8bit
{
private static Encoding EncodingUnicode = Encoding.Unicode;
private static Encoding EncodingWin1252 = System.Text.Encoding.GetEncoding("Windows-1252");
private byte[] bytes;
public override bool Equals(Object obj)
{
// Check for null values and compare run-time types.
if (obj == null) return false;
if (!(obj is String8bit)) return false;
String8bit comp = (String8bit)obj;
if (comp.Bytes.Length != this.Bytes.Length) return false;
for (Int32 i = 0; i < comp.Bytes.Length; i++)
{
if (comp.Bytes[i] != this.Bytes[i])
return false;
}
return true;
}
public override int GetHashCode()
{
UInt32 hash = (UInt32)(Bytes[0]);
for (Int32 i = 1; i < Bytes.Length; i++) hash = hash ^ (UInt32)(Bytes[0] << (i%4)*8);
return (Int32)hash;
}
public bool MatchStart(string start)
{
if (string.IsNullOrEmpty(start)) return false;
if (start.Length > this.Length) return false;
start = start.ToLowerInvariant(); // SQL is case insensitive
// Convert the string into a byte array
byte[] unicodeBytes = EncodingUnicode.GetBytes(start);
// Perform the conversion from one encoding to the other
byte[] win1252Bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
for (Int32 i = 0; i < win1252Bytes.Length; i++) if (Bytes[i] != win1252Bytes[i]) return false;
return true;
}
public byte[] Bytes { get { return bytes; } }
public String Value { get { return EncodingWin1252.GetString(Bytes); } }
public Int32 Length { get { return Bytes.Count(); } }
public String8bit(string word)
{
word = word.ToLowerInvariant(); // SQL is case insensitive
// Convert the string into a byte array
byte[] unicodeBytes = EncodingUnicode.GetBytes(word);
// Perform the conversion from one encoding to the other
bytes = Encoding.Convert(EncodingUnicode, EncodingWin1252, unicodeBytes);
}
public String8bit(Byte[] win1252bytes)
{ // if reading from SQL char then read as System.Data.SqlTypes.SqlBytes
bytes = win1252bytes;
}
}
}

How can convert a hex string into a string whose ASCII values have the same value in C#?

Assume that I have a string containing a hex value. For example:
string command "0xABCD1234";
How can I convert that string into another string (for example, string codedString = ...) such that this new string's ASCII-encoded representation has the same binary as the original strings contents?
The reason I need to do this is because I have a library from a hardware manufacturer that can transmit data from their piece of hardware to another piece of hardware over SPI. Their functions take strings as an input, but when I try to send "AA" I am expecting the SPI to transmit the binary 10101010, but instead it transmits the ascii representation of AA which is 0110000101100001.
Also, this hex string is going to be 32 hex characters long (that is, 256-bits long).
string command = "AA";
int num = int.Parse(command,NumberStyles.HexNumber);
string bits = Convert.ToString(num,2); // <-- 10101010
I think I understand what you need... here is the main code part.. asciiStringWithTheRightBytes is what you would send to your command.
var command = "ABCD1234";
var byteCommand = GetBytesFromHexString(command);
var asciiStringWithTheRightBytes = Encoding.ASCII.GetString(byteCommand);
And the subroutines it uses are here...
static byte[] GetBytesFromHexString(string str)
{
byte[] bytes = new byte[str.Length * sizeof(byte)];
for (var i = 0; i < str.Length; i++)
bytes[i] = HexToInt(str[i]);
return bytes;
}
static byte HexToInt(char hexChar)
{
hexChar = char.ToUpper(hexChar); // may not be necessary
return (byte)((int)hexChar < (int)'A' ?
((int)hexChar - (int)'0') :
10 + ((int)hexChar - (int)'A'));
}

save string array in binary format

string value1 , value1 ;
int length1 , length2 ;
System.Collections.BitArray bitValue1 = new System.Collections.BitArray(Length1);
System.Collections.BitArray bitValue2 = new System.Collections.BitArray(Length2);
I'm looking for the fastest way to covert each string to BitArray with defined length for each string (the string should be trimmed if it is larger than defined length and if strings size is smaller remaining bits will be filled with false) and then put this two strings together and write it in a binary file .
Edit :
#dtb : a simple example can be like this value1 = "A" ,value2 = "B" and length1 =8 and length2 = 16 and the result will be 010000010000000001000010
the first 8 bits are from "A" and next 16 bits from "B"
//Source string
string value1 = "t";
//Length in bits
int length1 = 2;
//Convert the text to an array of ASCII bytes
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(value1);
//Create a temp BitArray from the bytes
System.Collections.BitArray tempBits = new System.Collections.BitArray(bytes);
//Create the output BitArray setting the maximum length
System.Collections.BitArray bitValue1 = new System.Collections.BitArray(length1);
//Loop through the temp array
for(int i=0;i<tempBits.Length;i++)
{
//If we're outside of the range of the output array exit
if (i >= length1) break;
//Otherwise copy the value from the temp to the output
bitValue1.Set(i, tempBits.Get(i));
}
And I'm going to keep saying it, this assumes ASCII characters so anything above ASCII 127 (such as the é in résumé) will freak out and probably return ASCII 63 which is the question mark.
When converting a string to something else you need to consider what encoding you want to use. Here's a version that uses UTF-8
bitValue1 = System.Text.Encoding.UTF8.GetBytes(value1, 0, length1);
Edit
Hmm... saw that you're looking for a BitArray and not a ByteArray, this won't help you probably.
Since this is not a very clear question I'll give this a shot nonetheless,
using System.IO;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
public static void RunSnippet()
{
string s = "123";
byte[] b = System.Text.ASCIIEncoding.ASCII.GetBytes(s);
System.Collections.BitArray bArr = new System.Collections.BitArray(b);
Console.WriteLine("bArr.Count = {0}", bArr.Count);
for(int i = 0; i < bArr.Count; i++)
Console.WriteLin(string.Format("{0}", bArr.Get(i).ToString()));
BinaryFormatter bf = new BinaryFormatter();
using (FileStream fStream = new FileStream("test.bin", System.IO.FileMode.CreateNew)){
bf.Serialize(fStream, (System.Collections.BitArray)bArr);
Console.WriteLine("Serialized to test.bin");
}
Console.ReadLine();
}
Is that what you are trying to achieve?
Hope this helps,
Best regards,
Tom.

Categories