Keep the last eight bytes when reading raw image file with C# - c#

Im trying to read a raw image file with C# and keep the last 2 bytes, and last 8 bytes in a variable. However there is something wrong in my if-statement so the variables just appends.. Like this:
twoBytes= eightBytes=
twoBytes=00 eightBytes=00
twoBytes=0000 eightBytes=0000
twoBytes=000000 eightBytes=000000
twoBytes=00000000 eightBytes=00000000
twoBytes=0000000000 eightBytes=0000000000
twoBytes=000000000000 eightBytes=000000000000
twoBytes=00000000000000 eightBytes=00000000000000
twoBytes=0000000000000000 eightBytes=0000000000000000
twoBytes=000000000000000000 eightBytes=000000000000000000
twoBytes=00000000000000000000 eightBytes=00000000000000000000
twoBytes=0000000000000000000000 eightBytes=0000000000000000000000
twoBytes=000000000000000000000000 eightBytes=000000000000000000000000
twoBytes=00000000000000000000000000 eightBytes=00000000000000000000000000
twoBytes=0000000000000000000000000000 eightBytes=0000000000000000000000000000
twoBytes=000000000000000000000000000000 eightBytes=000000000000000000000000000000
twoBytes=00000000000000000000000000000000 eightBytes=00000000000000000000000000000000
twoBytes=0000000000000000000000000000000000 eightBytes=0000000000000000000000000000000000
twoBytes=000000000000000000000000000000000000 eightBytes=000000000000000000000000000000000000
twoBytes=00000000000000000000000000000000000000 eightBytes=00000000000000000000000000000000000000
I want something like "twoBytes=55AA", and eightBytes="55AA454649205041"
My code:
// Read file, byte at the time (example 00, 5A)
FileStream fs = new FileStream("C:\\Users\\user\\image_files\\usb_guid_exfat.001", FileMode.Open);
int hexIn;
String hex;
String twoBytes = "";
String eightBytes = "";
for (int i = 0; (hexIn = fs.ReadByte()) != -1; i++)
{
hex = string.Format("{0:X2}", hexIn);
Console.WriteLine("twoBytes=" + twoBytes + " eightBytes=" + eightBytes);
// Transfer two bytes
twoBytes = twoBytes + hex;
if (twoBytes.Length < 4)
{
if (twoBytes.Length > 6) {
twoBytes = twoBytes.Substring(2, 4);
}
}
// Transfer eight bytes
eightBytes = eightBytes + hex;
if(eightBytes.Length < 8)
{
if (twoBytes.Length > 10) {
eightBytes = eightBytes.Substring(2, 8);
}
}
}

Your if statements are wrong. A value can't be less than 4 and greater than 6 at the same time.
If length is <=4, you have 1 or 2 bytes, so you need to inspect only if length is grater than 4 (6,8,etc). Otherwise the value stays the same.
The code for inspecting string bigger than 4:
twoBytes = twoBytes + hex;
if (twoBytes.Length > 4) {
twoBytes = twoBytes.Substring(twoBytes.Length-4, 4);
}
The similar with eightBytes.
Good luck!! :)

Related

C# How to fix loss of data during file to binary to string to binary conversion

I read a file as binary, convert to hex string, convert back to binary, and write to a new file.
I expect a duplicate, but get a corrupted file.
I have been trying different ways to convert the binary into the hex string but can't seem to retain the entire file.
byte[] binary1 = File.ReadAllBytes(#"....Input.jpg");
string hexString = "";
int counter1 = 0;
foreach (byte b in binary1)
{
counter1++;
hexString += (Convert.ToString(b, 16));
}
List<byte> bytelist = new List<byte>();
int counter2 = 0;
for (int i = 0; i < hexString.Length/2; i++)
{
counter2++;
string ch = hexString.Substring(i*2,2);
bytelist.Add(Convert.ToByte(ch, 16));
}
byte[] binary2 = bytelist.ToArray();
File.WriteAllBytes(#"....Output.jpg", binary2);
Counter 1 and counter 2 should be the same count, but counter 2 is always about 10% smaller.
I want to get a duplicate file output that I could have transferred around via that string value.
Converting byte 10 will give a single char, and not 2 characters. Your convert back method (logically) build on 2 chars per byte.
this case works
byte[] binary1 = new byte[] { 100 }; // convert will result in "64"
and this case fails
byte[] binary1 = new byte[] { 10 }; // convert will result in "a"
I quick fixed your code, by padding with a "0" in case of a single char.
so working code:
byte[] binary1 = new byte[] { 100 };
string hexString = "";
int counter1 = 0;
foreach (byte b in binary1)
{
counter1++;
var s = (Convert.ToString(b, 16));
// new
if (s.Length < 2)
{
hexString += "0";
}
// end new
hexString += s;
}
List<byte> bytelist = new List<byte>();
int counter2 = 0;
for (int i = 0; i < hexString.Length / 2; i++)
{
counter2++;
string ch = hexString.Substring(i * 2, 2);
var item = Convert.ToByte(ch, 16);
bytelist.Add(item);
}
byte[] binary2 = bytelist.ToArray();
Please note, your code could use some refactoring, e.g. don't string concat in a loop and maybe check the Single Responsibility Principle.
Update, got it fixed, but there are better solutions here: How do you convert a byte array to a hexadecimal string, and vice versa?

C# - Convert Bytes Representing Char Numbers to Int16

If I have a byte array representing a number read from a file, how can the byte array be converted to an Int16/short?
byte[] bytes = new byte[]{45,49,54,50 } //Byte array representing "-162" from text file
short value = 0; //How to convert to -162 as a short here?
Tried using BitConverter.ToInt16(bytes, 0), but the value is not correct.
Edit: Looking for a solution that does NOT use string conversion.
This function performs some validations that you may be able to exclude. You could simplify it if you know your input array will always contain at least one element and that the value will be a valid Int16.
const byte Negative = (byte)'-';
const byte Zero = (byte)'0';
static Int16 BytesToInt16(byte[] bytes)
{
if (null == bytes || bytes.Length == 0)
return 0;
int result = 0;
bool isNegative = bytes[0] == Negative;
int index = isNegative ? 1 : 0;
for (; index < bytes.Length; index++)
{
result = 10 * result + (bytes[index] - Zero);
}
if (isNegative)
result *= -1;
if (result < Int16.MinValue)
return Int16.MinValue;
if (result > Int16.MaxValue)
return Int16.MaxValue;
return (Int16)result;
}
Like willaien said, you want to first convert your bytes to a string.
byte[] bytes = new byte[]{ 45,49,54,50 };
string numberString = Encoding.UTF8.GetString(bytes);
short value = Int16.Parse(numberString);
If you're not sure that your string can be parsed, I recommend using Int16.TryParse:
byte[] bytes = new byte[]{ 45,49,54,50 };
string numberString = Encoding.UTF8.GetString(bytes);
short value;
if (!Int16.TryParse(numberString, out value))
{
// Parsing failed
}
else
{
// Parsing worked, `value` now contains your value.
}

Bitwise operation performance, how to improve

I have a simple task: determine how many bytes is necessary to encode some number (byte array length) to byte array and encode final value (implement this article: Encoded Length and Value Bytes).
Originally I wrote a quick method that accomplish the task:
public static Byte[] Encode(Byte[] rawData, Byte enclosingtag) {
if (rawData == null) {
return new Byte[] { enclosingtag, 0 };
}
List<Byte> computedRawData = new List<Byte> { enclosingtag };
// if array size is less than 128, encode length directly. No questions here
if (rawData.Length < 128) {
computedRawData.Add((Byte)rawData.Length);
} else {
// convert array size to a hex string
String hexLength = rawData.Length.ToString("x2");
// if hex string has odd length, align it to even by prepending hex string
// with '0' character
if (hexLength.Length % 2 == 1) { hexLength = "0" + hexLength; }
// take a pair of hex characters and convert each octet to a byte
Byte[] lengthBytes = Enumerable.Range(0, hexLength.Length)
.Where(x => x % 2 == 0)
.Select(x => Convert.ToByte(hexLength.Substring(x, 2), 16))
.ToArray();
// insert padding byte, set bit 7 to 1 and add byte count required
// to encode length bytes
Byte paddingByte = (Byte)(128 + lengthBytes.Length);
computedRawData.Add(paddingByte);
computedRawData.AddRange(lengthBytes);
}
computedRawData.AddRange(rawData);
return computedRawData.ToArray();
}
This is an old code and was written in an awful way.
Now I'm trying to optimize the code by using either, bitwise operators, or BitConverter class. Here is an example of bitwise-edition:
public static Byte[] Encode2(Byte[] rawData, Byte enclosingtag) {
if (rawData == null) {
return new Byte[] { enclosingtag, 0 };
}
List<Byte> computedRawData = new List<Byte>(rawData);
if (rawData.Length < 128) {
computedRawData.Insert(0, (Byte)rawData.Length);
} else {
// temp number
Int32 num = rawData.Length;
// track byte count, this will be necessary further
Int32 counter = 1;
// simply make bitwise AND to extract byte value
// and shift right while remaining value is still more than 255
// (there are more than 8 bits)
while (num >= 256) {
counter++;
computedRawData.Insert(0, (Byte)(num & 255));
num = num >> 8;
}
// compose final array
computedRawData.InsertRange(0, new[] { (Byte)(128 + counter), (Byte)num });
}
computedRawData.Insert(0, enclosingtag);
return computedRawData.ToArray();
}
and the final implementation with BitConverter class:
public static Byte[] Encode3(Byte[] rawData, Byte enclosingtag) {
if (rawData == null) {
return new Byte[] { enclosingtag, 0 };
}
List<Byte> computedRawData = new List<Byte>(rawData);
if (rawData.Length < 128) {
computedRawData.Insert(0, (Byte)rawData.Length);
} else {
// convert integer to a byte array
Byte[] bytes = BitConverter.GetBytes(rawData.Length);
// start from the end of a byte array to skip unnecessary zero bytes
for (int i = bytes.Length - 1; i >= 0; i--) {
// once the byte value is non-zero, take everything starting
// from the current position up to array start.
if (bytes[i] > 0) {
// we need to reverse the array to get the proper byte order
computedRawData.InsertRange(0, bytes.Take(i + 1).Reverse());
// compose final array
computedRawData.Insert(0, (Byte)(128 + i + 1));
computedRawData.Insert(0, enclosingtag);
return computedRawData.ToArray();
}
}
}
return null;
}
All methods do their work as expected. I used an example from Stopwatch class page to measure performance. And performance tests surprised me. My test method performed 1000 runs of the method to encode a byte array (actually, only array sixe) with 100 000 elements and average times are:
Encode -- around 200ms
Encode2 -- around 270ms
Encode3 -- around 320ms
I personally like method Encode2, because the code looks more readable, but its performance isn't that good.
The question: what you woul suggest to improve Encode2 method performance or to improve Encode readability?
Any help will be appreciated.
===========================
Update: Thanks to all who participated in this thread. I took into consideration all suggestions and ended up with this solution:
public static Byte[] Encode6(Byte[] rawData, Byte enclosingtag) {
if (rawData == null) {
return new Byte[] { enclosingtag, 0 };
}
Byte[] retValue;
if (rawData.Length < 128) {
retValue = new Byte[rawData.Length + 2];
retValue[0] = enclosingtag;
retValue[1] = (Byte)rawData.Length;
} else {
Byte[] lenBytes = new Byte[3];
Int32 num = rawData.Length;
Int32 counter = 0;
while (num >= 256) {
lenBytes[counter] = (Byte)(num & 255);
num >>= 8;
counter++;
}
// 3 is: len byte and enclosing tag
retValue = new byte[rawData.Length + 3 + counter];
rawData.CopyTo(retValue, 3 + counter);
retValue[0] = enclosingtag;
retValue[1] = (Byte)(129 + counter);
retValue[2] = (Byte)num;
Int32 n = 3;
for (Int32 i = counter - 1; i >= 0; i--) {
retValue[n] = lenBytes[i];
n++;
}
}
return retValue;
}
Eventually I moved away from lists to fixed-sized byte arrays. Avg time against the same data set is now about 65ms. It appears that lists (not bitwise operations) gives me a significant penalty in performance.
The main problem here is almost certainly the allocation of the List, and the allocation needed when you are inserting new elements, and when the list is converted to an array in the end. This code probably spend most of its time in the garbage collector and memory allocator. The use vs non-use of bitwise operators probably means very little in comparison, and I would have looked into ways to reduce the amount of memory you allocate first.
One way is to send in a reference to a byte array allocated in advance and and an index to where you are in the array instead of allocating and returning the data, and then return an integer telling how many bytes you have written. Working on large arrays is usually more efficient than working on many small objects. As others have mentioned, use a profiler, and see where your code spend its time.
Of cause the optimization I mentioned will makes your code more low level in nature, and more close to what you would typically do in C, but there is often a trade off between readability and performance.
Using "reverse, append, reverse" instead of "insert at front", and preallocating everything, it might be something like this: (not tested)
public static byte[] Encode4(byte[] rawData, byte enclosingtag) {
if (rawData == null) {
return new byte[] { enclosingtag, 0 };
}
List<byte> computedRawData = new List<byte>(rawData.Length + 6);
computedRawData.AddRange(rawData);
if (rawData.Length < 128) {
computedRawData.InsertRange(0, new byte[] { enclosingtag, (byte)rawData.Length });
} else {
computedRawData.Reverse();
// temp number
int num = rawData.Length;
// track byte count, this will be necessary further
int counter = 1;
// simply cast to byte to extract byte value
// and shift right while remaining value is still more than 255
// (there are more than 8 bits)
while (num >= 256) {
counter++;
computedRawData.Add((byte)num);
num >>= 8;
}
// compose final array
computedRawData.Add((byte)num);
computedRawData.Add((byte)(counter + 128));
computedRawData.Add(enclosingtag);
computedRawData.Reverse();
}
return computedRawData.ToArray();
}
I don't know for sure whether it's going to be faster, but it would make sense - now the expensive "insert at front" operation is mostly avoided, except in the case where there would be only one of them (probably not enough to balance with the two reverses).
An other idea is to limit the insert at front to only one time in an other way: collect all the things that have to be inserted there and then do it once. Could look something like this: (not tested)
public static byte[] Encode5(byte[] rawData, byte enclosingtag) {
if (rawData == null) {
return new byte[] { enclosingtag, 0 };
}
List<byte> computedRawData = new List<byte>(rawData);
if (rawData.Length < 128) {
computedRawData.InsertRange(0, new byte[] { enclosingtag, (byte)rawData.Length });
} else {
// list of all things that will be inserted
List<byte> front = new List<byte>(8);
// temp number
int num = rawData.Length;
// track byte count, this will be necessary further
int counter = 1;
// simply cast to byte to extract byte value
// and shift right while remaining value is still more than 255
// (there are more than 8 bits)
while (num >= 256) {
counter++;
front.Insert(0, (byte)num); // inserting in tiny list, not so bad
num >>= 8;
}
// compose final array
front.InsertRange(0, new[] { (byte)(128 + counter), (byte)num });
front.Insert(0, enclosingtag);
computedRawData.InsertRange(0, front);
}
return computedRawData.ToArray();
}
If it's not good enough or didn't help (or if this is worse - hey, could be), I'll try to come up with more ideas.

How to convert a byte array (MD5 hash) into a string (36 chars)?

I've got a byte array that was created using a hash function. I would like to convert this array into a string. So far so good, it will give me hexadecimal string.
Now I would like to use something different than hexadecimal characters, I would like to encode the byte array with these 36 characters: [a-z][0-9].
How would I go about?
Edit: the reason I would to do this, is because I would like to have a smaller string, than a hexadecimal string.
I adapted my arbitrary-length base conversion function from this answer to C#:
static string BaseConvert(string number, int fromBase, int toBase)
{
var digits = "0123456789abcdefghijklmnopqrstuvwxyz";
var length = number.Length;
var result = string.Empty;
var nibbles = number.Select(c => digits.IndexOf(c)).ToList();
int newlen;
do {
var value = 0;
newlen = 0;
for (var i = 0; i < length; ++i) {
value = value * fromBase + nibbles[i];
if (value >= toBase) {
if (newlen == nibbles.Count) {
nibbles.Add(0);
}
nibbles[newlen++] = value / toBase;
value %= toBase;
}
else if (newlen > 0) {
if (newlen == nibbles.Count) {
nibbles.Add(0);
}
nibbles[newlen++] = 0;
}
}
length = newlen;
result = digits[value] + result; //
}
while (newlen != 0);
return result;
}
As it's coming from PHP it might not be too idiomatic C#, there are also no parameter validity checks. However, you can feed it a hex-encoded string and it will work just fine with
var result = BaseConvert(hexEncoded, 16, 36);
It's not exactly what you asked for, but encoding the byte[] into hex is trivial.
See it in action.
Earlier tonight I came across a codereview question revolving around the same algorithm being discussed here. See: https://codereview.stackexchange.com/questions/14084/base-36-encoding-of-a-byte-array/
I provided a improved implementation of one of its earlier answers (both use BigInteger). See: https://codereview.stackexchange.com/a/20014/20654. The solution takes a byte[] and returns a Base36 string. Both the original and mine include simple benchmark information.
For completeness, the following is the method to decode a byte[] from an string. I'll include the encode function from the link above as well. See the text after this code block for some simple benchmark info for decoding.
const int kByteBitCount= 8; // number of bits in a byte
// constants that we use in FromBase36String and ToBase36String
const string kBase36Digits= "0123456789abcdefghijklmnopqrstuvwxyz";
static readonly double kBase36CharsLengthDivisor= Math.Log(kBase36Digits.Length, 2);
static readonly BigInteger kBigInt36= new BigInteger(36);
// assumes the input 'chars' is in big-endian ordering, MSB->LSB
static byte[] FromBase36String(string chars)
{
var bi= new BigInteger();
for (int x= 0; x < chars.Length; x++)
{
int i= kBase36Digits.IndexOf(chars[x]);
if (i < 0) return null; // invalid character
bi *= kBigInt36;
bi += i;
}
return bi.ToByteArray();
}
// characters returned are in big-endian ordering, MSB->LSB
static string ToBase36String(byte[] bytes)
{
// Estimate the result's length so we don't waste time realloc'ing
int result_length= (int)
Math.Ceiling(bytes.Length * kByteBitCount / kBase36CharsLengthDivisor);
// We use a List so we don't have to CopyTo a StringBuilder's characters
// to a char[], only to then Array.Reverse it later
var result= new System.Collections.Generic.List<char>(result_length);
var dividend= new BigInteger(bytes);
// IsZero's computation is less complex than evaluating "dividend > 0"
// which invokes BigInteger.CompareTo(BigInteger)
while (!dividend.IsZero)
{
BigInteger remainder;
dividend= BigInteger.DivRem(dividend, kBigInt36, out remainder);
int digit_index= Math.Abs((int)remainder);
result.Add(kBase36Digits[digit_index]);
}
// orientate the characters in big-endian ordering
result.Reverse();
// ToArray will also trim the excess chars used in length prediction
return new string(result.ToArray());
}
"A test 1234. Made slightly larger!" encodes to Base64 as "165kkoorqxin775ct82ist5ysteekll7kaqlcnnu6mfe7ag7e63b5"
To decode that Base36 string 1,000,000 times takes 12.6558909 seconds on my machine (I used the same build and machine conditions as provided in my answer on codereview)
You mentioned that you were dealing with a byte[] for the MD5 hash, rather than a hexadecimal string representation of it, so I think this solution provide the least overhead for you.
If you want a shorter string and can accept [a-zA-Z0-9] and + and / then look at Convert.ToBase64String
Using BigInteger (needs the System.Numerics reference)
Using BigInteger (needs the System.Numerics reference)
const string chars = "0123456789abcdefghijklmnopqrstuvwxyz";
// The result is padded with chars[0] to make the string length
// (int)Math.Ceiling(bytes.Length * 8 / Math.Log(chars.Length, 2))
// (so that for any value [0...0]-[255...255] of bytes the resulting
// string will have same length)
public static string ToBaseN(byte[] bytes, string chars, bool littleEndian = true, int len = -1)
{
if (bytes.Length == 0 || len == 0)
{
return String.Empty;
}
// BigInteger saves in the last byte the sign. > 7F negative,
// <= 7F positive.
// If we have a "negative" number, we will prepend a 0 byte.
byte[] bytes2;
if (littleEndian)
{
if (bytes[bytes.Length - 1] <= 0x7F)
{
bytes2 = bytes;
}
else
{
// Note that Array.Resize doesn't modify the original array,
// but creates a copy and sets the passed reference to the
// new array
bytes2 = bytes;
Array.Resize(ref bytes2, bytes.Length + 1);
}
}
else
{
bytes2 = new byte[bytes[0] > 0x7F ? bytes.Length + 1 : bytes.Length];
// We copy and reverse the array
for (int i = bytes.Length - 1, j = 0; i >= 0; i--, j++)
{
bytes2[j] = bytes[i];
}
}
BigInteger bi = new BigInteger(bytes2);
// A little optimization. We will do many divisions based on
// chars.Length .
BigInteger length = chars.Length;
// We pre-calc the length of the string. We know the bits of
// "information" of a byte are 8. Using Log2 we calc the bits of
// information of our new base.
if (len == -1)
{
len = (int)Math.Ceiling(bytes.Length * 8 / Math.Log(chars.Length, 2));
}
// We will build our string on a char[]
var chs = new char[len];
int chsIndex = 0;
while (bi > 0)
{
BigInteger remainder;
bi = BigInteger.DivRem(bi, length, out remainder);
chs[littleEndian ? chsIndex : len - chsIndex - 1] = chars[(int)remainder];
chsIndex++;
if (chsIndex < 0)
{
if (bi > 0)
{
throw new OverflowException();
}
}
}
// We append the zeros that we skipped at the beginning
if (littleEndian)
{
while (chsIndex < len)
{
chs[chsIndex] = chars[0];
chsIndex++;
}
}
else
{
while (chsIndex < len)
{
chs[len - chsIndex - 1] = chars[0];
chsIndex++;
}
}
return new string(chs);
}
public static byte[] FromBaseN(string str, string chars, bool littleEndian = true, int len = -1)
{
if (str.Length == 0 || len == 0)
{
return new byte[0];
}
// This should be the maximum length of the byte[] array. It's
// the opposite of the one used in ToBaseN.
// Note that it can be passed as a parameter
if (len == -1)
{
len = (int)Math.Ceiling(str.Length * Math.Log(chars.Length, 2) / 8);
}
BigInteger bi = BigInteger.Zero;
BigInteger length2 = chars.Length;
BigInteger mult = BigInteger.One;
for (int j = 0; j < str.Length; j++)
{
int ix = chars.IndexOf(littleEndian ? str[j] : str[str.Length - j - 1]);
// We didn't find the character
if (ix == -1)
{
throw new ArgumentOutOfRangeException();
}
bi += ix * mult;
mult *= length2;
}
var bytes = bi.ToByteArray();
int len2 = bytes.Length;
// BigInteger adds a 0 byte for positive numbers that have the
// last byte > 0x7F
if (len2 >= 2 && bytes[len2 - 1] == 0)
{
len2--;
}
int len3 = Math.Min(len, len2);
byte[] bytes2;
if (littleEndian)
{
if (len == bytes.Length)
{
bytes2 = bytes;
}
else
{
bytes2 = new byte[len];
Array.Copy(bytes, bytes2, len3);
}
}
else
{
bytes2 = new byte[len];
for (int i = 0; i < len3; i++)
{
bytes2[len - i - 1] = bytes[i];
}
}
for (int i = len3; i < len2; i++)
{
if (bytes[i] != 0)
{
throw new OverflowException();
}
}
return bytes2;
}
Be aware that they are REALLY slow! REALLY REALLY slow! (2 minutes for 100k). To speed them up you would probably need to rewrite the division/mod operation so that they work directly on a buffer, instead of each time recreating the scratch pads as it's done by BigInteger. And it would still be SLOW. The problem is that the time needed to encode the first byte is O(n) where n is the length of the byte array (this because all the array needs to be divided by 36). Unless you want to work with blocks of 5 bytes and lose some bits. Each symbol of Base36 carries around 5.169925001 bits. So 8 of these symbols would carry 41.35940001 bits. Very near 40 bytes.
Note that these methods can work both in little-endian mode and in big-endian mode. The endianness of the input and of the output is the same. Both methods accept a len parameter. You can use it to trim excess 0 (zeroes). Note that if you try to make an output too much small to contain the input, an OverflowException will be thrown.
System.Text.Encoding enc = System.Text.Encoding.ASCII;
string myString = enc.GetString(myByteArray);
You can play with what encoding you need:
System.Text.ASCIIEncoding,
System.Text.UnicodeEncoding,
System.Text.UTF7Encoding,
System.Text.UTF8Encoding
To match the requrements [a-z][0-9] you can use it:
Byte[] bytes = new Byte[] { 200, 180, 34 };
string result = String.Join("a", bytes.Select(x => x.ToString()).ToArray());
You will have string representation of bytes with char separator. To convert back you will need to split, and convert the string[] to byte[] using the same approach with .Select().
Usually a power of 2 is used - that way one character maps to a fixed number of bits. An alphabet of 32 bits for instance would map to 5 bits. The only challenge in that case is how to deserialize variable-length strings.
For 36 bits you could treat the data as a large number, and then:
divide by 36
add the remainder as character to your result
repeat until the division results in 0
Easier said than done perhaps.
you can use modulu.
this example encode your byte array to string of [0-9][a-z].
change it if you want.
public string byteToString(byte[] byteArr)
{
int i;
char[] charArr = new char[byteArr.Length];
for (i = 0; i < byteArr.Length; i++)
{
int byt = byteArr[i] % 36; // 36=num of availible charachters
if (byt < 10)
{
charArr[i] = (char)(byt + 48); //if % result is a digit
}
else
{
charArr[i] = (char)(byt + 87); //if % result is a letter
}
}
return new String(charArr);
}
If you don't want to lose data for de-encoding you can use this example:
public string byteToString(byte[] byteArr)
{
int i;
char[] charArr = new char[byteArr.Length*2];
for (i = 0; i < byteArr.Length; i++)
{
charArr[2 * i] = (char)((int)byteArr[i] / 36+48);
int byt = byteArr[i] % 36; // 36=num of availible charachters
if (byt < 10)
{
charArr[2*i+1] = (char)(byt + 48); //if % result is a digit
}
else
{
charArr[2*i+1] = (char)(byt + 87); //if % result is a letter
}
}
return new String(charArr);
}
and now you have a string double-lengthed when odd char is the multiply of 36 and even char is the residu. for example: 200=36*5+20 => "5k".

BitConverter.ToString() in reverse? [duplicate]

This question already has answers here:
How do you convert a byte array to a hexadecimal string, and vice versa?
(53 answers)
Closed 8 years ago.
I have an array of bytes that I would like to store as a string. I can do this as follows:
byte[] array = new byte[] { 0x01, 0x02, 0x03, 0x04 };
string s = System.BitConverter.ToString(array);
// Result: s = "01-02-03-04"
So far so good. Does anyone know how I get this back to an array? There is no overload of BitConverter.GetBytes() that takes a string, and it seems like a nasty workaround to break the string into an array of strings and then convert each of them.
The array in question may be of variable length, probably about 20 bytes.
Not a built in method, but an implementation. (It could be done without the split though).
String[] arr=str.Split('-');
byte[] array=new byte[arr.Length];
for(int i=0; i<arr.Length; i++) array[i]=Convert.ToByte(arr[i],16);
Method without Split: (Makes many assumptions about string format)
int length=(s.Length+1)/3;
byte[] arr1=new byte[length];
for (int i = 0; i < length; i++)
arr1[i] = Convert.ToByte(s.Substring(3 * i, 2), 16);
And one more method, without either split or substrings. You may get shot if you commit this to source control though. I take no responsibility for such health problems.
int length=(s.Length+1)/3;
byte[] arr1=new byte[length];
for (int i = 0; i < length; i++)
{
char sixteen = s[3 * i];
if (sixteen > '9') sixteen = (char)(sixteen - 'A' + 10);
else sixteen -= '0';
char ones = s[3 * i + 1];
if (ones > '9') ones = (char)(ones - 'A' + 10);
else ones -= '0';
arr1[i] = (byte)(16*sixteen+ones);
}
(basically implementing base16 conversion on two chars)
You can parse the string yourself:
byte[] data = new byte[(s.Length + 1) / 3];
for (int i = 0; i < data.Length; i++) {
data[i] = (byte)(
"0123456789ABCDEF".IndexOf(s[i * 3]) * 16 +
"0123456789ABCDEF".IndexOf(s[i * 3 + 1])
);
}
The neatest solution though, I believe, is using extensions:
byte[] data = s.Split('-').Select(b => Convert.ToByte(b, 16)).ToArray();
If you don't need that specific format, try using Base64, like this:
var bytes = new byte[] { 0x12, 0x34, 0x56 };
var base64 = Convert.ToBase64String(bytes);
bytes = Convert.FromBase64String(base64);
Base64 will also be substantially shorter.
If you need to use that format, this obviously won't help.
byte[] data = Array.ConvertAll<string, byte>(str.Split('-'), s => Convert.ToByte(s, 16));
I believe the following will solve this robustly.
public static byte[] HexStringToBytes(string s)
{
const string HEX_CHARS = "0123456789ABCDEF";
if (s.Length == 0)
return new byte[0];
if ((s.Length + 1) % 3 != 0)
throw new FormatException();
byte[] bytes = new byte[(s.Length + 1) / 3];
int state = 0; // 0 = expect first digit, 1 = expect second digit, 2 = expect hyphen
int currentByte = 0;
int x;
int value = 0;
foreach (char c in s)
{
switch (state)
{
case 0:
x = HEX_CHARS.IndexOf(Char.ToUpperInvariant(c));
if (x == -1)
throw new FormatException();
value = x << 4;
state = 1;
break;
case 1:
x = HEX_CHARS.IndexOf(Char.ToUpperInvariant(c));
if (x == -1)
throw new FormatException();
bytes[currentByte++] = (byte)(value + x);
state = 2;
break;
case 2:
if (c != '-')
throw new FormatException();
state = 0;
break;
}
}
return bytes;
}
it seems like a nasty workaround to break the string into an array of strings and then convert each of them.
I don't think there's another way... the format produced by BitConverter.ToString is quite specific, so if there is no existing method to parse it back to a byte[], I guess you have to do it yourself
the ToString method is not really intended as a conversion, rather to provide a human-readable format for debugging, easy printout, etc.
I'd rethink about the byte[] - String - byte[] requirement and probably prefer SLaks' Base64 solution

Categories