I am currently working on a network tool that needs to decode/encode a particular protocol that packs fields into dense bit arrays at arbitrary positions. For example, one part of the protocol uses 3 bytes to represent a number of different fields:
Bit Position(s) Length (In Bits) Type
0 1 bool
1-5 5 int
6-13 8 int
14-22 9 uint
23 1 bool
As you can see, several of the fields span multiple bytes. Many (most) are also shorter than the built-in type that might be used to represent them, such as the first int field which is only 5 bits long. In these cases, the most significant bits of the target type (such as an Int32 or Int16) should be padded with 0 to make up the difference.
My problem is that I am having a difficult time processing this kind of data. Specifically, I am having a hard time figuring out how to efficiently get arbitrary length bit arrays, populate them with the appropriate bits from the source buffer, pad them to match the target type, and convert the padded bit arrays to the target type. In an ideal world, I would be able to take the byte[3] in the example above and call a method like GetInt32(byte[] bytes, int startBit, int length).
The closest thing in the wild that I've found is a BitStream class, but it appears to want individual values to line up on byte/word boundaries (and the half-streaming/half-indexed access convention of the class makes it a little confusing).
My own first attempt was to use the BitArray class, but that proved somewhat unwieldy. It's easy enough to stuff all the bits from the buffer into a large BitArray, transfer only the ones you want from the source BitArray to a new temporary BitArray, and then convert that into the target value...but it seems wrong, and very time consuming.
I am now considering a class like the following that references (or creates) a source/target byte[] buffer along with an offset and provides get and set methods for certain target types. The tricky part is that getting/setting values may span multiple bytes.
class BitField
{
private readonly byte[] _bytes;
private readonly int _offset;
public BitField(byte[] bytes)
: this(bytes, 0)
{
}
public BitField(byte[] bytes, int offset)
{
_bytes = bytes;
_offset = offset;
}
public BitField(int size)
: this(new byte[size], 0)
{
}
public bool this[int bit]
{
get { return IsSet(bit); }
set { if (value) Set(bit); else Clear(bit); }
}
public bool IsSet(int bit)
{
return (_bytes[_offset + (bit / 8)] & (1 << (bit % 8))) != 0;
}
public void Set(int bit)
{
_bytes[_offset + (bit / 8)] |= unchecked((byte)(1 << (bit % 8)));
}
public void Clear(int bit)
{
_bytes[_offset + (bit / 8)] &= unchecked((byte)~(1 << (bit % 8)));
}
//startIndex = the index of the bit at which to start fetching the value
//length = the number of bits to include - may be less than 32 in which case
//the most significant bits of the target type should be padded with 0
public int GetInt32(int startIndex, int length)
{
//NEED CODE HERE
}
//startIndex = the index of the bit at which to start storing the value
//length = the number of bits to use, if less than the number of bits required
//for the source type, precision may be lost
//value = the value to store
public void SetValue(int startIndex, int length, int value)
{
//NEED CODE HERE
}
//Other Get.../Set... methods go here
}
I am looking for any guidance in this area such as third-party libraries, algorithms for getting/setting values at arbitrary bit positions that span multiple bytes, feedback on my approach, etc. I included the class above for clarification and am not necessarily looking for code to fill it in (though I won't argue if someone wants to work it out!).
As promised, here is the class I ended up creating for this purpose. It will wrap an arbitrary byte array at an optionally specified index and allowing reading/writing at the bit level. It provides methods for reading/writing arbitrary blocks of bits from other byte arrays or for reading/writing primitive values with user-defined offsets and lengths. It works very well for my situation and solves the exact question I asked above. However, it does have a couple shortcomings. The first is that it is obviously not greatly documented - I just haven't had the time. The second is that there are no bounds or other checks. It also currently requires the MiscUtil library to provide endian conversion. All that said, hopefully this can help solve or serve as a starting point for someone else with a similar use case.
internal class BitField
{
private readonly byte[] _bytes;
private readonly int _offset;
private EndianBitConverter _bitConverter = EndianBitConverter.Big;
public BitField(byte[] bytes)
: this(bytes, 0)
{
}
//offset = the offset (in bytes) into the wrapped byte array
public BitField(byte[] bytes, int offset)
{
_bytes = bytes;
_offset = offset;
}
public BitField(int size)
: this(new byte[size], 0)
{
}
//fill == true = initially set all bits to 1
public BitField(int size, bool fill)
: this(new byte[size], 0)
{
if (!fill) return;
for(int i = 0 ; i < size ; i++)
{
_bytes[i] = 0xff;
}
}
public byte[] Bytes
{
get { return _bytes; }
}
public int Offset
{
get { return _offset; }
}
public EndianBitConverter BitConverter
{
get { return _bitConverter; }
set { _bitConverter = value; }
}
public bool this[int bit]
{
get { return IsBitSet(bit); }
set { if (value) SetBit(bit); else ClearBit(bit); }
}
public bool IsBitSet(int bit)
{
return (_bytes[_offset + (bit / 8)] & (1 << (7 - (bit % 8)))) != 0;
}
public void SetBit(int bit)
{
_bytes[_offset + (bit / 8)] |= unchecked((byte)(1 << (7 - (bit % 8))));
}
public void ClearBit(int bit)
{
_bytes[_offset + (bit / 8)] &= unchecked((byte)~(1 << (7 - (bit % 8))));
}
//index = the index of the source BitField at which to start getting bits
//length = the number of bits to get
//size = the total number of bytes required (0 for arbitrary length return array)
//fill == true = set all padding bits to 1
public byte[] GetBytes(int index, int length, int size, bool fill)
{
if(size == 0) size = (length + 7) / 8;
BitField bitField = new BitField(size, fill);
for(int s = index, d = (size * 8) - length ; s < index + length && d < (size * 8) ; s++, d++)
{
bitField[d] = IsBitSet(s);
}
return bitField._bytes;
}
public byte[] GetBytes(int index, int length, int size)
{
return GetBytes(index, length, size, false);
}
public byte[] GetBytes(int index, int length)
{
return GetBytes(index, length, 0, false);
}
//bytesIndex = the index (in bits) into the bytes array at which to start copying
//index = the index (in bits) in this BitField at which to put the value
//length = the number of bits to copy from the bytes array
public void SetBytes(byte[] bytes, int bytesIndex, int index, int length)
{
BitField bitField = new BitField(bytes);
for (int i = 0; i < length; i++)
{
this[index + i] = bitField[bytesIndex + i];
}
}
public void SetBytes(byte[] bytes, int index, int length)
{
SetBytes(bytes, 0, index, length);
}
public void SetBytes(byte[] bytes, int index)
{
SetBytes(bytes, 0, index, bytes.Length * 8);
}
//UInt16
//index = the index (in bits) at which to start getting the value
//length = the number of bits to use for the value, if less than required the value is padded with 0
public ushort GetUInt16(int index, int length)
{
return _bitConverter.ToUInt16(GetBytes(index, length, 2), 0);
}
public ushort GetUInt16(int index)
{
return GetUInt16(index, 16);
}
//valueIndex = the index (in bits) of the value at which to start copying
//index = the index (in bits) in this BitField at which to put the value
//length = the number of bits to copy from the value
public void Set(ushort value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(ushort value, int index)
{
Set(value, 0, index, 16);
}
//UInt32
public uint GetUInt32(int index, int length)
{
return _bitConverter.ToUInt32(GetBytes(index, length, 4), 0);
}
public uint GetUInt32(int index)
{
return GetUInt32(index, 32);
}
public void Set(uint value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(uint value, int index)
{
Set(value, 0, index, 32);
}
//UInt64
public ulong GetUInt64(int index, int length)
{
return _bitConverter.ToUInt64(GetBytes(index, length, 8), 0);
}
public ulong GetUInt64(int index)
{
return GetUInt64(index, 64);
}
public void Set(ulong value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(ulong value, int index)
{
Set(value, 0, index, 64);
}
//Int16
public short GetInt16(int index, int length)
{
return _bitConverter.ToInt16(GetBytes(index, length, 2, IsBitSet(index)), 0);
}
public short GetInt16(int index)
{
return GetInt16(index, 16);
}
public void Set(short value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(short value, int index)
{
Set(value, 0, index, 16);
}
//Int32
public int GetInt32(int index, int length)
{
return _bitConverter.ToInt32(GetBytes(index, length, 4, IsBitSet(index)), 0);
}
public int GetInt32(int index)
{
return GetInt32(index, 32);
}
public void Set(int value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(int value, int index)
{
Set(value, 0, index, 32);
}
//Int64
public long GetInt64(int index, int length)
{
return _bitConverter.ToInt64(GetBytes(index, length, 8, IsBitSet(index)), 0);
}
public long GetInt64(int index)
{
return GetInt64(index, 64);
}
public void Set(long value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(long value, int index)
{
Set(value, 0, index, 64);
}
//Char
public char GetChar(int index, int length)
{
return _bitConverter.ToChar(GetBytes(index, length, 2), 0);
}
public char GetChar(int index)
{
return GetChar(index, 16);
}
public void Set(char value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(char value, int index)
{
Set(value, 0, index, 16);
}
//Bool
public bool GetBool(int index, int length)
{
return _bitConverter.ToBoolean(GetBytes(index, length, 1), 0);
}
public bool GetBool(int index)
{
return GetBool(index, 8);
}
public void Set(bool value, int valueIndex, int index, int length)
{
SetBytes(_bitConverter.GetBytes(value), valueIndex, index, length);
}
public void Set(bool value, int index)
{
Set(value, 0, index, 8);
}
//Single and double precision floating point values must always use the correct number of bits
public float GetSingle(int index)
{
return _bitConverter.ToSingle(GetBytes(index, 32, 4), 0);
}
public void SetSingle(float value, int index)
{
SetBytes(_bitConverter.GetBytes(value), 0, index, 32);
}
public double GetDouble(int index)
{
return _bitConverter.ToDouble(GetBytes(index, 64, 8), 0);
}
public void SetDouble(double value, int index)
{
SetBytes(_bitConverter.GetBytes(value), 0, index, 64);
}
}
If your packets are always smaller than 8 or 4 bytes it would be easier to store each packet in an Int32 or Int64. The byte array only complicates things. You do have to pay attention to High-Endian vs Low-Endian storage.
And then, for a 3 byte package:
public static void SetValue(Int32 message, int startIndex, int length, int value)
{
// we want lengthx1
int mask = (1 << length) - 1;
value = value & mask; // or check and throw
int offset = 24 - startIndex - length; // 24 = 3 * 8
message = message | (value << offset);
}
First up, it seems you have re invented the wheel with the System.Collections.BitArray class. As for actually finding the value of a specific field of bits, I think that can easily be accomplished with a little math magic of the following pseudocode:
Start at the most distant digit in your selection (startIndex + length).
If it is set, add 2^(distance from digit). In this case, it would be 0 (mostDistance - self = 0). So Add 2^0 (1).
Move one bit to the left.
Repeat for each digit in the length you want.
In that situation, if you had a bit array like so:
10001010
And you want the value of digits 0-3, you would get something like:
[Index 3] [Index 2] [Index 1] [Index 0]
(3 - 3) (3 - 2) (3 - 1) (3 - 0)
=============================================
(0 * 2^0) + (0 * 2^1) + (0 * 2^2) + (1 * 2^3) = 8
Since 1000 (binary) == 8, the math works out.
What's the problem with just using simple bit shifts to get your values out?
int data = Convert.ToInt32( "110000000010000000100001", 2 );
bool v1 = ( data & 1 ) == 1; // True
int v2 = ( data >> 1 ) & 0x1F; // 16
int v3 = ( data >> 6 ) & 0xFF; // 128
uint v4 = (uint )( data >> 14 ) & 0x1FF; // 256
bool v5 = ( data >> 23 ) == 1; // True
This is a pretty good article covering the subject. it's in C, but the same concepts still apply.
I have a string representing bits, such as:
"0000101000010000"
I want to convert it to get an array of bytes such as:
{0x0A, 0x10}
The number of bytes is variable but there will always be padding to form 8 bits per byte (so 1010 becomes 000010101).
Use the builtin Convert.ToByte() and read in chunks of 8 chars without reinventing the thing..
Unless this is something that should teach you about bitwise operations.
Update:
Stealing from Adam (and overusing LINQ, probably. This might be too concise and a normal loop might be better, depending on your own (and your coworker's!) preferences):
public static byte[] GetBytes(string bitString) {
return Enumerable.Range(0, bitString.Length/8).
Select(pos => Convert.ToByte(
bitString.Substring(pos*8, 8),
2)
).ToArray();
}
public static byte[] GetBytes(string bitString)
{
byte[] output = new byte[bitString.Length / 8];
for (int i = 0; i < output.Length; i++)
{
for (int b = 0; b <= 7; b++)
{
output[i] |= (byte)((bitString[i * 8 + b] == '1' ? 1 : 0) << (7 - b));
}
}
return output;
}
Here's a quick and straightforward solution (and I think it will meet all your requirements): http://vbktech.wordpress.com/2011/07/08/c-net-converting-a-string-of-bits-to-a-byte-array/
This should get you to your answer: How can I convert bits to bytes?
You could just convert your string into an array like that article has, and from there use the same logic to perform the conversion.
Get the characers in groups of eight, and parse to a byte:
string bits = "0000101000010000";
byte[] data =
Regex.Matches(bits, ".{8}").Cast<Match>()
.Select(m => Convert.ToByte(m.Groups[0].Value, 2))
.ToArray();
private static byte[] GetBytes(string bitString)
{
byte[] result = Enumerable.Range(0, bitString.Length / 8).
Select(pos => Convert.ToByte(
bitString.Substring(pos * 8, 8),
2)
).ToArray();
List<byte> mahByteArray = new List<byte>();
for (int i = result.Length - 1; i >= 0; i--)
{
mahByteArray.Add(result[i]);
}
return mahByteArray.ToArray();
}
private static String ToBitString(BitArray bits)
{
var sb = new StringBuilder();
for (int i = bits.Count - 1; i >= 0; i--)
{
char c = bits[i] ? '1' : '0';
sb.Append(c);
}
return sb.ToString();
}
You can go any of below,
byte []bytes = System.Text.Encoding.UTF8.GetBytes("Hi");
string str = System.Text.Encoding.UTF8.GetString(bytes);
byte []bytesNew = System.Convert.FromBase64String ("Hello!");
string strNew = System.Convert.ToBase64String(bytesNew);
How can I write bits to a stream (System.IO.Stream) or read in C#? thanks.
You could create an extension method on Stream that enumerates the bits, like this:
public static class StreamExtensions
{
public static IEnumerable<bool> ReadBits(this Stream input)
{
if (input == null) throw new ArgumentNullException("input");
if (!input.CanRead) throw new ArgumentException("Cannot read from input", "input");
return ReadBitsCore(input);
}
private static IEnumerable<bool> ReadBitsCore(Stream input)
{
int readByte;
while((readByte = input.ReadByte()) >= 0)
{
for(int i = 7; i >= 0; i--)
yield return ((readByte >> i) & 1) == 1;
}
}
}
Using this extension method is easy:
foreach(bool bit in stream.ReadBits())
{
// do something with the bit
}
Attention: you should not call ReadBits multiple times on the same Stream, otherwise the subsequent calls will forget the current bit position and will just start reading the next byte.
This is not possible with the default stream class. The C# (BCL) Stream class operates on the granularity of bytes at it's lowest level. What you can do is write a wrapper class which reads bytes and partititions them out to bits.
For example:
class BitStream : IDisposable {
private Stream m__stream;
private byte? m_current;
private int m_index;
public byte ReadNextBit() {
if ( !m_current.HasValue ) {
m_current = ReadNextByte();
m_index = 0;
}
var value = (m_byte.Value >> m_index) & 0x1;
m_index++;
if (m_index == 8) {
m_current = null;
}
return value;
}
private byte ReadNextByte() {
...
}
// Dispose implementation omitted
}
Note: This will read the bits in right to left fashion which may or may not be what you're intending.
If you need to retrieve separate sections of your byte stream a few bits at a time, you need to remember the position of the bit to read next between calls. The following class takes care of caching the current byte and the bit position within it between calls.
// Binary MSB-first bit enumeration.
public class BitStream
{
private Stream wrapped;
private int bitPos = -1;
private int buffer;
public BitStream(Stream stream) => this.wrapped = stream;
public IEnumerable<bool> ReadBits()
{
do
{
while (bitPos >= 0)
{
yield return (buffer & (1 << bitPos--)) > 0;
}
buffer = wrapped.ReadByte();
bitPos = 7;
} while (buffer > -1);
}
}
Call like this:
var bStream = new BitStream(<existing Stream>);
var firstBits = bStream.ReadBits().Take(2);
var nextBits = bStream.ReadBits().Take(3);
...
For your purpose, I wrote an easy-to-use, fast and open-source (MIT license) library for this, called "BitStream", which is available at github (https://github.com/martinweihrauch/BitStream).
In this example, you can see how 5 unsigned integers, which can be represented with 6 bits (all below the value 63) are written with 6 bits each to a stream and then read back. Please note that the library takes and returns long or ulong values for the ease of it, so just convert your e. g. int, uint, etc to long/ulong first.
using SharpBitStream;
uint[] testDataUnsigned = { 5, 62, 17, 50, 33 };
var ms = new MemoryStream();
var bs = new BitStream(ms);
Console.WriteLine("Test1: \r\nFirst testing writing and reading small numbers of a max of 6 bits.");
Console.WriteLine("There are 5 unsigned ints , which shall be written into 6 bits each as they are all small than 64: 5, 62, 17, 50, 33");
foreach(var bits in testDataUnsigned)
{
bs.WriteUnsigned(6, (ulong)bits);
}
Console.WriteLine("The original data are of the size: " + testDataUnsigned.Length + " bytes. The size of the stream is now: " + ms.Length + " bytes\r\nand the bytes in it are: ");
ms.Position = 0;
Console.WriteLine("The resulting bytes in the stream look like this: ");
for (int i = 0; i < ms.Length; i++)
{
uint bits = (uint)ms.ReadByte();
Console.WriteLine("Byte #" + Convert.ToString(i).PadLeft(4, '0') + ": " + Convert.ToString(bits, 2).PadLeft(8, '0'));
}
Console.WriteLine("\r\nNow reading the bits back:");
ms.Position = 0;
bs.SetPosition(0, 0);
foreach (var bits in testDataUnsigned)
{
ulong number = (uint)bs.ReadUnsigned(6);
Console.WriteLine("Number read: " + number);
}
I have a BitArray with the length of 8, and I need a function to convert it to a byte. How to do it?
Specifically, I need a correct function of ConvertToByte:
BitArray bit = new BitArray(new bool[]
{
false, false, false, false,
false, false, false, true
});
//How to write ConvertToByte
byte myByte = ConvertToByte(bit);
var recoveredBit = new BitArray(new[] { myByte });
Assert.AreEqual(bit, recoveredBit);
This should work:
byte ConvertToByte(BitArray bits)
{
if (bits.Count != 8)
{
throw new ArgumentException("bits");
}
byte[] bytes = new byte[1];
bits.CopyTo(bytes, 0);
return bytes[0];
}
A bit late post, but this works for me:
public static byte[] BitArrayToByteArray(BitArray bits)
{
byte[] ret = new byte[(bits.Length - 1) / 8 + 1];
bits.CopyTo(ret, 0);
return ret;
}
Works with:
string text = "Test";
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(text);
BitArray bits = new BitArray(bytes);
bytes[] bytesBack = BitArrayToByteArray(bits);
string textBack = System.Text.Encoding.ASCII.GetString(bytesBack);
// bytes == bytesBack
// text = textBack
.
A poor man's solution:
protected byte ConvertToByte(BitArray bits)
{
if (bits.Count != 8)
{
throw new ArgumentException("illegal number of bits");
}
byte b = 0;
if (bits.Get(7)) b++;
if (bits.Get(6)) b += 2;
if (bits.Get(5)) b += 4;
if (bits.Get(4)) b += 8;
if (bits.Get(3)) b += 16;
if (bits.Get(2)) b += 32;
if (bits.Get(1)) b += 64;
if (bits.Get(0)) b += 128;
return b;
}
Unfortunately, the BitArray class is partially implemented in .Net Core class (UWP). For example BitArray class is unable to call the CopyTo() and Count() methods. I wrote this extension to fill the gap:
public static IEnumerable<byte> ToBytes(this BitArray bits, bool MSB = false)
{
int bitCount = 7;
int outByte = 0;
foreach (bool bitValue in bits)
{
if (bitValue)
outByte |= MSB ? 1 << bitCount : 1 << (7 - bitCount);
if (bitCount == 0)
{
yield return (byte) outByte;
bitCount = 8;
outByte = 0;
}
bitCount--;
}
// Last partially decoded byte
if (bitCount < 7)
yield return (byte) outByte;
}
The method decodes the BitArray to a byte array using LSB (Less Significant Byte) logic. This is the same logic used by the BitArray class. Calling the method with the MSB parameter set on true will produce a MSB decoded byte sequence. In this case, remember that you maybe also need to reverse the final output byte collection.
This should do the trick. However the previous answer is quite likely the better option.
public byte ConvertToByte(BitArray bits)
{
if (bits.Count > 8)
throw new ArgumentException("ConvertToByte can only work with a BitArray containing a maximum of 8 values");
byte result = 0;
for (byte i = 0; i < bits.Count; i++)
{
if (bits[i])
result |= (byte)(1 << i);
}
return result;
}
In the example you posted the resulting byte will be 0x80. In other words the first value in the BitArray coresponds to the first bit in the returned byte.
That's should be the ultimate one. Works with any length of array.
private List<byte> BoolList2ByteList(List<bool> values)
{
List<byte> ret = new List<byte>();
int count = 0;
byte currentByte = 0;
foreach (bool b in values)
{
if (b) currentByte |= (byte)(1 << count);
count++;
if (count == 7) { ret.Add(currentByte); currentByte = 0; count = 0; };
}
if (count < 7) ret.Add(currentByte);
return ret;
}
In addition to #JonSkeet's answer you can use an Extension Method as below:
public static byte ToByte(this BitArray bits)
{
if (bits.Count != 8)
{
throw new ArgumentException("bits");
}
byte[] bytes = new byte[1];
bits.CopyTo(bytes, 0);
return bytes[0];
}
And use like:
BitArray foo = new BitArray(new bool[]
{
false, false, false, false,false, false, false, true
});
foo.ToByte();
byte GetByte(BitArray input)
{
int len = input.Length;
if (len > 8)
len = 8;
int output = 0;
for (int i = 0; i < len; i++)
if (input.Get(i))
output += (1 << (len - 1 - i)); //this part depends on your system (Big/Little)
//output += (1 << i); //depends on system
return (byte)output;
}
Cheers!
Little endian byte array converter : First bit (indexed with "0") in the BitArray
assumed to represents least significant bit (rightmost bit in the bit-octet) which interpreted as "zero" or "one" as binary.
public static class BitArrayExtender {
public static byte[] ToByteArray( this BitArray bits ) {
const int BYTE = 8;
int length = ( bits.Count / BYTE ) + ( (bits.Count % BYTE == 0) ? 0 : 1 );
var bytes = new byte[ length ];
for ( int i = 0; i < bits.Length; i++ ) {
int bitIndex = i % BYTE;
int byteIndex = i / BYTE;
int mask = (bits[ i ] ? 1 : 0) << bitIndex;
bytes[ byteIndex ] |= (byte)mask;
}//for
return bytes;
}//ToByteArray
}//class