Integer to Bn conversion - c#

I have this piece of c# code that I cannot understand. During the first iteration of the IntToBin loop, I understand that the shift operator convert it to byte value of 7 but on the second pass, the byte value is 224. How is the 224 achieved.
static void Main(string[] args)
{
IntToBin(2016,2);
//Console.Write((byte)2016);
}
public static byte[] IntToBin(int from, int len)
{
byte[] to = new byte[len];
int max = len;
int t;
for (int i_move = max - 1, i_to = 0; i_move >= 0; i_move--, i_to++)
{
to[i_to] = (byte)(from >> (8 * i_move));
}
return to;
}

As far as I can see, you have difficulty with this line
to[i_to] = (byte)(from >> (8 * i_move));
You can easily test that
2016 == 7 * 256 + 224
Now how to get these numbers?
Shift operator >> is, in fact, an integer division by
powers of two:
>> 0 - no division (division by 1)
>> 1 - division by 2
>> 2 - division by 4
...
>> 8 * i_move - dision by 2**(8*i_move) i.e. division by 256 ** i_move
while (byte) cast is, in fact, remainder operator % 256
since (byte) returns the last byte.
Now let's try to unwrap the loop
i_move == 1 // max - 1 where max = 2
to[0] = (from / 256) % 256; // = 7
i_move == 0
to[1] = (from / 1) % 256; // = 224
In general case
to[0] = from / (256 ** (len - 1)) % 256;
to[1] = from / (256 ** (len - 2)) % 256;
...
to[len - 3] = from / (256 ** 2) % 256;
to[len - 2] = from / (256) % 256;
to[len - 1] = from / (1) % 256;

Related

How to convert a float/double/half to a minifloat the optimal way (improve my already working code)?

I've written an IEEE 754 "quarter" 8-bit minifloat in a 1.3.4.−3 format in C#.
It was mostly a fun little side-project, testing whether or not I understand floats.
Actually, though, I find myself using it more than I'd like to admit :) (bandwidth > clock ticks)
Here's my code for converting the minifloat to a 32-bit float:
public static implicit operator float(quarter q)
{
int sign = (q.value & 0b1000_0000) << 24;
int fusedExponentMantissa = (q.value & 0b0111_1111) << (23 - MANTISSA_BITS);
if ((q.value & 0b0111_0000) == 0b0111_0000) // NaN/Infinity
{
return asfloat(sign | (255 << 23) | fusedExponentMantissa);
}
else // normal and subnormal
{
float magic = asfloat((255 - 1 + EXPONENT_BIAS) << 23);
return magic * asfloat(sign | fusedExponentMantissa);
}
}
where quarter.value is the stored byte and "asfloat" is simply *(float*)&myUInt.The "magic" number makes use of mantissa overflow in the subnormal case, which affects the f_32 exponent (integer multiplication and mask + add is slower than FPU-switch and float multiplication). I guess one could optimize away the branch, too.
But here comes the problematic code - float_32 to float_8:
public static explicit operator quarter(float f)
{
byte f8_sign = (byte)((asuint(f) & 0x8000_0000u) >> 24);
uint f32_exponent = asuint(f) & 0x7F80_0000u;
uint f32_mantissa = asuint(f) & 0x007F_FFFFu;
if (f32_exponent < (120 << 23)) // underflow => preserve +/- 0
{
return new quarter { value = f8_sign };
}
else if (f32_exponent > (130 << 23)) // overflow => +/- infinity or preserve NaN
{
return new quarter { value = (byte)(f8_sign | PositiveInfinity.value | touint8(isnan(f))) };
}
else
{
switch (f32_exponent)
{
case 120 << 23: // 2^(-7) * 1.(mantissa > 0) means the value is closer to quarter.epsilon than 0
{
return new quarter { value = (byte)(f8_sign | touint8(f32_mantissa != 0)) };
}
case 121 << 23: // 2^(-6) * (1 + mantissa): return +/- quarter.epsilon = 2^(-2) * (0 + 2^(-4)); if the mantissa is > 0.5 i.e. 2^(-6) * max(mantissa, 1.75), return 2^(-2) * 2^(-3)
{
return new quarter { value = (byte)(f8_sign | (Epsilon.value + touint8(f32_mantissa > 0x0040_0000))) };
}
case 122 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_0010u | (f32_mantissa >> 22)) };
}
case 123 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_0100u | (f32_mantissa >> 21)) };
}
case 124 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_1000u | (f32_mantissa >> 20)) };
}
default:
{
const uint exponentDelta = (127 + EXPONENT_BIAS) << 23;
return new quarter { value = (byte)(f8_sign | (((f32_exponent - exponentDelta) | f32_mantissa) >> 19)) };
}
}
}
}
... where the function
"asuint" is simply *(uint*)&myFloat and
"touint8" is simply *(byte*)&myBoolean i.e. myBoolean ? 1 : 0.
The first five cases deal with numbers that can only be represented as subnormals in a "quarter".
I want to get rid of the switch at the very least. There's obviously a pattern (same as with float8_to_float32) but I haven't been able to figure out how I could unify the entire switch for days... I tried to google how hardware converts doubles to floats but that yielded no results either.
My requirements are to hold on to the IEEE-754 standard, meaning:
NaN, infinity preservation and clamping to infinity/zero in case of over-/underflow, aswell as rounding to epsilon when the larger type's value is closer to epsilon than 0 (first switch case aswell as the underflow limit in the first if statement).
Can anyone at least push me in the right direction please?
This may not be optimal, but it uses strictly conforming C code except as noted in the first comment, so no pointer aliasing or other manipulation of the bits of a floating-point object. A thorough test program is included.
#include <inttypes.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
/* Notes on portability:
uint8_t is an optional type. Its use here is easily replaced by
unsigned char.
Round-to-nearest is required in FloatToMini.
Floating-point must be base two, and the constant in the
Dekker-Veltkamp split is hardcoded for IEEE-754 binary64 but could be
adopted to other formats. (Change the exponent in 0x1p48 to the number
of bits in the significand minus five.)
*/
/* Convert a double to a 1-3-4 floating-point format. Round-to-nearest is
required.
*/
static uint8_t FloatToMini(double x)
{
// Extract the sign bit of x, moved into its position in a mini-float.
uint8_t s = !!signbit(x) << 7;
x = fabs(x);
/* If x is a NaN, return a quiet NaN with the copied sign. Significand
bits are not preserved.
*/
if (x != x)
return s | 0x78;
/* If |x| is greater than or equal to the rounding point between the
maximum finite value and infinity, return infinity with the copied sign.
(0x1.fp0 is the largest representable significand, 0x1.f8 is that plus
half an ULP, and the largest exponent is 3, so 0x1.f8p3 is that
rounding point.)
*/
if (0x1.f8p3 <= x)
return s | 0x70;
// If x is subnormal, encode with zero exponent.
if (x < 0x1p-2 - 0x1p-7)
return s | (uint8_t) nearbyint(x * 0x1p6);
/* Round to five significand bits using the Dekker-Veltkamp Split. (The
cast eliminates the excess precision that the C standard allows.)
*/
double d = x * (0x1p48 + 1);
x = d - (double) (d-x);
/* Separate the significand and exponent. C's frexp scales the exponent
so the significand is in [.5, 1), hence the e-1 below.
*/
int e;
x = frexp(x, &e) - .5;
return s | (e-1+3) << 4 | (uint8_t) (x*0x1p5);
}
static void Show(double x)
{
printf("%g -> 0x%02" PRIx8 ".\n", x, FloatToMini(x));
}
static void Test(double x, uint8_t expected)
{
uint8_t observed = FloatToMini(x);
if (expected != observed)
{
printf("Error, %.9g (%a) produced 0x%02" PRIx8
" but expected 0x%02" PRIx8 ".\n",
x, x, observed, expected);
exit(EXIT_FAILURE);
}
}
int main(void)
{
// Set the value of an ULP in [1, 2).
static const double ULP = 0x1p-4;
// Test all even significands with normal exponents.
for (double s = 1; s < 2; s += 2*ULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -ULP / (s == 1 ? 4 : 2); t <= +ULP/2; t += ULP/16)
// Test with all normal exponents.
for (int e = 1-3; e < 7-3; ++e)
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (e+3) << 4
| (uint8_t) ((s-1) * 0x1p4);
Test(sign * ldexp(s+t, e), expected);
}
// Test all odd significands with normal exponents.
for (double s = 1 + 1*ULP; s < 2; s += 2*ULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -ULP/2+ULP/16; t < +ULP/2; t += ULP/16)
// Test with all normal exponents.
for (int e = 1-3; e < 7-3; ++e)
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (e+3) << 4
| (uint8_t) ((s-1) * 0x1p4);
Test(sign * ldexp(s+t, e), expected);
}
// Set the value of an ULP in the subnormal range.
static const double subULP = ULP * 0x1p-2;
// Test all even significands with the subnormal exponent.
for (double s = 0; s < 0x1p-2; s += 2*subULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = s == 0 ? 0 : -subULP/2; t <= +subULP/2; t += subULP/16)
{
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (uint8_t) (s/subULP);
Test(sign * (s+t), expected);
}
}
// Test all odd significands with the subnormal exponent.
for (double s = 0 + 1*subULP; s < 0x1p-2; s += 2*subULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -subULP/2 + subULP/16; t < +subULP/2; t += subULP/16)
{
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (uint8_t) (s/subULP);
Test(sign * (s+t), expected);
}
}
// Test at and slightly under the point of rounding to infinity.
Test(+15.75, 0x70);
Test(-15.75, 0xf0);
Test(nexttoward(+15.75, 0), 0x6f);
Test(nexttoward(-15.75, 0), 0xef);
// Test infinities and NaNs.
Test(+INFINITY, 0x70);
Test(-INFINITY, 0xf0);
Test(+NAN, 0x78);
Test(-NAN, 0xf8);
Show(0);
Show(0x1p-6);
Show(0x1p-2);
Show(0x1.1p-2);
Show(0x1.2p-2);
Show(0x1.4p-2);
Show(0x1.8p-2);
Show(0x1p-1);
Show(15.5);
Show(15.75);
Show(16);
Show(NAN);
Show(1./6);
Show(1./3);
Show(2./3);
}
I hate to answer my own question... But this may still not be the optimal solution.
Although #Eric Postpischil's solution uses an established algorithm, it is not very well suited for minifloats, since there are so few denormals in 4 mantissa bits. Additionally, the overhead of multiple float arithmetic operations - and because of the actual code behind frexp in particular, it only has one branch less (or two when inlined and optimized) than my original solution and is also not that great in regards to instruction level parallelism.
So here's my current solution:
public static explicit operator quarter(float f)
{
byte f8_sign = (byte)((asuint(f) >> 31) << 7);
uint f32_exponent = (asuint(f) >> 23) & 0x00FFu;
uint f32_mantissa = asuint(f) & 0x007F_FFFFu;
if (f32_exponent < 120) // underflow => preserve +/- 0
{
return new quarter { value = f8_sign };
}
else if (f32_exponent > 130) // overflow => +/- infinity or preserve NaN
{
return new quarter { value = (byte)(f8_sign | PositiveInfinity.value | touint8(isnan(f))) };
}
else
{
int cmp = 125 - (int)f32_exponent;
int cmpIsZeroOrNegativeMask = (cmp - 1) >> 31;
int denormalExponent = andnot(0b0001_0000 >> cmp, cmpIsZeroOrNegativeMask); // special case 121: sets it to quarter.Epsilon
denormalExponent += touint8((f32_exponent == 121) & (f32_mantissa >= 0x0040_0000)); // case 121: 2^(-6) * (1 + mantissa): return +/- quarter.Epsilon = 2^(-2) * 2^(-4); if the mantissa is >= 0.5 return 2^(-2) * 2^(-3)
denormalExponent |= touint8((f32_exponent == 120) & (f32_mantissa != 0)); // case 120: 2^(-7) * 1.(mantissa > 0) means the value is closer to quarter.epsilon than 0
int normalExponent = (cmpIsZeroOrNegativeMask & ((int)f32_exponent - (127 + EXPONENT_BIAS))) << 4;
int mantissaShift = 19 + andnot(cmp, cmpIsZeroOrNegativeMask);
return new quarter { value = (byte)((f8_sign | normalExponent) | (denormalExponent | (f32_mantissa >> mantissaShift))) };
}
}
But note that the particular andnot(int a, int b) function I use returns a & ~b and...not ~a & b.
Thanks for your help :) I'm keeping this open since, as mentioned, this may very well not be the best solution - but at least it's my own...
PS: This is probably a good example for why PREMATURE optimization is bad; Your code is much less readable. Make sure you have the functionality backed up by unit tests and make sure you even need the optimization in the first place.
...And after some time and in the spirit of transparent progression, I want to show the final version, since I believe to have found the optimal implementation; more later.
First off, here it is (the code should speak for itself, which is why it is this "much"):
unsafe struct quarter
{
const bool IEEE_754_STANDARD = true; //standard: true
const bool SIGN_BIT = IEEE_754_STANDARD || true; //standard: true
const int BITS = 8 * sizeof(byte); //standard: 8
const int EXPONENT_BITS = 3 + (SIGN_BIT ? 0 : 1); //standard: 3
const int MANTISSA_BITS = BITS - EXPONENT_BITS - (SIGN_BIT ? 1 : 0); //standard: 4
const int EXPONENT_BIAS = -(((1 << BITS) - 1) >> (BITS - (EXPONENT_BITS - 1))); //standard: -3
const int MAX_EXPONENT = EXPONENT_BIAS + ((1 << EXPONENT_BITS) - 1) - (IEEE_754_STANDARD ? 1 : 0); //standard: 3
const int SIGNALING_EXPONENT = (MAX_EXPONENT - EXPONENT_BIAS + (IEEE_754_STANDARD ? 1 : 0)) << MANTISSA_BITS; //standard: 0b0111_0000
const int F32_BITS = 8 * sizeof(float);
const int F32_EXPONENT_BITS = 8;
const int F32_MANTISSA_BITS = 23;
const int F32_EXPONENT_BIAS = -(int)(((1L << F32_BITS) - 1) >> (F32_BITS - (F32_EXPONENT_BITS - 1)));
const int F32_MAX_EXPONENT = F32_EXPONENT_BIAS + ((1 << F32_EXPONENT_BITS) - 1 - 1);
const int F32_SIGNALING_EXPONENT = (F32_MAX_EXPONENT - F32_EXPONENT_BIAS + 1) << F32_MANTISSA_BITS;
const int F32_SHL_LOSE_SIGN = (F32_BITS - (MANTISSA_BITS + EXPONENT_BITS));
const int F32_SHR_PLACE_MANTISSA = MANTISSA_BITS + ((1 + F32_EXPONENT_BITS) - (MANTISSA_BITS + EXPONENT_BITS));
const int F32_MAGIC = (((1 << F32_EXPONENT_BITS) - 1) - (1 + EXPONENT_BITS)) << F32_MANTISSA_BITS;
byte _value;
static quarter Epsilon => new quarter { _value = 1 };
static quarter MaxValue => new quarter { _value = (byte)(SIGNALING_EXPONENT - 1) };
static quarter NaN => new quarter { _value = (byte)(SIGNALING_EXPONENT | 1) };
static quarter PositiveInfinity => new quarter { _value = (byte)SIGNALING_EXPONENT };
static uint asuint(float f) => *(uint*)&f;
static float asfloat(uint u) => *(float*)&u;
static byte tobyte(bool b) => *(byte*)&b;
static float ToFloat(quarter q, bool promiseInRange)
{
uint fusedExponentMantissa = ((uint)q._value << F32_SHL_LOSE_SIGN) >> F32_SHR_PLACE_MANTISSA;
uint sign = ((uint)q._value >> (BITS - 1)) << (F32_BITS - 1);
if (!promiseInRange)
{
bool nanInf = (q._value & SIGNALING_EXPONENT) == SIGNALING_EXPONENT;
uint ifNanInf = asuint(float.PositiveInfinity) & (uint)(-tobyte(nanInf));
return (nanInf ? 1f : asfloat(F32_MAGIC)) * asfloat(sign | fusedExponentMantissa | ifNanInf);
}
else
{
return asfloat(F32_MAGIC) * asfloat(sign | fusedExponentMantissa);
}
}
static quarter ToQuarter(float f, bool promiseInRange)
{
float inRange = f * (1f / asfloat(F32_MAGIC));
uint q = asuint(inRange) >> (F32_MANTISSA_BITS - (1 + EXPONENT_BITS));
uint f8_sign = asuint(f) >> (F32_BITS - 1);
if (!promiseInRange)
{
uint f32_exponent = asuint(f) & F32_SIGNALING_EXPONENT;
bool overflow = f32_exponent > (uint)(-F32_EXPONENT_BIAS + MAX_EXPONENT << F32_MANTISSA_BITS);
bool notNaNInf = f32_exponent != F32_SIGNALING_EXPONENT;
f8_sign ^= tobyte(!notNaNInf);
if (overflow & notNaNInf)
{
q = PositiveInfinity._value;
}
}
f8_sign <<= (BITS - 1);
return new quarter{ _value = (byte)(q ^ f8_sign) };
}
}
Turns out that in fact, the reverse operation of converting the mini-float to a 32 bit float by multiplying with a magic constant is also the reverse operation of a multiplication (wow...): a floating point division by that constant.
Luckily "by that constant" and not the other way around; we can calculate the reciprocal at compile time and multiply by it instead. This only fails, as with the reverse operation, when converting to- and from 'INF' and 'NaN'. Absolute overflow with any biased 32 exponent with exponent % (MAX_EXPONENT + 1) != 0 is not translated into 'INF' and positive 'INF' is translated into negative 'INF'.
Although this enables some optimizations through the bool paramater, this mostly just reduces code size and more importantly (especially for SIMD versions, where small data types really shine) reduces the need for constants. Speaking of SIMD: This scalar version can be optimized a little by using SSE/SSE2 intrinsics.
The (disabled) optimizations (would) run completely in parallel to the floating point multiplication followed by a shift, taking a total of 5 to 6+ clock cycles (very CPU dependant), which is astonishingly close to native hardware instructions (~4 to 5 clock cycles).

C#.NET to PHP same code with different result

C#.NET:
public int NextRandom(int n) {
int n2 = (n + 7) * 3;
n = ((int)((uint)n >> 8) | n << 24);
n ^= ((int)((uint)n >> 7) & 0x3FF) * ((int)((uint)n >> 22) & 0x3FF) + 5 * (n2 + 3);
return n;
}
NextRandom(1337);
C# RETURN: 956321482
PHP:
public function NextRandom($n) {
$n2 = ($n + 7) * 3;
$n = ((int)(abs($n) >> 8) | $n << 24);
$n ^= ((int)(abs($n) >> 7) & 0x3FF) * ((int)(abs($n) >> 22) &
0x3FF) + 5 * ($n2 + 3);
return $n;
}
NextRandom(1337);
PHP RETURN: 22431157962
What is wrong in my PHP code?
Tanks for help.
SOLVED:
I add
$n &= 0xFFFFFFFF;
to put the integer back into 32-bit range.
the result of your operation is 22431157962 the value that PHP shows
But the max value an int(32bit) can show is 2147483647, so it can not fit in the return type(int) you have defined, try changing the return type to long(64 bit number) (+any other cast if needed) and you should be fine, not a master at PHP but i think PHP is using 64 bit number in this case
Just for more debug info, the way to debug this is look at HEX values
956321482 = 0x39004ECA
22431157962 = 0x539004ECA
If you look close the first 32bit are same, but your number needs more than that

fast way to convert integer array to byte array (11 bit)

I have integer array and I need to convert it to byte array
but I need to take (only and just only) first 11 bit of each element of the هinteger array
and then convert it to a byte array
I tried this code
// ***********convert integer values to byte values
//***********to avoid the left zero padding on the byte array
// *********** first step : convert to binary string
// ***********second step : convert binary string to byte array
// *********** first step
string ByteString = Convert.ToString(IntArray[0], 2).PadLeft(11,'0');
for (int i = 1; i < IntArray.Length; i++)
ByteString = ByteString + Convert.ToString(IntArray[i], 2).PadLeft(11, '0');
// ***********second step
int numOfBytes = ByteString.Length / 8;
byte[] bytes = new byte[numOfBytes];
for (int i = 0; i < numOfBytes; ++i)
{
bytes[i] = Convert.ToByte(ByteString.Substring(8 * i, 8), 2);
}
But it takes too long time (if the file size large , the code takes more than 1 minute)
I need a very very fast code (very few milliseconds only )
can any one help me ?
Basically, you're going to be doing a lot of shifting and masking. The exact nature of that depends on the layout you want. If we assume that we pack little-endian from each int, appending on the left, so two 11-bit integers with positions:
abcdefghijk lmnopqrstuv
become the 8-bit chunks:
defghijk rstuvabc 00lmnopq
(i.e. take the lowest 8 bits of the first integer, which leaves 3 left over, so pack those into the low 3 bits of the next byte, then take the lowest 5 bits of the second integer, then finally the remaining 6 bits, padding with zero), then something like this should work:
using System;
using System.Linq;
static class Program
{
static string AsBinary(int val) => Convert.ToString(val, 2).PadLeft(11, '0');
static string AsBinary(byte val) => Convert.ToString(val, 2).PadLeft(8, '0');
static void Main()
{
int[] source = new int[1432];
var rand = new Random(123456);
for (int i = 0; i < source.Length; i++)
source[i] = rand.Next(0, 2047); // 11 bits
// Console.WriteLine(string.Join(" ", source.Take(5).Select(AsBinary)));
var raw = Encode(source);
// Console.WriteLine(string.Join(" ", raw.Take(6).Select(AsBinary)));
var clone = Decode(raw);
// now prove that it worked OK
if (source.Length != clone.Length)
{
Console.WriteLine($"Length: {source.Length} vs {clone.Length}");
}
else
{
int failCount = 0;
for (int i = 0; i < source.Length; i++)
{
if (source[i] != clone[i] && failCount++ == 0)
{
Console.WriteLine($"{i}: {source[i]} vs {clone[i]}");
}
}
Console.WriteLine($"Errors: {failCount}");
}
}
static byte[] Encode(int[] source)
{
long bits = source.Length * 11;
int len = (int)(bits / 8);
if ((bits % 8) != 0) len++;
byte[] arr = new byte[len];
int bitOffset = 0, index = 0;
for (int i = 0; i < source.Length; i++)
{
// note: this encodes little-endian
int val = source[i] & 2047;
int bitsLeft = 11;
if(bitOffset != 0)
{
val = val << bitOffset;
arr[index++] |= (byte)val;
bitsLeft -= (8 - bitOffset);
val >>= 8;
}
if(bitsLeft >= 8)
{
arr[index++] = (byte)val;
bitsLeft -= 8;
val >>= 8;
}
if(bitsLeft != 0)
{
arr[index] = (byte)val;
}
bitOffset = bitsLeft;
}
return arr;
}
private static int[] Decode(byte[] source)
{
int bits = source.Length * 8;
int len = (int)(bits / 11);
// note no need to worry about remaining chunks - no ambiguity since 11 > 8
int[] arr = new int[len];
int bitOffset = 0, index = 0;
for(int i = 0; i < source.Length; i++)
{
int val = source[i] << bitOffset;
int bitsLeftInVal = 11 - bitOffset;
if(bitsLeftInVal > 8)
{
arr[index] |= val;
bitOffset += 8;
}
else if(bitsLeftInVal == 8)
{
arr[index++] |= val;
bitOffset = 0;
}
else
{
arr[index++] |= (val & 2047);
if(index != arr.Length) arr[index] = val >> 11;
bitOffset = 8 - bitsLeftInVal;
}
}
return arr;
}
}
If you need a different layout you'll need to tweak it.
This encodes 512 MiB in just over a second on my machine.
Overview to the Encode method:
The first thing is does is pre-calculate the amount of space that is going to be required, and allocate the output buffer; since each input contributes 11 bits to the output, this is just some modulo math:
long bits = source.Length * 11;
int len = (int)(bits / 8);
if ((bits % 8) != 0) len++;
byte[] arr = new byte[len];
We know the output position won't match the input, and we know we're going to be starting each 11-bit chunk at different positions in bytes each time, so allocate variables for those, and loop over the input:
int bitOffset = 0, index = 0;
for (int i = 0; i < source.Length; i++)
{
...
}
return arr;
So: taking each input in turn (where the input is the value at position i), take the low 11 bits of the value - and observe that we have 11 bits (of this value) still to write:
int val = source[i] & 2047;
int bitsLeft = 11;
Now, if the current output value is partially written (i.e. bitOffset != 0), we should deal with that first. The amount of space left in the current output is 8 - bitOffset. Since we always have 11 input bits we don't need to worry about having more space than values to fill, so: left-shift our value by bitOffset (pads on the right with bitOffset zeros, as a binary operation), and "or" the lowest 8 bits of this with the output byte. Essentially this says "if bitOffset is 3, write the 5 low bits of val into the 5 high bits of the output buffer"; finally, fixup the values: increment our write position, record that we have fewer bits of the current value still to write, and use right-shift to discard the 8 low bits of val (which is made of bitOffset zeros and 8 - bitOffset "real" bits):
if(bitOffset != 0)
{
val = val << bitOffset;
arr[index++] |= (byte)val;
bitsLeft -= (8 - bitOffset);
val >>= 8;
}
The next question is: do we have (at least) an entire byte of data left? We might not, if bitOffset was 1 for example (so we'll have written 7 bits already, leaving just 4). If we do, we can just stamp that down and increment the write position - then once again track how many are left and throw away the low 8 bits:
if(bitsLeft >= 8)
{
arr[index++] = (byte)val;
bitsLeft -= 8;
val >>= 8;
}
And it is possible that we've still got some left-over; for example, if bitOffset was 7 we'll have written 1 bit in the first chunk, 8 bits in the second, leaving 2 more to write - or if bitOffset was 0 we won't have written anything in the first chunk, 8 in the second, leaving 3 left to write. So, stamp down whatever is left, but do not increment the write position - we've written to the low bits, but the next value might need to write to the high bits. Finally, update bitOffset to be however many low bits we wrote in the last step (which could be zero):
if(bitsLeft != 0)
{
arr[index] = (byte)val;
}
bitOffset = bitsLeft;
The Decode operation is the reverse of this logic - again, calculate the sizes and prepare the state:
int bits = source.Length * 8;
int len = (int)(bits / 11);
int[] arr = new int[len];
int bitOffset = 0, index = 0;
Now loop over the input:
for(int i = 0; i < source.Length; i++)
{
...
}
return arr;
Now, bitOffset is the start position that we want to write to in the current 11-bit value, so if we start at the start, it will be 0 on the first byte, then 8; 3 bits of the second byte join with the first 11-bit integer, so the 5 bits become part of the second - so bitOffset is 5 on the 3rd byte, etc. We can calculate the number of bits left in the current integer by subtracting from 11:
int val = source[i] << bitOffset;
int bitsLeftInVal = 11 - bitOffset;
Now we have 3 possible scenarios:
1) if we have more than 8 bits left in the current value, we can stamp down our input (as a bitwise "or") but do not increment the write position (as we have more to write for this value), and note that we're 8-bits further along:
if(bitsLeftInVal > 8)
{
arr[index] |= val;
bitOffset += 8;
}
2) if we have exactly 8 bits left in the current value, we can stamp down our input (as a bitwise "or") and increment the write position; the next loop can start at zero:
else if(bitsLeftInVal == 8)
{
arr[index++] |= val;
bitOffset = 0;
}
3) otherwise, we have less than 8 bits left in the current value; so we need to write the first bitsLeftInVal bits to the current output position (incrementing the output position), and whatever is left to the next output position. Since we already left-shifted by bitOffset, what this really means is simply: stamp down (as a bitwise "or") the low 11 bits (val & 2047) to the current position, and whatever is left (val >> 11) to the next if that wouldn't exceed our output buffer (padding zeros). Then calculate our new bitOffset:
else
{
arr[index++] |= (val & 2047);
if(index != arr.Length) arr[index] = val >> 11;
bitOffset = 8 - bitsLeftInVal;
}
And that's basically it. Lots of bitwise operations - shifts (<< / >>), masks (&) and combinations (|).
If you wanted to store the least significant 11 bits of an int into two bytes such that the least significant byte has bits 1-8 inclusive and the most significant byte has 9-11:
int toStore = 123456789;
byte msb = (byte) ((toStore >> 8) & 7); //or 0b111
byte lsb = (byte) (toStore & 255); //or 0b11111111
To check this, 123456789 in binary is:
0b111010110111100110100010101
MMMLLLLLLLL
The bits above L are lsb, and have a value of 21, above M are msb and have a value of 5
Doing the work is the shift operator >> where all the binary digits are slid to the right 8 places (8 of them disappear from the right hand side - they're gone, into oblivion):
0b111010110111100110100010101 >> 8 =
0b1110101101111001101
And the mask operator & (the mask operator works by only keeping bits where, in each position, they're 1 in the value and also 1 in the mask) :
0b111010110111100110100010101 &
0b000000000000000000011111111 (255) =
0b000000000000000000000010101
If you're processing an int array, just do this in a loop:
byte[] bs = new byte[ intarray.Length*2 ];
for(int x = 0, b=0; x < intarray.Length; x++){
int toStore = intarray[x];
bs[b++] = (byte) ((toStore >> 8) & 7);
bs[b++] = (byte) (toStore & 255);
}

Converting an Int to a BCD byte array [duplicate]

I want to convert an int to a byte[2] array using BCD.
The int in question will come from DateTime representing the Year and must be converted to two bytes.
Is there any pre-made function that does this or can you give me a simple way of doing this?
example:
int year = 2010
would output:
byte[2]{0x20, 0x10};
static byte[] Year2Bcd(int year) {
if (year < 0 || year > 9999) throw new ArgumentException();
int bcd = 0;
for (int digit = 0; digit < 4; ++digit) {
int nibble = year % 10;
bcd |= nibble << (digit * 4);
year /= 10;
}
return new byte[] { (byte)((bcd >> 8) & 0xff), (byte)(bcd & 0xff) };
}
Beware that you asked for a big-endian result, that's a bit unusual.
Use this method.
public static byte[] ToBcd(int value){
if(value<0 || value>99999999)
throw new ArgumentOutOfRangeException("value");
byte[] ret=new byte[4];
for(int i=0;i<4;i++){
ret[i]=(byte)(value%10);
value/=10;
ret[i]|=(byte)((value%10)<<4);
value/=10;
}
return ret;
}
This is essentially how it works.
If the value is less than 0 or greater than 99999999, the value won't fit in four bytes. More formally, if the value is less than 0 or is 10^(n*2) or greater, where n is the number of bytes, the value won't fit in n bytes.
For each byte:
Set that byte to the remainder of the value-divided-by-10 to the byte. (This will place the last digit in the low nibble [half-byte] of the current byte.)
Divide the value by 10.
Add 16 times the remainder of the value-divided-by-10 to the byte. (This will place the now-last digit in the high nibble of the current byte.)
Divide the value by 10.
(One optimization is to set every byte to 0 beforehand -- which is implicitly done by .NET when it allocates a new array -- and to stop iterating when the value reaches 0. This latter optimization is not done in the code above, for simplicity. Also, if available, some compilers or assemblers offer a divide/remainder routine that allows retrieving the quotient and remainder in one division step, an optimization which is not usually necessary though.)
Here's a terrible brute-force version. I'm sure there's a better way than this, but it ought to work anyway.
int digitOne = year / 1000;
int digitTwo = (year - digitOne * 1000) / 100;
int digitThree = (year - digitOne * 1000 - digitTwo * 100) / 10;
int digitFour = year - digitOne * 1000 - digitTwo * 100 - digitThree * 10;
byte[] bcdYear = new byte[] { digitOne << 4 | digitTwo, digitThree << 4 | digitFour };
The sad part about it is that fast binary to BCD conversions are built into the x86 microprocessor architecture, if you could get at them!
Here is a slightly cleaner version then Jeffrey's
static byte[] IntToBCD(int input)
{
if (input > 9999 || input < 0)
throw new ArgumentOutOfRangeException("input");
int thousands = input / 1000;
int hundreds = (input -= thousands * 1000) / 100;
int tens = (input -= hundreds * 100) / 10;
int ones = (input -= tens * 10);
byte[] bcd = new byte[] {
(byte)(thousands << 4 | hundreds),
(byte)(tens << 4 | ones)
};
return bcd;
}
maybe a simple parse function containing this loop
i=0;
while (id>0)
{
twodigits=id%100; //need 2 digits per byte
arr[i]=twodigits%10 + twodigits/10*16; //first digit on first 4 bits second digit shifted with 4 bits
id/=100;
i++;
}
More common solution
private IEnumerable<Byte> GetBytes(Decimal value)
{
Byte currentByte = 0;
Boolean odd = true;
while (value > 0)
{
if (odd)
currentByte = 0;
Decimal rest = value % 10;
value = (value-rest)/10;
currentByte |= (Byte)(odd ? (Byte)rest : (Byte)((Byte)rest << 4));
if(!odd)
yield return currentByte;
odd = !odd;
}
if(!odd)
yield return currentByte;
}
Same version as Peter O. but in VB.NET
Public Shared Function ToBcd(ByVal pValue As Integer) As Byte()
If pValue < 0 OrElse pValue > 99999999 Then Throw New ArgumentOutOfRangeException("value")
Dim ret As Byte() = New Byte(3) {} 'All bytes are init with 0's
For i As Integer = 0 To 3
ret(i) = CByte(pValue Mod 10)
pValue = Math.Floor(pValue / 10.0)
ret(i) = ret(i) Or CByte((pValue Mod 10) << 4)
pValue = Math.Floor(pValue / 10.0)
If pValue = 0 Then Exit For
Next
Return ret
End Function
The trick here is to be aware that simply using pValue /= 10 will round the value so if for instance the argument is "16", the first part of the byte will be correct, but the result of the division will be 2 (as 1.6 will be rounded up). Therefore I use the Math.Floor method.
I made a generic routine posted at IntToByteArray that you could use like:
var yearInBytes = ConvertBigIntToBcd(2010, 2);
static byte[] IntToBCD(int input) {
byte[] bcd = new byte[] {
(byte)(input>> 8),
(byte)(input& 0x00FF)
};
return bcd;
}

Please help figure out what this C# method is doing?

I am a little confused as to what is being accomplished by this method. It seems to be attempting to break bytes into nibbles and reassemble the nibbles with nibbles from other bytes to form new bytes and then return a new sequence of bytes.
However, I didn't think, you could take nibbles from a byte using modulus and subtraction and division, nor reassemble them with simple multiplication and addition.
I want to better understand what how this method works, and what it is doing, so I can get some comments around it and then see if it can be converted to make more sense using more more standard methods of nibbling bytes and even take advantage of .Net 4.0 if possible.
private static byte[] Process(byte[] bytes)
{
Queue<byte> newBytes = new Queue<byte>();
int phase = 0;
byte nibble1 = 0;
byte nibble2 = 0;
byte nibble3 = 0;
int length = bytes.Length-1;
for (int i = 0; i < length; i++)
{
switch (phase)
{
case 0:
nibble1 = (byte)((bytes[i] - (bytes[i] % 4)) / 4);
nibble2 = (byte)(byte[i] % 4);
nibble3 = 0;
break;
case 1:
nibble2 = (byte)((nibble2 * 4) + (bytes[i] - (bytes[i] % 16))/16);
nibble3 = (byte)(bytes[i] % 16);
if (i < 4)
{
newBytes.Clear();
newBytes.Enqueue((byte)((16 * nibble1) + nibble2));
}
else
newBytes.Enqueue((byte)((16 * nibble1) + nibble2));
break;
case 2:
nibble1 = nibble3;
nibble2 = (byte)((bytes[i] - (bytes[i] % 4)) / 4);
nibble3 = (byte)(bytes[i] % 4);
newBytes.Enqueue((byte)((16 * nibble1) + nibble2));
break;
case 3:
nibble1 = (byte)((nibble3 * 4) + (bytes[i] - (bytes[i] % 16))/16);
nibble2 = (byte)(bytes[i] % 16);
newBytes.Enqueue((byte)((16 * nibble1) + nibble2));
break;
}
phase = (phase + 1) % 4;
}
return newBytes.ToArray();
}
Multiplication by 2 is the same as shifting bits one place to the left. (So multiply by 4 is shifting 2 places, and so on).
Division by 2 is the same as shifting bits one place to the right.
The modulus operator is being used to mask parts of the values. Modulus N where N = 2^p, will give you the value contained in (p-1) bits of the original value. So
value % 4
Would be the same as
value & 7 // 7 the largest value you can make with 3 bits (4-1). 4 + 2 +1.
Addition and subtraction can be used to combine the values. For instance if you know n and z to be 4-bit values, then both the following statements would combine them into one byte, with n placed in the upper 4 bits:
value = (n * 16) + z;
Versus
value = (n << 4) | z;
I am not entirely sure, but the code appears to be rearranging the nibbles in each byte and flipping them (so 0xF0 becomes 0x0F). It may be trying to compress or encrypt the bytes - difficult to tell without representative input.
In regards to the different things happening in the function:
Dividing by 4 is the same a rightshifting twice (>> 2)
Dividing by 16 is the same a rightshifting four times (>> 4)
Multiplying by 4 is the same a leftshifting twice (<< 2)
Multiplyingby 16 is the same a leftshifting four times (<< 4)
These parts reconstruct a byte from nibbles, the first nibble is placed in the higher order part, the second in the lower order:
(byte)((16 * nibble1) + nibble2)
So if nibble1 is 0x0F and nibble2 is 0x0C, the operation results in a leftshift of the nibble1 by 4, resulting in 0xF0 then nibble2 is added, resulting in 0xFF.

Categories