My current hobby project provides Monte-Carlo-Simulations for card games with French decks (52 cards, from 2 to Ace).
To simulate as fast as possible, I use to represent multiple cards as bitmasks in some spots. Here is some (simplified) code:
public struct Card
{
public enum CardColor : byte { Diamonds = 0, Hearts = 1, Spades = 2, Clubs = 3 }
public enum CardValue : byte { Two = 0, Three = 1, Four = 2, Five = 3, Six = 4, Seven = 5, Eight = 6, Nine = 7, Ten = 8, Jack = 9, Queen = 10, King = 11, Ace = 12 }
public CardColor Color { get; private set; }
public CardValue Value { get; private set; }
// ID provides a unique value for each card, ranging from 0 to 51, from 2Diamonds to AceClubs
public byte ID { get { return (byte)(((byte)this.Value * 4) + (byte)this.Color); } }
// --- Constructors ---
public Card(CardColor color, CardValue value)
{
this.Color = color;
this.Value = value;
}
public Card(byte id)
{
this.Color = (CardColor)(id % 4);
this.Value = (CardValue)((id - (byte)this.Color) / 4);
}
}
The structure which holds multiple cards as bitmask:
public struct CardPool
{
private const ulong FULL_POOL = 4503599627370495;
internal ulong Pool { get; private set; } // Holds all cards as set bit at Card.ID position
public int Count()
{
ulong i = this.Pool;
i = i - ((i >> 1) & 0x5555555555555555);
i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333);
return (int)((((i + (i >> 4)) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56);
}
public CardPool Clone()
{
return new CardPool() { Pool = this.Pool };
}
public void Add(Card card)
{
Add(card.ID);
}
public void Add(byte cardID)
{
this.Pool = this.Pool | ((ulong)1 << cardID);
}
public void Remove(Card card)
{
Remove(card.ID);
}
public void Remove(byte cardID)
{
this.Pool = this.Pool & ~((ulong)1 << cardID);
}
public bool Contains(Card card)
{
ulong mask = ((ulong)1 << card.ID);
return (this.Pool & mask) == mask;
}
// --- Constructor ---
public CardPool(bool filled)
{
if (filled)
this.Pool = FULL_POOL;
else
this.Pool = 0;
}
}
I want to draw one or more cards at random from the second struct CardPool, but I cannot imagine how to do that without iterating single bits in the pool.
Is there any known algorithm to perfom this? If not, do you have any idea of doing this fast?
Update:
The structure is not intended to draw all cards from. It gets cloned frequently and cloning an array is no option. I really think of bitoperations for drawing one or multiple cards.
Update2:
I wrote a class which holds the cards as List for comparison.
public class CardPoolClass
{
private List<Card> Cards;
public void Add(Card card)
{
this.Cards.Add(card);
}
public CardPoolClass Clone()
{
return new CardPoolClass(this.Cards);
}
public CardPoolClass()
{
this.Cards = new List<Card> { };
}
public CardPoolClass(List<Card> cards)
{
this.Cards = cards.ToList();
}
}
Comparing 1.000.000 clone operations of full decks:
- struct: 17 ms
- class: 73 ms
Admitted: The difference is not as much as I thought.
But taken into account that I additionally give up the easy lookup of precalculated values, this makes a big difference.
Of course, it would be faster to draw a random card with this class, but I would have to calculate an index for lookup then, what just transfers the problem to another spot.
I repeat my initial question: Is there a known algorithm for choosing a random set bit from an integer value or has someone an idea for getting this done faster than to iterate all bits?
The discussion about using a class with a List or an Array takes us nowhere, this is not my question and I am able to elaborate on my own if I would be better off using a class.
Update3, the lookup-code:
CODE DELETED: This might be misleading because it does not refer to passages which performance suffers from what is subject of the thread.
Since a same card cannot be drawn twice in a row, you can place every card (in your case, the indices of Pool's set bits) in an array, shuffle it, and pop the cards one by one from any end of this array.
Here's a pseudo-code (because I don't know C#).
declare cards as an array of indices
for each bit in Pool
push its index into cards
shuffle cards
when a card needs to be drawn
pop an index from cards
look up the card with Card(byte id)
Edit
Here's an algorithm to get a random set bit once in a 64-bit integer, using a code from Bit Twiddling Hacks to get position of a bit with given rank (number of more significant set bits).
ulong v = this.Pool;
// ulong a = (v & ~0UL/3) + ((v >> 1) & ~0UL/3);
ulong a = v - ((v >> 1) & ~0UL/3);
// ulong b = (a & ~0UL/5) + ((a >> 2) & ~0UL/5);
ulong b = (a & ~0UL/5) + ((a >> 2) & ~0UL/5);
// ulong c = (b & ~0UL/0x11) + ((b >> 4) & ~0UL/0x11);
ulong c = (b + (b >> 4)) & ~0UL/0x11;
// ulong d = (c & ~0UL/0x101) + ((c >> 8) & ~0UL/0x101);
ulong d = (c + (c >> 8)) & ~0UL/0x101;
ulong t = (d >> 32) + (d >> 48);
int bitCount = ((c * (~0UL / 0xff)) >> 56);
ulong r = Randomizer.Next(1, bitCount+1);
ulong s = 64;
// if (r > t) {s -= 32; r -= t;}
s -= ((t - r) & 256) >> 3; r -= (t & ((t - r) >> 8));
t = (d >> (s - 16)) & 0xff;
// if (r > t) {s -= 16; r -= t;}
s -= ((t - r) & 256) >> 4; r -= (t & ((t - r) >> 8));
t = (c >> (s - 8)) & 0xf;
// if (r > t) {s -= 8; r -= t;}
s -= ((t - r) & 256) >> 5; r -= (t & ((t - r) >> 8));
t = (b >> (s - 4)) & 0x7;
// if (r > t) {s -= 4; r -= t;}
s -= ((t - r) & 256) >> 6; r -= (t & ((t - r) >> 8));
t = (a >> (s - 2)) & 0x3;
// if (r > t) {s -= 2; r -= t;}
s -= ((t - r) & 256) >> 7; r -= (t & ((t - r) >> 8));
t = (v >> (s - 1)) & 0x1;
// if (r > t) s--;
s -= ((t - r) & 256) >> 8;
s--; // s is now the position of a random set bit in v
The commented lines make another version, with branches. If you want to compare the two versions, uncomment these lines and comment the lines following them.
In the original code, the last line is s = 65 - s, but since you use 1 << cardID for manipulations on card pools, and r is random anyway, s-- gives the correct value.
The only thing to watch out for is a zero value for v. But executing this code on an empty pool would be meaningless anyway.
Related
I need to convert C# app which uses extensively bytes manipulation.
An example:
public abstract class BinRecord
{
public static int version => 1;
public virtual int LENGTH => 1 + 7 + 8 + 2 + 1; // 19
public char type;
public ulong timestamp; // 7 byte
public double p;
public ushort size;
public char callbackType;
public virtual void FillBytes(byte[] bytes)
{
bytes[0] = (byte)type;
var t = BitConverter.GetBytes(timestamp);
Buffer.BlockCopy(t, 0, bytes, 1, 7);
Buffer.BlockCopy(BitConverter.GetBytes(p), 0, bytes, 8, 8);
Buffer.BlockCopy(BitConverter.GetBytes(size), 0, bytes, 16, 2);
bytes[18] = (byte)callbackType;
}
}
Basically BitConverter and Buffer.BlockCopy called 100s times per sec.
There are several classes that inherit from the base class above doing more specific tasks. For example:
public class SpecRecord : BinRecord
{
public override int LENGTH => base.LENGTH + 2;
public ushort num;
public SpecRecord() { }
public SpecRecord(ushort num)
{
this.num = num;
}
public override void FillBytes(byte[] bytes)
{
var idx = base.LENGTH;
base.FillBytes(bytes);
Buffer.BlockCopy(BitConverter.GetBytes(num), 0, bytes, idx + 0, 2);
}
}
What approach in C++ should I look into?
Best option, in my opinion, is to actually go to C - use memcpy to copy over the bytes of any object.
Your above code would then be re-written as follows:
void FillBytes(uint8_t* bytes)
{
bytes[0] = (uint8_t)type;
memcpy((bytes + 1), &t, sizeof(uint64_t) - 1);
memcpy((bytes + 8), &p, sizeof(double));
memcpy((bytes + 16), &size, sizeof(uint16_t));
bytes[18] = (uint8_t)callbackType;
}
Here, I use uint8_t, uint16_t, and uint64_t as replacements for the byte, ushort, and ulong types.
Keep in mind, your timestamp copy is not portable to a big-endian CPU - it will cut off the lowest byte rather than the highest. Solving that would require copying in each byte manually, like so:
//Copy a 7 byte timestamp into the buffer.
bytes[1] = (t >> 0) & 0xFF;
bytes[2] = (t >> 8) & 0xFF;
bytes[3] = (t >> 16) & 0xFF;
bytes[4] = (t >> 24) & 0xFF;
bytes[5] = (t >> 32) & 0xFF;
bytes[6] = (t >> 40) & 0xFF;
bytes[7] = (t >> 48) & 0xFF;
I've written an IEEE 754 "quarter" 8-bit minifloat in a 1.3.4.−3 format in C#.
It was mostly a fun little side-project, testing whether or not I understand floats.
Actually, though, I find myself using it more than I'd like to admit :) (bandwidth > clock ticks)
Here's my code for converting the minifloat to a 32-bit float:
public static implicit operator float(quarter q)
{
int sign = (q.value & 0b1000_0000) << 24;
int fusedExponentMantissa = (q.value & 0b0111_1111) << (23 - MANTISSA_BITS);
if ((q.value & 0b0111_0000) == 0b0111_0000) // NaN/Infinity
{
return asfloat(sign | (255 << 23) | fusedExponentMantissa);
}
else // normal and subnormal
{
float magic = asfloat((255 - 1 + EXPONENT_BIAS) << 23);
return magic * asfloat(sign | fusedExponentMantissa);
}
}
where quarter.value is the stored byte and "asfloat" is simply *(float*)&myUInt.The "magic" number makes use of mantissa overflow in the subnormal case, which affects the f_32 exponent (integer multiplication and mask + add is slower than FPU-switch and float multiplication). I guess one could optimize away the branch, too.
But here comes the problematic code - float_32 to float_8:
public static explicit operator quarter(float f)
{
byte f8_sign = (byte)((asuint(f) & 0x8000_0000u) >> 24);
uint f32_exponent = asuint(f) & 0x7F80_0000u;
uint f32_mantissa = asuint(f) & 0x007F_FFFFu;
if (f32_exponent < (120 << 23)) // underflow => preserve +/- 0
{
return new quarter { value = f8_sign };
}
else if (f32_exponent > (130 << 23)) // overflow => +/- infinity or preserve NaN
{
return new quarter { value = (byte)(f8_sign | PositiveInfinity.value | touint8(isnan(f))) };
}
else
{
switch (f32_exponent)
{
case 120 << 23: // 2^(-7) * 1.(mantissa > 0) means the value is closer to quarter.epsilon than 0
{
return new quarter { value = (byte)(f8_sign | touint8(f32_mantissa != 0)) };
}
case 121 << 23: // 2^(-6) * (1 + mantissa): return +/- quarter.epsilon = 2^(-2) * (0 + 2^(-4)); if the mantissa is > 0.5 i.e. 2^(-6) * max(mantissa, 1.75), return 2^(-2) * 2^(-3)
{
return new quarter { value = (byte)(f8_sign | (Epsilon.value + touint8(f32_mantissa > 0x0040_0000))) };
}
case 122 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_0010u | (f32_mantissa >> 22)) };
}
case 123 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_0100u | (f32_mantissa >> 21)) };
}
case 124 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_1000u | (f32_mantissa >> 20)) };
}
default:
{
const uint exponentDelta = (127 + EXPONENT_BIAS) << 23;
return new quarter { value = (byte)(f8_sign | (((f32_exponent - exponentDelta) | f32_mantissa) >> 19)) };
}
}
}
}
... where the function
"asuint" is simply *(uint*)&myFloat and
"touint8" is simply *(byte*)&myBoolean i.e. myBoolean ? 1 : 0.
The first five cases deal with numbers that can only be represented as subnormals in a "quarter".
I want to get rid of the switch at the very least. There's obviously a pattern (same as with float8_to_float32) but I haven't been able to figure out how I could unify the entire switch for days... I tried to google how hardware converts doubles to floats but that yielded no results either.
My requirements are to hold on to the IEEE-754 standard, meaning:
NaN, infinity preservation and clamping to infinity/zero in case of over-/underflow, aswell as rounding to epsilon when the larger type's value is closer to epsilon than 0 (first switch case aswell as the underflow limit in the first if statement).
Can anyone at least push me in the right direction please?
This may not be optimal, but it uses strictly conforming C code except as noted in the first comment, so no pointer aliasing or other manipulation of the bits of a floating-point object. A thorough test program is included.
#include <inttypes.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
/* Notes on portability:
uint8_t is an optional type. Its use here is easily replaced by
unsigned char.
Round-to-nearest is required in FloatToMini.
Floating-point must be base two, and the constant in the
Dekker-Veltkamp split is hardcoded for IEEE-754 binary64 but could be
adopted to other formats. (Change the exponent in 0x1p48 to the number
of bits in the significand minus five.)
*/
/* Convert a double to a 1-3-4 floating-point format. Round-to-nearest is
required.
*/
static uint8_t FloatToMini(double x)
{
// Extract the sign bit of x, moved into its position in a mini-float.
uint8_t s = !!signbit(x) << 7;
x = fabs(x);
/* If x is a NaN, return a quiet NaN with the copied sign. Significand
bits are not preserved.
*/
if (x != x)
return s | 0x78;
/* If |x| is greater than or equal to the rounding point between the
maximum finite value and infinity, return infinity with the copied sign.
(0x1.fp0 is the largest representable significand, 0x1.f8 is that plus
half an ULP, and the largest exponent is 3, so 0x1.f8p3 is that
rounding point.)
*/
if (0x1.f8p3 <= x)
return s | 0x70;
// If x is subnormal, encode with zero exponent.
if (x < 0x1p-2 - 0x1p-7)
return s | (uint8_t) nearbyint(x * 0x1p6);
/* Round to five significand bits using the Dekker-Veltkamp Split. (The
cast eliminates the excess precision that the C standard allows.)
*/
double d = x * (0x1p48 + 1);
x = d - (double) (d-x);
/* Separate the significand and exponent. C's frexp scales the exponent
so the significand is in [.5, 1), hence the e-1 below.
*/
int e;
x = frexp(x, &e) - .5;
return s | (e-1+3) << 4 | (uint8_t) (x*0x1p5);
}
static void Show(double x)
{
printf("%g -> 0x%02" PRIx8 ".\n", x, FloatToMini(x));
}
static void Test(double x, uint8_t expected)
{
uint8_t observed = FloatToMini(x);
if (expected != observed)
{
printf("Error, %.9g (%a) produced 0x%02" PRIx8
" but expected 0x%02" PRIx8 ".\n",
x, x, observed, expected);
exit(EXIT_FAILURE);
}
}
int main(void)
{
// Set the value of an ULP in [1, 2).
static const double ULP = 0x1p-4;
// Test all even significands with normal exponents.
for (double s = 1; s < 2; s += 2*ULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -ULP / (s == 1 ? 4 : 2); t <= +ULP/2; t += ULP/16)
// Test with all normal exponents.
for (int e = 1-3; e < 7-3; ++e)
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (e+3) << 4
| (uint8_t) ((s-1) * 0x1p4);
Test(sign * ldexp(s+t, e), expected);
}
// Test all odd significands with normal exponents.
for (double s = 1 + 1*ULP; s < 2; s += 2*ULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -ULP/2+ULP/16; t < +ULP/2; t += ULP/16)
// Test with all normal exponents.
for (int e = 1-3; e < 7-3; ++e)
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (e+3) << 4
| (uint8_t) ((s-1) * 0x1p4);
Test(sign * ldexp(s+t, e), expected);
}
// Set the value of an ULP in the subnormal range.
static const double subULP = ULP * 0x1p-2;
// Test all even significands with the subnormal exponent.
for (double s = 0; s < 0x1p-2; s += 2*subULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = s == 0 ? 0 : -subULP/2; t <= +subULP/2; t += subULP/16)
{
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (uint8_t) (s/subULP);
Test(sign * (s+t), expected);
}
}
// Test all odd significands with the subnormal exponent.
for (double s = 0 + 1*subULP; s < 0x1p-2; s += 2*subULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -subULP/2 + subULP/16; t < +subULP/2; t += subULP/16)
{
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (uint8_t) (s/subULP);
Test(sign * (s+t), expected);
}
}
// Test at and slightly under the point of rounding to infinity.
Test(+15.75, 0x70);
Test(-15.75, 0xf0);
Test(nexttoward(+15.75, 0), 0x6f);
Test(nexttoward(-15.75, 0), 0xef);
// Test infinities and NaNs.
Test(+INFINITY, 0x70);
Test(-INFINITY, 0xf0);
Test(+NAN, 0x78);
Test(-NAN, 0xf8);
Show(0);
Show(0x1p-6);
Show(0x1p-2);
Show(0x1.1p-2);
Show(0x1.2p-2);
Show(0x1.4p-2);
Show(0x1.8p-2);
Show(0x1p-1);
Show(15.5);
Show(15.75);
Show(16);
Show(NAN);
Show(1./6);
Show(1./3);
Show(2./3);
}
I hate to answer my own question... But this may still not be the optimal solution.
Although #Eric Postpischil's solution uses an established algorithm, it is not very well suited for minifloats, since there are so few denormals in 4 mantissa bits. Additionally, the overhead of multiple float arithmetic operations - and because of the actual code behind frexp in particular, it only has one branch less (or two when inlined and optimized) than my original solution and is also not that great in regards to instruction level parallelism.
So here's my current solution:
public static explicit operator quarter(float f)
{
byte f8_sign = (byte)((asuint(f) >> 31) << 7);
uint f32_exponent = (asuint(f) >> 23) & 0x00FFu;
uint f32_mantissa = asuint(f) & 0x007F_FFFFu;
if (f32_exponent < 120) // underflow => preserve +/- 0
{
return new quarter { value = f8_sign };
}
else if (f32_exponent > 130) // overflow => +/- infinity or preserve NaN
{
return new quarter { value = (byte)(f8_sign | PositiveInfinity.value | touint8(isnan(f))) };
}
else
{
int cmp = 125 - (int)f32_exponent;
int cmpIsZeroOrNegativeMask = (cmp - 1) >> 31;
int denormalExponent = andnot(0b0001_0000 >> cmp, cmpIsZeroOrNegativeMask); // special case 121: sets it to quarter.Epsilon
denormalExponent += touint8((f32_exponent == 121) & (f32_mantissa >= 0x0040_0000)); // case 121: 2^(-6) * (1 + mantissa): return +/- quarter.Epsilon = 2^(-2) * 2^(-4); if the mantissa is >= 0.5 return 2^(-2) * 2^(-3)
denormalExponent |= touint8((f32_exponent == 120) & (f32_mantissa != 0)); // case 120: 2^(-7) * 1.(mantissa > 0) means the value is closer to quarter.epsilon than 0
int normalExponent = (cmpIsZeroOrNegativeMask & ((int)f32_exponent - (127 + EXPONENT_BIAS))) << 4;
int mantissaShift = 19 + andnot(cmp, cmpIsZeroOrNegativeMask);
return new quarter { value = (byte)((f8_sign | normalExponent) | (denormalExponent | (f32_mantissa >> mantissaShift))) };
}
}
But note that the particular andnot(int a, int b) function I use returns a & ~b and...not ~a & b.
Thanks for your help :) I'm keeping this open since, as mentioned, this may very well not be the best solution - but at least it's my own...
PS: This is probably a good example for why PREMATURE optimization is bad; Your code is much less readable. Make sure you have the functionality backed up by unit tests and make sure you even need the optimization in the first place.
...And after some time and in the spirit of transparent progression, I want to show the final version, since I believe to have found the optimal implementation; more later.
First off, here it is (the code should speak for itself, which is why it is this "much"):
unsafe struct quarter
{
const bool IEEE_754_STANDARD = true; //standard: true
const bool SIGN_BIT = IEEE_754_STANDARD || true; //standard: true
const int BITS = 8 * sizeof(byte); //standard: 8
const int EXPONENT_BITS = 3 + (SIGN_BIT ? 0 : 1); //standard: 3
const int MANTISSA_BITS = BITS - EXPONENT_BITS - (SIGN_BIT ? 1 : 0); //standard: 4
const int EXPONENT_BIAS = -(((1 << BITS) - 1) >> (BITS - (EXPONENT_BITS - 1))); //standard: -3
const int MAX_EXPONENT = EXPONENT_BIAS + ((1 << EXPONENT_BITS) - 1) - (IEEE_754_STANDARD ? 1 : 0); //standard: 3
const int SIGNALING_EXPONENT = (MAX_EXPONENT - EXPONENT_BIAS + (IEEE_754_STANDARD ? 1 : 0)) << MANTISSA_BITS; //standard: 0b0111_0000
const int F32_BITS = 8 * sizeof(float);
const int F32_EXPONENT_BITS = 8;
const int F32_MANTISSA_BITS = 23;
const int F32_EXPONENT_BIAS = -(int)(((1L << F32_BITS) - 1) >> (F32_BITS - (F32_EXPONENT_BITS - 1)));
const int F32_MAX_EXPONENT = F32_EXPONENT_BIAS + ((1 << F32_EXPONENT_BITS) - 1 - 1);
const int F32_SIGNALING_EXPONENT = (F32_MAX_EXPONENT - F32_EXPONENT_BIAS + 1) << F32_MANTISSA_BITS;
const int F32_SHL_LOSE_SIGN = (F32_BITS - (MANTISSA_BITS + EXPONENT_BITS));
const int F32_SHR_PLACE_MANTISSA = MANTISSA_BITS + ((1 + F32_EXPONENT_BITS) - (MANTISSA_BITS + EXPONENT_BITS));
const int F32_MAGIC = (((1 << F32_EXPONENT_BITS) - 1) - (1 + EXPONENT_BITS)) << F32_MANTISSA_BITS;
byte _value;
static quarter Epsilon => new quarter { _value = 1 };
static quarter MaxValue => new quarter { _value = (byte)(SIGNALING_EXPONENT - 1) };
static quarter NaN => new quarter { _value = (byte)(SIGNALING_EXPONENT | 1) };
static quarter PositiveInfinity => new quarter { _value = (byte)SIGNALING_EXPONENT };
static uint asuint(float f) => *(uint*)&f;
static float asfloat(uint u) => *(float*)&u;
static byte tobyte(bool b) => *(byte*)&b;
static float ToFloat(quarter q, bool promiseInRange)
{
uint fusedExponentMantissa = ((uint)q._value << F32_SHL_LOSE_SIGN) >> F32_SHR_PLACE_MANTISSA;
uint sign = ((uint)q._value >> (BITS - 1)) << (F32_BITS - 1);
if (!promiseInRange)
{
bool nanInf = (q._value & SIGNALING_EXPONENT) == SIGNALING_EXPONENT;
uint ifNanInf = asuint(float.PositiveInfinity) & (uint)(-tobyte(nanInf));
return (nanInf ? 1f : asfloat(F32_MAGIC)) * asfloat(sign | fusedExponentMantissa | ifNanInf);
}
else
{
return asfloat(F32_MAGIC) * asfloat(sign | fusedExponentMantissa);
}
}
static quarter ToQuarter(float f, bool promiseInRange)
{
float inRange = f * (1f / asfloat(F32_MAGIC));
uint q = asuint(inRange) >> (F32_MANTISSA_BITS - (1 + EXPONENT_BITS));
uint f8_sign = asuint(f) >> (F32_BITS - 1);
if (!promiseInRange)
{
uint f32_exponent = asuint(f) & F32_SIGNALING_EXPONENT;
bool overflow = f32_exponent > (uint)(-F32_EXPONENT_BIAS + MAX_EXPONENT << F32_MANTISSA_BITS);
bool notNaNInf = f32_exponent != F32_SIGNALING_EXPONENT;
f8_sign ^= tobyte(!notNaNInf);
if (overflow & notNaNInf)
{
q = PositiveInfinity._value;
}
}
f8_sign <<= (BITS - 1);
return new quarter{ _value = (byte)(q ^ f8_sign) };
}
}
Turns out that in fact, the reverse operation of converting the mini-float to a 32 bit float by multiplying with a magic constant is also the reverse operation of a multiplication (wow...): a floating point division by that constant.
Luckily "by that constant" and not the other way around; we can calculate the reciprocal at compile time and multiply by it instead. This only fails, as with the reverse operation, when converting to- and from 'INF' and 'NaN'. Absolute overflow with any biased 32 exponent with exponent % (MAX_EXPONENT + 1) != 0 is not translated into 'INF' and positive 'INF' is translated into negative 'INF'.
Although this enables some optimizations through the bool paramater, this mostly just reduces code size and more importantly (especially for SIMD versions, where small data types really shine) reduces the need for constants. Speaking of SIMD: This scalar version can be optimized a little by using SSE/SSE2 intrinsics.
The (disabled) optimizations (would) run completely in parallel to the floating point multiplication followed by a shift, taking a total of 5 to 6+ clock cycles (very CPU dependant), which is astonishingly close to native hardware instructions (~4 to 5 clock cycles).
I have a code that turns an integer to its binary representation, but I was wondering if there is a more simple or "easier" way of doing so. I know that there is a built-in method in C# that does this for you automatically, but that is not what I want to use.
This version loops over each o the 32-bit positions while writing ones and zeros and uses TrimStart to remove leading zeroes.
For example, converting the integer 10 to its string representation in
binary as "1010".
static string IntToBinary(int n)
{
char[] b = new char[32];
int pos = 31;
int i = 0;
while (i < 32) // Loops over each of the 32-bit positions while writing ones and zeros.
{
if ((n & (1 << i)) != 0)
{
b[pos] = '1';
}
else
{
b[pos] = '0';
}
pos--;
i++;
}
return new string(b).TrimStart('0'); // TrimStart removes leading zeroes.
}
static void Main()
{
Console.WriteLine(IntToBinary(300));
}
I suppose you could use a nibble lookup table:
static string[] nibbles = {
"0000", "0001", "0010", "0011",
"0100", "0101", "0110", "0111",
"1000", "1001", "1010", "1011",
"1100", "1101", "1110", "1111"
};
public static string IntToBinary(int n)
{
return
nibbles[(n >> 28) & 0xF] +
nibbles[(n >> 24) & 0xF] +
nibbles[(n >> 20) & 0xF] +
nibbles[(n >> 16) & 0xF] +
nibbles[(n >> 12) & 0xF] +
nibbles[(n >> 8) & 0xF] +
nibbles[(n >> 4) & 0xF] +
nibbles[(n >> 0) & 0xF]
.TrimStart('0');
}
Here is a simple LINQ implementation:
static string IntToBinary(int n)
{
return string.Concat(Enumerable.Range(0, 32)
.Select(i => (n & (1 << (31 - i))) != 0 ? '1' : '0')
.SkipWhile(ch => ch == '0'));
}
Another one using for loop:
static string IntToBinary(int n)
{
var chars = new char[32];
int start = chars.Length;
for (uint bits = (uint)n; bits != 0; bits >>= 1)
chars[--start] = (char)('0' + (bits & 1));
return new string(chars, start, chars.Length - start);
}
I'm trying to compare two strings(Tx & Rx data) and find the quantity of unequal chars.
With the help of the following code, I managed to get the quantity,
string TxData = "00001111";
string RxData = "00000000";
int distorted = 0;
for (int i = 0; i < TxData.Length; i++)
{
if (TxData[i] != RxData[i])
distorted++;
}
Console.Write("Distorted Bits (qty) : {0}", distorted);
Result:
Distorted Bits (qty) : 4
But I'm very curious to know if there's any better way to do this task?
Thanks for your time...:)
If they're always the same length:
int distorted = TxData.Zip(RxData, (a,b) => a == b ? 0 : 1).Sum();
I like okrumnows answer by its simplisity, but assuming that you really already have bytes (or int) and don't need to convert them to string in the first place, you would probably be better of doing something like:
int myMethod(byte byte1, byte byte2)
{
//byte1 = Convert.ToByte("10010101",2);
//byte2 = Convert.ToByte("10011101",2);
byte xorvalue = (byte)( byte1 ^ byte2);
return NumberOfSetBits(xorvalue);
}
private static int NumberOfSetBits(uint i)
{
i = i - ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
This will be much faster.
I have the code below to convert a 32 bit BCD value (supplied in two uint halves) to a uint binary value.
The values supplied can be up to 0x9999, to form a maximum value of 0x99999999.
Is there a better (ie. quicker) way to achieve this?
/// <summary>
/// Convert two PLC words in BCD format (forming 8 digit number) into single binary integer.
/// e.g. If Lower = 0x5678 and Upper = 0x1234, then Return is 12345678 decimal, or 0xbc614e.
/// </summary>
/// <param name="lower">Least significant 16 bits.</param>
/// <param name="upper">Most significant 16 bits.</param>
/// <returns>32 bit unsigned integer.</returns>
/// <remarks>If the parameters supplied are invalid, returns zero.</remarks>
private static uint BCD2ToBin(uint lower, uint upper)
{
uint binVal = 0;
if ((lower | upper) != 0)
{
int shift = 0;
uint multiplier = 1;
uint bcdVal = (upper << 16) | lower;
for (int i = 0; i < 8; i++)
{
uint digit = (bcdVal >> shift) & 0xf;
if (digit > 9)
{
binVal = 0;
break;
}
else
{
binVal += digit * multiplier;
shift += 4;
multiplier *= 10;
}
}
}
return binVal;
}
If you've space to spare for a 39,322 element array, you could always just look the value up.
If you unroll the loop, remember to keep the bit shift.
value = ( lo & 0xF);
value += ((lo >> 4 ) & 0xF) * 10;
value += ((lo >> 8 ) & 0xF) * 100;
value += ((lo >> 12) & 0xF) * 1000;
value += ( hi & 0xF) * 10000;
value += ((hi >> 4 ) & 0xF) * 100000;
value += ((hi >> 8 ) & 0xF) * 1000000;
value += ((hi >> 12) & 0xF) * 10000000;
Your code seems rather complicated; do you require the specific error checking?
Otherwise, you could just use the following code which shouldn't be slower, in fact, it's mostly the same:
uint result = 0;
uint multiplier = 1;
uint value = lo | hi << 0x10;
while (value > 0) {
uint digit = value & 0xF;
value >>= 4;
result += multiplier * digit;
multiplier *= 10;
}
return result;
I suppose you could unroll the loop:
value = ( lo & 0xF);
value+= ((lo>>4) & 0xF) *10;
value+= ((lo>>8) & 0xF) *100;
value+= ((lo>>12)& 0xF) *1000;
value+= ( hi & 0xF) *10000;
value+= ((hi>>4 & 0xF) *100000;
value+= ((hi>>8) & 0xF) *1000000;
value+= ((hi>>12)& 0xF) *10000000;
And you can check for invalid BCD digits like this:
invalid = lo & ((lo&0x8888)>>2)*3
This sets invalid to a non-zero value if any single hex digit > 9.
Try this:
public static int bcd2int(int bcd) {
return int.Parse(bcd.ToString("X"));
}
public static uint BCDToNum(int num)
{
return uint.Parse(num.ToString(), System.Globalization.NumberStyles.HexNumber);
}
Of course, there are a more efficient method. this is just a example of course, so you can tune it as a lesson ^^
function bcd_to_bin ($bcd) {
$mask_sbb = 0x33333333;
$mask_msb = 0x88888888;
$mask_opp = 0xF;
for($i=28;$i;--$i) {
$mask_msb <<= 1;
$mask_opp <<= 1;
$mask_sbb <<= 1;
for($j=0;$j<$i;$j+=4) {
$mask_opp_j = $mask_opp << $j;
if ($bcd & $mask_msb & $mask_opp_j ) {
$bcd -= $mask_sbb & $mask_opp_j;
}
}
}
return $bcd;
}