I've written an IEEE 754 "quarter" 8-bit minifloat in a 1.3.4.−3 format in C#.
It was mostly a fun little side-project, testing whether or not I understand floats.
Actually, though, I find myself using it more than I'd like to admit :) (bandwidth > clock ticks)
Here's my code for converting the minifloat to a 32-bit float:
public static implicit operator float(quarter q)
{
int sign = (q.value & 0b1000_0000) << 24;
int fusedExponentMantissa = (q.value & 0b0111_1111) << (23 - MANTISSA_BITS);
if ((q.value & 0b0111_0000) == 0b0111_0000) // NaN/Infinity
{
return asfloat(sign | (255 << 23) | fusedExponentMantissa);
}
else // normal and subnormal
{
float magic = asfloat((255 - 1 + EXPONENT_BIAS) << 23);
return magic * asfloat(sign | fusedExponentMantissa);
}
}
where quarter.value is the stored byte and "asfloat" is simply *(float*)&myUInt.The "magic" number makes use of mantissa overflow in the subnormal case, which affects the f_32 exponent (integer multiplication and mask + add is slower than FPU-switch and float multiplication). I guess one could optimize away the branch, too.
But here comes the problematic code - float_32 to float_8:
public static explicit operator quarter(float f)
{
byte f8_sign = (byte)((asuint(f) & 0x8000_0000u) >> 24);
uint f32_exponent = asuint(f) & 0x7F80_0000u;
uint f32_mantissa = asuint(f) & 0x007F_FFFFu;
if (f32_exponent < (120 << 23)) // underflow => preserve +/- 0
{
return new quarter { value = f8_sign };
}
else if (f32_exponent > (130 << 23)) // overflow => +/- infinity or preserve NaN
{
return new quarter { value = (byte)(f8_sign | PositiveInfinity.value | touint8(isnan(f))) };
}
else
{
switch (f32_exponent)
{
case 120 << 23: // 2^(-7) * 1.(mantissa > 0) means the value is closer to quarter.epsilon than 0
{
return new quarter { value = (byte)(f8_sign | touint8(f32_mantissa != 0)) };
}
case 121 << 23: // 2^(-6) * (1 + mantissa): return +/- quarter.epsilon = 2^(-2) * (0 + 2^(-4)); if the mantissa is > 0.5 i.e. 2^(-6) * max(mantissa, 1.75), return 2^(-2) * 2^(-3)
{
return new quarter { value = (byte)(f8_sign | (Epsilon.value + touint8(f32_mantissa > 0x0040_0000))) };
}
case 122 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_0010u | (f32_mantissa >> 22)) };
}
case 123 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_0100u | (f32_mantissa >> 21)) };
}
case 124 << 23:
{
return new quarter { value = (byte)(f8_sign | 0b0000_1000u | (f32_mantissa >> 20)) };
}
default:
{
const uint exponentDelta = (127 + EXPONENT_BIAS) << 23;
return new quarter { value = (byte)(f8_sign | (((f32_exponent - exponentDelta) | f32_mantissa) >> 19)) };
}
}
}
}
... where the function
"asuint" is simply *(uint*)&myFloat and
"touint8" is simply *(byte*)&myBoolean i.e. myBoolean ? 1 : 0.
The first five cases deal with numbers that can only be represented as subnormals in a "quarter".
I want to get rid of the switch at the very least. There's obviously a pattern (same as with float8_to_float32) but I haven't been able to figure out how I could unify the entire switch for days... I tried to google how hardware converts doubles to floats but that yielded no results either.
My requirements are to hold on to the IEEE-754 standard, meaning:
NaN, infinity preservation and clamping to infinity/zero in case of over-/underflow, aswell as rounding to epsilon when the larger type's value is closer to epsilon than 0 (first switch case aswell as the underflow limit in the first if statement).
Can anyone at least push me in the right direction please?
This may not be optimal, but it uses strictly conforming C code except as noted in the first comment, so no pointer aliasing or other manipulation of the bits of a floating-point object. A thorough test program is included.
#include <inttypes.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
/* Notes on portability:
uint8_t is an optional type. Its use here is easily replaced by
unsigned char.
Round-to-nearest is required in FloatToMini.
Floating-point must be base two, and the constant in the
Dekker-Veltkamp split is hardcoded for IEEE-754 binary64 but could be
adopted to other formats. (Change the exponent in 0x1p48 to the number
of bits in the significand minus five.)
*/
/* Convert a double to a 1-3-4 floating-point format. Round-to-nearest is
required.
*/
static uint8_t FloatToMini(double x)
{
// Extract the sign bit of x, moved into its position in a mini-float.
uint8_t s = !!signbit(x) << 7;
x = fabs(x);
/* If x is a NaN, return a quiet NaN with the copied sign. Significand
bits are not preserved.
*/
if (x != x)
return s | 0x78;
/* If |x| is greater than or equal to the rounding point between the
maximum finite value and infinity, return infinity with the copied sign.
(0x1.fp0 is the largest representable significand, 0x1.f8 is that plus
half an ULP, and the largest exponent is 3, so 0x1.f8p3 is that
rounding point.)
*/
if (0x1.f8p3 <= x)
return s | 0x70;
// If x is subnormal, encode with zero exponent.
if (x < 0x1p-2 - 0x1p-7)
return s | (uint8_t) nearbyint(x * 0x1p6);
/* Round to five significand bits using the Dekker-Veltkamp Split. (The
cast eliminates the excess precision that the C standard allows.)
*/
double d = x * (0x1p48 + 1);
x = d - (double) (d-x);
/* Separate the significand and exponent. C's frexp scales the exponent
so the significand is in [.5, 1), hence the e-1 below.
*/
int e;
x = frexp(x, &e) - .5;
return s | (e-1+3) << 4 | (uint8_t) (x*0x1p5);
}
static void Show(double x)
{
printf("%g -> 0x%02" PRIx8 ".\n", x, FloatToMini(x));
}
static void Test(double x, uint8_t expected)
{
uint8_t observed = FloatToMini(x);
if (expected != observed)
{
printf("Error, %.9g (%a) produced 0x%02" PRIx8
" but expected 0x%02" PRIx8 ".\n",
x, x, observed, expected);
exit(EXIT_FAILURE);
}
}
int main(void)
{
// Set the value of an ULP in [1, 2).
static const double ULP = 0x1p-4;
// Test all even significands with normal exponents.
for (double s = 1; s < 2; s += 2*ULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -ULP / (s == 1 ? 4 : 2); t <= +ULP/2; t += ULP/16)
// Test with all normal exponents.
for (int e = 1-3; e < 7-3; ++e)
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (e+3) << 4
| (uint8_t) ((s-1) * 0x1p4);
Test(sign * ldexp(s+t, e), expected);
}
// Test all odd significands with normal exponents.
for (double s = 1 + 1*ULP; s < 2; s += 2*ULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -ULP/2+ULP/16; t < +ULP/2; t += ULP/16)
// Test with all normal exponents.
for (int e = 1-3; e < 7-3; ++e)
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (e+3) << 4
| (uint8_t) ((s-1) * 0x1p4);
Test(sign * ldexp(s+t, e), expected);
}
// Set the value of an ULP in the subnormal range.
static const double subULP = ULP * 0x1p-2;
// Test all even significands with the subnormal exponent.
for (double s = 0; s < 0x1p-2; s += 2*subULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = s == 0 ? 0 : -subULP/2; t <= +subULP/2; t += subULP/16)
{
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (uint8_t) (s/subULP);
Test(sign * (s+t), expected);
}
}
// Test all odd significands with the subnormal exponent.
for (double s = 0 + 1*subULP; s < 0x1p-2; s += 2*subULP)
// Test with trailing bits less than or equal to 1/2 ULP in magnitude.
for (double t = -subULP/2 + subULP/16; t < +subULP/2; t += subULP/16)
{
// Test with both signs.
for (int sign = -1; sign <= +1; sign += 2)
{
// Prepare the expected encoding.
uint8_t expected =
(0 < sign ? 0 : 1) << 7
| (uint8_t) (s/subULP);
Test(sign * (s+t), expected);
}
}
// Test at and slightly under the point of rounding to infinity.
Test(+15.75, 0x70);
Test(-15.75, 0xf0);
Test(nexttoward(+15.75, 0), 0x6f);
Test(nexttoward(-15.75, 0), 0xef);
// Test infinities and NaNs.
Test(+INFINITY, 0x70);
Test(-INFINITY, 0xf0);
Test(+NAN, 0x78);
Test(-NAN, 0xf8);
Show(0);
Show(0x1p-6);
Show(0x1p-2);
Show(0x1.1p-2);
Show(0x1.2p-2);
Show(0x1.4p-2);
Show(0x1.8p-2);
Show(0x1p-1);
Show(15.5);
Show(15.75);
Show(16);
Show(NAN);
Show(1./6);
Show(1./3);
Show(2./3);
}
I hate to answer my own question... But this may still not be the optimal solution.
Although #Eric Postpischil's solution uses an established algorithm, it is not very well suited for minifloats, since there are so few denormals in 4 mantissa bits. Additionally, the overhead of multiple float arithmetic operations - and because of the actual code behind frexp in particular, it only has one branch less (or two when inlined and optimized) than my original solution and is also not that great in regards to instruction level parallelism.
So here's my current solution:
public static explicit operator quarter(float f)
{
byte f8_sign = (byte)((asuint(f) >> 31) << 7);
uint f32_exponent = (asuint(f) >> 23) & 0x00FFu;
uint f32_mantissa = asuint(f) & 0x007F_FFFFu;
if (f32_exponent < 120) // underflow => preserve +/- 0
{
return new quarter { value = f8_sign };
}
else if (f32_exponent > 130) // overflow => +/- infinity or preserve NaN
{
return new quarter { value = (byte)(f8_sign | PositiveInfinity.value | touint8(isnan(f))) };
}
else
{
int cmp = 125 - (int)f32_exponent;
int cmpIsZeroOrNegativeMask = (cmp - 1) >> 31;
int denormalExponent = andnot(0b0001_0000 >> cmp, cmpIsZeroOrNegativeMask); // special case 121: sets it to quarter.Epsilon
denormalExponent += touint8((f32_exponent == 121) & (f32_mantissa >= 0x0040_0000)); // case 121: 2^(-6) * (1 + mantissa): return +/- quarter.Epsilon = 2^(-2) * 2^(-4); if the mantissa is >= 0.5 return 2^(-2) * 2^(-3)
denormalExponent |= touint8((f32_exponent == 120) & (f32_mantissa != 0)); // case 120: 2^(-7) * 1.(mantissa > 0) means the value is closer to quarter.epsilon than 0
int normalExponent = (cmpIsZeroOrNegativeMask & ((int)f32_exponent - (127 + EXPONENT_BIAS))) << 4;
int mantissaShift = 19 + andnot(cmp, cmpIsZeroOrNegativeMask);
return new quarter { value = (byte)((f8_sign | normalExponent) | (denormalExponent | (f32_mantissa >> mantissaShift))) };
}
}
But note that the particular andnot(int a, int b) function I use returns a & ~b and...not ~a & b.
Thanks for your help :) I'm keeping this open since, as mentioned, this may very well not be the best solution - but at least it's my own...
PS: This is probably a good example for why PREMATURE optimization is bad; Your code is much less readable. Make sure you have the functionality backed up by unit tests and make sure you even need the optimization in the first place.
...And after some time and in the spirit of transparent progression, I want to show the final version, since I believe to have found the optimal implementation; more later.
First off, here it is (the code should speak for itself, which is why it is this "much"):
unsafe struct quarter
{
const bool IEEE_754_STANDARD = true; //standard: true
const bool SIGN_BIT = IEEE_754_STANDARD || true; //standard: true
const int BITS = 8 * sizeof(byte); //standard: 8
const int EXPONENT_BITS = 3 + (SIGN_BIT ? 0 : 1); //standard: 3
const int MANTISSA_BITS = BITS - EXPONENT_BITS - (SIGN_BIT ? 1 : 0); //standard: 4
const int EXPONENT_BIAS = -(((1 << BITS) - 1) >> (BITS - (EXPONENT_BITS - 1))); //standard: -3
const int MAX_EXPONENT = EXPONENT_BIAS + ((1 << EXPONENT_BITS) - 1) - (IEEE_754_STANDARD ? 1 : 0); //standard: 3
const int SIGNALING_EXPONENT = (MAX_EXPONENT - EXPONENT_BIAS + (IEEE_754_STANDARD ? 1 : 0)) << MANTISSA_BITS; //standard: 0b0111_0000
const int F32_BITS = 8 * sizeof(float);
const int F32_EXPONENT_BITS = 8;
const int F32_MANTISSA_BITS = 23;
const int F32_EXPONENT_BIAS = -(int)(((1L << F32_BITS) - 1) >> (F32_BITS - (F32_EXPONENT_BITS - 1)));
const int F32_MAX_EXPONENT = F32_EXPONENT_BIAS + ((1 << F32_EXPONENT_BITS) - 1 - 1);
const int F32_SIGNALING_EXPONENT = (F32_MAX_EXPONENT - F32_EXPONENT_BIAS + 1) << F32_MANTISSA_BITS;
const int F32_SHL_LOSE_SIGN = (F32_BITS - (MANTISSA_BITS + EXPONENT_BITS));
const int F32_SHR_PLACE_MANTISSA = MANTISSA_BITS + ((1 + F32_EXPONENT_BITS) - (MANTISSA_BITS + EXPONENT_BITS));
const int F32_MAGIC = (((1 << F32_EXPONENT_BITS) - 1) - (1 + EXPONENT_BITS)) << F32_MANTISSA_BITS;
byte _value;
static quarter Epsilon => new quarter { _value = 1 };
static quarter MaxValue => new quarter { _value = (byte)(SIGNALING_EXPONENT - 1) };
static quarter NaN => new quarter { _value = (byte)(SIGNALING_EXPONENT | 1) };
static quarter PositiveInfinity => new quarter { _value = (byte)SIGNALING_EXPONENT };
static uint asuint(float f) => *(uint*)&f;
static float asfloat(uint u) => *(float*)&u;
static byte tobyte(bool b) => *(byte*)&b;
static float ToFloat(quarter q, bool promiseInRange)
{
uint fusedExponentMantissa = ((uint)q._value << F32_SHL_LOSE_SIGN) >> F32_SHR_PLACE_MANTISSA;
uint sign = ((uint)q._value >> (BITS - 1)) << (F32_BITS - 1);
if (!promiseInRange)
{
bool nanInf = (q._value & SIGNALING_EXPONENT) == SIGNALING_EXPONENT;
uint ifNanInf = asuint(float.PositiveInfinity) & (uint)(-tobyte(nanInf));
return (nanInf ? 1f : asfloat(F32_MAGIC)) * asfloat(sign | fusedExponentMantissa | ifNanInf);
}
else
{
return asfloat(F32_MAGIC) * asfloat(sign | fusedExponentMantissa);
}
}
static quarter ToQuarter(float f, bool promiseInRange)
{
float inRange = f * (1f / asfloat(F32_MAGIC));
uint q = asuint(inRange) >> (F32_MANTISSA_BITS - (1 + EXPONENT_BITS));
uint f8_sign = asuint(f) >> (F32_BITS - 1);
if (!promiseInRange)
{
uint f32_exponent = asuint(f) & F32_SIGNALING_EXPONENT;
bool overflow = f32_exponent > (uint)(-F32_EXPONENT_BIAS + MAX_EXPONENT << F32_MANTISSA_BITS);
bool notNaNInf = f32_exponent != F32_SIGNALING_EXPONENT;
f8_sign ^= tobyte(!notNaNInf);
if (overflow & notNaNInf)
{
q = PositiveInfinity._value;
}
}
f8_sign <<= (BITS - 1);
return new quarter{ _value = (byte)(q ^ f8_sign) };
}
}
Turns out that in fact, the reverse operation of converting the mini-float to a 32 bit float by multiplying with a magic constant is also the reverse operation of a multiplication (wow...): a floating point division by that constant.
Luckily "by that constant" and not the other way around; we can calculate the reciprocal at compile time and multiply by it instead. This only fails, as with the reverse operation, when converting to- and from 'INF' and 'NaN'. Absolute overflow with any biased 32 exponent with exponent % (MAX_EXPONENT + 1) != 0 is not translated into 'INF' and positive 'INF' is translated into negative 'INF'.
Although this enables some optimizations through the bool paramater, this mostly just reduces code size and more importantly (especially for SIMD versions, where small data types really shine) reduces the need for constants. Speaking of SIMD: This scalar version can be optimized a little by using SSE/SSE2 intrinsics.
The (disabled) optimizations (would) run completely in parallel to the floating point multiplication followed by a shift, taking a total of 5 to 6+ clock cycles (very CPU dependant), which is astonishingly close to native hardware instructions (~4 to 5 clock cycles).
I am trying to use AForge.Net Genetics library to create a simple application for optimization purposes. I have a scenario where I have four input parameters, therefore I tried to modify the "OptimizationFunction2D.cs" class located in the AForge.Genetic project to handle four parameters.
While converting the binary chromosomes into 4 parameters (type = double) I am not sure if my approach is correct as I don't know how to verify the extracted values. Below is the code segment where my code differs from the original AForge code:
public double[] Translate( IChromosome chromosome )
{
// get chromosome's value
ulong val = ((BinaryChromosome) chromosome).Value;
// chromosome's length
int length = ((BinaryChromosome) chromosome).Length;
// length of W component
int wLength = length/4;
// length of X component
int xLength = length / 4;
// length of Y component
int yLength = length / 4;
// length of Z component
int zLength = length / 4;
// W maximum value - equal to X mask
ulong wMax = 0xFFFFFFFFFFFFFFFF >> (64 - wLength);
// X maximum value
ulong xMax = 0xFFFFFFFFFFFFFFFF >> (64 - xLength);
// Y maximum value - equal to X mask
ulong yMax = 0xFFFFFFFFFFFFFFFF >> (64 - yLength);
// Z maximum value
ulong zMax = 0xFFFFFFFFFFFFFFFF >> (64 - zLength);
// W component
double wPart = val & wMax;
// X component;
double xPart = (val >> wLength) & xMax;
// Y component;
double yPart = (val >> (wLength + xLength) & yMax);
// Z component;
double zPart = val >> (wLength + xLength + yLength);
// translate to optimization's function space
double[] ret = new double[4];
ret[0] = wPart * _rangeW.Length / wMax + _rangeW.Min;
ret[1] = xPart * _rangeX.Length / xMax + _rangeX.Min;
ret[2] = yPart * _rangeY.Length / yMax + _rangeY.Min;
ret[3] = zPart * _rangeZ.Length / zMax + _rangeZ.Min;
return ret;
}
I am not sure if am correctly separating the chromosome value into four part (wPart/xPart/yPart/zPart). The original function in the AForge.Genetic library looks like this:
public double[] Translate( IChromosome chromosome )
{
// get chromosome's value
ulong val = ( (BinaryChromosome) chromosome ).Value;
// chromosome's length
int length = ( (BinaryChromosome) chromosome ).Length;
// length of X component
int xLength = length / 2;
// length of Y component
int yLength = length - xLength;
// X maximum value - equal to X mask
ulong xMax = 0xFFFFFFFFFFFFFFFF >> ( 64 - xLength );
// Y maximum value
ulong yMax = 0xFFFFFFFFFFFFFFFF >> ( 64 - yLength );
// X component
double xPart = val & xMax;
// Y component;
double yPart = val >> xLength;
// translate to optimization's function space
double[] ret = new double[2];
ret[0] = xPart * rangeX.Length / xMax + rangeX.Min;
ret[1] = yPart * rangeY.Length / yMax + rangeY.Min;
return ret;
}
Can someone please confirm if my conversion process is correct or is there a better way of doing it.
No, it works but you don't need it to be so complicated.
ulong wMax = 0xFFFFFFFFFFFFFFFF >> (64 - wLength);
this returns the same value to all the results wMax xMax yMax zMax so just do one and call it componentMask
part = (val >> (wLength * pos) & componentMask);
where pos is the 0 based position of the component. so 0 for w, 1 for x ...
the rest is ok.
EDIT:
if the Length is not divided by 4 you can make the last part be just val >> (wLength * pos) to make it have the remaining bits.
I am trying to get the original 12-bit value from from a base15 (edit) string. I figured that I need a zerofill right shift operator like in Java to deal with the zero padding. How do I do this?
No luck so far with the following code:
static string chars = "0123456789ABCDEFGHIJKLMNOP";
static int FromStr(string s)
{
int n = (chars.IndexOf(s[0]) << 4) +
(chars.IndexOf(s[1]) << 4) +
(chars.IndexOf(s[2]));
return n;
}
Edit; I'll post the full code to complete the context
static string chars = "0123456789ABCDEFGHIJKLMNOP";
static void Main()
{
int n = FromStr(ToStr(182));
Console.WriteLine(n);
Console.ReadLine();
}
static string ToStr(int n)
{
if (n <= 4095)
{
char[] cx = new char[3];
cx[0] = chars[n >> 8];
cx[1] = chars[(n >> 4) & 25];
cx[2] = chars[n & 25];
return new string(cx);
}
return string.Empty;
}
static int FromStr(string s)
{
int n = (chars.IndexOf(s[0]) << 8) +
(chars.IndexOf(s[1]) << 4) +
(chars.IndexOf(s[2]));
return n;
}
Your representation is base26, so the answer that you are going to get from a three-character value is not going to be 12 bits: it's going to be in the range 0..17575, inclusive, which requires 15 bits.
Recall that shifting left by k bits is the same as multiplying by 2^k. Hence, your x << 4 operations are equivalent to multiplying by 16. Also recall that when you convert a base-X number, you need to multiply its digits by a power of X, so your code should be multiplying by 26, rather than shifting the number left, like this:
int n = (chars.IndexOf(s[0]) * 26*26) +
(chars.IndexOf(s[1]) * 26) +
(chars.IndexOf(s[2]));
I was wondering if it would be possible to get a vector with an X and a Y value as a single number, knowing that both X and Y can range from -65000 to +65000.
Is this possible in any way?
Code examples on how to convert from this kind of number and to it would be nice.
Store it in a ulong:
ulong rslt = (uint)x;
rslt = rslt << 32;
rslt |= ((uint)y);
To get it out:
int x = (int)(rslt >> 32);
int y = (int)(rslt & 0xFFFFFFFF);
Assuming X and Y are both integer values and there is no overflow (32bit values is not enough) you can use e.g. (pseudocode)
V = fromXY(X, Y) = (y+65000)*130001+(x+65000)
(X,Y) = toXY(V) = (V%130001-65000,V/130001-65000) // <= / is integer division
(130001 is the number of distinct values for X or Y)
To combine:
var limit = 65000;
var x = 1;
var y = 2;
var single = x * (limit + 1) + y;
And then:
y = single % (limit + 1);
x = single - y / (limit + 1);
See it in action.
Of course, you have to assume that the maximum value for single fits within the size of the data type that stores it (which in this case it does).
the union does what you want very easily.
See also: http://www.cplusplus.com/doc/tutorial/other_data_types/
typedef long int64;
typedef int int32;
union {
struct { int32 a, b; };
int64 a_and_b;
} stacker;
int main ()
{
stacker.a = -1000;
stacker.b = 2000;
cout << stacker.a << ", " << stacker.b << endl;
cout << stacker.a_and_b << endl;
}
this will output:
-1000, 2000 <-- a and b read as two int32
8594229558296 <-- a and b interprested as a single int64
I've asked before about the opposite of Bitwise AND(&) and you told me its impossible to reverse.
Well,this is the situation: The server sends an image,which is encoded with the function I want to reverse,then it is encoded with zlib.
This is how I get the image from the server:
UInt32[] image = new UInt32[200 * 64];
int imgIndex = 0;
byte[] imgdata = new byte[compressed];
byte[] imgdataout = new byte[uncompressed];
Array.Copy(data, 17, imgdata, 0, compressed);
imgdataout = zlib.Decompress(imgdata);
for (int h = 0; h < height; h++)
{
for (int w = 0; w < width; w++)
{
imgIndex = (int)((height - 1 - h) * width + w);
image[imgIndex] = 0xFF000000;
if (((1 << (Int32)(0xFF & (w & 0x80000007))) & imgdataout[((h * width + w) >> 3)]) > 0)
{
image[imgIndex] = 0xFFFFFFFF;
}
}
}
Width,Height,Image decompressed and Image compressed length are always the same.
When this function is done I put image(UInt32[] array) in a Bitmap and I've got it.
Now I want to be the server and send that image.I have to do two things:
Reverse that function and then compress it with zlib.
How do I reverse that function so I can encode the picture?
for (int h = 0; h < height; h++)
{
for (int w = 0; w < width; w++)
{
imgIndex = (int)((height - 1 - h) * width + w);
image[imgIndex] = 0xFF000000;
if (((1 << (Int32)(0xFF & (w & 0x80000007))) & imgdataout[((h * width + w) >> 3)]) > 0)
{
image[imgIndex] = 0xFFFFFFFF;
}
}
}
EDIT:The format is 32bppRGB
The assumption that the & operator is always irreversible is incorrect.
Yes, in general if you have
c = a & b
and all you know is the value of c, then you cannot know what values a or b had before hand.
However it's very common for & to be used to extract certain bits from a longer value, where those bits were previously combined together with the | operator and where each 'bit field' is independent of every other. The fundamental difference with the generic & or | operators that makes this reversible is that the original bits were all zero beforehand, and the other bits in the word are left unchanged. i.e:
0xc0 | 0x03 = 0xc3 // combine two nybbles
0xc3 & 0xf0 = 0xc0 // extract the top nybble
0xc3 & 0x0f = 0x03 // extract the bottom nybble
In this case your current function appears to be extracting a 1 bit-per-pixel (monochrome image) and converting it to 32-bit RGBA.
You'll need something like:
int source_image[];
byte dest_image[];
for (int h = 0; h < height; ++h) {
for (int w = 0; w < width; ++w) {
int offset = (h * width) + w;
if (source_image[offset] == 0xffffffff) {
int mask = w % 8; // these two lines convert from one int-per-pixel
offset /= 8; // offset to one-bit-per-pixel
dest_image[offset] |= (1 << mask); // only changes _one_ bit
}
}
}
NB: assumes the image is a multiple of 8 pixels wide, that the dest_image array was previously all zeroes. I've used % and / in that inner test because it's easier to understand and the compiler should convert to mask / shift itself. Normally I'd do the masking and shifting myself.