Can anyone explain in a simple way the codes below:
public unsafe static float sample(){
int result = 154 + (153 << 8) + (25 << 16) + (64 << 24);
return *(float*)(&result); //don't know what for... please explain
}
Note: the above code uses unsafe function
For the above code, I'm having hard time because I don't understand what's the difference between its return value compare to the return value below:
return (float)(result);
Is it necessary to use unsafe function if your returning *(float*)(&result)?
On .NET a float is represented using an IEEE binary32 single precision floating number stored using 32 bits. Apparently the code constructs this number by assembling the bits into an int and then casts it to a float using unsafe. The cast is what in C++ terms is called a reinterpret_cast where no conversion is done when the cast is performed - the bits are just reinterpreted as a new type.
The number assembled is 4019999A in hexadecimal or 01000000 00011001 10011001 10011010 in binary:
The sign bit is 0 (it is a positive number).
The exponent bits are 10000000 (or 128) resulting in the exponent 128 - 127 = 1 (the fraction is multiplied by 2^1 = 2).
The fraction bits are 00110011001100110011010 which, if nothing else, almost have a recognizable pattern of zeros and ones.
The float returned has the exact same bits as 2.4 converted to floating point and the entire function can simply be replaced by the literal 2.4f.
The final zero that sort of "breaks the bit pattern" of the fraction is there perhaps to make the float match something that can be written using a floating point literal?
So what is the difference between a regular cast and this weird "unsafe cast"?
Assume the following code:
int result = 0x4019999A // 1075419546
float normalCast = (float) result;
float unsafeCast = *(float*) &result; // Only possible in an unsafe context
The first cast takes the integer 1075419546 and converts it to its floating point representation, e.g. 1075419546f. This involves computing the sign, exponent and fraction bits required to represent the original integer as a floating point number. This is a non-trivial computation that has to be done.
The second cast is more sinister (and can only be performed in an unsafe context). The &result takes the address of result returning a pointer to the location where the integer 1075419546 is stored. The pointer dereferencing operator * can then be used to retrieve the value pointed to by the pointer. Using *&result will retrieve the integer stored at the location however by first casting the pointer to a float* (a pointer to a float) a float is instead retrieved from the memory location resulting in the float 2.4f being assigned to unsafeCast. So the narrative of *(float*) &result is give me a pointer to result and assume the pointer is pointer to a float and retrieve the value pointed to by the pointer.
As opposed to the first cast the second cast doesn't require any computations. It just shoves the 32 bit stored in result into unsafeCast (which fortunately also is 32 bit).
In general performing a cast like that can fail in many ways but by using unsafe you are telling the compiler that you know what you are doing.
If i'm interpreting what the method is doing correctly, this is a safe equivalent:
public static float sample() {
int result = 154 + (153 << 8) + (25 << 16) + (64 << 24);
byte[] data = BitConverter.GetBytes(result);
return BitConverter.ToSingle(data, 0);
}
As has been said already, it is re-interpreting the int value as a float.
This looks like an optimization attempt. Instead of doing floating point calculations you are doing integer calculations on the Integer representation of a floating point number.
Remember, floats are stored as binary values just like ints.
After the calculation is done you are using pointers and casting to convert the integer into the float value.
This is not the same as casting the value to a float. That will turn the int value 1 into the float 1.0. In this case you turn the int value into the floating point number described by the binary value stored in the int.
It's quite hard to explain properly. I will look for an example. :-)
Edit:
Look here: http://en.wikipedia.org/wiki/Fast_inverse_square_root
Your code is basically doing the same as described in this article.
Re : What is it doing?
It is taking the value of the bytes stored int and instead interpreting these bytes as a float (without conversion).
Fortunately, floats and ints have the same data size of 4 bytes.
Because Sarge Borsch asked, here's the 'Union' equivalent:
[StructLayout(LayoutKind.Explicit)]
struct ByteFloatUnion {
[FieldOffset(0)] internal byte byte0;
[FieldOffset(1)] internal byte byte1;
[FieldOffset(2)] internal byte byte2;
[FieldOffset(3)] internal byte byte3;
[FieldOffset(0)] internal float single;
}
public static float sample() {
ByteFloatUnion result;
result.single = 0f;
result.byte0 = 154;
result.byte1 = 153;
result.byte2 = 25;
result.byte3 = 64;
return result.single;
}
As others have already described, it's treating the bytes of an int as if they were a float.
You might get the same result without using unsafe code like this:
public static float sample()
{
int result = 154 + (153 << 8) + (25 << 16) + (64 << 24);
return BitConverter.ToSingle(BitConverter.GetBytes(result), 0);
}
But then it won't be very fast any more and you might as well use floats/doubles and the Math functions.
Related
I'm having an issue working from Arduino to Unity; essentially I'm building a payload struct in C++ (Arduino) and sending it to a PC using the following serial protocol:
1 byte - header
1 byte - payload size
X bytes - payload
1 byte - checksum (LRC)
The payload looks like so:
struct SamplePayload {
double A;
double B;
double C;
double D;
} payload;
when I sizeof(payload) I get 16 bytes, when I believe a double is an 8 byte data type; if I add another double the struct is 20 bytes and so on. Am I misunderstanding something? This is causing issues as this is then converted to a byte data stream and cast as a struct on receipt in C#, and I'm not sure what the equivalent datatype would be (casting as a double gives wrong values). The serial protocol in Unity also relies on the correct payload size to read out the stream.
It is probably a straightforward answer but I couldn't find it anywhere, many thanks!
If your application is not numerically sensitive, you could use the following approach:
Instead of using doubles within your struct (which aren't strictly standardized, as mentioned in the comments), you could use two int32_t, a and b for representing a significant and an exponent such that
a*2^b = original_double
So your struct will look something like this:
struct SamplePayload {
int32_t A_sig;
int32_t A_exp;
//B,C...
} payload;
Then on the receiving side, you will only have to multiply according to the formula above to get the original double.
C provides you with a neat function, frexp, to ease things up.
But since we store both a and b as integers, we need to modify the results a bit in order to get high precision.
Specifically, since a is guaranteed to be between 0.5 and 1, you need to multiply a by 2^30, and subtract 30 from b in order not to overflow.
Here's an example:
#include <stdio.h> /* printf */
#include <math.h> /* frexp */
#include <stdint.h>
int main ()
{
double param, result;
int32_t a,b;
param = +235.0123123;
result = frexp (param , &b);
a=(result*(1<<30)) /*2^30*/; b-=30;
printf ("%f = %d * 2^%d\n", param, a, b); //235.012312 = 985713081 * 2^-22
return 0;
}
i read C# book, and there is this example. the question is, why the heck float lose the numeric "1" from int value???
isn't float have bigger magnitude?
int i1 = 100000001;
float f = i1; // Magnitude preserved, precision lost (WHY? #_#)
int i2 = (int)f; // 100000000
A float is a 32 bit number made up of a 24 bit mantissa and an 8 bit exponent. What happens when
float f = ii;
is an attempt to squeeze a 32 bit integer into a 24 bit mantissa. The mantissa will only store 24 bits (around 6-7 significant figures) so anything past the 6th or 7th digit will be lost.
If the assignment is made with a double, which has more significant digits, the value will be preserved.
float was not designed for big integer numbers. If you want to use big numbers and you know it is not always integers, use double.
int i1 = 100000001;
double f = Convert.ToDouble(i1);
int i2 = Convert.ToInt32(f); // 100000001
If all integers and you will want to be able to do calculations with them use Int64 instead of int.
I have been looking at some way to determine the scale and precision of a decimal in C#, which led me to several SO questions, yet none of them seem to have correct answers, or have misleading titles (they really are about SQL server or some other databases, not C#), or any answers at all. The following post, I think, is the closest to what I'm after, but even this seems wrong:
Determine the decimal precision of an input number
First, there seems to be some confusion about the difference between scale and precision. Per Google (per MSDN):
Precision is the number of digits in a number. Scale is the number of digits to the right of the decimal point in a number.
With that being said, the number 12345.67890M would have a scale of 5 and a precision of 10. I have not discovered a single code example that would accurately calculate this in C#.
I want to make two helper methods, decimal.Scale(), and decimal.Precision(), such that the following unit test passes:
[TestMethod]
public void ScaleAndPrecisionTest()
{
//arrange
var number = 12345.67890M;
//act
var scale = number.Scale();
var precision = number.Precision();
//assert
Assert.IsTrue(precision == 10);
Assert.IsTrue(scale == 5);
}
but I have yet to find a snippet that will do this, though several people have suggested using decimal.GetBits(), and others have said, convert it to a string and parse it.
Converting it to a string and parsing it is, in my mind, an awful idea, even disregarding the localization issue with the decimal point. The math behind the GetBits() method, however, is like Greek to me.
Can anyone describe what the calculations would look like for determining scale and precision in a decimal value for C#?
This is how you get the scale using the GetBits() function:
decimal x = 12345.67890M;
int[] bits = decimal.GetBits(x);
byte scale = (byte) ((bits[3] >> 16) & 0x7F);
And the best way I can think of to get the precision is by removing the fraction point (i.e. use the Decimal Constructor to reconstruct the decimal number without the scale mentioned above) and then use the logarithm:
decimal x = 12345.67890M;
int[] bits = decimal.GetBits(x);
//We will use false for the sign (false = positive), because we don't care about it.
//We will use 0 for the last argument instead of bits[3] to eliminate the fraction point.
decimal xx = new Decimal(bits[0], bits[1], bits[2], false, 0);
int precision = (int)Math.Floor(Math.Log10((double)xx)) + 1;
Now we can put them into extensions:
public static class Extensions{
public static int GetScale(this decimal value){
if(value == 0)
return 0;
int[] bits = decimal.GetBits(value);
return (int) ((bits[3] >> 16) & 0x7F);
}
public static int GetPrecision(this decimal value){
if(value == 0)
return 0;
int[] bits = decimal.GetBits(value);
//We will use false for the sign (false = positive), because we don't care about it.
//We will use 0 for the last argument instead of bits[3] to eliminate the fraction point.
decimal d = new Decimal(bits[0], bits[1], bits[2], false, 0);
return (int)Math.Floor(Math.Log10((double)d)) + 1;
}
}
And here is a fiddle.
First of all, solve the "physical" problem: how you're gonna decide which digits are significant. The fact is, "precision" has no physical meaning unless you know or guess the absolute error.
Now, there are 2 fundamental ways to determine each digit (and thus, their number):
get+interpret the meaningful parts
calculate mathematically
The 2nd way can't detect trailing zeros in the fractional part (which may or may not be significant depending on your answer to the "physical" problem), so I won't cover it unless requested.
For the first one, in the Decimal's interface, I see 2 basic methods to get the parts: ToString() (a few overloads) and GetBits().
ToString(String, IFormatInfo) is actually a reliable way since you can define the format exactly.
E.g. use the F specifier and pass a culture-neutral NumberFormatInfo in which you have manually set all the fields that affect this particular format.
regarding the NumberDecimalDigits field: a test shows that it is the minimal number - so set it to 0 (the docs are unclear on this), - and trailing zeros are printed all right if there are any
The semantics of GetBits() result are documented clearly in its MSDN article (so laments like "it's Greek to me" won't do ;) ). Decompiling with ILSpy shows that it's actually a tuple of the object's raw data fields:
public static int[] GetBits(decimal d)
{
return new int[]
{
d.lo,
d.mid,
d.hi,
d.flags
};
}
And their semantics are:
|high|mid|low| - binary digits (96 bits), interpreted as an integer (=aligned to the right)
flags:
bits 16 to 23 - "the power of 10 to divide the integer number" (=number of fractional decimal digits)
(thus (flags>>16)&0xFF is the raw value of this field)
bit 31 - sign (doesn't concern us)
as you can see, this is very similar to IEEE 754 floats.
So, the number of fractional digits is the exponent value. The number of total digits is the number of digits in the decimal representation of the 96-bit integer.
Racil's answer gives you the value of the internal scale value of the decimal which is correct, although if the internal representation ever changes it'll be interesting.
In the current format the precision portion of decimal is fixed at 96 bits, which is between 28 and 29 decimal digits depending on the number. All .NET decimal values share this precision. Since this is constant there's no internal value you can use to determine it.
What you're apparently after though is the number of digits, which we can easily determine from the string representation. We can also get the scale at the same time or at least using the same method.
public struct DecimalInfo
{
public int Scale;
public int Length;
public override string ToString()
{
return string.Format("Scale={0}, Length={1}", Scale, Length);
}
}
public static class Extensions
{
public static DecimalInfo GetInfo(this decimal value)
{
string decStr = value.ToString().Replace("-", "");
int decpos = decStr.IndexOf(".");
int length = decStr.Length - (decpos < 0 ? 0 : 1);
int scale = decpos < 0 ? 0 : length - decpos;
return new DecimalInfo { Scale = scale, Length = length };
}
}
I am attempting to do some mat on two UInt64 values and store the result in a float:
UInt64 val64 = 18446744073709551615;
UInt64 val64_2 = 18446744073709551000;
float val = (float)val64 - val64_2;
Console.WriteLine(val64);
Console.WriteLine(val.ToString("f"));
Console.ReadKey();
I am expecting the val to be 615.0 but instead I am getting 0.0!
Using double instead for val seems to work but surely float is capable of storing 615.0. What am I missing here?
It's not the result that is being truncated, it's the values used in the calculation. You are casting val64 to a float in your sum. This also means val64_2 will be cast to a float (to match val64). Both have lost enough precision that they are the same value when represted as a float, and the difference is 0.
You want to keep them as UInt64 for the subtraction, and have the result as a float. i.e.
float val = (float)(val64 - val64_2);
Float is an approximation that can store only 6 or 7 significant digits (see https://msdn.microsoft.com/en-us/library/hd7199ke.aspx)
In your code both UInt64s end up as 1.84467441E+19 when cast to float.
As various people have already mentioned the solution is to keep the values as UInt64s for the subtraction.
I am trying to explicity cast an int into a ushort but am getting the Cannot implicity convert type 'int' to 'ushort'
ushort quotient = ((12 * (ushort)(channel)) / 16);
I am using .Net Micro framework so BitConverter is unavailable. Why I am using ushort in the first place has to do with how my data is being sent over SPI. I can understand this particular error has been brought up before on this site but I cannot see why when I am explicity declaring that I dont care if any data goes missing, just chop the 32 bit into a 16 bit and I will be happy.
public void SetGreyscale(int channel, int percent)
{
// Calculate value in range of 0 through 4095 representing pwm greyscale data: refer to datasheet, 2^12 - 1
ushort value = (ushort)System.Math.Ceiling((double)percent * 40.95);
// determine the index position within GsData where our data starts
ushort quotient = ((12 * (ushort)(channel)) / 16); // There is 12 peices of 16 bits
I would prefer not to change int channel, to ushort channel. How can I solve the error?
(ushort) channel is ushort but 12 * (ushort)(channel) would be int, do this instead:
ushort quotient = (ushort) ((12 * channel) / 16);
Multiplication of any int and smaller types produces int. So in your case 12 * ushort produces int.
ushort quotient = (ushort)(12 * channel / 16);
Note that above code is not exactly equivalent to original sample - the cast of channel to ushort may significantly change result if value of channel is outside of ushort range (0.. 0xFFFF). In case if it is important you still need inner cast. Sample below will produce 0 for channel=0x10000 (which is what original sample in question does) unlike more regular looking code above (which gives 49152 result):
ushort quotient = (ushort)((12 * (ushort)channel) / 16);