I am trying to decode the magnetic heading that is contained in a 10bit field. I am not sure how the above instructions are interpreted. What i did is just took the 10 bits and convert them to decimal like this
int magneticheading = Convert.ToInt32(olotoMEbinary.Substring(14, 10), 2);
But then i checked that 259degrees only need 9bits to be expressed in binary (100000011). I am confused about what does a most significant bit of 180 degrees mean and a lsb of 360/1 024 .
For example if i receive the following 10bits 0100001010 how are they converted to degrees according to the above instructions?
Using floating-point math, multiply by 360 and divide by 1024.
The instructions the question references are missing, but Stephen Cleary's method appears to fit the two data points provided. It may help to think of it as a unit conversion from 1024 divisions of a circle to 360.
Related
What is the best data type to use when storing geopositional data in C#? I would use decimal for its exactness, but operations on decimal floating point numbers are slower then binary floating point numbers (double).
I read that most of the time you won't need any more than 6 or 7 digits of precision for latitude or longitude. Does the inexactness of doubles even matter then or can it be ignored?
Go for double, there are several reasons.
Trigonometric functions are available only for double
Precision of double (range of 100 nanometers) is far beyond anything you'll ever require for Lat/Lon values
GeoCoordinate Class and third-Party modules (e.g. DotSpatial) also use double for coordinates
A double has up to 15 decimal digits of precision. So, lets assume three of those digits are going to be on the left of the decimal point for lat/long values (max of 180deg). This leaves 12 digits of precision on the right. Since a degree of lat/long is ~111km, 5 of those 12 digits would give us precision to the meter. 3 more digits would give us precision to the millimeter. The remaining 4 digits would get us precision to around 100 nanometers. Since double will win from the perspective of performance and memory, I see no reason to even consider using decimal.
I faced this question quite a while ago when i started with spacial programming.
I read a book a while ago that led me to this.
//sql server has a really cool dll that deals with spacial data such like
//geography points and so on.
//add this namespace
Using Microsoft.SqlServer.Types;
//SqlGeography.Point(dblLat, dblLon, srid)
var lat_lon_point = Microsoft.SqlServer.Types.SqlGeography.Point(lat, lon, 4326);
This is the best way when working in your application with spacial data.
then to save the data use this in sql
CREATE TABLE myGeoTable
{
LatLonPoint GEOMETRY
}
else, if you are using something else that isnt sql just convert the point to hexadecimal and store it. I know after a long time using spacial that this is the safest.
Double
Combining the answers, it is how Microsoft represents it itself in SqlGeography library
[get: Microsoft.SqlServer.Server.SqlMethod(IsDeterministic=true, IsPrecise=true)]
public System.Data.SqlTypes.SqlDouble Lat { get; }
Property Value
SqlDouble
A SqlDouble value that specifies the latitude.
If you are using .net ef core, I would recommend you the NetTopologySuite library.
Read the full documentation at below link:
https://learn.microsoft.com/en-us/ef/core/modeling/spatial
I'm trying to translate the raw binary data from a thread context into a human-readable format, and have come up empty when trying to translate quadruple-precision floating point values into a readable format in C#.
Ultimately, I'd like to display it in standard scientific notation, e.g. 1.234567×1089. I'm not worried about loss of precision in the process - I just want a reasonable idea of what the value is.
My first thought was to manually compute the value as a double by raising the exponent, but of course I'm going to exceed the maximum value in many cases. I don't mind losing precision, but not being able to display it at all isn't acceptable.
Is there some kind of simple mathematical hack I can use for this?
You could install a third-party library that handles that. For example it looks like QPFloat gives you a new struct called System.Quadruple which overrides ToString, so you could try that.
(I wonder when .NET will support something like System.Quadruple.)
So here's an answer to expand on the comment I made earlier. I hope you don't mind that I'm using Python, since I know where to find everything I need in that language; maybe someone else can translate this into a suitable answer in C#.
Suppose that you've got a sequence of 128 bits representing a number in IEEE 754 binary128 format, and that we've currently read those 128 bits in in the form of an unsigned integer x. For example:
>>> x = 0x4126f07c18386f74e697bd57a865a9d0
(I guess this would be a bit messier in C#, since as far as I can tell it doesn't have a 128-bit integer type; you'd need to either use two 64-bit integers for the high and low words, or use the BigInteger type.)
We can extract the exponent and significand via bit operations as usual (I'm assuming that you already got this far, but I wanted to include the computation for completeness):
>>> significand_mask = (1 << 112) - 1
>>> exponent_mask = (1 << 127) - (1 << 112)
>>> trailing_significand = x & significand_mask
>>> significand = 1.0 + float(trailing_significand) / (2.0**112)
>>> biased_exponent = (x & exponent_mask) >> 112
>>> exponent = biased_exponent - 16383
Note that while the exponent is exact, we've lost most of the precision of significand at this point, keeping only 52-53 bits of precision.
>>> significand
1.9393935334951098
>>> exponent
295
So the value represented is around 1.9393935334951098 * 2**295, or around 1.234567e+89. But you can't do the computation directly at this stage because it might overflow a Double (in this case it doesn't, but if the exponent were bigger you'd have a problem). So here's where the logs come in: let's compute the natural log of the value represented by x:
>>> from math import log, exp
>>> log_of_value = log(significand) + exponent*log(2)
>>> log_of_value
205.14079357778544
Then we can divide by log(10) to get the exponent and mantissa for the decimal part: the quotient of the division gives the decimal exponent, while the remainder gives the log of the significand, so we have to apply exp to it to retrieve the actual significand:
>>> exp10, mantissa10 = divmod(log_of_value, log(10))
>>> exp10
89.0
>>> significand10 = exp(mantissa10)
>>> significand10
1.234566999999967
And formatting the answer nicely:
>>> print("{:.10f}e{:+d}".format(significand10, int(exp10)))
1.2345670000e+89
That's the basic idea: to do this generally you'd also need to handle the sign bit and the special bit patterns for zeros, subnormal numbers, infinities and NaNs. Depending on the application, you may not need all of those.
There's some precision loss involved firstly in the conversion of the integer significand to a double precision float, but also in taking logs and exponents. The worst case for precision loss occurs when the exponent is large, since a large exponent magnifies the absolute error involved in the log(2) computation, which in turn contributes a larger relative error when taking exp to get the final significand. But since the (unbiased) exponent doesn't exceed 16384, it's not hard to bound the error. I haven't done the formal computations, but this should be good for around 12 digits of precision across the range of the binary128 format, and precision should be a bit better for numbers with small exponent.
there are few hacks for that...
compute hex string for number
mantissa and exponent are in binary so there should be no problem just do not forget to add zero for each 2^4 exponent part and shift the mantissa by exponent&3 bits. Negative exponents need few tweaks but are very similar.
All of this can be done by bit and shift operations so no precision loss if coded right ...
convert hex string to dec string
there are quite a few examples also here on SO here is mine. You can also tweak it a little to skip zero processing for more speed...
now scan the dec string
if you look at mine dec2hex and hex2dec conversions in link above then the scan is already there you need to find:
the position of first nonzero decimal from left and right
position of decimal point
from these you easily compute exponent
convert dec string to mantissa * 10^exponet form
it is quite straight forward just remove zeros ... and translate decimal point to its new position then add exponent part ...
add sign for mantissa
you can add it directly in bullets #1,#2 but if you do it in the end then it will spare you a few ifs ...
Hope this helps ...
Can anyone explain to me why this program:
for(float i = -1; i < 1; i += .1F)
Console.WriteLine(i);
Outputs this:
-1
-0.9
-0.8
-0.6999999
-0.5999999
-0.4999999
-0.3999999
-0.2999999
-0.1999999
-0.99999993
7.450581E-08
0.1000001
0.2000001
0.3000001
0.4000001
0.5000001
0.6000001
0.7000001
0.8000001
0.9000002
Where is the rounding error coming from??
I'm sure this question must have been asked in some form before but I can't find it anywhere quickly. :)
The answer comes down to the way that floating point numbers are represented. You can go into the technical detail via wikipedia but it is simply put that a decimal number doesn't necessarily have an exact floating point representation...
The way floating point numbers (base 2 floating point anyway like doubles and floats) work [0]is by adding up powers of 1/2 to get to what you want. So 0.5 is just 1/2. 0.75 is 1/2+1/4 and so on.
the problem comes that you can never represent 0.1 in this binary system without an unending stream of increasingly smaller powers of 2 so the best a computer can do is store a number that is very close to but not quite 0.1.
Usually you don't notice these differences but they are there and sometimes you can make them manifest themselves. There are a lot of ways to deal with these issues and which one you use is very much dependant on what you are actually doing with it.
[0] in the slightly handwavey close enough kind of way
Floating point numbers are not correct, they are always approximated because they must be rounded!!
They are precise in binary representation.
Every CPU or pc could lead to different results.
Take a look at Wikipedia page
The big issue is that 0.1 cannot be represented in binary, just like 1 / 3 or 1 / 7 cannot be represented in decimal. So since the computer has to cut off at some point, it will accumulate a rounding error.
Try doing 0.1 + 0.7 == 0.8 in pretty much any programming language, you'll get false as a result.
In C# to get around this, use the decimal type to get better precision.
This will explain everything about floating-point:
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
The rounding error comes from the fact that Float is not a precise data type (when converted to decimal), it is an approxomation, note in the C# Reference Float is specified as having 7 digits of decimal precision.
It is fundamental to the any floating point variable. The reasons are complex but there is plenty of information if you google it.
Try using Decimal instead.
As other posters have intimated, the problem stems from the assumption that floating point numbers are a precise decimal representation. They are not- they are a precise binary (base-2) representation of a number. The problem you are experiencing is that you cannot always express a precise binary number in decimal format- just like you cannot express 1/3 in decimal format (.33333333...). At some point, rounding must occur.
In your example, rounding is occurring when you express .1F (because that is not a value that can be expressed precisely in base-2).
Refreshing on floating points (also PDF), IEEE-754 and taking part in this discussion on floating point rounding when converting to strings, brought me to tinker: how can I get the maximum and minimum value for a given floating point number whose binary representations are equal.
Disclaimer: for this discussion, I like to stick to 32 bit and 64 bit floating point as described by IEEE-754. I'm not interested in extended floating point (80-bits) or quads (128 bits IEEE-754-2008) or any other standard (IEEE-854).
Background: Computers are bad at representing 0.1 in binary representation. In C#, a float represents this as 3DCCCCCD internally (C# uses round-to-nearest) and a double as 3FB999999999999A. The same bit patterns are used for decimal 0.100000005 (float) and 0.1000000000000000124 (double), but not for 0.1000000000000000144 (double).
For convenience, the following C# code gives these internal representations:
string GetHex(float f)
{
return BitConverter.ToUInt32(BitConverter.GetBytes(f), 0).ToString("X");
}
string GetHex(double d)
{
return BitConverter.ToUInt64(BitConverter.GetBytes(d), 0).ToString("X");
}
// float
Console.WriteLine(GetHex(0.1F));
// double
Console.WriteLine(GetHex(0.1));
In the case of 0.1, there is no lower decimal number that is represented with the same bit pattern, any 0.99...99 will yield a different bit representation (i.e., float for 0.999999937 yields 3F7FFFFF internally).
My question is simple: how can I find the lowest and highest decimal value for a given float (or double) that is internally stored in the same binary representation.
Why: (I know you'll ask) to find the error in rounding in .NET when it converts to a string and when it converts from a string, to find the internal exact value and to understand my own rounding errors better.
My guess is something like: take the mantissa, remove the rest, get its exact value, get one (mantissa-bit) higher, and calculate the mean: anything below that will yield the same bit pattern. My main problem is: how to get the fractional part as integer (bit manipulation it not my strongest asset). Jon Skeet's DoubleConverter class may be helpful.
One way to get at your question is to find the size of an ULP, or Unit in the Last Place, of your floating-point number. Simplifying a little bit, this is the distance between a given floating-point number and the next larger number. Again, simplifying a little bit, given a representable floating-point value x, any decimal string whose value is between (x - 1/2 ulp) and (x + 1/2 ulp) will be rounded to x when converted to a floating-point value.
The trick is that (x +/- 1/2 ulp) is not a representable floating-point number, so actually calculating its value requires that you use a wider floating-point type (if one is available) or an arbitrary width big decimal or similar type to do the computation.
How do you find the size of an ulp? One relatively easy way is roughly what you suggested, written here is C-ish pseudocode because I don't know C#:
float absX = absoluteValue(x);
uint32_t bitPattern = getRepresentationOfFloat(absx);
bitPattern++;
float nextFloatNumber = getFloatFromRepresentation(bitPattern);
float ulpOfX = (nextFloatNumber - absX);
This works because adding one to the bit pattern of x exactly corresponds to adding one ulp to the value of x. No floating-point rounding occurs in the subtraction because the values involved are so close (in particular, there is a theorem of ieee-754 floating-point arithmetic that if two numbers x and y satisfy y/2 <= x <= 2y, then x - y is computed exactly). The only caveats here are:
if x happens to be the largest finite floating point number, this won't work (it will return inf, which is clearly wrong).
if your platform does not correctly support gradual underflow (say an embedded device running in flush-to-zero mode), this won't work for very small values of x.
It sounds like you're not likely to be in either of those situations, so this should work just fine for your purposes.
Now that you know what an ulp of x is, you can find the interval of values that rounds to x. You can compute ulp(x)/2 exactly in floating-point, because floating-point division by 2 is exact (again, barring underflow). Then you need only compute the value of x +/- ulp(x)/2 suitable larger floating-point type (double will work if you're interested in float) or in a Big Decimal type, and you have your interval.
I made a few simplifying assumptions through this explanation. If you need this to really be spelled out exactly, leave a comment and I'll expand on the sections that are a bit fuzzy when I get the chance.
One other note the following statement in your question:
In the case of 0.1, there is no lower
decimal number that is represented
with the same bit pattern
is incorrect. You just happened to be looking at the wrong values (0.999999... instead of 0.099999... -- an easy typo to make).
Python 3.1 just implemented something like this: see the changelog (scroll down a bit), bug report.
After one hour of trying to find a bug in my code I've finally found the reason. I was trying to add a very small float to 1f, but nothing was happening. While trying to figure out why I found that adding that small float to 0f worked perfectly.
Why is this happening?
Does this have to do with 'orders of magnitude'?
Is there any workaround to this problem?
Thanks in advance.
Edit:
Changing to double precision or decimal is not an option at the moment.
Because precision for a single-precision (32 bit) floating-point value is around 7 digits after the decimal point. Which means the value you are adding is essentially zero, at least when added to 1. The value itself, however, can effortlessly stored in a float since the exponent is small in that case. But to successfully add it to 1 you have to use the exponent of the larger number ... and then the digits after the zeroes disappear in rounding.
You can use double if you need more precision. Performance-wise this shouldn't make a difference on today's hardware and memory is often also not as constrained that you have to think about every single variable.
EDIT: As you stated that using double is not an option you could use Kahan summation, as akuhn pointed out in a comment.
Another option may be to perform intermediary calculations in double-precision and afterwards cast to float again. This will only help, however, when there are a few more operations than just adding a very small number to a larger one.
Floating-point arithmetic
This probably happens because the number of digits of precision in a float is constant, but the exponent can obviously vary.
This means that although you can add your small number to 0, you cannot expect to add it to a number that has an exponent different from 0, since there just won't be enough digits of precision left.
You should read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
It looks like it has something to do with floating point precision. If I were you, I'd use a different type, like decimal. That should fix precision errors.
With float, you only get an accuracy of about seven digits. So your number'll be rounded into 1f. If you want to store such number, use double instead
http://msdn.microsoft.com/en-us/library/ayazw934.aspx
In addition to the accepted answer: If you need to sum up many small number and some larger ones, you should use Kahan Summation.
If performance is an issue (because you can't use double), then binary scaling/fixed-point may be an option. floats are stored as integers, but scaled by a large number (say, 2^16). Intermediate arithmetic is done with (relatively fast) integer operations. The final answer can be converted back to floating point at the end, by dividing by the scaling factor.
This is often done if the target processor lacks a hardware floating-point unit.
You're using the f suffix on your literals, which will make these floats instead of doubles. So your very small float will vanish in the bigger float.