C# float.ToString Rounding Values - c#

Maybe I am missing something but the float.ToString() method rounds numbers, which causes me a lot of headache.
Take a look at the following small code:
When entering 12345678 as input, the float number in the debugger is correct but the output of the ToString methods is 12345680 (in any format I've tried...)
Console.WriteLine("Please enter a float number");
float theFloat = float.Parse(Console.ReadLine());
Console.WriteLine(string.Format("You have entered {0}", theFloat.ToString("F")));
And the output:
Please enter a float number
12345678
You have entered 12345680.00
Help will be much appreciated!

From the documentation for float:
The range of a float is -3.4 × 10^38 to +3.4 × 10^38
The precision of a float is 7 digits.
Your number, 12345678, at 8 digits long exceeds the precision, so it is by default being rounded to 7 significant digits, which yields 12345680. (Note the by default.)
However, despite what that Microsoft article says about the precision of a float, in reality it holds up to 9 digits of precision.
The Microsoft documentation for Single.ToString() states:
By default, the return value only contains 7 digits of precision although a maximum of 9 digits is maintained internally.
It then goes on to say:
If you require more precision, specify format with the "G9" format specification, which always returns 9 digits of precision, or "R", which returns 7 digits if the number can be represented with that precision or 9 digits if the number can only be represented with maximum precision.
Armed with this information, we can write this code:
Console.WriteLine(12345678f.ToString("G9"));
Which does indeed print 12345678.
What I can't explain is why Microsoft state that a float has 7 digits of precision, and then goes on to let us use 9 digits...
However, note that not all 8 (or 9) decimal digit integers will have an exact representation in a float, as the following code demonstrates (the last digit differs):
Console.WriteLine(16777217f.ToString("R")); // Prints 16777216

Related

What's the meaning of precision digits in float / double / decimal? [duplicate]

This question already has answers here:
How to calculate float type precision and does it make sense?
(4 answers)
Closed 2 years ago.
On the C# reference for floating-point numeric types one can read that
float has a precision of 6 to 9 digits
double has a precision of 15 to 17 digits
decimal has a precision of 28 to 29 digits
What does precision mean in this context and especially, how can the precision be a range? As the number of bits for the exponent and mantissa are fixed, how can the precision be variable (in the described range)? Can someone please give an example for e.g. a float with a precision of 6 and one with a precision of 9?
float and double
(I'll explain float, that is IEEE-754 Single-precision floating-point format, but double, that is IEEE-754 Double-precision floating-point format is the same but with bigger numbers.
In general you can imagine a float to be:
mantissa₂ * (2 ^ exponent₂)
where mantissa₂ means mantissa in base two, and exponent₂ means exponent in base two
The mantissa₂ is 23 bits, the exponent₂ 8 bits. There is an extra bit for the sign, and the exponent₂ has a special format with special range that we will see much below
There is another trick: floating points are normally saved in "normalized" form:
1₂ mantissa₂ * (2 ^ exponent₂)
so the first digit is always 1₂, and so there is a 1₂ plus 23 binary digits for the mantissa₂, so a total of 24 digits for the complete mantissa₂.
Now, with 24 bits you can have numbers between 0 and 16,777,216, that is 7 full digits plus the 8th that is "partial" (you can't have 26,777,216 for example). In fact log₁₀ 2^24 = 7.22471989594
The exponent "moves" a floating decimal point, so that you can have, for example
1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂ . 1₂ (there are a total of 24 binary digits 1, I hope... I counted them)
or
1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂ . 1₂1₂
or
1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂0₂
or
1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂1₂0₂0₂
and so on.
The exponent₂ has three ranges: [-1;-127], [1;127], 0 for denormalized numbers and 255 for NaN and Infinite (where 255 means that all the bits of the exponent are at 1)
In the range [-1;-127] the decimal point is moved to the left, for a number of steps equal to the range, in the range [1;127]` the decimal point is moved to the right in the same way.
If the exponent is 0, the number is "denormalized". They are ugly floating point numbers that have special handling and are slower for this reason. When the number is "denormalized" then there is no implicit 1₂ at the beginning of the number, so you only have 23 bits of mantissa, that is 6 dot something digits of precision (log₁₀ 2^23 = 6.9236899)
Can't explain how the 9 digits of precision come out.
decimal
With decimal it is easy: the format is:
mantissa₂ / (10 ^ exponent₂)
where mantissa₂ is 96 bits, exponent₂ is 5 bits (a little less, the range is [0;28]), plus there is a sign bit, and many unused bits. The exact format is written in the reference source. In decimals there is no implicit initial 1₂, so it is pure 96 bits, and log₁₀ 2^96 = 28.8988795837, so 28 or 29 digits.
What's the meaning of precision digits in float / double / decimal?
The decimal digits needed to round trip between text and the type.
text-FP-text: When a number is decimal text and then converted to floating point type and then converted back to text with the same number of digits and get the same value, over the entire exponent range of the FP type, the max number of significant decimal digits in the text version is the lower number like 6 for float. As long as the text version has only 6 digits display, float can encode a value close enough.
FP-text-FP: When a FP number is converted to decimal text and then converted back to FP and get the same FP value, the number of significant decimal digits needed for the text version is the higher number like 9 for float. As long as a text version reports 9+ significant digits, the original FP value can be recovered exactly.
float has 24 bits of binary precession. To translate that into decimal, the above context is important. The minimum non-zero double takes about 330+ decimal digits to printout exactly, yet that is rarely thought of as the precession of that number.
Can someone please give an example for e.g. a float with a precision of 6 and one with a precision of 9?
6 decimal digits always works .... "9999979e3" and "9999978e3" both convert to 9.999978e+09, so 9 significant text digits needed to round-trip.

How to calculate float type precision and does it make sense?

I have a problem understanding the precision of float type.
The msdn writes that precision from 6 to 9 digits. But I note that precision depends from on the size of the number:
float smallNumber = 1.0000001f;
Console.WriteLine(smallNumber); // 1.0000001
bigNumber = 100000001f;
Console.WriteLine(bigNumber); // 100000000
The smallNumber is more precise than big, I understand IEEE754, but I don't understand how MSDN calculate precision, and does it make sense?
Also, you can play with the representation of numbers in float format here. Please write 100000000 value in "You entered" input and click "+1" on the right. Then change the input's value to 1, and click "+1" again. You may see the difference in precision.
The MSDN documentation is nonsensical and wrong.
Bad concept. Binary-floating-point format does not have any precision in decimal digits because it has no decimal digits at all. It represents numbers with a sign, a fixed number of binary digits (bits), and an exponent for a power of two.
Wrong on the high end. The floating-point format represents many numbers exactly, with infinite precision. For example, “3” is represented exactly. You can write it in decimal arbitrarily far, 3.0000000000…, and all of the decimal digits will be correct. Another example is 1.40129846432481707092372958328991613128026194187651577175706828388979108268586060148663818836212158203125•10−45. This number has 105 significant digits in decimal, but the float format represents it exactly (it is 2−149).
Wrong on the low end. When “999999.97” is converted from decimal to float, the result is 1,000,000. So not even one decimal digit is correct.
Not a measure of accuracy. Because the float significand has 24 bits, the resolution of its lowest bit is about 223 times finer than the resolution of its highest bit. This is about 6.9 digits in the sense that log10223 is about 6.9. But that just tells us the resolution—the coarseness—of the representation. When we convert a number to the float format, we get a result that differs from the number by at most ½ of this resolution, because we round to the nearest representable value. So a conversion to float has a relative error of at most 1 part in 224, which corresponds to about 7.2 digits in the above sense. If we are using digits to measure resolution, then we say the resolution is about 7.2 digits, not that it is 6-9 digits.
Where do these numbers came from?
So, if “~6-9 digits” is not a correct concept, does not come from actual bounds on the digits, and does not measure accuracy, where does it come from? We cannot be sure, but 6 and 9 do appear in two descriptions of the float format.
6 is the largest number x for which this is guaranteed:
If any decimal numeral with at most x significant digits is within the normal exponent bounds of the float format and is converted to the nearest value represented in the format, then, when the result is converted to the nearest decimal numeral with at most x significant digits, the result of that conversion equals the original number.
So it is reasonable to say float can preserve at least six decimal digits. However, as we will see, there is no bound involving nine digits.
9 is the smallest number x that guarantees this:
If any finite float number is converted to the nearest decimal numeral with x digits, then, when the result is converted to the nearest value representable in float, the result of that conversion equals the original number.
As an analogy, if float is a container, then the largest “decimal container” guaranteed to fit inside it is six digits, and the smallest “decimal container” guaranteed to hold it is nine digits. 6 and 9 are akin to interior and exterior measurements of the float container.
Suppose you had a block 7.2 units long, and you were looking at its placement on a line of bricks each 1 unit long. If you put the start of the block at the start of a brick, it will extend 7.2 bricks. However, if somebody else chooses where it starts, they might start it in the middle of a brick. Then it would cover part of that brick, all of the next 6 bricks, and and part of the last brick (e.g., .5 + 6 + .7 = 7.2). So a 7.2-unit block is only guaranteed to cover 6 bricks. Conversely, 8 bricks can cover the 7.2-unit block if you choose where they are placed. But if somebody else chooses where they start, the first might cover just .1 units of the block. Then you need 7 more and another fraction, so 9 bricks are needed.
The reason this analogy holds is that powers of two and powers of 10 are irregularly spaced relative to each other. 210 (1024) is near 103 (1000). 10 is the exponent used in the float format for numbers from 1024 (inclusive) to 2048 (exclusive). So this interval from 1024 to 2048 is like a block that has been placed just after the 100-1000 ends and the 1000-10,000 block starts.
But note that this property involving 9 digits is the exterior measurement—it is not a capability that float can perform or a service that it can provide. It is something that float needs (if it is to be held in a decimal format), not something it provides. So it is not a bound on how many digits a float can store.
Further Reading
For better understanding of floating-point arithmetic, consider studying the IEEE-754 Standard for Floating-Point Arithmetic or a good textbook like Handbook of Floating-Point Arithmetic by Jean-Michel Muller et al.
Yes number of digits before rounding errors is a measure of precision but you can not asses precision from just 2 numbers because you might be just closer or further from the rounding threshold.
To better understand the situation then you need to see how floats are represented.
The IEEE754 32bit floats are stored as:
bool(1bit sign) * integer(24bit mantisa) << integer(8bit exponent)
Yes mantissa is 24 bit instead of 23 as it's MSB is implicitly set to 1.
As you can see there are only integers and bitshift. So if you are representing natural number up to 2^24 you are without rounding completely. Fro bigger numbers binary zero padding occurs from the right that causes the difference.
In case of digits after decimal points the zero padding occurs from the left. But there is another problem as in binary you can not store some decadic numbers exactly. For example:
0.3 dec = 0.100110011001100110011001100110011001100... bin
0.25 dec = 0.01 bin
As you can see the sequence of 0.3 dec in binary is infinite (like we can not write 1/3 in decadic) hence if crop it to only 24 bits you lose the rest and the number is not what you want anymore.
If you compare 0.3 and 0.125 the 0.125 is exact and 0.3 is not but 0.125 is much smaller than 0.3. So your measure is not correct unless explored more very close values that will cover the rounding steps and computing the max difference from such set. For example you could compare
1.0000001f
1.0000002f
1.0000003f
1.0000004f
1.0000005f
1.0000006f
1.0000007f
1.0000008f
1.0000009f
and remember the max difference of fabs(x-round(x)) and than do the same for
100000001
100000002
100000003
100000004
100000005
100000006
100000007
100000008
100000009
And then compare the two differences.
On top of all this you are missing one very important thing. And that is the errors while converting from text to binary and back which are usually even bigger. First of all try to print your numbers without rounding (for example force to print 20 decimal digits after decimal point).
Also the numbers are stored in binary base so in order to print them you need to convert to decadic base which involves multiplication and division by 10. The more bits are missing (zero pad) from the number the bigger the print errors are. To be as precise as you can a trick is used and that is to print the number in hex (no rounding errors) and then convert the hex string itself to decadic base on integer math. That is much more accurate then naive floating point prints. for more info see related QAs:
my best attempt to print 32 bit floats with least rounding errors (integer math only)
How do libraries/programming languages convert floats to strings
How do I convert a very long binary number to decimal?
Now to get back to number of "precise" digits represented by float. For integer part of number is that easy:
dec_digits = floor(log10(2^24)) = floor(7.22) = 7
However for digits after decimal point is this not as precise (for first few decadic digits) as there are a lot rounding going on. For more info see:
How do you print the EXACT value of a floating point number?
I think what they mean in their documentation is that depending on the number that the precision ranges from 6 to 9 decimal places. Go by the standard that is explained on the page you linked, sometimes Microsoft are a bit lazy when it comes to documentation, like the rest of us.
The problem with floating point is that it is inaccurate. If you put the number 1.05 into the site in your link you will notice that it cannot be accurately stored in floating point. It's actually stored as 1.0499999523162841796875. It's stored this way to do calculations faster. It's not great for money, e.g. what if your item is priced at $1.05 and you sell a billion of them.
The smallNumber is more precise than big
Incorrect compare. The other number has more significant digits.
1.0000001f is attempting N digits of decimal precision.
100000001f attempts N+1.
I have a problem understanding the precision of float type.
To best understand float precision, think binary. Use "%a" for printing with a C99 or later compiler.
float is stored base 2. The significand is a Dyadic rational, some integer/power-of-2.
float commonly has 24 bits of binary precision. (23-bit explicitly encoded, 1 implied)
Between [1.0 ... 2.0), there are 223 different float values.
Between [2.0 ... 4.0), there are 223 different float values.
Between [4.0 ... 8.0), there are 223 different float values.
...
The possible values of a float are not distributed uniformly among powers-of-10. The grouping of float values to power-of-10 (decimal precision) results in the wobbling 6 to 9 decimal digits of precision.
How to calculate float type precision?
To find the difference between subsequent float values, since C99, use nextafterf()
Illustrative code:
#include<math.h>
#include<stdio.h>
void foooo(float b) {
float a = nextafterf(b, 0);
float c = nextafterf(b, b * 2.0f);
printf("%-15a %.9e\n", a, a);
printf("%-15a %.9e\n", b, b);
printf("%-15a %.9e\n", c, c);
printf("Local decimal precision %.2f digits\n", 1.0 - log10((c - b) / b));
}
int main(void) {
foooo(1.0000001f);
foooo(100000001.0f);
return 0;
}
Output
0x1p+0 1.000000000e+00
0x1.000002p+0 1.000000119e+00
0x1.000004p+0 1.000000238e+00
Local decimal precision 7.92 digits
0x1.7d783ep+26 9.999999200e+07
0x1.7d784p+26 1.000000000e+08
0x1.7d7842p+26 1.000000080e+08
Local decimal precision 8.10 digits

c# print float values with more precision

I want to print floats with greater precision than is the default.
For example when I print the value of PI from a float i get 6 decimals. But if I copy the same value from the float into a double and print it i get 14 decimals.
Why do I get more precision when printing the same value but as a double?
How can I get Console.WriteLine() to output more decimals when printing floats without needing to copy it into a double first?
I also tried the 0.0000000000 but it did not write with more precision, it just added more zeroes. :-/
My test code:
float x = (float)Math.PI;
double xx = x;
Console.WriteLine(x);
Console.WriteLine(xx);
Console.WriteLine($"{x,18:0.0000000000}"); // Try to force more precision
Console.WriteLine($"{xx,18:0.0000000000}");
Output:
3,141593
3,14159274101257
3,1415930000 <-- The program just added more zeroes :-(
3,1415927410
I also tried to enter PI at https://www.h-schmidt.net/FloatConverter/IEEE754.html
The binary float representation of PI is 0b01000000010010010000111111011011
So the value is: 2 * 0b1.10010010000111111011011 = 2 * 1.57079637050628662109375 = 3.1415927410125732421875
So there are more decimals to output. How would I get C# to output this whole value?
There is no more precision in a float. When you convert it to a double, the accuracy is worse, even though the precision (number of digits) increased - basically, all those extra digits you see in the double print are conversion artifacts - they are wrong, just the best representation of that given float number you can get in a double. This has everything to do with how binary floating point numbers work.
Let's look at the binary representation of the float:
01000000010010010000111111011100
The first bit is sign, the next eight are the exponent, and the rest is the mantissa. What do we get when we cast it to a double? The exponent stays the same, but the mantissa is filled in with zeroes. But that actually changes the number - you get 3.1415929794311523 (rounded, as always with binary floats) instead of the correct 3.14159265358979 double value of pi. You get the illusion of greater precision, but it's only an illusion - the number is no more accurate than before, you just replaced zeroes with noise.
There's no 1:1 mapping between floats and doubles. The same float value can be represented by many different double values, and the decimal representation of the number can change accordingly. Every cast of float to double should be followed by a rounding operation if you care about decimal precision. Consider this code instead:
double.Parse(((float)Math.PI).ToString())
Instead of casting the float to a double, I first changed it to a decimal representation (in a string), and created the double from that. Now instead of having a "truncated" double, you have a proper double that doesn't lie about extra precision; when you print it out, you get 3.1415930000. Still rounded, of course, since it's still a binary->decimal conversion, but no longer pretending to have more precision than is actually there - the rounding happens at a later digit than in the float version, the zeroes are really zeroes, except for the last one (which is only approximately zero).
If you want real decimal precision, you might want to use a decimal type (e.g. int or decimal). float and double are both binary numbers, and only have binary precision. A finite decimal number isn't necessarily finite in binary, just like a finite trinary number isn't necessarily finite in decimal (e.g. 1/3 doesn't have a finite decimal representation).
A float within c# has a precision of 7 digits and no more. That means 1 digit before the decimal and 6 after.
If you do have any more digits in your output, they might be entirely wrong.
If you do need more digits, you have to use either double which has 15-16 digits precision or decimal which has 28-29 digits.
See MSDN for reference.
You can easily verify that as the digits of PI are a little different from your output. The correct first 40 digits are: 3.14159 26535 89793 23846 26433 83279 50288 4197
To print the value with 6 places after the decimal.
Console.WriteLine("{0:N6}", (double)2/3);
Output :
0.666667

Why does Convert.ToDecimal(Double) round to 15 significant figures?

I have a double with 17 digits after the decimal point, i.e.:
double myDouble = 0.12345678901234567;
If I convert this to a decimal like this:
decimal myDecimal = Convert.ToDecimal(myDouble);
then the value of myDecimal is rounded, as per the Convert.ToDecimal documentation, to 15 digits (i.e. 0.0123456789012345). My question is, why is this rounding performed?
I understand that if my original number could be accurately represented in base 10 and I was trying to store it as a double, then we could only have confidence in the first 15 digits. The final two digits would be subject to rounding error. But, that's a base 10 biased point of view. My number may be more accurately represented by a double and I wish to convert it to decimal while preserving as much accuracy as possible.
Shouldn't Convert.ToDecimal aim to minimise the difference between myDouble and (double)Convert.ToDecimal(myDouble)?
From the documentation of Double:
A Double value has up to 15 decimal digits of precision, although a
maximum of 17 digits is maintained internally
So, as the double value itself has a maximum of 15 decimal places, converting it to Decimal will result in a Decimal value with 15 significant figures.
The behavior of the rounding guarantees that conversion of any Decimal which has at most fifteen significant figures to double and back to Decimal will yield the original value unchanged. If values were rounded to sixteen figures rather than fifteen, such a guarantee would not only fail to hold for number with sixteen figures, but it wouldn't even hold for much shorter values. For example, the closest Double value to 9000.04 is approximately 9000.040000000000873115; rounding that to sixteen figures would yield 9000.040000000001.
The choice of rounding one should use depends upon whether regards the best Decimal value of double value 9000.04 as being 9000.04m, 9000.040000000001m, 9000.0400000000008731m, or perhaps something else. Microsoft probably decided that any representation other than 9000.04m would be confusing.
The following is from the documentation of the method in question.
http://msdn.microsoft.com/en-us/library/a69w9ca0(v=vs.110).aspx
"The Decimal value returned by this method contains a maximum of 15 significant digits. If the value parameter contains more than 15 significant digits, it is rounded using rounding to nearest.
Every terminating binary fraction is exactly representable as a decimal fraction, so the minimum possible difference for a finite number is always 0. The IEEE 754 64-bit representation of your number is exactly equal to 0.1234567890123456634920984242853592149913311004638671875
Every conversion from binary floating point to decimal or decimal string must embody some decision about rounding. In theory, they could preserve all the digits, but that would result in some very long outputs with most of the digits having little meaning.
One option, used in Java's Double.toString, is to stop at the shortest decimal representation that would convert back to the original double.
Most set some fairly arbitrary limit. 15 significant digits preserves most meaningful digits.

Why do float and int have such different maximum values even though they're the same number of bits? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
what the difference between the float and integer data type when the size is same in java?
As you probably know, both of these types are 32-bits.int can hold only integer numbers, whereas float also supports floating point numbers (as the type names suggest).
How is it possible then that the max value of int is 231, and the max value of float is 3.4*1038, while both of them are 32 bits?
I think that int's max value capacity should be higher than the float because it doesn't save memory for the floating number and accepts only integer numbers. I'll be glad for an explanation in that case.
Your intuition quite rightly tells you that there can be no more information content in one than the other, because they both have 32 bits. But that doesn't mean we can't use those bits to represent different values.
Suppose I invent two new datatypes, uint4 and foo4. uint4 uses 4 bits to represent an integer, in the standard binary representation, so we have
bits value
0000 0
0001 1
0010 2
...
1111 15
But foo4 uses 4 bits to represent these values:
bits value
0000 0
0001 42
0010 -97
0011 1
...
1110 pi
1111 e
Now foo4 has a much wider range of values than uint4, despite having the same number of bits! How? Because there are some uint4 values that can't be represented by foo4, so those 'slots' in the bit mapping are available for other values.
It is the same for int and float - they can both store values from a set of 232 values, just different sets of 232 values.
A float might store a higher number value, but it will not be precise even on digits before the decimal dot.
Consider the following example:
float a = 123456789012345678901234567890f; //30 digits
Console.WriteLine(a); // 1.234568E+29
Notice that barely any precision is kept.
An integer on the other hand will always precisely store any number within its range of values.
For the sake of comparison, let's look at a double precision floating point number:
double a = 123456789012345678901234567890d; //30 digits
Console.WriteLine(a); // 1.23456789012346E+29
Notice that roughly twice as much significant digits are preserved.
These are based on IEEE754 floating point specification, that is why it is possible. Please read this documentation. It is not just about how many bits.
The hint is in the "floating" part of "floating point". What you say basically assumes fixed point. A floating point number does not "reserve space" for the digits after the decimal point - it has a limited number of digits (23 binary) and remembers what power of two to multiply it by.

Categories