I have a float for totalPoints which is being automatically increased by the Timer, but after totalPoints reaches certain number, it doesn't seem to increase no more. I did not put any limits so I'm not sure why is it happening.
So, totalPoints stops increasing when it reaches "2097152" value.
Here's part of my code:
public float totalPoins;
void AccumulatePoints()
{
timer += Time.deltaTime * miningPowerValue;
if (timer > 1f)
{
totalPoints += someValue;
timer = 0;
}
}
So basically it accumulates points depending on miningPowerValue. If its low, it will accumulate on a slower rate, higher - faster. Please help. I'm confused.
Floating-point numbers get less precise as they get bigger. You've ran into a case where they're big enough that the amount you're trying to add is smaller than the smallest possible difference between numbers of that size. Switch to double or something else with more precision.
To expand on Joseph Sible's answer in a way that I can't simply through a comment.
A single-precision floating-point value is a method of specifying a floating-point value, and in most programming languages is defined by the specification of IEEE 754. We can assume that C#'s floating-point values conform to IEEE 754. Floating point values are given a certain amount of bytes, which they use in three chunks: The sign (positive or negative), the exponent ('y' in x * 10^y), and the value ('x' in x * 10^y). (Note: The definition of these three chunks is a little more elaborate than stated here, but for this example, it suffices).
In other words, a floating point value would encode the value "2,000.00" as "positive, 3, 2". That is, positive 2 * 10^3, or 2 * 1000, or 2,000.
This means that a floating point value can represent a very large value, but it also means that very large values tend to encounter errors. At a certain point, floats need to start "lopping off" the end of their information, in order to get the best approximation for the space they have.
For example, assume we were trying to use a very small version of a floating point value to define the value 1,234,567, and it only has the space for 4 digits in its x. 1,234,567 would become "positive, 6, 1.234"; 1.234 + 10 ^ 6, which calculates to 1.234 * 1,000,000, which calculates to 1,234,000 - the last three digits are removed, because they are the "least significant" for the purposes of estimation. Assume we store this value to the variable Foo.
Now, say you try to add 1 to this value, via Foo += 1. The value would become 1,234,001. But since our floating-point value can only store the 4 largest digits, that 1 gets ignored. 1,234,000 is a good enough approximation for 1,234,001. Even if you added 1 a thousand times, it wouldn't change anything because it gets rounded off every time you add. Meanwhile, adding 1,000 directly would, in fact, have an effect.
This is what is happening in your code, on a much larger scale, and what Joseph Sible was trying to convey. This also explains why totalPoints += 0.5f will work when 0.1136f wont - 0.5 is a larger number than 0.1136, which means that 0.1136 stops being "significant" before 0.5 does. He recommended you use double, which is more precise. A double-precision floating-point value is similar to a single-precision floating point value, but considerably larger; that is, a double can store more bits than a float (about twice as many, in fact), so smaller numbers won't be lost as easily. This would solve your problem, at least up until the point where double needs to start lopping off small numbers again!
The problem is that you are exceeding the precision of a `float' which is only 7 digits.
Look at the following simple code:
void Main()
{
float f = 0.0f;
float i = 0.1136f;
int j = 0;
while (f <= 3000000.0f) //I arbitrarily chose 3 million here for illustration
{
f += i;
//Only print after every 1000th iteration of the loop
if (j % 1000 == 0)
{
Console.WriteLine(f);
}
j++;
}
}
At the beginning of this code you will see values like the following:
0.1136
113.7148
227.3164
340.9067
454.4931
568.0796
Then, after a bit, the decimal portion starts shrinking as the whole number portion grows:
9430.525
9543.807
9657.088
9770.369
9883.65
9996.932
10110.21
10223.49
10336.78
10450.06
10563.34
But as the value gets higher and higher, it eventually just outputs this:
999657.9
999782.9
999907.9
1000033
1000158
1000283
1000408
1000533
1000658
1000783
Notice how the part after decimal shrinks but the part before it increases? Since the float data type only has 7 digits of precision, the least significant digits after the decimal point are chopped off. double on the other hand has 16 digits of precision.
I hope this helps to explain what is happening.
Related
I have a problem understanding the precision of float type.
The msdn writes that precision from 6 to 9 digits. But I note that precision depends from on the size of the number:
float smallNumber = 1.0000001f;
Console.WriteLine(smallNumber); // 1.0000001
bigNumber = 100000001f;
Console.WriteLine(bigNumber); // 100000000
The smallNumber is more precise than big, I understand IEEE754, but I don't understand how MSDN calculate precision, and does it make sense?
Also, you can play with the representation of numbers in float format here. Please write 100000000 value in "You entered" input and click "+1" on the right. Then change the input's value to 1, and click "+1" again. You may see the difference in precision.
The MSDN documentation is nonsensical and wrong.
Bad concept. Binary-floating-point format does not have any precision in decimal digits because it has no decimal digits at all. It represents numbers with a sign, a fixed number of binary digits (bits), and an exponent for a power of two.
Wrong on the high end. The floating-point format represents many numbers exactly, with infinite precision. For example, “3” is represented exactly. You can write it in decimal arbitrarily far, 3.0000000000…, and all of the decimal digits will be correct. Another example is 1.40129846432481707092372958328991613128026194187651577175706828388979108268586060148663818836212158203125•10−45. This number has 105 significant digits in decimal, but the float format represents it exactly (it is 2−149).
Wrong on the low end. When “999999.97” is converted from decimal to float, the result is 1,000,000. So not even one decimal digit is correct.
Not a measure of accuracy. Because the float significand has 24 bits, the resolution of its lowest bit is about 223 times finer than the resolution of its highest bit. This is about 6.9 digits in the sense that log10223 is about 6.9. But that just tells us the resolution—the coarseness—of the representation. When we convert a number to the float format, we get a result that differs from the number by at most ½ of this resolution, because we round to the nearest representable value. So a conversion to float has a relative error of at most 1 part in 224, which corresponds to about 7.2 digits in the above sense. If we are using digits to measure resolution, then we say the resolution is about 7.2 digits, not that it is 6-9 digits.
Where do these numbers came from?
So, if “~6-9 digits” is not a correct concept, does not come from actual bounds on the digits, and does not measure accuracy, where does it come from? We cannot be sure, but 6 and 9 do appear in two descriptions of the float format.
6 is the largest number x for which this is guaranteed:
If any decimal numeral with at most x significant digits is within the normal exponent bounds of the float format and is converted to the nearest value represented in the format, then, when the result is converted to the nearest decimal numeral with at most x significant digits, the result of that conversion equals the original number.
So it is reasonable to say float can preserve at least six decimal digits. However, as we will see, there is no bound involving nine digits.
9 is the smallest number x that guarantees this:
If any finite float number is converted to the nearest decimal numeral with x digits, then, when the result is converted to the nearest value representable in float, the result of that conversion equals the original number.
As an analogy, if float is a container, then the largest “decimal container” guaranteed to fit inside it is six digits, and the smallest “decimal container” guaranteed to hold it is nine digits. 6 and 9 are akin to interior and exterior measurements of the float container.
Suppose you had a block 7.2 units long, and you were looking at its placement on a line of bricks each 1 unit long. If you put the start of the block at the start of a brick, it will extend 7.2 bricks. However, if somebody else chooses where it starts, they might start it in the middle of a brick. Then it would cover part of that brick, all of the next 6 bricks, and and part of the last brick (e.g., .5 + 6 + .7 = 7.2). So a 7.2-unit block is only guaranteed to cover 6 bricks. Conversely, 8 bricks can cover the 7.2-unit block if you choose where they are placed. But if somebody else chooses where they start, the first might cover just .1 units of the block. Then you need 7 more and another fraction, so 9 bricks are needed.
The reason this analogy holds is that powers of two and powers of 10 are irregularly spaced relative to each other. 210 (1024) is near 103 (1000). 10 is the exponent used in the float format for numbers from 1024 (inclusive) to 2048 (exclusive). So this interval from 1024 to 2048 is like a block that has been placed just after the 100-1000 ends and the 1000-10,000 block starts.
But note that this property involving 9 digits is the exterior measurement—it is not a capability that float can perform or a service that it can provide. It is something that float needs (if it is to be held in a decimal format), not something it provides. So it is not a bound on how many digits a float can store.
Further Reading
For better understanding of floating-point arithmetic, consider studying the IEEE-754 Standard for Floating-Point Arithmetic or a good textbook like Handbook of Floating-Point Arithmetic by Jean-Michel Muller et al.
Yes number of digits before rounding errors is a measure of precision but you can not asses precision from just 2 numbers because you might be just closer or further from the rounding threshold.
To better understand the situation then you need to see how floats are represented.
The IEEE754 32bit floats are stored as:
bool(1bit sign) * integer(24bit mantisa) << integer(8bit exponent)
Yes mantissa is 24 bit instead of 23 as it's MSB is implicitly set to 1.
As you can see there are only integers and bitshift. So if you are representing natural number up to 2^24 you are without rounding completely. Fro bigger numbers binary zero padding occurs from the right that causes the difference.
In case of digits after decimal points the zero padding occurs from the left. But there is another problem as in binary you can not store some decadic numbers exactly. For example:
0.3 dec = 0.100110011001100110011001100110011001100... bin
0.25 dec = 0.01 bin
As you can see the sequence of 0.3 dec in binary is infinite (like we can not write 1/3 in decadic) hence if crop it to only 24 bits you lose the rest and the number is not what you want anymore.
If you compare 0.3 and 0.125 the 0.125 is exact and 0.3 is not but 0.125 is much smaller than 0.3. So your measure is not correct unless explored more very close values that will cover the rounding steps and computing the max difference from such set. For example you could compare
1.0000001f
1.0000002f
1.0000003f
1.0000004f
1.0000005f
1.0000006f
1.0000007f
1.0000008f
1.0000009f
and remember the max difference of fabs(x-round(x)) and than do the same for
100000001
100000002
100000003
100000004
100000005
100000006
100000007
100000008
100000009
And then compare the two differences.
On top of all this you are missing one very important thing. And that is the errors while converting from text to binary and back which are usually even bigger. First of all try to print your numbers without rounding (for example force to print 20 decimal digits after decimal point).
Also the numbers are stored in binary base so in order to print them you need to convert to decadic base which involves multiplication and division by 10. The more bits are missing (zero pad) from the number the bigger the print errors are. To be as precise as you can a trick is used and that is to print the number in hex (no rounding errors) and then convert the hex string itself to decadic base on integer math. That is much more accurate then naive floating point prints. for more info see related QAs:
my best attempt to print 32 bit floats with least rounding errors (integer math only)
How do libraries/programming languages convert floats to strings
How do I convert a very long binary number to decimal?
Now to get back to number of "precise" digits represented by float. For integer part of number is that easy:
dec_digits = floor(log10(2^24)) = floor(7.22) = 7
However for digits after decimal point is this not as precise (for first few decadic digits) as there are a lot rounding going on. For more info see:
How do you print the EXACT value of a floating point number?
I think what they mean in their documentation is that depending on the number that the precision ranges from 6 to 9 decimal places. Go by the standard that is explained on the page you linked, sometimes Microsoft are a bit lazy when it comes to documentation, like the rest of us.
The problem with floating point is that it is inaccurate. If you put the number 1.05 into the site in your link you will notice that it cannot be accurately stored in floating point. It's actually stored as 1.0499999523162841796875. It's stored this way to do calculations faster. It's not great for money, e.g. what if your item is priced at $1.05 and you sell a billion of them.
The smallNumber is more precise than big
Incorrect compare. The other number has more significant digits.
1.0000001f is attempting N digits of decimal precision.
100000001f attempts N+1.
I have a problem understanding the precision of float type.
To best understand float precision, think binary. Use "%a" for printing with a C99 or later compiler.
float is stored base 2. The significand is a Dyadic rational, some integer/power-of-2.
float commonly has 24 bits of binary precision. (23-bit explicitly encoded, 1 implied)
Between [1.0 ... 2.0), there are 223 different float values.
Between [2.0 ... 4.0), there are 223 different float values.
Between [4.0 ... 8.0), there are 223 different float values.
...
The possible values of a float are not distributed uniformly among powers-of-10. The grouping of float values to power-of-10 (decimal precision) results in the wobbling 6 to 9 decimal digits of precision.
How to calculate float type precision?
To find the difference between subsequent float values, since C99, use nextafterf()
Illustrative code:
#include<math.h>
#include<stdio.h>
void foooo(float b) {
float a = nextafterf(b, 0);
float c = nextafterf(b, b * 2.0f);
printf("%-15a %.9e\n", a, a);
printf("%-15a %.9e\n", b, b);
printf("%-15a %.9e\n", c, c);
printf("Local decimal precision %.2f digits\n", 1.0 - log10((c - b) / b));
}
int main(void) {
foooo(1.0000001f);
foooo(100000001.0f);
return 0;
}
Output
0x1p+0 1.000000000e+00
0x1.000002p+0 1.000000119e+00
0x1.000004p+0 1.000000238e+00
Local decimal precision 7.92 digits
0x1.7d783ep+26 9.999999200e+07
0x1.7d784p+26 1.000000000e+08
0x1.7d7842p+26 1.000000080e+08
Local decimal precision 8.10 digits
Assume I have an array of bytes which are truly random (e.g. captured from an entropy source).
byte[] myTrulyRandomBytes = MyEntropyHardwareEngine.GetBytes(8);
Now, I want to get a random double precision floating point value, but between the values of 0 and positive 1 (like the Random.NextDouble() function performs).
Simply passing an array of 8 random bytes into BitConverter.ToDouble() can yield strange results, but most importantly, the results will almost never be less than 1.
I am fine with bit-manipulation, but the formatting of floating point numbers has always been mysterious to me. I tried many combinations of bits to apply randomness to and always ended up finding the numbers were either just over 1, always VERY close to 0, or very large.
Can someone explain which bits should be made random in a double in order to make it random within the range 0 and 1?
Though working answers have been given, I'll give an other one, that looks worse but isn't:
long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) / long.MaxValue;
The issue with casting from an ulong to double is that it's not directly supported by hardware, so it compiles to this:
vxorps xmm0,xmm0,xmm0
vcvtsi2sd xmm0,xmm0,rcx ; interpret ulong as long and convert it to double
test rcx,rcx ; add fixup if it was "negative"
jge 000000000000001D
vaddsd xmm0,xmm0,mmword ptr [00000060h]
vdivsd xmm0,xmm0,mmword ptr [00000068h]
Whereas with my suggestion it will compile more nicely:
vxorps xmm0,xmm0,xmm0
vcvtsi2sd xmm0,xmm0,rcx
vdivsd xmm0,xmm0,mmword ptr [00000060h]
Both tested with the x64 JIT in .NET 4, but this applies in general, there just isn't a nice way to convert an ulong to a double.
Don't worry about the bit of entropy being lost: there are only 262 doubles between 0.0 and 1.0 in the first place, and most of the smaller doubles cannot be chosen so the number of possible results is even less.
Note that this as well as the presented ulong examples can result in exactly 1.0 and distribute the values with slightly differing gaps between adjacent results because they don't divide by a power of two. You can change them exclude 1.0 and get a slightly more uniform spacing (but see the first plot below, there is a bunch of different gaps, but this way it is very regular) like this:
long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) / ((double)long.MaxValue + 1);
As a really nice bonus, you can now change the division to a multiplication (powers of two usually have inverses)
long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) * 1.08420217248550443400745280086994171142578125E-19;
Same idea for ulong, if you really want to use that.
Since you also seemed interested specifically in how to do it with double-bits trickery, I can show that too.
Because of the whole significand/exponent deal, it can't really be done in a super direct way (just reinterpreting the bits and that's it), mainly because choosing the exponent uniformly spells trouble (with a uniform exponent, the numbers are necessarily clumped preferentially near 0 since most exponents are there).
But if the exponent is fixed, it's easy to make a double that's uniform in that region. That cannot be 0 to 1 because that spans a lot of exponents, but it can be 1 to 2 and then we can subtract 1.
So first mask away the bits that won't be part of the significand:
x &= (1L << 52) - 1;
Put in the exponent (1.0 - 2.0 range, excluding 2)
x |= 0x3ff0000000000000;
Reinterpret and adjust for the offset of 1:
return BitConverter.Int64BitsToDouble(x) - 1;
Should be pretty fast, too. An unfortunate side effect is that this time it really does cost a bit of entropy, because there are only 52 but there could have been 53. This way always leaves the least significant bit zero (the implicit bit steals a bit).
There were some concerns about the distributions, which I will address now.
The approach of choosing a random (u)long and dividing it by the maximum value clearly has a uniformly chosen (u)long, and what happens after that is actually interesting. The result can justifiably be called a uniform distribution, but if you look at it as a discrete distribution (which it actually is) it looks (qualitatively) like this: (all examples for minifloats)
Ignore the "thicker" lines and wider gaps, that's just the histogram being funny. These plots used division by a power of two, so there is no spacing problem in reality, it's only plotted strangely.
Top is what happens when you use too many bits, as happens when dividing a complete (u)long by its max value. This gives the lower floats a better resolution, but lots of different (u)longs get mapped onto the same float in the higher regions. That's not necessarily a bad thing, if you "zoom out" the density is the same everywhere.
The bottom is what happens when the resolution is limited to the worst case (0.5 to 1.0 region) everywhere, which you can do by limiting the number of bits first and then doing the "scale the integer" deal. My second suggesting with the bit hacks does not achieve this, it's limited to half that resolution.
For what it's worth, NextDouble in System.Random scales a non-negative int into the 0.0 .. 1.0 range. The resolution of that is obviously a lot lower than it could be. It also uses an int that cannot be int.MaxValue and therefore scales by approximately 1/(231-1) (cannot be represented by a double, so slightly rounded), so there are actually 33 slightly different gaps between adjacent possible results, though the majority of the gaps is the same distance.
Since int.MaxValue is small compared to what can be brute-forced these days, you can easily generate all possible results of NextDouble and examine them, for example I ran this:
const double scale = 4.6566128752458E-10;
double prev = 0;
Dictionary<long, int> hist = new Dictionary<long, int>();
for (int i = 0; i < int.MaxValue; i++)
{
long bits = BitConverter.DoubleToInt64Bits(i * scale - prev);
if (!hist.ContainsKey(bits))
hist[bits] = 1;
else
hist[bits]++;
prev = i * scale;
if ((i & 0xFFFFFF) == 0)
Console.WriteLine("{0:0.00}%", 100.0 * i / int.MaxValue);
}
This is easier than you think; its all about scaling (also true when going from a 0-1 range to some other range).
Basically, if you know that you have 64 truly random bits (8 bytes) then just do this:
double zeroToOneDouble = (double)(BitConverter.ToUInt64(bytes) / (decimal)ulong.MaxValue);
The trouble with this kind of algorithm comes when your "random" bits aren't actually uniformally random. That's when you need a specialized algorithm, such as a Mersenne Twister.
I don't know wether it's the best solution for this, but it should do the job:
ulong asLong = BitConverter.ToUInt64(myTrulyRandomBytes, 0);
double number = (double)asLong / ulong.MaxValue;
All I'm doing is converting the byte array to a ulong which is then divided by it's max value, so that the result is between 0 and 1.
To make sure the long value is within the range from 0 to 1, you can apply the following mask:
long longValue = BitConverter.ToInt64(myTrulyRandomBytes, 0);
longValue &= 0x3fefffffffffffff;
The resulting value is guaranteed to lay in the range [0, 1).
Remark. The 0x3fefffffffffffff value is very-very close to 1 and will be printed as 1, but it is really a bit less than 1.
If you want to make the generated values greater, you could set a number higher bits of an exponent to 1. For instance:
longValue |= 0x03c00000000000000;
Summarizing: example on dotnetfiddle.
If you care about the quality of the random numbers generated, be very suspicious of the answers that have appeared so far.
Those answers that use Int64BitsToDouble directly will definitely have problems with NaNs and infinities. For example, 0x7ff0000000000001, a perfectly good random bit pattern, converts to NaN (and so do thousands of others).
Those that try to convert to a ulong and then scale, or convert to a double after ensuring that various bit-pattern constraints are met, won't have NaN problems, but they are very likely to have distributional problems. Representable floating point numbers are not distributed uniformly over (0, 1), so any scheme that randomly picks among all representable values will not produce values with the required uniformity.
To be safe, just use ToInt32 and use that int as a seed for Random. (To be extra safe, reject 0.) This won't be as fast as the other schemes, but it will be much safer. A lot of research and effort has gone into making RNGs good in ways that are not immediately obvious.
Simple piece of code to print the bits out for you.
for (double i = 0; i < 1.0; i+=0.05)
{
var doubleToInt64Bits = BitConverter.DoubleToInt64Bits(i);
Console.WriteLine("{0}:\t{1}", i, Convert.ToString(doubleToInt64Bits, 2));
}
0.05: 11111110101001100110011001100110011001100110011001100110011010
0.1: 11111110111001100110011001100110011001100110011001100110011010
0.15: 11111111000011001100110011001100110011001100110011001100110100
0.2: 11111111001001100110011001100110011001100110011001100110011010
0.25: 11111111010000000000000000000000000000000000000000000000000000
0.3: 11111111010011001100110011001100110011001100110011001100110011
0.35: 11111111010110011001100110011001100110011001100110011001100110
0.4: 11111111011001100110011001100110011001100110011001100110011001
0.45: 11111111011100110011001100110011001100110011001100110011001100
0.5: 11111111011111111111111111111111111111111111111111111111111111
0.55: 11111111100001100110011001100110011001100110011001100110011001
0.6: 11111111100011001100110011001100110011001100110011001100110011
0.65: 11111111100100110011001100110011001100110011001100110011001101
0.7: 11111111100110011001100110011001100110011001100110011001100111
0.75: 11111111101000000000000000000000000000000000000000000000000001
0.8: 11111111101001100110011001100110011001100110011001100110011011
0.85: 11111111101011001100110011001100110011001100110011001100110101
0.9: 11111111101100110011001100110011001100110011001100110011001111
0.95: 11111111101110011001100110011001100110011001100110011001101001
Background:
I have an opc tag whose value is a 32 bit floating point number. The tag represents a running tally - so at some point the total will hit the max value and start again at 0. An additional bit of information about the tag is that it's data type is single float but it's value is always a whole number ( I didn't create the tag so I'm not sure why float was chosen ).
In my application when a certain event is fired I record the current value of this running total and when another event is fired I take the current value of the total and then need to calculate the difference.
The only issue I have is that if the value were to be near max value when I take the initial reading, and then rollover to 0+ before I take the second reading I will end up with a negative value for the difference.
I've got this so far (it's in c#):
float diff (float a, float b) {
if (b < a) {
float f = float.MaxValue - a;
return f + b;
}
return b - a;
}
but this doesn't work since unless a is huge, float.MaxValue - a == 0. Does anyone know how I might work this out?
Floats don't overflow the same way integers do.
If you add two floats that should add to a number larger than float.MaxValue, the result will be float.PositiveInfinity, not a large negative number as with signed integers or a small positive number as with unsigned integers.
But before that happens, you will most likely run into a different problem: If you add a small number to a large number, the result might be the large number, unchanged. So, if the score always increases by some (relatively) small number, you will most likely find that, after a certain point, the score doesn't change anymore.
Because of all this, your “roll-over” will never happen and so maybe you don't have to do anything. But even better solution would be to use an integer type to represent integers, thus avoiding all those pitfalls.
I have a double value that equals 1.212E+25
When I throw it out to text I do myVar.ToString("0000000000000000000000")
The problem is even if I do myVar++ 3 or 4 times the value seems to stay the same.
Why is that?
That is because the precision of a double is not sufficient. It simply can't hold that many significant digits.
It will not fit into a long, but probably into a Decimal.
But... do you really need this level of precision?
To expand on the other answer the smallest increase you can make to a double is one Unit in the Last Place, or ULP, as double is a floating point type then the size of an ULP changes, at 1E+25 it will be about 1E+10.
as you can see compared to 1E+10 incrementing by 1 really might as well be adding nothing. which is exactly what double will do, so it wouldnt matter if you tried it 10^25 times it still won't increase unless you try to increase by at least 1 ULP
if incrementing by an ULP is useful you can do this by casting the bits to long and back here is a quick extension method to do that
public static double UlpChange(this double val, int ulp)
{
if (!double.IsInfinity(val) && !double.IsNaN(val))
{
//should probably do something if we are at max or min values
//but its not clear what
long bits = BitConverter.DoubleToInt64Bits(val);
return BitConverter.Int64BitsToDouble(bits + ulp);
}
return val;
}
double (Double) holds about 16 digits of precision and long (Int64) about 18 digits.
Neither of these appear to have sufficient precision for your needs.
However decimal (Decimal) holds up to 30 digits of precision. Although this appears to be great enough for your needs I'd recommend caution in case your requirement grows even larger. In that case you may need a third party numeric library.
Related StackOverflow entries are:
How can I represent a very large integer in .NET?
Big integers in C#
You may want to read the Floating-Point Guide to understand how doubles work.
Basically, a double only has about 16 decimal digits of precision. At a magnitude of 10^25, an increase of 1.0 is below the threshold of precision and gets lost. Due to the binary representation, this may not be obvious.
The smallest increment that'll work is 2^30+1 which will actually increment the double by 2^31. You can test this kind of thing easily enough with LINQPad:
double inc = 1.0;
double num = 1.212e25;
while(num+inc == num) inc*=2;
inc.Dump(); //2147483648 == 2^31
(num+inc == num).Dump(); //false due to loop invariant
(num+(inc/2.0) == num).Dump();//true due to loop invariant
(num+(inc/2.0+1.0) == num).Dump();//false - 2^30+1 suffices to change the number
(num+(inc/2.0+1.0) == num + inc).Dump();//true - 2^30+1 and 2^31 are equiv. increments
((num+(inc/2.0+1.0)) - num == inc ).Dump();//true - the effective increment is 2^31
Since a double is essentially a binary number with limited precision, that means that the smallest possible increment will itself always be a power of two (this increment can be determined directly from the bit-pattern of the double, but it's probably clearer to do so with a loop as above since that's portable across float, double and other floating point representations (which don't exist in .NET).
One error I stumble upon every few month is this one:
double x = 19.08;
double y = 2.01;
double result = 21.09;
if (x + y == result)
{
MessageBox.Show("x equals y");
}
else
{
MessageBox.Show("that shouldn't happen!"); // <-- this code fires
}
You would suppose the code to display "x equals y" but that's not the case.
The short explanation is that the decimal places are, represented as a binary digit, do not fit into double.
Example:
2.625 would look like:
10.101
because
1-------0-------1---------0----------1
1 * 2 + 0 * 1 + 1 * 0.5 + 0 * 0.25 + 1 * 0,125 = 2.65
And some values (like the result of 19.08 plus 2.01) cannot be be represented with the bits of a double.
One solution is to use a constant:
double x = 19.08;
double y = 2.01;
double result = 21.09;
double EPSILON = 10E-10;
if ( x + y - result < EPSILON )
{
MessageBox.Show("x equals y"); // <-- this code fires
}
else
{
MessageBox.Show("that shouldn't happen!");
}
If I use decimal instead of double in the first example, the result is "x equals y".
But I'm asking myself If this is because of "decimal" type is not vulnerable of this behaviour or it just works in this case because the values "fit" into 128 bit.
Maybe someone has a better solution than using a constant?
Btw. this is not a dotNet/C# problem, it happens in most programming languages I think.
Decimal will be accurate so long as you stay within values which are naturally decimals in an appropriate range. So if you just add and subtract, for example, without doing anything which would skew the range of digits required too much (adding a very very big number to a very very small number) you will end up with easily comparable results. Multiplication is likely to be okay too, but I suspect it's easier to get inaccuracies with it.
As soon as you start dividing, that's where the problems can come - particularly if you start dividing by numbers which include prime factors other than 2 or 5.
Bottom line: it's safe in certain situations, but you really need to have a good handle on exactly what operations you'll be performing.
Note that it's not the 128-bitness of decimal which is helping you here - it's the representation of numbers as floating decimal point values rather than floating binary point values. See my articles on .NET binary floating point and decimal floating point for more information.
System.Decimal is just a floating point number with a different base so, in theory, it is still vulnerable to the sort of error you point out. I think you just happened on a case where rounding doesn't happen. More information here.
Yes, the .NET System.Double structure is subject to the problem you describe.
from http://msdn.microsoft.com/en-us/library/system.double.epsilon.aspx:
Two apparently equivalent floating-point numbers might not compare equal because of differences in their least significant digits. For example, the C# expression, (double)1/3 == (double)0.33333, does not compare equal because the division operation on the left side has maximum precision while the constant on the right side is precise only to the specified digits. If you create a custom algorithm that determines whether two floating-point numbers can be considered equal, you must use a value that is greater than the Epsilon constant to establish the acceptable absolute margin of difference for the two values to be considered equal. (Typically, that margin of difference is many times greater than Epsilon.)