Can C# store more precise data than doubles? - c#

double in C# don't hold enough precision for my needs. I am writing a fractal program, and after zooming in a few times I run out of precision.
I there a data type that can hold more precise floating-point information (i.e more decimal places) than a double?

Yes, decimal is designed for just that.
However, do be aware that the range of the decimal type is smaller than a double. That is double can hold a larger value, but it does so by losing precision. Or, as stated on MSDN:
The decimal keyword denotes a 128-bit
data type. Compared to floating-point
types, the decimal type has a greater
precision and a smaller range, which
makes it suitable for financial and
monetary calculations. The approximate
range and precision for the decimal
type are shown in the following table.
The primary difference between decimal and double is that decimal is fixed-point and double is floating point. That means that decimal stores an exact value, while double represents a value represented by a fraction, and is less precise. A decimalis 128 bits, so it takes the double space to store. Calculations on decimal is also slower (measure !).
If you need even larger precision, then BigInteger can be used from .NET 4. (You will need to handle decimal points yourself). Here you should be aware, that BigInteger is immutable, so any arithmetic operation on it will create a new instance - if numbers are large, this might be crippling for performance.
I suggest you look into exactly how precise you need to be. Perhaps your algorithm can work with normalized values, that can be smaller ? If performance is an issue, one of the built in floating point types are likely to be faster.

The .NET Framework 4 introduces the System.Numerics.BigInteger struct that can hold numbers with an arbitrary large precision.

Check out BigInteger (.NET 4) if you need even more precision than Decimal gives you.

Related

C# decimal and double

From what I understand decimal is used for precision and is recommended for monetary calculations. Double gives better range, but less precision and is a lot faster than decimal.
What if I have time and rate, I feel like double is suited for time and decimal for rate. I can't mix the two and run calculations without casting which is yet another performance bottleneck. What's the best approach here? Just use decimal for time and rate?
Use double for both. decimal is for currency or other situations where the base-10 representation of the number is important. If you don't care about the base-10 representation of a number, don't use decimal. For things like time or rates of change of physical quantities, the base-10 representation generally doesn't matter, so decimal is not the most appropriate choice.
The important thing to realize is that decimal is still a floating-point type. It still suffers from rounding error and cannot represent certain "simple" numbers (such as 1/3). Its one advantage (and one purpose) is that it can represent decimal numbers with fewer than 29 significant digits exactly. That means numbers like 0.1 or 12345.6789. Basically any decimal you can write down on paper with fewer than 29 digits. If you have a repeating decimal or an irrational number, decimal offers no major benefits.
The rule of thumb is to use the type that is more suitable to the values you will handle. This means that you should use DateTime or TimeSpan for time, unless you only care about a specific unit, like seconds, days, etc., in which case you can use any integer type. Usually for time you need precision and don't want any error due to rounding, so I wouldn't use any floating point type like float or double.
For anything related to money, of course you don't want any rounding error either, so you should really use decimal here.
Finally, only if for some very specific requirements you need absolute speed in a calculation that is done millions of times and for which decimal happens not to be fast enough, only then I would think of using another faster type. I would first try with integer values (maybe multiplying your value by a power of 10 if you have decimals) and only divide by this power of 10 at the end. If this can't be done, only then I would think of using a double. Don't do a premature optimization if you are not sure it's needed.

How big is the precision loss converting long to double?

I have read in different post on stackoverflow and in the C# documentation, that converting long (or any other data type representing a number) to double loses precision. This is quite obvious due to the representation of floating point numbers.
My question is, how big is the loss of precision if I convert a larger number to double? Do I have to expect differences larger than +/- X ?
The reason I would like to know this, is that I have to deal with a continuous counter which is a long. This value is read by my application as string, needs to be cast and has to be divided by e.g. 10 or some other small number and is then processed further.
Would decimal be more appropriate for this task?
converting long (or any other data type representing a number) to double loses precision. This is quite obvious due to the representation of floating point numbers.
This is less obvious than it seems, because precision loss depends on the value of long. For values between -252 and 252 there is no precision loss at all.
How big is the loss of precision if I convert a larger number to double? Do I have to expect differences larger than +/- X
For numbers with magnitude above 252 you will experience some precision loss, depending on how much above the 52-bit limit you go. If the absolute value of your long fits in, say, 58 bits, then the magnitude of your precision loss will be 58-52=6 bits, or +/-64.
Would decimal be more appropriate for this task?
decimal has a different representation than double, and it uses a different base. Since you are planning to divide your number by "small numbers", different representations would give you different errors on division. Specifically, double will be better at handling division by powers of two (2, 4, 8, 16, etc.) because such division can be accomplished by subtracting from exponent, without touching the mantissa. Similarly, large decimals would suffer no loss of significant digits when divided by ten, hundred, etc.
long
long is a 64-bit integer type and can hold values from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (max. 19 digits).
double
double is 64-bit floating-point type that has precision of 15 to 16 digits. So data can certainly be lost in case your numbers are greater than ~100,000,000,000,000.
decimal
decimal is a 128-bit decimal type and can hold up to 28-29 digits. So it's always safe to cast long to decimal.
Recommendation
I would advice that you find out the exact expectations about the numbers you will be working with. Then you can take an informed decision in choosing the appropriate data type. Since you are reading your numbers from a string, isn't it possible that they will be even greater than 28 digits? In that case, none of the types listed will work for you, and instead you'll have to use some sort of a BigInt implementation.

How does decimal work?

I looked at decimal in C# but I wasnt 100% sure what it did.
Is it lossy? in C# writing 1.0000000000001f+1.0000000000001f results in 2 when using float (double gets you 2.0000000000002 which is correct) is it possible to add two things with decimal and not get the correct answer?
How many decimal places can I use? I see the MaxValue is 79228162514264337593543950335 but if i subtract 1 how many decimal places can I use?
Are there quirks I should know of? In C# its 128bits, in other language how many bits is it and will it work the same way as C# decimal does? (when adding, dividing, multiplication)
What you're showing isn't decimal - it's float. They're very different types. f is the suffix for float, aka System.Single. m is the suffix for decimal, aka System.Decimal. It's not clear from your question whether you thought this was actually using decimal, or whether you were just using float to demonstrate your fears.
If you use 1.0000000000001m + 1.0000000000001m you'll get exactly the right value. Note that the double version wasn't able to express either of the individual values exactly, by the way.
I have articles on both kinds of floating point in .NET, and you should read them thoroughly, along other resources:
Binary floating point (float/double)
Decimal floating point (decimal)
All floating point types have their limits of course, but in particular you should not expect binary floating point to accurately represent decimal values such as 0.1. It still can't represent anything that isn't exactly representable in 28/29 decimal digits though - so if you divide 1 by 3, you won't get the exact answer of course.
You should also note that the range of decimal is considerably smaller than that of double. So while it can have 28-29 decimal digits of precision, you can't represent truly huge numbers (e.g. 10200) or miniscule numbers (e.g. 10-200).
Decimals in programming are (almost) never 100% accurate. Sometimes it's even better to multiply the decimal value with a very high number and then calculate, but that's only if you're for example sure that the value is always between 0 and 100(so it won't get out of range of the maxvalue)
Floting point is inherently imprecise. Some numbers can't be represented faithfully. Decimal is a large floating point with high precision. If you look on the page at msdn you can see there are "28-29 significant digits." The .net framework classes are language agnostic. they will work the same in every language that uses .net.
edit (in response to Jon Skeet): If you initialize the Decimal class with the numbers above, which are less than 28 digits each after the decimal point, the number will be stored faithfully as long as the binary representation is exact. Since it works in 64-bit format, I assume the 128-bit will handle it perfectly fine. Some numbers, such as 0.1, will never be exactly representable because they are a repeating sequence in binary.

How to safely convert from a double to a decimal in c#

We are storing financial data in a SQL Server database using the decimal data type and we need 6-8 digits of precision in the decimal. When we get this value back through our data access layer into our C# server, it is coming back as the decimal data type.
Due to some design constraints that are beyond my control, this needs to be converted. Converting to a string isn't a problem. Converting to a double is as the MS documentation says "[converting from decimal to double] can produce round-off errors because a double-precision floating-point number has fewer significant digits than a decimal."
As the double (or string) we can round to 2 decimal places after any calculations are done, so what is the "right" way to do the decimal conversion to ensure that we don't lose any precision before the rounding?
The conversion won't produce errors within the first 8 digits. double has 15-16 digits of precision - less than the 28-29 of decimal, but enough for your purposes by the sounds of it.
You should definitely put in place some sort of plan to avoid using double in the future, however - it's an unsuitable datatype for financial calculations.
If you round to 2dp, IMO the "right" way would be store an integer that is the multiple - i.e. for 12.34 you store the integer 1234. No more double rounding woe.
If you must use double, this still works; all integers are guaranteed to be stored exactly in double - so still use the same trick.

When should I use double instead of decimal?

I can name three advantages to using double (or float) instead of decimal:
Uses less memory.
Faster because floating point math operations are natively supported by processors.
Can represent a larger range of numbers.
But these advantages seem to apply only to calculation intensive operations, such as those found in modeling software. Of course, doubles should not be used when precision is required, such as financial calculations. So are there any practical reasons to ever choose double (or float) instead of decimal in "normal" applications?
Edited to add:
Thanks for all the great responses, I learned from them.
One further question: A few people made the point that doubles can more precisely represent real numbers. When declared I would think that they usually more accurately represent them as well. But is it a true statement that the accuracy may decrease (sometimes significantly) when floating point operations are performed?
I think you've summarised the advantages quite well. You are however missing one point. The decimal type is only more accurate at representing base 10 numbers (e.g. those used in currency/financial calculations). In general, the double type is going to offer at least as great precision (someone correct me if I'm wrong) and definitely greater speed for arbitrary real numbers. The simple conclusion is: when considering which to use, always use double unless you need the base 10 accuracy that decimal offers.
Edit:
Regarding your additional question about the decrease in accuracy of floating-point numbers after operations, this is a slightly more subtle issue. Indeed, precision (I use the term interchangeably for accuracy here) will steadily decrease after each operation is performed. This is due to two reasons:
the fact that certain numbers (most obviously decimals) can't be truly represented in floating point form
rounding errors occur, just as if you were doing the calculation by hand. It depends greatly on the context (how many operations you're performing) whether these errors are significant enough to warrant much thought however.
In all cases, if you want to compare two floating-point numbers that should in theory be equivalent (but were arrived at using different calculations), you need to allow a certain degree of tolerance (how much varies, but is typically very small).
For a more detailed overview of the particular cases where errors in accuracies can be introduced, see the Accuracy section of the Wikipedia article. Finally, if you want a seriously in-depth (and mathematical) discussion of floating-point numbers/operations at machine level, try reading the oft-quoted article What Every Computer Scientist Should Know About Floating-Point Arithmetic.
You seem spot on with the benefits of using a floating point type. I tend to design for decimals in all cases, and rely on a profiler to let me know if operations on decimal is causing bottlenecks or slow-downs. In those cases, I will "down cast" to double or float, but only do it internally, and carefully try to manage precision loss by limiting the number of significant digits in the mathematical operation being performed.
In general, if your value is transient (not reused), you're safe to use a floating point type. The real problem with floating point types is the following three scenarios.
You are aggregating floating point values (in which case the precision errors compound)
You build values based on the floating point value (for example in a recursive algorithm)
You are doing math with a very wide number of significant digits (for example, 123456789.1 * .000000000000000987654321)
EDIT
According to the reference documentation on C# decimals:
The decimal keyword denotes a
128-bit data type. Compared to
floating-point types, the decimal type
has a greater precision and a smaller
range, which makes it suitable for
financial and monetary calculations.
So to clarify my above statement:
I tend to design for decimals in all
cases, and rely on a profiler to let
me know if operations on decimal is
causing bottlenecks or slow-downs.
I have only ever worked in industries where decimals are favorable. If you're working on phsyics or graphics engines, it's probably much more beneficial to design for a floating point type (float or double).
Decimal is not infinitely precise (it is impossible to represent infinite precision for non-integral in a primitive data type), but it is far more precise than double:
decimal = 28-29 significant digits
double = 15-16 significant digits
float = 7 significant digits
EDIT 2
In response to Konrad Rudolph's comment, item # 1 (above) is definitely correct. Aggregation of imprecision does indeed compound. See the below code for an example:
private const float THREE_FIFTHS = 3f / 5f;
private const int ONE_MILLION = 1000000;
public static void Main(string[] args)
{
Console.WriteLine("Three Fifths: {0}", THREE_FIFTHS.ToString("F10"));
float asSingle = 0f;
double asDouble = 0d;
decimal asDecimal = 0M;
for (int i = 0; i < ONE_MILLION; i++)
{
asSingle += THREE_FIFTHS;
asDouble += THREE_FIFTHS;
asDecimal += (decimal) THREE_FIFTHS;
}
Console.WriteLine("Six Hundred Thousand: {0:F10}", THREE_FIFTHS * ONE_MILLION);
Console.WriteLine("Single: {0}", asSingle.ToString("F10"));
Console.WriteLine("Double: {0}", asDouble.ToString("F10"));
Console.WriteLine("Decimal: {0}", asDecimal.ToString("F10"));
Console.ReadLine();
}
This outputs the following:
Three Fifths: 0.6000000000
Six Hundred Thousand: 600000.0000000000
Single: 599093.4000000000
Double: 599999.9999886850
Decimal: 600000.0000000000
As you can see, even though we are adding from the same source constant, the results of the double is less precise (although probably will round correctly), and the float is far less precise, to the point where it has been reduced to only two significant digits.
Use decimal for base 10 values, e.g. financial calculations, as others have suggested.
But double is generally more accurate for arbitrary calculated values.
For example if you want to calculate the weight of each line in a portfolio, use double as the result will more nearly add up to 100%.
In the following example, doubleResult is closer to 1 than decimalResult:
// Add one third + one third + one third with decimal
decimal decimalValue = 1M / 3M;
decimal decimalResult = decimalValue + decimalValue + decimalValue;
// Add one third + one third + one third with double
double doubleValue = 1D / 3D;
double doubleResult = doubleValue + doubleValue + doubleValue;
So again taking the example of a portfolio:
The market value of each line in the portfolio is a monetary value and would probably be best represented as decimal.
The weight of each line in the portfolio (= Market Value / SUM(Market Value)) is usually better represented as double.
Use a double or a float when you don't need precision, for example, in a platformer game I wrote, I used a float to store the player velocities. Obviously I don't need super precision here because I eventually round to an Int for drawing on the screen.
In some Accounting, consider the possibility of using integral types instead or in conjunction. For example, let say that the rules you operate under require every calculation result carry forward with at least 6 decimal places and the final result will be rounded to the nearest penny.
A calculation of 1/6th of $100 yields $16.66666666666666..., so the value carried forth in a worksheet will be $16.666667. Both double and decimal should yield that result accurately to 6 decimal places. However, we can avoid any cumulative error by carrying the result forward as an integer 16666667. Each subsequent calculation can be made with the same precision and carried forward similarly. Continuing the example, I calculate Texas sales tax on that amount (16666667 * .0825 = 1375000). Adding the two (it's a short worksheet) 1666667 + 1375000 = 18041667. Moving the decimal point back in gives us 18.041667, or $18.04.
While this short example wouldn't yield a cumulative error using double or decimal, it's fairly easy to show cases where simply calculating the double or decimal and carrying forward would accumulate significant error. If the rules you operate under require a limited number of decimal places, storing each value as an integer by multiplying by 10^(required # of decimal place), and then dividing by 10^(required # of decimal places) to get the actual value will avoid any cumulative error.
In situations where fractions of pennies do not occur (for example, a vending machine), there is no reason to use non-integral types at all. Simply think of it as counting pennies, not dollars. I have seen code where every calculation involved only whole pennies, yet use of double led to errors! Integer only math removed the issue. So my unconventional answer is, when possible, forgo both double and decimal.
If you need to binary interrop with other languages or platforms, then you might need to use float or double, which are standardized.
Depends on what you need it for.
Because float and double are binary data types you have some diifculties and errrors in the way in rounds numbers, so for instance double would round 0.1 to 0.100000001490116, double would also round 1 / 3 to 0.33333334326441. Simply put not all real numbers have accurate representation in double types
Luckily C# also supports the so-called decimal floating-point arithmetic, where numbers are represented via the decimal numeric system rather than the binary system. Thus, the decimal floating point-arithmetic does not lose accuracy when storing and processing floating-point numbers. This makes it immensely suited to calculations where a high level of accuracy is needed.
Note: this post is based on information of the decimal type's capabilities from http://csharpindepth.com/Articles/General/Decimal.aspx and my own interpretation of what that means. I will assume Double is normal IEEE double precision.
Note2: smallest and largest in this post reffer to the magnitude of the number.
Pros of "decimal".
"decimal" can represent exactly numbers that can be written as (sufficiently short) decimal fractions, double cannot. This is important in financial ledgers and similar where it is important that the results exactly match what a human doing the calculations would give.
"decimal" has a much larger mantissa than "double". That means that for values within it's normalised range "decimal" will have a much higher precision than double.
Cons of decimal
It will be Much slower (I don't have benchmarks but I would guess at least an order of magnitude maybe more), decimal will not benefit from any hardware acceleration and arithmetic on it will require relatively expensive multiplication/division by powers of 10 (which is far more expensive than multiplication and dividion by powers of 2) to match the exponent before addition/subtraction and to bring the exponent back into range after multiplication/division.
decimal will overflow earlier tha double will. decimal can only represent numbers up to ±296-1 . By comparision double can represent numbers up to nearly ±21024
decimal will underflow earlier. The smallest numbers representable in decimal are ±10-28 . By comparision double can represent values down to 2-149 (approx 10-45) if subnromal numbers are supported and 2-126 (approx 10-38) if they are not.
decimal takes up twice as much memory as double.
My opinion is that you should default to using "decimal" for money work and other cases where matching human calculation exactly is important and that you should use use double as your default choice the rest of the time.
Use floating points if you value performance over correctness.
Choose the type in function of your application. If you need precision like in financial analysis, you have answered your question. But if your application can settle with an estimate your ok with double.
Is your application in need of a fast calculation or will he have all the time in the world to give you an answer? It really depends on the type of application.
Graphic hungry? float or double is enough. Financial data analysis, meteor striking a planet kind of precision ? Those would need a bit of precision :)
Decimal has wider bytes, double is natively supported by CPU. Decimal is base-10, so a decimal-to-double conversion is happening while a decimal is computed.
For accounting - decimal
For finance - double
For heavy computation - double
Keep in mind .NET CLR only supports Math.Pow(double,double). Decimal is not supported.
.NET Framework 4
[SecuritySafeCritical]
public static extern double Pow(double x, double y);
A double values will serialize to scientific notation by default if that notation is shorter than the decimal display. (e.g. .00000003 will be 3e-8) Decimal values will never serialize to scientific notation. When serializing for consumption by an external party, this may be a consideration.

Categories