According to the documentation, the decimal.Round method uses a round-to-even algorithm which is not common for most applications. So I always end up writing a custom function to do the more natural round-half-up algorithm:
public static decimal RoundHalfUp(this decimal d, int decimals)
{
if (decimals < 0)
{
throw new ArgumentException("The decimals must be non-negative",
"decimals");
}
decimal multiplier = (decimal)Math.Pow(10, decimals);
decimal number = d * multiplier;
if (decimal.Truncate(number) < number)
{
number += 0.5m;
}
return decimal.Round(number) / multiplier;
}
Does anybody know the reason behind this framework design decision?
Is there any built-in implementation of the round-half-up algorithm into the framework? Or maybe some unmanaged Windows API?
It could be misleading for beginners that simply write decimal.Round(2.5m, 0) expecting 3 as a result but getting 2 instead.
The other answers with reasons why the Banker's algorithm (aka round half to even) is a good choice are quite correct. It does not suffer from negative or positive bias as much as the round half away from zero method over most reasonable distributions.
But the question was why .NET use Banker's actual rounding as default - and the answer is that Microsoft has followed the IEEE 754 standard. This is also mentioned in MSDN for Math.Round under Remarks.
Also note that .NET supports the alternative method specified by IEEE by providing the MidpointRounding enumeration. They could of course have provided more alternatives to solving ties, but they choose to just fulfill the IEEE standard.
Probably because it's a better algorithm. Over the course of many roundings performed, you will average out that all .5's end up rounding equally up and down. This gives better estimations of actual results if you are for instance, adding a bunch of rounded numbers. I would say that even though it isn't what some may expect, it's probably the more correct thing to do.
While I cannot answer the question of "Why did Microsoft's designers choose this as the default?", I just want to point out that an extra function is unnecessary.
Math.Round allows you to specify a MidpointRounding:
ToEven - When a number is halfway between two others, it is rounded toward the nearest even number.
AwayFromZero - When a number is halfway between two others, it is rounded toward the nearest number that is away from zero.
Decimals are mostly used for money; banker’s rounding is common when working with money. Or you could say.
It is mostly bankers that need the
decimal type; therefore it does
“banker’s rounding”
Bankers rounding have the advantage that on average you will get the same result if you:
round a set of “invoice lines” before adding them up,
or add them up then round the total
Rounding before adding up saved a lot of work in the days before computers.
(In the UK when we went decimal banks would not deal with half pence, but for many years there was still a half pence coin and shop often had prices ending in half pence – so lots of rounding)
Use another overload of Round function like this:
decimal.Round(2.5m, 0,MidpointRounding.AwayFromZero)
It will output 3. And if you use
decimal.Round(2.5m, 0,MidpointRounding.ToEven)
you will get banker's rounding.
Related
I have a float I need to turn into a string with 5 decimals precision (X.XXXXX), which means I need to have at least 6 decimals for round up/down. The issue is that the operation to get integer representation results in a very big number which I cant store (I'd need something like Big Integer but I cant rely on any built-in stuff for compatibility reasons and I wont pretend I understand how to re-invent one, in a fairly simple manner as well). I can pre-emptively limit it:
result = (m * Pow(5, +exp) / Pow(10,8));
but this will only give correct results for a handful of normalized floats like 0.3f, something like 1-E5 or 113.754f (this now has 3 more "leading" digits for the "ceil" part) will be wrong.
Taking into account I need 5 (6) decimals precision max - is there a shortcut I can take?
is there a shortcut I can take?
No and yes.
No in getting the best conversion result. Shortcuts run into the table-maker's dilemma. In short, there will be corner cases that oblige a fair amount of code for float to string conversion. Typically this means doing most of the conversion using integer math. Example.
Yes if code is willing to tolerate some error. This error results from the accumulated rounding of floating point operations. As typical float has 24 bits of binary precision (akin to at least 6 significant decimal digits) the "5 decimals precision (X.XXXXX)," (which is really 6 significant decimal digits) will be hard to obtain without error.
Using wider math can greatly reduces errors (perhaps by a factor of 100s millions), yet not eliminate them.
I have a class that does some length calculations based on a height on a ticket. It's been in place for years and working quite well... Until we got a unique ticket size.
They are entered by sales people in inches and are normally nice numbers like 3, 4 or 3.5 and store in a database - This one is however 3.66666 recurring (or 11/3) But it is being entered as 3.666 and causing the calculation to fail due to lost precision.
I have thought of a bit of a hack to restore precision for certain numbers, but thought maybe someone knows of a better way of getting a 3.666 or a 93.1333 back to it's number + two thirds status?
Thanks,
Mick.
As you explained in comments I see your point now. I've checked the numbers:
168000 / 3.666 = 45826.5139
168000 / 3.666666 = 45818.1901488
168000 * 3 / 11 = 45818.1818182
It makes a difference of 8 tickets. I have a feeling that your issue can be solved in many ways. On the side of user input for example. Or on the side of database. But back to your question:
How do I convert 3.666 or a 93.1333 back to it's number + two thirds
status?
You are looking for converting decimal (or double) to fraction.
There is already a question on SO: Algorithm for simplifying decimal to fractions which has many answeres. I've tested some of them, and none of them were satisfying. Some of them don't even hanlde recurrence. Perhaps I've missed the correct one, you can look by yourself.
Anyway, I believe you don't need to fully implement a conversion from 1.666 to 3/2, since it's not easy and you have a real-world sizes. You've said, that most of the time numbers are aroung 3, 3.5, 4 etc. So I suggest you to take a look at a question I've linked above and search for an algorythm of detecting the recurrence number. It was also discussed here How to know the repeating decimal in a fraction?
After what just convert 1.666 to 1.666666, since 1/1000000 of inch won't mess your calculations, as numbers above show.
It would be difficult to get the accurate value of double as double is floating point.
The MSDN says:
Remember that a floating-point number
can only approximate a decimal number,
and that the precision of a
floating-point number determines how
accurately that number approximates a
decimal number. By default, a Double
value contains 15 decimal digits of
precision, although a maximum of 17
digits is maintained internally. The
precision of a floating-point number
has several consequences:
Two floating-point numbers that appear equal for a particular
precision might not compare equal
because their least significant digits
are different.
A mathematical or comparison operation that uses a floating-point
number might not yield the same result
if a decimal number is used because
the floating-point number might not
exactly approximate the decimal
number.
A number can have multiple representations if we use a float, so the results of a division of floats may produce bitwise different floats. But what if the denominator is a power of 2?
AFAIK, dividing by a power of 2 would only shift the exponent, leaving the same mantissa, always producing bitwise identical floats. Is that right?
float a = xxx;
float result = n/1024f; // always the same result?
--- UPDATE ----------------------
Sorry for my lack of knowledge in the IEEE black magic for floating points :) , but I'm talking about those numbers Guvante mentioned: no representation for certain decimal numbers, 'inaccurate' floats. For the rest of this post I'll use 'accurate' and 'inaccurate' considering Guvante's definition of these words.
To simplify, let's say the numerator is always an 'accurate' number. Also, let's divide not by any power of 2, but always for 1024. Additionally, I'm doing the operation the same way every time (same method), so I'm talking about getting the same results in different executions (for the same inputs, sure).
I'm asking all this because I see different numbers coming from the same inputs, so I thought: well if I only use 'accurate' floats as numerators and divide by 1024 I will only shift the exponent, still having an 'accurate' float.
You asked for an example. The real problem is this: I have a simulator producing sometimes 0.02999994 and sometimes 0.03000000 for the same inputs. I thought I could multiply these numbers by 1024, round to get an 'integer' ('accurate' float) that would be the same for those two numbers, and then divide by 1024 to get an 'accurate' rounded float.
I was told (in my other question) that I could convert to decimal, round and cast to float, but I want to know if this way works.
A number can have multiple representations if we use a float
The question appears to be predicated on an incorrect premise; the only number that has multiple representations as a float is zero, which can be represented as either "positive zero" or "negative zero". Other than zero a given number only has one representation as a float, assuming that you are talking about the "double" or "float" types.
Or perhaps I misunderstand. Is the issue that you are referring to the fact that the compiler is permitted to do floating point operations in higher precision than the 32 or 64 bits available for storage? That can cause divisions and multiplications to produce different results in some cases.
Since people often don't fully grasp floating point numbers I will go over some of your points real quick. Each particular combination of bits in a floating point number represent a unique number. However because that number has a base 2 fractional component, there is no representation for certain decimal numbers. For instance 1.1. In those cases you take the closest number. IEEE 754-2008 specifies round to nearest, ties to even in these cases.
The real difficulty is when you combine two of these 'inaccurate' numbers. This can introduce problems as each intermediate step will involve rounding. If you calculate the same value using two different methods, you could come up with subtly different values. Typically this is handled with an epsilon when you want equality.
Now onto your real question, can you divide by a power of two and avoid introducing any additional 'inaccuracies'? Normally you can, however as with all floating point numbers, denormals and other odd cases have their own logic, and obviously if your mantissa overflows you will have difficulty. And again note, that no mathematical errors are introduced during any of this, it is simply math being done with limited percision, which involves intermittent rounding of results.
EDIT: In response to new question
What you are saying could work, but is pretty much equivalent to rounding. Additionally if you are just looking for equality, you should use an episilon as I mentioned earlier (a - b) < e for some small value e (0.0001 would work in your example). If you are looking to print out a pretty number, and the framework you are using isn't doing it to your liking, some rounding would be the most direct way of describing your solution, which is always a plus.
I can name three advantages to using double (or float) instead of decimal:
Uses less memory.
Faster because floating point math operations are natively supported by processors.
Can represent a larger range of numbers.
But these advantages seem to apply only to calculation intensive operations, such as those found in modeling software. Of course, doubles should not be used when precision is required, such as financial calculations. So are there any practical reasons to ever choose double (or float) instead of decimal in "normal" applications?
Edited to add:
Thanks for all the great responses, I learned from them.
One further question: A few people made the point that doubles can more precisely represent real numbers. When declared I would think that they usually more accurately represent them as well. But is it a true statement that the accuracy may decrease (sometimes significantly) when floating point operations are performed?
I think you've summarised the advantages quite well. You are however missing one point. The decimal type is only more accurate at representing base 10 numbers (e.g. those used in currency/financial calculations). In general, the double type is going to offer at least as great precision (someone correct me if I'm wrong) and definitely greater speed for arbitrary real numbers. The simple conclusion is: when considering which to use, always use double unless you need the base 10 accuracy that decimal offers.
Edit:
Regarding your additional question about the decrease in accuracy of floating-point numbers after operations, this is a slightly more subtle issue. Indeed, precision (I use the term interchangeably for accuracy here) will steadily decrease after each operation is performed. This is due to two reasons:
the fact that certain numbers (most obviously decimals) can't be truly represented in floating point form
rounding errors occur, just as if you were doing the calculation by hand. It depends greatly on the context (how many operations you're performing) whether these errors are significant enough to warrant much thought however.
In all cases, if you want to compare two floating-point numbers that should in theory be equivalent (but were arrived at using different calculations), you need to allow a certain degree of tolerance (how much varies, but is typically very small).
For a more detailed overview of the particular cases where errors in accuracies can be introduced, see the Accuracy section of the Wikipedia article. Finally, if you want a seriously in-depth (and mathematical) discussion of floating-point numbers/operations at machine level, try reading the oft-quoted article What Every Computer Scientist Should Know About Floating-Point Arithmetic.
You seem spot on with the benefits of using a floating point type. I tend to design for decimals in all cases, and rely on a profiler to let me know if operations on decimal is causing bottlenecks or slow-downs. In those cases, I will "down cast" to double or float, but only do it internally, and carefully try to manage precision loss by limiting the number of significant digits in the mathematical operation being performed.
In general, if your value is transient (not reused), you're safe to use a floating point type. The real problem with floating point types is the following three scenarios.
You are aggregating floating point values (in which case the precision errors compound)
You build values based on the floating point value (for example in a recursive algorithm)
You are doing math with a very wide number of significant digits (for example, 123456789.1 * .000000000000000987654321)
EDIT
According to the reference documentation on C# decimals:
The decimal keyword denotes a
128-bit data type. Compared to
floating-point types, the decimal type
has a greater precision and a smaller
range, which makes it suitable for
financial and monetary calculations.
So to clarify my above statement:
I tend to design for decimals in all
cases, and rely on a profiler to let
me know if operations on decimal is
causing bottlenecks or slow-downs.
I have only ever worked in industries where decimals are favorable. If you're working on phsyics or graphics engines, it's probably much more beneficial to design for a floating point type (float or double).
Decimal is not infinitely precise (it is impossible to represent infinite precision for non-integral in a primitive data type), but it is far more precise than double:
decimal = 28-29 significant digits
double = 15-16 significant digits
float = 7 significant digits
EDIT 2
In response to Konrad Rudolph's comment, item # 1 (above) is definitely correct. Aggregation of imprecision does indeed compound. See the below code for an example:
private const float THREE_FIFTHS = 3f / 5f;
private const int ONE_MILLION = 1000000;
public static void Main(string[] args)
{
Console.WriteLine("Three Fifths: {0}", THREE_FIFTHS.ToString("F10"));
float asSingle = 0f;
double asDouble = 0d;
decimal asDecimal = 0M;
for (int i = 0; i < ONE_MILLION; i++)
{
asSingle += THREE_FIFTHS;
asDouble += THREE_FIFTHS;
asDecimal += (decimal) THREE_FIFTHS;
}
Console.WriteLine("Six Hundred Thousand: {0:F10}", THREE_FIFTHS * ONE_MILLION);
Console.WriteLine("Single: {0}", asSingle.ToString("F10"));
Console.WriteLine("Double: {0}", asDouble.ToString("F10"));
Console.WriteLine("Decimal: {0}", asDecimal.ToString("F10"));
Console.ReadLine();
}
This outputs the following:
Three Fifths: 0.6000000000
Six Hundred Thousand: 600000.0000000000
Single: 599093.4000000000
Double: 599999.9999886850
Decimal: 600000.0000000000
As you can see, even though we are adding from the same source constant, the results of the double is less precise (although probably will round correctly), and the float is far less precise, to the point where it has been reduced to only two significant digits.
Use decimal for base 10 values, e.g. financial calculations, as others have suggested.
But double is generally more accurate for arbitrary calculated values.
For example if you want to calculate the weight of each line in a portfolio, use double as the result will more nearly add up to 100%.
In the following example, doubleResult is closer to 1 than decimalResult:
// Add one third + one third + one third with decimal
decimal decimalValue = 1M / 3M;
decimal decimalResult = decimalValue + decimalValue + decimalValue;
// Add one third + one third + one third with double
double doubleValue = 1D / 3D;
double doubleResult = doubleValue + doubleValue + doubleValue;
So again taking the example of a portfolio:
The market value of each line in the portfolio is a monetary value and would probably be best represented as decimal.
The weight of each line in the portfolio (= Market Value / SUM(Market Value)) is usually better represented as double.
Use a double or a float when you don't need precision, for example, in a platformer game I wrote, I used a float to store the player velocities. Obviously I don't need super precision here because I eventually round to an Int for drawing on the screen.
In some Accounting, consider the possibility of using integral types instead or in conjunction. For example, let say that the rules you operate under require every calculation result carry forward with at least 6 decimal places and the final result will be rounded to the nearest penny.
A calculation of 1/6th of $100 yields $16.66666666666666..., so the value carried forth in a worksheet will be $16.666667. Both double and decimal should yield that result accurately to 6 decimal places. However, we can avoid any cumulative error by carrying the result forward as an integer 16666667. Each subsequent calculation can be made with the same precision and carried forward similarly. Continuing the example, I calculate Texas sales tax on that amount (16666667 * .0825 = 1375000). Adding the two (it's a short worksheet) 1666667 + 1375000 = 18041667. Moving the decimal point back in gives us 18.041667, or $18.04.
While this short example wouldn't yield a cumulative error using double or decimal, it's fairly easy to show cases where simply calculating the double or decimal and carrying forward would accumulate significant error. If the rules you operate under require a limited number of decimal places, storing each value as an integer by multiplying by 10^(required # of decimal place), and then dividing by 10^(required # of decimal places) to get the actual value will avoid any cumulative error.
In situations where fractions of pennies do not occur (for example, a vending machine), there is no reason to use non-integral types at all. Simply think of it as counting pennies, not dollars. I have seen code where every calculation involved only whole pennies, yet use of double led to errors! Integer only math removed the issue. So my unconventional answer is, when possible, forgo both double and decimal.
If you need to binary interrop with other languages or platforms, then you might need to use float or double, which are standardized.
Depends on what you need it for.
Because float and double are binary data types you have some diifculties and errrors in the way in rounds numbers, so for instance double would round 0.1 to 0.100000001490116, double would also round 1 / 3 to 0.33333334326441. Simply put not all real numbers have accurate representation in double types
Luckily C# also supports the so-called decimal floating-point arithmetic, where numbers are represented via the decimal numeric system rather than the binary system. Thus, the decimal floating point-arithmetic does not lose accuracy when storing and processing floating-point numbers. This makes it immensely suited to calculations where a high level of accuracy is needed.
Note: this post is based on information of the decimal type's capabilities from http://csharpindepth.com/Articles/General/Decimal.aspx and my own interpretation of what that means. I will assume Double is normal IEEE double precision.
Note2: smallest and largest in this post reffer to the magnitude of the number.
Pros of "decimal".
"decimal" can represent exactly numbers that can be written as (sufficiently short) decimal fractions, double cannot. This is important in financial ledgers and similar where it is important that the results exactly match what a human doing the calculations would give.
"decimal" has a much larger mantissa than "double". That means that for values within it's normalised range "decimal" will have a much higher precision than double.
Cons of decimal
It will be Much slower (I don't have benchmarks but I would guess at least an order of magnitude maybe more), decimal will not benefit from any hardware acceleration and arithmetic on it will require relatively expensive multiplication/division by powers of 10 (which is far more expensive than multiplication and dividion by powers of 2) to match the exponent before addition/subtraction and to bring the exponent back into range after multiplication/division.
decimal will overflow earlier tha double will. decimal can only represent numbers up to ±296-1 . By comparision double can represent numbers up to nearly ±21024
decimal will underflow earlier. The smallest numbers representable in decimal are ±10-28 . By comparision double can represent values down to 2-149 (approx 10-45) if subnromal numbers are supported and 2-126 (approx 10-38) if they are not.
decimal takes up twice as much memory as double.
My opinion is that you should default to using "decimal" for money work and other cases where matching human calculation exactly is important and that you should use use double as your default choice the rest of the time.
Use floating points if you value performance over correctness.
Choose the type in function of your application. If you need precision like in financial analysis, you have answered your question. But if your application can settle with an estimate your ok with double.
Is your application in need of a fast calculation or will he have all the time in the world to give you an answer? It really depends on the type of application.
Graphic hungry? float or double is enough. Financial data analysis, meteor striking a planet kind of precision ? Those would need a bit of precision :)
Decimal has wider bytes, double is natively supported by CPU. Decimal is base-10, so a decimal-to-double conversion is happening while a decimal is computed.
For accounting - decimal
For finance - double
For heavy computation - double
Keep in mind .NET CLR only supports Math.Pow(double,double). Decimal is not supported.
.NET Framework 4
[SecuritySafeCritical]
public static extern double Pow(double x, double y);
A double values will serialize to scientific notation by default if that notation is shorter than the decimal display. (e.g. .00000003 will be 3e-8) Decimal values will never serialize to scientific notation. When serializing for consumption by an external party, this may be a consideration.
I always tell in c# a variable of type double is not suitable for money. All weird things could happen. But I can't seem to create an example to demonstrate some of these issues. Can anyone provide such an example?
(edit; this post was originally tagged C#; some replies refer to specific details of decimal, which therefore means System.Decimal).
(edit 2: I was specific asking for some c# code, so I don't think this is language agnostic only)
Very, very unsuitable. Use decimal.
double x = 3.65, y = 0.05, z = 3.7;
Console.WriteLine((x + y) == z); // false
(example from Jon's page here - recommended reading ;-p)
You will get odd errors effectively caused by rounding. In addition, comparisons with exact values are extremely tricky - you usually need to apply some sort of epsilon to check for the actual value being "near" a particular one.
Here's a concrete example:
using System;
class Test
{
static void Main()
{
double x = 0.1;
double y = x + x + x;
Console.WriteLine(y == 0.3); // Prints False
}
}
Yes it's unsuitable.
If I remember correctly double has about 17 significant numbers, so normally rounding errors will take place far behind the decimal point. Most financial software uses 4 decimals behind the decimal point, that leaves 13 decimals to work with so the maximum number you can work with for single operations is still very much higher than the USA national debt. But rounding errors will add up over time. If your software runs for a long time you'll eventually start losing cents. Certain operations will make this worse. For example adding large amounts to small amounts will cause a significant loss of precision.
You need fixed point datatypes for money operations, most people don't mind if you lose a cent here and there but accountants aren't like most people..
edit
According to this site http://msdn.microsoft.com/en-us/library/678hzkk9.aspx Doubles actually have 15 to 16 significant digits instead of 17.
#Jon Skeet decimal is more suitable than double because of its higher precision, 28 or 29 significant decimals. That means less chance of accumulated rounding errors becoming significant. Fixed point datatypes (ie integers that represent cents or 100th of a cent like I've seen used) like Boojum mentions are actually better suited.
Since decimal uses a scaling factor of multiples of 10, numbers like 0.1 can be represented exactly. In essence, the decimal type represents this as 1 / 10 ^ 1, whereas a double would represent this as 104857 / 2 ^ 20 (in reality it would be more like really-big-number / 2 ^ 1023).
A decimal can exactly represent any base 10 value with up to 28/29 significant digits (like 0.1). A double can't.
My understanding is that most financial systems express currency using integers -- i.e., counting everything in cents.
IEEE double precision actually can represent all integers exactly in the range -2^53 through +2^53. (Hacker's Delight, pg. 262) If you use only addition, subtraction and multiplication, and keep everything to integers within this range then you should see no loss of precision. I'd be very wary of division or more complex operations, however.
Using double when you don't know what you are doing is unsuitable.
"double" can represent an amount of a trillion dollars with an error of 1/90th of a cent. So you will get highly precise results. Want to calculate how much it costs to put a man on Mars and get him back alive? double will do just fine.
But with money there are often very specific rules saying that a certain calculation must give a certain result and no other. If you calculate an amount that is very very very close to $98.135 then there will often be a rule that determines whether the result should be $98.14 or $98.13 and you must follow that rule and get the result that is required.
Depending on where you live, using 64 bit integers to represent cents or pennies or kopeks or whatever is the smallest unit in your country will usually work just fine. For example, 64 bit signed integers representing cents can represent values up to 92,223 trillion dollars. 32 bit integers are usually unsuitable.
No a double will always have rounding errors, use "decimal" if you're on .Net...
Actually floating-point double is perfectly well suited to representing amounts of money as long as you pick a suitable unit.
See http://www.idinews.com/moneyRep.html
So is fixed-point long. Either consumes 8 bytes, surely preferable to the 16 consumed by a decimal item.
Whether or not something works (i.e. yields the expected and correct result) is not a matter of either voting or individual preference. A technique either works or it doesn't.