I have some code that is behaving strangely and seems to be rounding the result of the addition of two double values. This is causing issues with my code.
Unfortunately I cannot fix this issue because my unit testing environment is working OK (not rounding) and my application is not OK (rounding).
In my test environment:
a = -3.7468700408935547
b = 525218.0
c = b + a
c = 525214.25312995911
Inside my application:
a = -3.7468700408935547
b = 525218.0
c = b + a
c = 525214.25
What can be causing this? Project config? (I'm using visual studio, btw)
Edit (from comments)
I'm stepping through the same code using the Visual Studio debugger, so it's the exact same piece of code.
I have more code but I narrowed the problem down to that particular sum.
The binary representations of each value are:
test environment:
System.BitConverter.ToString(System.BitConverter.GetBytes(a)) "00-00-00-00-97-F9-0D-C0" string
System.BitConverter.ToString(System.BitConverter.GetBytes(b)) "00-00-00-00-44-07-20-41" string
System.BitConverter.ToString(System.BitConverter.GetBytes(c)) "00-40-9A-81-3C-07-20-41" string
inside application:
System.BitConverter.ToString(System.BitConverter.GetBytes(a)) "00-00-00-00-97-F9-0D-C0" string
System.BitConverter.ToString(System.BitConverter.GetBytes(b)) "00-00-00-00-44-07-20-41" string
System.BitConverter.ToString(System.BitConverter.GetBytes(c)) "00-00-00-80-3C-07-20-41" string
Edit 2:
As Alexei Levenkov points out, this issue is caused by a library that changes the FPU config.
For anyone who is curious what this meant for me:
I was able to mitigate this issue for my particular piece of code, by making some assumptions about my input values and doing some preemptive values rounding which in turn made my calculations consistent.
Your application may be doing something strange with configuration of FPU. I.e. using some random library for math which reconfigures precision...
Direct3d is possible suspect, see for example Pow implementation for double.
Use decimal if one wants preciscion such as needed in financial calculations.
Edit
See:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Five Tips for Floating Point Programming
Related
While upgrading from dotnet core 2.2 to dotnet 5.0 we experienced failing tests that depend on consistent serilaization of objects to generate a hash. I have tracked this down to the way the double type is serialized when passed to the JsonSerializer.Serialize method. During serialization we seem to be losing precision where a double is rounded to the next decimal point, but not every number is rounded.
For example
var d = 50.494329039350461;
var dAsText = System.Text.Json.JsonSerializer.Serialize(d);
//Value of dAsText is 50.49432903935046
When we deserialize we get the origial number back but we need to act on serialized data to generate our hash. Why has this behaviour changed between frameworks, is it a bug or is it intended? Are there any settings we can change to restore the previous implementation (while remaining on .net 5 of course). The same behaviour can be seen with Newtonsoft Json (I have raised an issue with them as well)
The behavior seem consistent on Windows 10x64 and in a Lunix Docker image. This is running in Visual Studio 2019 (latest update)
The problem can also be seen with these number:
50.494328391915907
30.316339899700989
50.494128852095287
I think it's the consequence of Floating-Point Parsing and Formatting improvements in .NET Core 3.0. There are many changes so worth reading the article, but in particular:
ToString(), ToString("G"), and ToString("R") will now return the
shortest roundtrippable string
Not sure about System.Text.Json, but Newtonsoft Json uses "R" specifier when writing double (and this makes sense). Using d.ToString("R") reproduces this issue.
Note that your number, and the number it serializes to, are actually equal:
var d = 50.494329039350461d;
var d2 = 50.49432903935046d;
var sameThing = d == d2; // true
bool sameBytes = BitConverter.GetBytes(d).SequenceEqual(BitConverter.GetBytes(d2)); // true
So the last "1" digit is basically meaningless (due to double limited precision). As I understand from that article - now shortest roundtrippable value is returned, and d2 is shortest indeed.
Article also says:
For ToString("R"), there is no mechanism to fallback to the old
behavior. The previous behavior would first try “G15” and then using
the internal buffer would see if it roundtrips; if that failed, it
would instead return “G17”.
And indeed using d.ToString("G17") returns the result you expect.
So, this is intended behaviour (previous behavior is considered a "bug" and fixed), and as far as I can tell - there are no settings to get back to previous behavior.
I am working on a calculation module using C#, and I bumped on this :
double v = 4 / 100;
I know this is a wrong initialization that returns v = 0.0 instead of v = 0.04
The c# rules says I must ensure at least one of the member is a double, like this :
double v = (double) 4 / 100;
double v = 4.0 / 100;
However, I have many many initializations of that kind that involves integer variables operations, and I feel lazy to browse my code line by line to detect such mistakes.
Instead, is it possible to get warned by the compiler about this ?
Alright, after some playing around and what not, I have a solution. I used this article to come to this solution.I use StyleCop, so you'll need to get and install that. Then, you can download my C# project MathematicsAnalyzer.
First off, I did not account for all type conversion mismatches. In fact, I only accommodate one part.
Basically, I check to see if the line contains "double" followed by a space. I do know that could lead to false warnings, because the end of a class could be double or any number of other things, but I'll leave that to you to figure out how to properly isolate the type.
If a match is found, I check to see that it matches this regex:
double[ ][A-Za-z0-9]*[ ]?=(([ ]?[0-9]*d[ ]?/[ ]?[0-9]*;)|[ ]?[0-9]*[ ]?/[ ]?[0-9]*d;)
If it does -not- match this regex, then I add a violation. What this regex will match on is any of the following:
double i=4d / 100;
double i = 4d / 100;
double i = 4 / 100d;
double i = 4/ 100d;
double i = 4 /100d;
double i = 4/100d;
double i=4d / 100;
double i=4 / 100d;
double i=4/100d;
Any of the above will not create a violation. As it is currently written, pretty much if a 'd' isn't used, it'll throw a violation. You'll need to add extra logic to account for the other possible ways of explicitly casting an operand. As I'm writing this, I've just realized that having a 'd' on both operands will most likely throw an exception. Whoops.
And lastly, I could not get StyleCop to display my violation properly. It kept giving me an error about the rule not existing, and even with a second pair of eyes on it, we could not find a solution, so I hacked it. The error shows the name of the rule you were trying to find, so I just put the name of the rule as something descriptive and included the line number in it.
To install the custom rule, build the MathematicalAnalyzer project. Close Visual Studio and copy the DLL into the StyleCop install directory. When you open Visual Studio, you should see the rule in the StyleCop settings. Step 5 and 6 of the article I used shows where to do that.
This only gets one violation at a time throughout the solution, so you'll have to fix the violation it shows, and run StyleCop again to find the next one. There may be a way around that, but I ran out of juice and stopped here.
Enjoy!
This article explains how to set up custom Code Analysis rules that, when you run Code Analysis, can show warnings and what not.
http://blog.tatham.oddie.com.au/2010/01/06/custom-code-analysis-rules-in-vs2010-and-how-to-make-them-run-in-fxcop-and-vs2008-too/
I am running into something that should be very simple to answer but i can't put my fingers on it. It has been quite sometime since i have done some trigonometry.
double cosValue = -2.7105054312E-20;
// (ACos) returns the angle
var deducedAngleInRadian = System.Math.Acos(cosValue);
var cos = System.Math.Cos(deducedAngleInRadian);
Console.WriteLine(cosValue);
Console.WriteLine(deducedAngleInRadian);
Console.WriteLine(cos);
Output:
-2.7105054312E-20
1.5707963267949
6.12303176911189E-17
How come that cosValue and cos are not the same?
Did you notice how much the two values are close of 0, and close to each other ?
The floating point (im)precision and the implementation of each methods may probably perfectly explain that.
Those methods are not perfect, for example they are relying on an approximation of Pi (of course, as Pi can't be stored in a computer ;)).
You could probably achieve a better precision (do you really need it) with a scientifical library, dedicated to this and using higher precision types than Double.
You could may be find some interesting stuff in Math operations using System.Decimal in C#? or https://stackoverflow.com/questions/1387430/recommended-math-library-for-c-net
Is there a generally accepted best approach to coding complex math? For example:
double someNumber = .123 + .456 * Math.Pow(Math.E, .789 * Math.Pow((homeIndex + .22), .012));
Is this a point where hard-coding the numbers is okay? Or should each number have a constant associated with it? Or is there even another way, like storing the calculations in config and invoking them somehow?
There will be a lot of code like this, and I'm trying to keep it maintainable.
Note: The example shown above is just one line. There would be tens or hundreds of these lines of code. And not only could the numbers change, but the formula could as well.
Generally, there are two kinds of constants - ones with the meaning to the implementation, and ones with the meaning to the business logic.
It is OK to hard-code the constants of the first kind: they are private to understanding your algorithm. For example, if you are using a ternary search and need to divide the interval in three parts, dividing by a hard-coded 3 is the right approach.
Constants with the meaning outside the code of your program, on the other hand, should not be hard-coded: giving them explicit names gives someone who maintains your code after you leave the company non-zero chances of making correct modifications without having to rewrite things from scratch or e-mailing you for help.
"Is it okay"? Sure. As far as I know, there's no paramilitary police force rounding up those who sin against the one true faith of programming. (Yet.).
Is it wise?
Well, there are all sorts of ways of deciding that - performance, scalability, extensibility, maintainability etc.
On the maintainability scale, this is pure evil. It make extensibility very hard; performance and scalability are probably not a huge concern.
If you left behind a single method with loads of lines similar to the above, your successor would have no chance maintaining the code. He'd be right to recommend a rewrite.
If you broke it down like
public float calculateTax(person)
float taxFreeAmount = calcTaxFreeAmount(person)
float taxableAmount = calcTaxableAmount(person, taxFreeAmount)
float taxAmount = calcTaxAmount(person, taxableAmount)
return taxAmount
end
and each of the inner methods is a few lines long, but you left some hardcoded values in there - well, not brilliant, but not terrible.
However, if some of those hardcoded values are likely to change over time (like the tax rate), leaving them as hardcoded values is not okay. It's awful.
The best advice I can give is:
Spend an afternoon with Resharper, and use its automatic refactoring tools.
Assume the guy picking this up from you is an axe-wielding maniac who knows where you live.
I usually ask myself whether I can maintain and fix the code at 3 AM being sleep deprived six months after writing the code. It has served me well. Looking at your formula, I'm not sure I can.
Ages ago I worked in the insurance industry. Some of my colleagues were tasked to convert the actuarial formulas into code, first FORTRAN and later C. Mathematical and programming skills varied from colleague to colleague. What I learned was the following reviewing their code:
document the actual formula in code; without it, years later you'll have trouble remember the actual formula. External documentation goes missing, become dated or simply may not be accessible.
break the formula into discrete components that can be documented, reused and tested.
use constants to document equations; magic numbers have very little context and often require existing knowledge for other developers to understand.
rely on the compiler to optimize code where possible. A good compiler will inline methods, reduce duplication and optimize the code for the particular architecture. In some cases it may duplicate portions of the formula for better performance.
That said, there are times where hard coding just simplify things, especially if those values are well understood within a particular context. For example, dividing (or multiplying) something by 100 or 1000 because you're converting a value to dollars. Another one is to multiply something by 3600 when you'd like to convert hours to seconds. Their meaning is often implied from the greater context. The following doesn't say much about magic number 100:
public static double a(double b, double c)
{
return (b - c) * 100;
}
but the following may give you a better hint:
public static double calculateAmountInCents(double amountDue, double amountPaid)
{
return (amountDue - amountPaid) * 100;
}
As the above comment states, this is far from complex.
You can however store the Magic numbers in constants/app.config values, so as to make it easier for the next developer to maitain your code.
When storing such constants, make sure to explain to the next developer (read yourself in 1 month) what your thoughts were, and what they need to keep in mind.
Also ewxplain what the actual calculation is for and what it is doing.
Do not leave in-line like this.
Constant so you can reuse, easily find, easily change and provides for better maintaining when someone comes looking at your code for the first time.
You can do a config if it can/should be customized. What is the impact of a customer altering the value(s)? Sometimes it is best to not give them that option. They could change it on their own then blame you when things don't work. Then again, maybe they have it in flux more often than your release schedules.
Its worth noting that the C# compiler (or is it the CLR) will automatically inline 1 line methods so if you can extract certain formulas into one liners you can just extract them as methods without any performance loss.
EDIT:
Constants and such more or less depends on the team and the quantity of use. Obviously if you're using the same hard-coded number more than once, constant it. However if you're writing a formula that its likely only you will ever edit (small team) then hard coding the values is fine. It all depends on your teams views on documentation and maintenance.
If the calculation in your line explains something for the next developer then you can leave it, otherwise its better to have calculated constant value in your code or configuration files.
I found one line in production code which was like:
int interval = 1 * 60 * 60 * 1000;
Without any comment, it wasn't hard that the original developer meant 1 hour in milliseconds, rather than seeing a value of 3600000.
IMO May be leaving out calculations is better for scenarios like that.
Names can be added for documentation purposes. The amount of documentation needed depends largely on the purpose.
Consider following code:
float e = m * 8.98755179e16;
And contrast it with the following one:
const float c = 299792458;
float e = m * c * c;
Even though the variable names are not very 'descriptive' in the latter you'll have much better idea what the code is doing the the first one - arguably there is no need to rename the c to speedOfLight, m to mass and e to energy as the names are explanatory in their domains.
const float speedOfLight = 299792458;
float energy = mass * speedOfLight * speedOfLight;
I would argue that the second code is the clearest one - especially if programmer can expect to find STR in the code (LHC simulator or something similar). To sum up - you need to find an optimal point. The more verbose code the more context you provide - which might both help to understand the meaning (what is e and c vs. we do something with mass and speed of light) and obscure the big picture (we square c and multiply by m vs. need of scanning whole line to get equation).
Most constants have some deeper meening and/or established notation so I would consider at least naming it by the convention (c for speed of light, R for gas constant, sPerH for seconds in hour). If notation is not clear the longer names should be used (sPerH in class named Date or Time is probably fine while it is not in Paginator). The really obvious constants could be hardcoded (say - division by 2 in calculating new array length in merge sort).
I've been playing with Script#, and I was wondering how the C# numbers were converted to Javascript. I wrote this little bit of code
int a = 3 / 2;
and looked at the relevant bit of compiled Javascript:
var $0=3/2;
In C#, the result of 3 / 2 assigned to an int is 1, but in Javascript, which only has one number type, is 1.5.
Because of this disparity between the C# and Javascript behaviour, and since the compiled code doesn't seem to compensate for it, should I assume that my numeric calculations written in C# might behave incorrectly when compiled to Javascript?
Should I assume that my numeric calculations written in C# might behave incorrectly when compiled to Javascript?
Yes.
Like you said, "the compiled code doesn't seem to compensate for it" - though for the case you mention where a was declared as an int it would be easy enough to compensate by using var $0 = Math.floor(3/2);. But if you don't control how the "compiler" works you're in a pickle. (You could correct the JavaScript manually, but you'd have to do that every time you regenerated it. Yuck.)
Note also that you are likely to have problems with decimal numbers too due to the way JavaScript represents decimal places. Most people are surprised the first time they find out that JavaScript will tell you that 0.4 * 3 works out to be 1.2000000000000002. For more details see one of the many other questions on this issue, e.g., How to deal with floating point number precision in JavaScript?. (Actually I think C# handles decimals the same way, so maybe this issue won't be such a surprise. Still, it can be a trap for new players...)