Need to store high precision decimal values in MongoDB

Need to store high precision decimal values in MongoDB - c#

I have little experience with MongoDB. I am usual working on large scale SQL server DBs.
MongoDB only supports double and there is no decimal. The C# driver serializes decimals as strings.
What functionality do I miss if I store decimals as strings in
MongoDB?
Is there a way to set a default serialization of decimals as double
(AllowTruncation) without having to put an Attribute on each
property?
What do I lose in precision if I used Bson double?
Thanks for your help!
UPDATE
I have an existing application model that uses decimals in C#. I want to use MongoDB as a new DB layer and change as little in the existing app as possible. Thats why I am looking for a way to map decimals in C# to double in MongoDB.
I understand that I loose precision and would have to analyze the side effects of it. My only remaining question is to know if there is a way to set a default serialization of decimals as double.
Thanks again. Great answers and comments so far.

As of Mongodb 3.4 and the 2.4 Mongodb C# driver, decimal types are supported.
The properties of your document must have the [BsonRepresentation(BsonType.Decimal128)] attribute found in the MongoDB.Bson.Serialization.Attributes namespace.
this will map to "YourDecimalValue" : NumberDecimal("100.0000") in MongodDB. Robomongo supports the new decimal type from version 1.1 Beta.

I will answer your question partially (because I do not know much about C#).
So what will you lose if you will store decimals as strings.
your numbers on average would weight more (each double number cost 8 bytes to store which means that every string that has more then 8 chars will weight more). Because of these your indexes (if they will be built on this field would grow)
you will not be able to use operators which takes numbers as arguments $inc, $bit, $mod, $min, $max and in 2.6 version $mul. (May be I forgot something)
you will not be able to compare numbers (may be '1.65' and '1.23' is comparable as a string, but definitely not numbers with e and minuses somewhere in between). Because of this operations which build on top of comparison like $sort, and all these $gte, $gt, $lte, $lt will not work correctly.
What will you lose in precision if you store decimal as double:
based on this, Decimal in C# has 28-29 significant digits, whereas looking at my first link and checking the spec for double precision you see that it has 15-17 significant digits. This is basically what you will lose
another really important thing which people sometimes forget when dealing with double floats is illustrated below:
.
db.c.insert({_id : 1, b : 3.44})
db.c.update({_id : 1},{$inc : {b : 1}})
db.c.find({b: 4.44}) // WTf, where is my document? There is nothing there
Regarding the 2-nd subquestion:
Is there a way to set a default serialization of decimals as double
(AllowTruncation) without having to put an Attribute on each property?
I do not really understood it, so I hope someone would be able to answer it.

Related

double type Multiplication in C# giving me wrong values [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 7 years ago.
If I execute the following expression in C#:
double i = 10*0.69;
i is: 6.8999999999999995. Why?
I understand numbers such as 1/3 can be hard to represent in binary as it has infinite recurring decimal places but this is not the case for 0.69. And 0.69 can easily be represented in binary, one binary number for 69 and another to denote the position of the decimal place.
How do I work around this? Use the decimal type?

Because you've misunderstood floating point arithmetic and how data is stored.
In fact, your code isn't actually performing any arithmetic at execution time in this particular case - the compiler will have done it, then saved a constant in the generated executable. However, it can't store an exact value of 6.9, because that value cannot be precisely represented in floating point point format, just like 1/3 can't be precisely stored in a finite decimal representation.
See if this article helps you.

why doesn't the framework work around this and hide this problem from me and give me the
right answer,0.69!!!
Stop behaving like a dilbert manager, and accept that computers, though cool and awesome, have limits. In your specific case, it doesn't just "hide" the problem, because you have specifically told it not to. The language (the computer) provides alternatives to the format, that you didn't choose. You chose double, which has certain advantages over decimal, and certain downsides. Now, knowing the answer, you're upset that the downsides don't magically disappear.
As a programmer, you are responsible for hiding this downside from managers, and there are many ways to do that. However, the makers of C# have a responsibility to make floating point work correctly, and correct floating point will occasionally result in incorrect math.
So will every other number storage method, as we do not have infinite bits. Our job as programmers is to work with limited resources to make cool things happen. They got you 90% of the way there, just get the torch home.

And 0.69 can easily be represented in
binary, one binary number for 69 and
another to denote the position of the
decimal place.
I think this is a common mistake - you're thinking of floating point numbers as if they are base-10 (i.e decimal - hence my emphasis).
So - you're thinking that there are two whole-number parts to this double: 69 and divide by 100 to get the decimal place to move - which could also be expressed as:
69 x 10 to the power of -2.
However floats store the 'position of the point' as base-2.
Your float actually gets stored as:
68999999999999995 x 2 to the power of some big negative number
This isn't as much of a problem once you're used to it - most people know and expect that 1/3 can't be expressed accurately as a decimal or percentage. It's just that the fractions that can't be expressed in base-2 are different.

but why doesn't the framework work around this and hide this problem from me and give me the right answer,0.69!!!
Because you told it to use binary floating point, and the solution is to use decimal floating point, so you are suggesting that the framework should disregard the type you specified and use decimal instead, which is very much slower because it is not directly implemented in hardware.
A more efficient solution is to not output the full value of the representation and explicitly specify the accuracy required by your output. If you format the output to two decimal places, you will see the result you expect. However if this is a financial application decimal is precisely what you should use - you've seen Superman III (and Office Space) haven't you ;)
Note that it is all a finite approximation of an infinite range, it is merely that decimal and double use a different set of approximations. The advantage of decimal is it produces the same approximations that you would if you were performing the calculation yourself. For example if you calculated 1/3, you would eventually stop writing 3's when it was 'good enough'.

For the same reason that 1 / 3 in a decimal systems comes out as 0.3333333333333333333333333333333333333333333 and not the exact fraction, which is infinitely long.

To work around it (e.g. to display on screen) try this:
double i = (double) Decimal.Multiply(10, (Decimal) 0.69);
Everyone seems to have answered your first question, but ignored the second part.

Double or decimal for latitude/longitude values in C#

What is the best data type to use when storing geopositional data in C#? I would use decimal for its exactness, but operations on decimal floating point numbers are slower then binary floating point numbers (double).
I read that most of the time you won't need any more than 6 or 7 digits of precision for latitude or longitude. Does the inexactness of doubles even matter then or can it be ignored?

Go for double, there are several reasons.
Trigonometric functions are available only for double
Precision of double (range of 100 nanometers) is far beyond anything you'll ever require for Lat/Lon values
GeoCoordinate Class and third-Party modules (e.g. DotSpatial) also use double for coordinates

A double has up to 15 decimal digits of precision. So, lets assume three of those digits are going to be on the left of the decimal point for lat/long values (max of 180deg). This leaves 12 digits of precision on the right. Since a degree of lat/long is ~111km, 5 of those 12 digits would give us precision to the meter. 3 more digits would give us precision to the millimeter. The remaining 4 digits would get us precision to around 100 nanometers. Since double will win from the perspective of performance and memory, I see no reason to even consider using decimal.

I faced this question quite a while ago when i started with spacial programming.
I read a book a while ago that led me to this.
//sql server has a really cool dll that deals with spacial data such like
//geography points and so on.
//add this namespace
Using Microsoft.SqlServer.Types;
//SqlGeography.Point(dblLat, dblLon, srid)
var lat_lon_point = Microsoft.SqlServer.Types.SqlGeography.Point(lat, lon, 4326);
This is the best way when working in your application with spacial data.
then to save the data use this in sql
CREATE TABLE myGeoTable
{
LatLonPoint GEOMETRY
}
else, if you are using something else that isnt sql just convert the point to hexadecimal and store it. I know after a long time using spacial that this is the safest.

Double
Combining the answers, it is how Microsoft represents it itself in SqlGeography library
[get: Microsoft.SqlServer.Server.SqlMethod(IsDeterministic=true, IsPrecise=true)]
public System.Data.SqlTypes.SqlDouble Lat { get; }
Property Value
SqlDouble
A SqlDouble value that specifies the latitude.

If you are using .net ef core, I would recommend you the NetTopologySuite library.
Read the full documentation at below link:
https://learn.microsoft.com/en-us/ef/core/modeling/spatial

Why does formatting a Double using the "G" standard format string not return the full string?

I hope I have researched this enough that my premise is not totally off base. If so, then the mathematicians out there can set me straight.
My premise is that a Double value such as 12.5 should be rounded to 5 significant figures (NOT decimal places) as 12.500. Instead, using the following C# code, I get 12.5:
Double d = 12.5;
Console.WriteLine(d.ToString("G5"));
I came across this post from 2007 which seems to echo my problem. In fact, I am using those example numbers just to keep things consistent.
My goal here is to better understand the following:
Is my understanding of sig figs mathematically correct? I.e., is my expectation reasonable, or is the output "12.5" somehow correct?
Is this really a (very long-lived) bug in the framework? If so, can/will it be fixed?
Assuming it is a bug, what might I do about it now? Write a hack to determine how many
sig figs you actually got back and then pad it? Roll my own code to
do what the "G" format string was supposed to do? I have come across examples of this on SO already, so perhaps that is evidence that a clean option does not exist.
Additionally, I do realize that the storage issues with Double might negatively impact the rounding aspect of this problem, but for now, I am only concerned with the issue of more sig figs than original digits.
EDIT: I have tested this up to framework 4.5.

See this link on G-Format Specifier. It clearly states:
The result contains a decimal point if required, and trailing zeros after the decimal point are omitted.

A Double value is rounded to 15 significant figures, not five.
Reference: The General ("G") format specifier
Rounding a number to any number of significant figures doesn't mean that the formatted string has to contain that number of digits. If the value is rounded to 12.5000000000000 then it will be formatted into "12.5" because that is the most compact way to represent the value.

Is there a 128 or 256 bit double class in .net?

I have an application that I want to be able to use large numbers and very precise numbers. For this, I needed a precision interpretation and IntX only works for integers.
Is there a class in .net framework or even third party(preferably free) that would do this?
Is there another way to do this?

Maybe the Decimal type would work for you?

You can use the freely available, arbitrary precision, BigDecimal from java.math, which is part of the J# redistributable package from Microsoft and is a managed .NET library.
Place a reference to vjslib in your project and you can something like this:
using java.math;
public void main()
{
BigDecimal big = new BigDecimal("1234567890123456789011223344556677889900.0000009876543210000987654321");
big.add(new BigDecimal(1.0));
Debug.Print(big);
}
Will print the following to the debug console:
1234567890123456789011223344556677889901.0000009876543210000987654321
Note that, as already mentioned, .NET 2010 contains a BigInteger class which, as a matter of fact, was already available in earlier versions, but only as internal class (i.e., you'd need some reflection to get it to work).

The F# library has some really big number types as well if you're okay with using that...

I've been searching for a solution for this for a long time, and today came across this library:
Quadruple Precision Double in C#
Signed 128-bit floating point data type library, with 64 effective bits of precision (vs. 53 for Doubles) and a 64 bit exponent (vs. 11 for Doubles). Quads have greater precision and far greater range than Doubles and are especially useful when dealing with very large or very small values, such as those in probabilistic models. As of version 2.0, all Quad arithmetic is checked (underflowing to 0, overflowing to +/- infinity), has special PositiveInfinity, NegativeInfinity, and NaN values, and follows the same rules as .Net Double arithmetic and comparison operators (e.g. 1/0 == PositiveInfinity, 0 * PositiveInfinity == NaN, NaN != NaN), making it a convenient drop-in replacement for Doubles in existing code.

If Decimal doesn't work for you, try implementing (or grabbing code from somewhere) Rational arithmetic using large integers. That will provide the precision you need.

Shameless plug: QPFloat emulates the IEEE standard to full precision.

Use decimal for this if possible.

Decimal is a 128-bit (16 byte) value type that is used for highly precise calculations. It is a floating point type that is represented internally as base 10 instead of base 2 (i.e. binary). If you need to be highly precise, you should use Decimal - but the drawback is that Decimal is about 20 times slower than using floats.

Well, I'm about 12 years late to the party. 🙃
I just thought you'd like to use my arbitrary precision floating point class called BigDecimal (I should have named it BigFloat, but its kinda late for that now).
Well, more correctly it will provide precision up to the number of digits you specify (by setting the static BigDecimal.Precision member), that way it doesn't use up all your ram trying to represent irrational numbers.
I put a lot of effort into ensuring its correctness and working out all the bugs. It comes with a test project that tests every method in multiple ways, and each bug I fixed started by adding a test case.
And unlike QPFloat and the J# redistributable's BigDecimal, the code is not an incomprehensible mess and follows C# coding style and naming conventions (to be fair, part of unreadability of J#'s version comes from the fact that you have to decompile the assembly first, so it'll be missing the names of all the private members).
Link drop:
BigDecimal on GitHub
ExtendedNumerics.BigDecimal on NuGet

Decimal is 128 bits if that would work.

What is the best data type to use for money in C#?

What is the best data type to use for money in C#?

As it is described at decimal as:
The decimal keyword indicates a 128-bit data type. Compared to
floating-point types, the decimal type has more precision and a
smaller range, which makes it appropriate for financial and monetary
calculations.
You can use a decimal as follows:
decimal myMoney = 300.5m;

System.Decimal
The Decimal value type represents decimal numbers ranging from positive 79,228,162,514,264,337,593,543,950,335 to negative 79,228,162,514,264,337,593,543,950,335. The Decimal value type is appropriate for financial calculations requiring large numbers of significant integral and fractional digits and no round-off errors. The Decimal type does not eliminate the need for rounding. Rather, it minimizes errors due to rounding.
I'd like to point to this excellent answer by zneak on why double shouldn't be used.

Use the Money pattern from Patterns of Enterprise Application Architecture. specify amount as decimal and the currency as an enum.

Decimal. If you choose double you're leaving yourself open to rounding errors

decimal has a smaller range, but greater precision - so you don't lose all those pennies over time!
Full details here:
http://msdn.microsoft.com/en-us/library/364x0z75.aspx

Agree with the Money pattern: Handling currencies is just too cumbersome when you use decimals.
If you create a Currency-class, you can then put all the logic relating to money there, including a correct ToString()-method, more control of parsing values and better control of divisions.
Also, with a Currency class, there is no chance of unintentionally mixing money up with other data.

Another option (especially if you're rolling you own class) is to use an int or a int64, and designate the lower four digits (or possibly even 2) as "right of the decimal point". So "on the edges" you'll need some "* 10000" on the way in and some "/ 10000" on the way out. This is the storage mechanism used by Microsoft's SQL Server, see http://msdn.microsoft.com/en-au/library/ms179882.aspx
The nicity of this is that all your summation can be done using (fast) integer arithmetic.

Most applications I've worked with use decimal to represent money. This is based on the assumption that the application will never be concerned with more than one currency.
This assumption may be based on another assumption, that the application will never be used in other countries with different currencies. I've seen cases where that proved to be false.
Now that assumption is being challenged in a new way: New currencies such as Bitcoin are becoming more common, and they aren't specific to any country. It's not unrealistic that an application used in just one country may still need to support multiple currencies.
Some people will say that creating or even using a type just for money is "gold plating," or adding extra complexity beyond the known requirements. I strongly disagree. The more ubiquitous a concept is within your domain, the more important it is to make a reasonable effort to use the correct abstraction up front. If you want to see complexity, try working in an application that used to use decimal and now there's an additional Currency property next to every decimal property.
If you use the wrong abstraction up front, replacing it later will be a hundred times more work. That means potentially introducing defects into existing code, and the best part is that those defects will likely involve amounts of money, transactions with money, or just anything with money.
And it's not that difficult to use something other than decimal. Google "nuget money type" and you'll see that numerous developers have created such abstractions (including me.) It's easy. It's as easy as using DateTime instead of storing a date in a string.

Create your own class. This seems odd, but a .Net type is inadequate to cover different currencies.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.