What's the best way to represent System.Decimal in Protocol Buffers?

What's the best way to represent System.Decimal in Protocol Buffers? - c#

Following on from this question, what would be the best way to represent a System.Decimal object in a Protocol Buffer?

Well, protobuf-net will simply handle this for you; it runs off the properties of types, and has full support for decimal. Since there is no direct way of expressing decimal in proto, it won't (currently) generate a decimal property from a ".proto" file, but it would be a nice tweak to recognise some common type ("BCL.Decimal" or similar) and interpret it as decimal.
As for representing it - I had a discussion document on this (now out of date I suspect) in the protobuf-net wiki area; there is now a working version in protobuf-net that simply does it for you.
No doubt Jon and I will hammer this out more later today ;-p
The protobuf-net version of this (in .proto) is something like (from here):
message Decimal {
optional uint64 lo = 1; // the first 64 bits of the underlying value
optional uint32 hi = 2; // the last 32 bis of the underlying value
optional sint32 signScale = 3; // the number of decimal digits, and the sign
}

Marc and I have very vague plans to come up with a "common PB message" library such that you can represent pretty common types (date/time and decimal springing instantly to mind) in a common way, with conversions available in .NET and Java (and anything else anyone wants to contribute).
If you're happy to stick to .NET, and you're looking for compactness, I'd possibly go with something like:
message Decimal {
// 96-bit mantissa broken into two chunks
optional uint64 mantissa_msb = 1;
optional uint32 mantissa_lsb = 2;
required sint32 exponent_and_sign = 3;
}
The sign can just be represented by the sign of exponent_and_sign, with the exponent being the absolute value.
Making both parts of the mantissa optional means that 0 is represented very compactly (but still differentiating between 0m and 0.0000m etc). exponent_and_sign could be optional as well if we really wanted.
I don't know about Marc's project, but in my port I generate partial classes, so you can the put a conversion between System.Decimal and Protobuf.Common.Decimal (or whatever) into the partial class.

A slightly simpler to implement approach than Jon or Marc's is to store it as 4 sint32 values, which conveniently maps trivially to the output of Decimal.GetBits().
The proto file will look like:
message ProtoDecimal {
sint32 v1 = 1;
sint32 v2 = 2;
sint32 v3 = 3;
sint32 v4 = 4;
}
And the converter will be:
public decimal ConvertToDecimal(AbideDecimal value)
{
return new decimal(new int[] { value.V1, value.V2, value.V3, value.V4 });
}
public ProtoDecimal ConvertFromDecimal(decimal value)
{
var bits = decimal.GetBits(value);
return new ProtoDecimal
{
V1 = bits[0],
V2 = bits[1],
V3 = bits[2],
V4 = bits[3]
}
}
This might not be as simple in other languages, but if you only have to target C# then it will take up the same maximum of 16 bytes that the other approach will (although values like 0 might not be as compactly stored - I don't know enough about the intricate details of how protobuf stores ints), while being much clearer to dumb-dumb programmers like me :)
Obviously you will have to race the horses if you want to test performance but I doubt there is much in it.

When you know you have a limited number of decimals, you can use the smallest possible unit as an integer value. For example, when handling money one don't require a decimal type but instead can define to use the unit cents. Then an integer with value 2 would refer to 0.02 in whatever currency is used.

I put together a patch for protobuf-csharp-port with hooks which generates protobuf classes with native Decimal and DateTime structs. Wire format wise, they are represented by two "built-in" proto messages.
Here is the link:
https://code.google.com/p/protobuf-csharp-port/issues/detail?can=2&start=0&num=100&q=&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary&groupby=&sort=&id=78

Related

String.Format with infinite precision and at least 2 decimal digit

I am developing an application for some guys very picky about numerical accuracy (they are dealing among others with accounting and very accurate telecom variables). For this reason I am using the decimal type everywhere, because the float and double types are not suitable (do not hesitate to redirect me to an existing better datatype).
My problem comes when I need to format those numbers. The requirement is to display as many decimal digit as needed but at least 2 and also to use a group separator for thousands.
For example:
Value Formatted
1 1.00
1000 1,000.00
1.5 1.50
1000.355 1,000.355
0.000035 0.000035
So I went the MSDN looking for numeric string formats. I found this useful resource Standard Numeric Formats but none of my tries works as expected (N, N2, F, F2, G, G2, etc, I tried various combinations even when I didn't believe in them ^^ - I even try some F2# for fun).
My conclusion is there is not a built-in format to do what I want. Right?
So I checked out the next chapter Custom Numeric Formats. But I couldn't find a combination that suit my needs. So I went to SO and find a lots of question about that (1, 2, and so on).
These questions let me fear that the only solution is this one: #,##0.00####### with as many trailing # as I need precision.
Am I right?
I guess that with 12 #, my guys won't find any accuracy issue, but I might have missed the magical format I need?

This is probably what you're looking for:
static string FormatNumber(string input)
{
var dec = decimal.Parse(input);
var bits = decimal.GetBits(dec);
var prec = bits[3] >> 16 & 255;
if (prec < 2)
prec = 2;
return dec.ToString("N" + prec);
}
When you call it, do a ToString() on decimals, and convert the result back to decimal if needed.
I tried your example numbers, and the result:
Based on this SO answer.

I created a little function: it formats the number based on how many decimal places it has:
public static void Main()
{
decimal theValue;
theValue = 0.000035M;
Console.WriteLine(theFormat(theValue));
theValue = 1.5M;
Console.WriteLine(theFormat(theValue));
theValue = 1;
Console.WriteLine(theFormat(theValue));
}
public static decimal theFormat(decimal theValue){
int count = BitConverter.GetBytes(decimal.GetBits(theValue)[3])[2];
return count > 1?theValue:Convert.ToDecimal(string.Format("{0:F2}", theValue));
}
This, produces the following output:
0.000035
1.50
1.00

If you want a total control over formatting you can implement your own IFormatProvider for decimals. Inside of it you can use StringBuilder and do anything you need with no restrictions of string.Format().

Difference between two large numbers C#

There are already solutions to this problem for small numbers:
Here: Difference between 2 numbers
Here: C# function to find the delta of two numbers
Here: How can I find the difference between 2 values in C#?
I'll summarise the answer to them all:
Math.Abs(a - b)
The problem is when the numbers are large this gives the wrong answer (by means of an overflow). Worse still, if (a - b) = Int32.MinValue then Math.Abs crashes with an exception (because Int32.MaxValue = Int32.MinValue - 1):
System.OverflowException occurred
HResult=0x80131516
Message=Negating the minimum value of a twos complement number is
invalid.
Source=mscorlib
StackTrace: at
System.Math.AbsHelper(Int32 value) at System.Math.Abs(Int32 value)
Its specific nature leads to difficult-to-reproduce bugs.
Maybe I'm missing some well known library function, but is there any way of determining the difference safely?

As suggested by others, use BigInteger as defined in System.Numerics (you'll have to include the namespace in Visual Studio)
Then you can just do:
BigInteger a = new BigInteger();
BigInteger b = new BigInteger();
// Assign values to a and b somewhere in here...
// Then just use included BigInteger.Abs method
BigInteger result = BigInteger.Abs(a - b);
Jeremy Thompson's answer is still valid, but note that the BigInteger namespace includes an absolute value method, so there shouldn't be any need for special logic. Also, Math.Abs expects a decimal, so it will give you grief if you try to pass in a BigInteger.
Keep in mind there are caveats to using BigIntegers. If you have a ludicrously large number, C# will try to allocate memory for it, and you may run into out of memory exceptions. On the flip side, BigIntegers are great because the amount of memory allotted to them is dynamically changed as the number gets larger.
Check out the microsoft reference here for more info: https://msdn.microsoft.com/en-us/library/system.numerics.biginteger(v=vs.110).aspx

The question is, how do you want to hold the difference between two large numbers? If you're calculating the difference between two signed long (64-bit) integers, for example, and the difference will not fit into a signed long integer, how do you intend to store it?
long a = +(1 << 62) + 1000;
long b = -(1 << 62);
long dif = a - b; // Overflow, bit truncation
The difference between a and b is wider than 64 bits, so when it's stored into a long integer, its high-order bits are truncated, and you get a strange value for dif.
In other words, you cannot store all possible differences between signed integer values of a given width into a signed integer of the same width. (You can only store half of all of the possible values; the other half require an extra bit.)
Your options are to either use a wider type to hold the difference (which won't help you if you're already using the widest long integer type), or to use a different arithmetic type. If you need at least 64 signed bits of precision, you'll probably need to use BigInteger.

The BigInteger was introduced in .Net 4.0.
There are some open source implementations available in lower versions of the .Net Framework, however you'd be wise to go with the standard.
If the Math.Abs still gives you grief you can implement the function yourself; if the number is negative (a - b < 0) simply trim the negative symbol so its unsigned.
Also, have you tried using Doubles? They hold much larger values.

Here's an alternative that might be interesting to you, but is very much within the confines of a particular int size. This example uses Int32, and uses bitwise operators to accomplish the difference and then the absolute value. This implementation is tolerant of your scenario where a - b equals the min int value, it naturally returns the min int value (not much else you can do, without casting things to the a larger data type). I don't think this is as good an answer as using BigInteger, but it is fun to play with if nothing else:
static int diff(int a, int b)
{
int xorResult = (a ^ b);
int diff = (a & xorResult) - (b & xorResult);
return (diff + (diff >> 31)) ^ (diff >> 31);
}
Here are some cases I ran it through to play with the behavior:
Console.WriteLine(diff(13, 14)); // 1
Console.WriteLine(diff(11, 9)); // 2
Console.WriteLine(diff(5002000, 2346728)); // 2655272
Console.WriteLine(diff(int.MinValue, 0)); // Should be 2147483648, but int data type can't go that large. Actual result will be -2147483648.

What is the point of using an int for the sign?

Because I needed to look at some methods in BigInteger, I DotPeeked into the assembly. And then I found something rather odd:
internal int _sign;
Why would you use an int for the sign of a number? Is there no reason, or is there something I'm missing. I mean, they could use a BitArray, or a bool, or a byte. Why an int?

If you look at some of the usages of _sign field in the decompiled code, you may find things like this:
if ((this._sign ^ other._sign) < 0)
return this._sign >= 0 ? 1 : -1;
Basically int type allows to compare signs of two values using multiplication. Obviously neither byte, nor bool would allow this.
Still there is a question: why not Int16 then, as it would consume less memory? This is perhaps connected with alignment.

Storing the sign as an int allows you to simply multiply by the sign to apply it to the result of a calculation. This could come in handy when converting to simpler types.

A bool can have only 2 states. The advantage of an int is that it now also is simple to keep track of the special value: 0
public bool get_IsZero()
{
return (this._sign == 0);
}
And several more shortcuts like that when you read the rest of the code.

The size of any class object is going to be rounded up to 32 bits (four bytes), so "saving" three bytes won't buy anything. One might be able to shave four bytes off the size of a typical BigInteger by stealing a bit from one of the words that holds the numeric value, but the extra processing required for such usage would outweigh the cost of wasting a 32-bit integer.
A more interesting possibility might be to have BigInteger be an abstract class, with derived classes PositiveBigInteger and NegativeBigInteger. Since every class object is going to have a word that says what class it is, such an approach would save 32 bits for each BigInteger that's created. Use of an abstract class in such fashion would add an extra virtual member dispatch to each function call, but would likely save an "if" test on most of them (since the methods of e.g. NegativeBigInteger would know by virtue of the fact that they are invoked that this is negative, they wouldn't have to test it). Such a design could also improve efficiency if there were classes for TinyBigInteger (a BigInteger whose value could fit in a single Integer) and SmallBigInteger (a BigInteger whose value could fit in a Long). I have no idea if Microsoft considered such a design, or what the trade-offs would have been.

Gets a number that indicates the sign (negative, positive, or zero) of the current System.Numerics.BigInteger object.
-1 The value of this object is negative. 0 The value of this object is 0 (zero). 1 The value of this object is positive.
That means
class Program
{
static void Main(string[] args)
{
BigInteger bInt1 = BigInteger.Parse("0");
BigInteger bInt2 = BigInteger.Parse("-5");
BigInteger bInt3 = BigInteger.Parse("5");
division10(bInt1);//it is Impossible
division10(bInt2);//it is Possible : -2
division10(bInt3);//it is Possible : 2
}
static void division10(BigInteger bInt)
{
double d = 10;
if (bInt.IsZero)
{
Console.WriteLine("it is Impossible");
}
else
{
Console.WriteLine("it is Possible : {0}", d / (int)bInt);
}
}
}
don't use byte or another uint, sbyte, ushort, short because exist CLS and CLS don't support their

c# : string to ulong conversion keeping precision

What is the best way to convert/parse a string into a ulong in c# and keep precision ?
Direct cast is not possible, and the Convert class utility is not providing ulong conversion, so I used an intermediate decimal variable, but I am losing precision.
decimal d = Decimal.Parse("1.0316584"));
Console.Write(d) // displays 1.0316584
ulong u = (ulong)d;
Console.Write(u) // displays 1 , precision is lost
I first tried to use a long parser, but I got thrown out :
long l = Int64.Parse("1.0316584")); // throws System.FormatException
EDIT :
Ok sorry my bad : My question was very badly put. "long" is indeed an integer in C# I was confused by other previously used languages. Plus, I had to use ulong because this is what the third party code I am using requests.So the multiplying factor as suggested in an answer was indeed the way to go

ulong is an integer type and can never have any precision for fractional/decimal values.

Strings can store a lot more data than a long. If you convert to a long, you run the risk of not being able to convert it back.
e.g. if I have the string Now is the time for all good men to come to the aid of their party. that can't really be converted to a long. "Precision" will be lost.
Having said that, A long is a 64 bit integer. It can't store that kind of data unless you're willing to change "encoding" somehow. If you have code that looks like this:
decimal d = Decimal.Parse("1.0316584"));
Console.Write(d) // displays 1.0316584
ulong u = (ulong)(d * 1000000000m);
Console.Write(u / 1000000000m) // displays 1.0316584? Precision is not lost.

How do you deal with numbers larger than UInt64 (C#)

In C#, how can one store and calculate with numbers that significantly exceed UInt64's max value (18,446,744,073,709,551,615)?

Can you use the .NET 4.0 beta? If so, you can use BigInteger.
Otherwise, if you're sticking within 28 digits, you can use decimal - but be aware that obviously that's going to perform decimal arithmetic, so you may need to round at various places to compensate.

By using a BigInteger class; there's one in the the J# libraries (definitely accessible from C#), another in F# (need to test this one), and there are freestanding implementations such as this one in pure C#.

What is it that you wish to use these numbers for? If you are doing calculations with really big numbers, do you still need the accuracy down to the last digit?
If not, you should consider using floating point values instead. They can be huge, the max value for the double type is 1.79769313486231570E+308, (in case you are not used to scientific notation it means 1.79769313486231570 multiplied by 10000000...0000 - 308 zeros).
That should be large enough for most applications

BigInteger represents an arbitrarily large signed integer.
using System.Numerics;
var a = BigInteger.Parse("91389681247993671255432112000000");
var b = new BigInteger(1790322312);
var c = a * b;

Decimal has greater range.
There is support for bigInteger in .NET 4.0 but that is still not out of beta.

There are several libraries for computing with big integers, most of the cryptography libraries also offer a class for that. See this for a free library.

Also, do check that you truly need a variable with greater capacity than Int64 and aren't falling foul of C#'s integer arithmetic.
For example, this code will yield an overflow error:
Int64 myAnswer = 20000*1024*1024;
At first glance that might seem to be because the result is too large for an Int64 variable, but actually it's because each of the numbers on the right side of the formula are implicitly typed as Int32 so the temporary memory space reserved for the result of the calculation will be Int32 size, and that's what causes the overflow.
The result will actually easily fit into an Int64, it just needs one of the values on the right to be cast to Int64:
Int64 myAnswer = (Int64)20000*1024*1024;
This is discussed in more detail in this answer.
(I appreciate this doesn't apply in the OP's case, but it was just this sort of issue that brought me here!)

You can use decimal. It is greater than Int64.
It has 28-29 significant digits.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

What's the best way to represent System.Decimal in Protocol Buffers? - c#

Following on from this question, what would be the best way to represent a System.Decimal object in a Protocol Buffer?

Related

String.Format with infinite precision and at least 2 decimal digit

Difference between two large numbers C#

What is the point of using an int for the sign?

c# : string to ulong conversion keeping precision

How do you deal with numbers larger than UInt64 (C#)

Categories

Resources