What is the point of using an int for the sign? - c#

Because I needed to look at some methods in BigInteger, I DotPeeked into the assembly. And then I found something rather odd:
internal int _sign;
Why would you use an int for the sign of a number? Is there no reason, or is there something I'm missing. I mean, they could use a BitArray, or a bool, or a byte. Why an int?

If you look at some of the usages of _sign field in the decompiled code, you may find things like this:
if ((this._sign ^ other._sign) < 0)
return this._sign >= 0 ? 1 : -1;
Basically int type allows to compare signs of two values using multiplication. Obviously neither byte, nor bool would allow this.
Still there is a question: why not Int16 then, as it would consume less memory? This is perhaps connected with alignment.

Storing the sign as an int allows you to simply multiply by the sign to apply it to the result of a calculation. This could come in handy when converting to simpler types.

A bool can have only 2 states. The advantage of an int is that it now also is simple to keep track of the special value: 0
public bool get_IsZero()
{
return (this._sign == 0);
}
And several more shortcuts like that when you read the rest of the code.

The size of any class object is going to be rounded up to 32 bits (four bytes), so "saving" three bytes won't buy anything. One might be able to shave four bytes off the size of a typical BigInteger by stealing a bit from one of the words that holds the numeric value, but the extra processing required for such usage would outweigh the cost of wasting a 32-bit integer.
A more interesting possibility might be to have BigInteger be an abstract class, with derived classes PositiveBigInteger and NegativeBigInteger. Since every class object is going to have a word that says what class it is, such an approach would save 32 bits for each BigInteger that's created. Use of an abstract class in such fashion would add an extra virtual member dispatch to each function call, but would likely save an "if" test on most of them (since the methods of e.g. NegativeBigInteger would know by virtue of the fact that they are invoked that this is negative, they wouldn't have to test it). Such a design could also improve efficiency if there were classes for TinyBigInteger (a BigInteger whose value could fit in a single Integer) and SmallBigInteger (a BigInteger whose value could fit in a Long). I have no idea if Microsoft considered such a design, or what the trade-offs would have been.

Gets a number that indicates the sign (negative, positive, or zero) of the current System.Numerics.BigInteger object.
-1 The value of this object is negative. 0 The value of this object is 0 (zero). 1 The value of this object is positive.
That means
class Program
{
static void Main(string[] args)
{
BigInteger bInt1 = BigInteger.Parse("0");
BigInteger bInt2 = BigInteger.Parse("-5");
BigInteger bInt3 = BigInteger.Parse("5");
division10(bInt1);//it is Impossible
division10(bInt2);//it is Possible : -2
division10(bInt3);//it is Possible : 2
}
static void division10(BigInteger bInt)
{
double d = 10;
if (bInt.IsZero)
{
Console.WriteLine("it is Impossible");
}
else
{
Console.WriteLine("it is Possible : {0}", d / (int)bInt);
}
}
}
don't use byte or another uint, sbyte, ushort, short because exist CLS and CLS don't support their

Related

Difference between two large numbers C#

There are already solutions to this problem for small numbers:
Here: Difference between 2 numbers
Here: C# function to find the delta of two numbers
Here: How can I find the difference between 2 values in C#?
I'll summarise the answer to them all:
Math.Abs(a - b)
The problem is when the numbers are large this gives the wrong answer (by means of an overflow). Worse still, if (a - b) = Int32.MinValue then Math.Abs crashes with an exception (because Int32.MaxValue = Int32.MinValue - 1):
System.OverflowException occurred
HResult=0x80131516
Message=Negating the minimum value of a twos complement number is
invalid.
Source=mscorlib
StackTrace: at
System.Math.AbsHelper(Int32 value) at System.Math.Abs(Int32 value)
Its specific nature leads to difficult-to-reproduce bugs.
Maybe I'm missing some well known library function, but is there any way of determining the difference safely?
As suggested by others, use BigInteger as defined in System.Numerics (you'll have to include the namespace in Visual Studio)
Then you can just do:
BigInteger a = new BigInteger();
BigInteger b = new BigInteger();
// Assign values to a and b somewhere in here...
// Then just use included BigInteger.Abs method
BigInteger result = BigInteger.Abs(a - b);
Jeremy Thompson's answer is still valid, but note that the BigInteger namespace includes an absolute value method, so there shouldn't be any need for special logic. Also, Math.Abs expects a decimal, so it will give you grief if you try to pass in a BigInteger.
Keep in mind there are caveats to using BigIntegers. If you have a ludicrously large number, C# will try to allocate memory for it, and you may run into out of memory exceptions. On the flip side, BigIntegers are great because the amount of memory allotted to them is dynamically changed as the number gets larger.
Check out the microsoft reference here for more info: https://msdn.microsoft.com/en-us/library/system.numerics.biginteger(v=vs.110).aspx
The question is, how do you want to hold the difference between two large numbers? If you're calculating the difference between two signed long (64-bit) integers, for example, and the difference will not fit into a signed long integer, how do you intend to store it?
long a = +(1 << 62) + 1000;
long b = -(1 << 62);
long dif = a - b; // Overflow, bit truncation
The difference between a and b is wider than 64 bits, so when it's stored into a long integer, its high-order bits are truncated, and you get a strange value for dif.
In other words, you cannot store all possible differences between signed integer values of a given width into a signed integer of the same width. (You can only store half of all of the possible values; the other half require an extra bit.)
Your options are to either use a wider type to hold the difference (which won't help you if you're already using the widest long integer type), or to use a different arithmetic type. If you need at least 64 signed bits of precision, you'll probably need to use BigInteger.
The BigInteger was introduced in .Net 4.0.
There are some open source implementations available in lower versions of the .Net Framework, however you'd be wise to go with the standard.
If the Math.Abs still gives you grief you can implement the function yourself; if the number is negative (a - b < 0) simply trim the negative symbol so its unsigned.
Also, have you tried using Doubles? They hold much larger values.
Here's an alternative that might be interesting to you, but is very much within the confines of a particular int size. This example uses Int32, and uses bitwise operators to accomplish the difference and then the absolute value. This implementation is tolerant of your scenario where a - b equals the min int value, it naturally returns the min int value (not much else you can do, without casting things to the a larger data type). I don't think this is as good an answer as using BigInteger, but it is fun to play with if nothing else:
static int diff(int a, int b)
{
int xorResult = (a ^ b);
int diff = (a & xorResult) - (b & xorResult);
return (diff + (diff >> 31)) ^ (diff >> 31);
}
Here are some cases I ran it through to play with the behavior:
Console.WriteLine(diff(13, 14)); // 1
Console.WriteLine(diff(11, 9)); // 2
Console.WriteLine(diff(5002000, 2346728)); // 2655272
Console.WriteLine(diff(int.MinValue, 0)); // Should be 2147483648, but int data type can't go that large. Actual result will be -2147483648.

What is the limit of the Value Type BigInteger in C#?

As described in MSDN BigInteger is :
An immutable type that represents an arbitrarily large integer whose
value in theory has no upper or lower bounds.
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
MSDN goes further saying :
an OutOfMemoryException can be thrown for any operation that causes a
BigInteger value to grow too large.
and more :
Although this process is transparent to the caller, it does incur a
performance penalty. In some cases, especially when repeated
operations are performed in a loop on very large BigInteger values
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
I was told that it has ReferenceType obejects inside it, but all I can find here in its definition in VisualStudio is ValueTypes.
What's its real limit ? And even if doesn't have one, how can it "as a value type" manage to store all that amount of data ?
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
No, that's not true. It's a conventional limit, but it's entirely feasible for a value type to take more than that. For example:
public struct Foo {
private readonly int a, b, c, d, e; // Look ma, 20 bytes!
}
However, I strongly suspect that BigInteger actually includes a reference to a byte array:
public struct BigInteger {
private readonly byte[] data;
// Some other fields...
}
(Moslem Ben Dhaou's answer shows one current implementation using int and uint[], but of course the details of this are intentionally hidden.)
So the value of a BigInteger can still be small, but it can refer to a big chunk of memory - and if there isn't enough memory to allocate what's required when you perform some operation, you'll get an exception.
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
Well BigInteger is for integers, so I wouldn't particularly want to use it for anything to do with double... but fundamentally the limitations are going to be around how much memory you've got and the size of array the CLR can cope with. In reality, you'd be talking about enormous numbers before actually hitting the limit for any specific number - but if you have gazillions of smaller numbers, that obviously has large memory requirements too.
As a confirmation to the answer from Jon Skeet, I looked to the source code of BigInteger. It actually contains two internal properties as follow:
internal int _sign;
internal uint[] _bits;
_bits is used by almost all private/public methods within the class which are used to read/write the actual data.
_sign is used to keep the sign of the BigInteger.
The private methods are extensively using binary operators and calculations. Here is a small list of constants used in the class that might reflect a bit the limits:
private const int knMaskHighBit = -2147483648;
private const uint kuMaskHighBit = 2147483648U;
private const int kcbitUint = 32;
private const int kcbitUlong = 64;
private const int DecimalScaleFactorMask = 16711680;
private const int DecimalSignMask = -2147483648;
PS: I should have commented on J.S. answer, but a comment is too short. To view the source code, either download it or decompile System.Numerics.dll.
TL;DR: BigInteger maxvalue is 2^68685922272
In .Net 4.7.2 BigInteger uses an uint array for bits.
An uint holds 32bits of data.
An array's max size is defined as internal const int MaxArrayLength = 0X7FEFFFFF;
7FEFFFFF = 2146435071
Now, to calculate: max size of array x capacity of each uint is: 2146435071 x 32 = 68685922272. But that's only the count of the bits in a BigInteger.
Which means BigInteger's max value is: 2^68'685'922'272 which is stupendusly large (used ' for easier readability).
If they ever decide to increase the array's max size, then it will also increase the max value for BigInteger.
I just did some quick experiments on this. Max seems to be around 2^65,000,000,000 but actual practicality 2146435071
I get a System.OverflowException on the below at 0x1F. It overflowed between E FFFF FFE2 and F 7FFF FFE1. (or somewhere between 2^64,424,509,410 and 2^66,571,993,057)
// Test 1
BigInteger test = 1;
for (int i = 0x00; i < 0xFF; i++)
test <<= 0x7FFFFFFF;
// Test 2
BigInteger.Pow((BigInteger)2, 0x7FEFFFF0); // OK - I think - never finished
BigInteger.Pow((BigInteger)2, 0x7FEFFFFF); // Immediate OutOfMemoryException
I should also note that while ~66,571,993,057 seems to be supported. The usefulness is more like 2^2146435071 because POWER and shifts don't seem to work with a POWER larger then 2,146,435,071(for POW() ) or a shift amount more than 2,147,483,647. Larger shifts can be done but it would take several rounds ruining efficiency. And the other item is slow at those speeds - a single shift was taking about 7 seconds and BigInteger.Pow() took at least 5 minutes.
.Net 5, AMD Threadripper, 32GB RAM, Windows 10 x64

Reading and setting base 3 digits from base 2 integer

Part of my application data contains a set of 9 ternary (base-3) "bits". To keep the data compact for the database, I would like to store that data as a single short. Since 3^9 < 2^15 I can represent any possible 9 digit base-3 number as a short.
My current method is to work with it as a string of length 9. I can read or set any digit by index, and it is nice and easy. To convert it to a short though, I am currently converting to base 10 by hand (using a shift-add loop) and then using Int16.Parse to convert it back to a binary short. To convert a stored value back to the base 3 string, I run the process in reverse. All of this takes time, and I would like to optimize it if at all possible.
What I would like to do is always store the value as a short, and read and set ternary bits in place. Ideally, I would have functions to get and set individual digits from the binary in place.
I have tried playing with some bit shifts and mod functions, but havn't quite come up with the right way to do this. I'm not even sure if it is even possible without going through the full conversion.
Can anyone give me any bitwise arithmetic magic that can help out with this?
public class Base3Handler
{
private static int[] idx = {1, 3, 9, 27, 81, 243, 729, 729*3, 729*9, 729*81};
public static byte ReadBase3Bit(short n, byte position)
{
if ((position > 8) || (position < 0))
throw new Exception("Out of range...");
return (byte)((n%idx[position + 1])/idx[position]);
}
public static short WriteBase3Bit(short n, byte position, byte newBit)
{
byte oldBit = ReadBase3Bit(n, position);
return (short) (n + (newBit - oldBit)*idx[position]);
}
}
These are small numbers. Store them as you wish, efficiently in memory, but then use a table lookup to convert from one form to another as needed.
You can't do bit operations on ternary values. You need to use multiply, divide and modulo to extract and combine values.
To use bit operations you need to limit the packing to 8 ternaries per short (i.e. 2 bits each)

enum with value 0x0001?

I have an enum declaration like this:
public enum Filter
{
a = 0x0001;
b = 0x0002;
}
What does that mean? They are using this to filter an array.
It means they're the integer values assigned to those names. Enums are basically just named numbers. You can cast between the underlying type of an enum and the enum value.
For example:
public enum Colour
{
Red = 1,
Blue = 2,
Green = 3
}
Colour green = (Colour) 3;
int three = (int) Colour.Green;
By default an enum's underlying type is int, but you can use any of byte, sbyte, short, ushort, int, uint, long or ulong:
public enum BigEnum : long
{
BigValue = 0x5000000000 // Couldn't fit this in an int
}
It just means that if you do Filter->a, you get 1. Filter->b is 2.
The weird hex notation is just that, notation.
EDIT:
Since this is a 'filter' the hex notation makes a little more sense.
By writing 0x1, you specify the following bit pattern:
0000 0001
And 0x2 is:
0000 0010
This makes it clearer on how to use a filter.
So for example, if you wanted to filter out data that has the lower 2 bits set, you could do:
Filter->a | Filter->b
which would correspond to:
0000 0011
The hex notation makes the concept of a filter clearer (for some people). For example, it's relatively easy to figure out the binary of 0x83F0 by looking at it, but much more difficult for 33776 (the same number in base 10).
It's not clear what it is that you find unclear, so let's discuss it all:
The enum values have been given explicit numerical values. Each enum value is always represented as a numerical value for the underlying storage, but if you want to be sure what that numerical value is you have to specify it.
The numbers are written in hexadecimal notation, this is often used when you want the numerical values to contain a single set bit for masking. It's easier to see that the value has only one bit set when it's written as 0x8000 than when it's written as 32768.
In your example it's not as obvious as you have only two values, but for bit filtering each value represents a single bit so that each value is twice as large as the previous:
public enum Filter {
First = 0x0001,
Second = 0x0002,
Third = 0x0004,
Fourth = 0x0008
}
You can use such an enum to filter out single bits in a value:
If ((num & Filter.First) != 0 && (num & Filter.Third) != 0) {
Console.WriteLine("First and third bits are set.");
}
It could mean anything. We need to see more code then that to be able to understand what it's doing.
0x001 is the number 1. Anytime you see the 0x it means the programmer has entered the number in hexadecimal.
Those are literal hexadecimal numbers.
Main reason is :
It is easyer to read hex notation when writing numbers such as : "2 to the power of x" is needed.
To use enum type as bit flag, we need to increment enum values by power of 2 ...
1,2,4,8,16,32,64, etc. To keep it readable, hex notation is used.
Ex : 2^10 is 0x10000 in hex (neat and clean), but it is written 65536 in classical decimal notation ... Same for 0x200 (hex notation) and 512. (2^9)
Those look like they are bit masks of some sort. But their actual values are 1 and 2...
You can assign values to enums such as:
enum Example {
a = 10,
b = 23,
c = 0x00FF
}
etc...
Using Hexidecimal notation like that usually indicates that there may be some bit manipulation. I've used this notation often when dealing with this very thing, for the very reason you asked this question - this notation sort of pops out at you and says "Pay attention to me I'm important!"
Well we can use integers infact we can avoid any as the default nature of enum assigns 0 to its first member and an incremented value to the next available member. Many developers use this to hit two targets with one bow.
Complicate the code making it difficult to understand
Faster the performance as hex codes are nearer to binary one
my view is if we are still using why we are in fourth generation language just move to binary again
but its quite better technique to play with bits and encryption/decryption process

What's the best way to represent System.Decimal in Protocol Buffers?

Following on from this question, what would be the best way to represent a System.Decimal object in a Protocol Buffer?
Well, protobuf-net will simply handle this for you; it runs off the properties of types, and has full support for decimal. Since there is no direct way of expressing decimal in proto, it won't (currently) generate a decimal property from a ".proto" file, but it would be a nice tweak to recognise some common type ("BCL.Decimal" or similar) and interpret it as decimal.
As for representing it - I had a discussion document on this (now out of date I suspect) in the protobuf-net wiki area; there is now a working version in protobuf-net that simply does it for you.
No doubt Jon and I will hammer this out more later today ;-p
The protobuf-net version of this (in .proto) is something like (from here):
message Decimal {
optional uint64 lo = 1; // the first 64 bits of the underlying value
optional uint32 hi = 2; // the last 32 bis of the underlying value
optional sint32 signScale = 3; // the number of decimal digits, and the sign
}
Marc and I have very vague plans to come up with a "common PB message" library such that you can represent pretty common types (date/time and decimal springing instantly to mind) in a common way, with conversions available in .NET and Java (and anything else anyone wants to contribute).
If you're happy to stick to .NET, and you're looking for compactness, I'd possibly go with something like:
message Decimal {
// 96-bit mantissa broken into two chunks
optional uint64 mantissa_msb = 1;
optional uint32 mantissa_lsb = 2;
required sint32 exponent_and_sign = 3;
}
The sign can just be represented by the sign of exponent_and_sign, with the exponent being the absolute value.
Making both parts of the mantissa optional means that 0 is represented very compactly (but still differentiating between 0m and 0.0000m etc). exponent_and_sign could be optional as well if we really wanted.
I don't know about Marc's project, but in my port I generate partial classes, so you can the put a conversion between System.Decimal and Protobuf.Common.Decimal (or whatever) into the partial class.
A slightly simpler to implement approach than Jon or Marc's is to store it as 4 sint32 values, which conveniently maps trivially to the output of Decimal.GetBits().
The proto file will look like:
message ProtoDecimal {
sint32 v1 = 1;
sint32 v2 = 2;
sint32 v3 = 3;
sint32 v4 = 4;
}
And the converter will be:
public decimal ConvertToDecimal(AbideDecimal value)
{
return new decimal(new int[] { value.V1, value.V2, value.V3, value.V4 });
}
public ProtoDecimal ConvertFromDecimal(decimal value)
{
var bits = decimal.GetBits(value);
return new ProtoDecimal
{
V1 = bits[0],
V2 = bits[1],
V3 = bits[2],
V4 = bits[3]
}
}
This might not be as simple in other languages, but if you only have to target C# then it will take up the same maximum of 16 bytes that the other approach will (although values like 0 might not be as compactly stored - I don't know enough about the intricate details of how protobuf stores ints), while being much clearer to dumb-dumb programmers like me :)
Obviously you will have to race the horses if you want to test performance but I doubt there is much in it.
When you know you have a limited number of decimals, you can use the smallest possible unit as an integer value. For example, when handling money one don't require a decimal type but instead can define to use the unit cents. Then an integer with value 2 would refer to 0.02 in whatever currency is used.
I put together a patch for protobuf-csharp-port with hooks which generates protobuf classes with native Decimal and DateTime structs. Wire format wise, they are represented by two "built-in" proto messages.
Here is the link:
https://code.google.com/p/protobuf-csharp-port/issues/detail?can=2&start=0&num=100&q=&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary&groupby=&sort=&id=78

Categories