Different integer size usages - c#

Is there a difference between space (memory) usage of integer 1 and 234234? How much space would int.MaxValue use and just integer 1 use?
I'm incrementing a value in the program everytime the object gets accessed so I can keep the objects with the most usage in memory and others flushed to the disk. Therefore I was thinking the counter (integer) might increase a lot and a lot of memory would be used just for the counter?

No. int.MaxValue is the largest value which can be represent by a 32 bit integer. If you want a larger value you use long which consumes 64 bits. Basically, the amount of memory an integer consumes has nothing to do with it's value.
int small = 1 // translates to 0x00000001
while
int big = int.MaxValue //translates to 0x7FFFFFFF
They still consume that same 4 bytes of memory, they just have different values for the bits. The values in my code snippet are represented in hexadecimal, if you don't know how that translates into actual bits just check it out wikipedia.

Related

Decompression of 2 byte to 4 byte float

I have some animation data (x,y,z), which is represented as 2 byte structures and written in Little Endian. I know that they should be a 4 byte floating point, so i have to unpack them. I collected a few sample values as precise as it was possible (they doesn't represent exactly packed values, but very close to them) and roughly divided packed values on few ranges.
Sample values (Little Endian):
0.048879981 - 0x0046
0.056879997 - 0x0047
0.253880024 - 0x0050
0.313879967 - 0x0051
0.623880029 - 0x0055
1.003879905 - 0x0058
-0.066120029 - 0x00С8
-0.1561199428 - 0x00СD
-0.8691199871 - 0x00D7
Ranges:
0x0000 : zero
[0x0000,0x0014] : invisible changes (increasing probably)
[0x0014, ....] : increasing (visible)
0x0080 : zero, probably the point of sign change
[0x0080,0x00B0] : invisible changes (decreasing probably)
[0x00B0, ....] : decreasing (visible)
There are gaps (....) on the ends of ranges because it is hard to check them correctly, but i assume such big values which are lying close to these ends doesn't used in practice.
Also, it looks like a symmetry between positive and negative ranges, for example i tested 0x0058 which gave 1.003879905 and 0x00D8 which gave value close to -1.003879905 but not precise. Maybe it happened because of slightly offset observed after 0x0080, when visible decreasing starts from 0x00B0, but it should be about 0x0094 if entire range had equal symmetry. But slight measure inaccuracy might be as well.
So, how to get a function in C#, that will convert source data to 4 byte floating point?
Some initial comments based on the information in the question so far:
byte[] buffer = new byte[4]; is a bad approach because it addresses bytes individually while the other code manipulates bits using shifts within words, and C# does not define endianness. Simply use an unsigned 32-bit integer for all the work. The code will actually be simpler.
The code does not handle subnormal values properly. If num2 is zero and num3 is not zero, the significand (num3) must be shifted and the exponent (num2) must be adjusted.

Why is BigInteger in C# a struct if it has an unbounded size?

Why is BigInteger declared as a ValueType (struct) in C#? It seems to be very similar to the string type which is declared as a reference type.
Both are immutable (value types). Both can be arbitrarily large.
The recommendation I have heard is that a struct should never be more than 16 Bytes. BigInteger can get much larger than 16 Bytes and I would think this would make frequent operations extremely slow since it is always copied by value.
Copying a BigInteger does not cause the underlying data to be copied. Instead, just a reference to the data is copied.
Since BigInteger values are immutable it is safe for two or more values to share a common data buffer.
BigInteger has two instance fields:
int _sign - probably tells whether its a positive or negative value.
uint[] _bits - this is a reference to the data buffer.
An int is 4 bytes and a reference is 8 bytes (on a 64-bit system). Therefore the size of a BigInteger is ≤ 16 bytes.
If you look at the source for BigInteger and strip it down to only instance level fields (the things that would count toward it's size) all the class has is
public struct BigInteger : IFormattable, IComparable, IComparable<BigInteger>, IEquatable<BigInteger>
{
internal int _sign;
internal uint[] _bits;
}
So you have 4 bytes for _sign and 4 or 8 bytes for uint[] depending on if you are on a 32 or 64 bit system due to the fact that arrays are reference types. This gives you a total of 8 or 12 bytes, well below the 16 recommendation. (note: The CLR will pad the 12 byte version to 16 to make it a multiple of 8 for optimization reasons)
When a new BigInteger is created the _bits array will be shared between the two instances. Because the type is immutable (you can't change the value of any cell of _bits) it is safe for the two copies to share the array.
Here are the fields of a BigInteger:
// For values int.MinValue < n <= int.MaxValue, the value is stored in sign
// and _bits is null. For all other values, sign is +1 or -1 and the bits are in _bits
internal int _sign;
internal uint[] _bits;
So, one int and one uint[], which is a reference type. The type itself can't grow arbitrarily large. It'll be 8 bytes on x86 and 16 bytes on x64 (12 bytes for the field + 4 bytes of padding).
string and arrays are the only types in the framework which have a varying size and are special-cased in the runtime.
As to answer the question: there is less overhead in using a struct. Having a class wrapper over two fields would cause more indirection and more GC pressure for no good reason. Besides, a BigInteger is semantically a value.
The size of a struct matters only because the entire struct has to be copied each time you pass it around from one function to another. If it was not for the copying, nobody would care.
However, BigInteger consists of two parts:
The actual struct, which is the part that gets copied when you pass a BigInteger around, and is fairly small, and
The array of bits, which is of arbitrary length, but which is not copied each time the struct is copied.
So, when you pass a BigInteger, this is what happens:
Before copying:
[BigInteger instance 1] ---------> [array of bits]
After copying:
[BigInteger instance 1] ---------> [array of bits]
|
[BigInteger instance 2] ----+
Notice how there is always just one array of bits.

What is the limit of the Value Type BigInteger in C#?

As described in MSDN BigInteger is :
An immutable type that represents an arbitrarily large integer whose
value in theory has no upper or lower bounds.
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
MSDN goes further saying :
an OutOfMemoryException can be thrown for any operation that causes a
BigInteger value to grow too large.
and more :
Although this process is transparent to the caller, it does incur a
performance penalty. In some cases, especially when repeated
operations are performed in a loop on very large BigInteger values
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
I was told that it has ReferenceType obejects inside it, but all I can find here in its definition in VisualStudio is ValueTypes.
What's its real limit ? And even if doesn't have one, how can it "as a value type" manage to store all that amount of data ?
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
No, that's not true. It's a conventional limit, but it's entirely feasible for a value type to take more than that. For example:
public struct Foo {
private readonly int a, b, c, d, e; // Look ma, 20 bytes!
}
However, I strongly suspect that BigInteger actually includes a reference to a byte array:
public struct BigInteger {
private readonly byte[] data;
// Some other fields...
}
(Moslem Ben Dhaou's answer shows one current implementation using int and uint[], but of course the details of this are intentionally hidden.)
So the value of a BigInteger can still be small, but it can refer to a big chunk of memory - and if there isn't enough memory to allocate what's required when you perform some operation, you'll get an exception.
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
Well BigInteger is for integers, so I wouldn't particularly want to use it for anything to do with double... but fundamentally the limitations are going to be around how much memory you've got and the size of array the CLR can cope with. In reality, you'd be talking about enormous numbers before actually hitting the limit for any specific number - but if you have gazillions of smaller numbers, that obviously has large memory requirements too.
As a confirmation to the answer from Jon Skeet, I looked to the source code of BigInteger. It actually contains two internal properties as follow:
internal int _sign;
internal uint[] _bits;
_bits is used by almost all private/public methods within the class which are used to read/write the actual data.
_sign is used to keep the sign of the BigInteger.
The private methods are extensively using binary operators and calculations. Here is a small list of constants used in the class that might reflect a bit the limits:
private const int knMaskHighBit = -2147483648;
private const uint kuMaskHighBit = 2147483648U;
private const int kcbitUint = 32;
private const int kcbitUlong = 64;
private const int DecimalScaleFactorMask = 16711680;
private const int DecimalSignMask = -2147483648;
PS: I should have commented on J.S. answer, but a comment is too short. To view the source code, either download it or decompile System.Numerics.dll.
TL;DR: BigInteger maxvalue is 2^68685922272
In .Net 4.7.2 BigInteger uses an uint array for bits.
An uint holds 32bits of data.
An array's max size is defined as internal const int MaxArrayLength = 0X7FEFFFFF;
7FEFFFFF = 2146435071
Now, to calculate: max size of array x capacity of each uint is: 2146435071 x 32 = 68685922272. But that's only the count of the bits in a BigInteger.
Which means BigInteger's max value is: 2^68'685'922'272 which is stupendusly large (used ' for easier readability).
If they ever decide to increase the array's max size, then it will also increase the max value for BigInteger.
I just did some quick experiments on this. Max seems to be around 2^65,000,000,000 but actual practicality 2146435071
I get a System.OverflowException on the below at 0x1F. It overflowed between E FFFF FFE2 and F 7FFF FFE1. (or somewhere between 2^64,424,509,410 and 2^66,571,993,057)
// Test 1
BigInteger test = 1;
for (int i = 0x00; i < 0xFF; i++)
test <<= 0x7FFFFFFF;
// Test 2
BigInteger.Pow((BigInteger)2, 0x7FEFFFF0); // OK - I think - never finished
BigInteger.Pow((BigInteger)2, 0x7FEFFFFF); // Immediate OutOfMemoryException
I should also note that while ~66,571,993,057 seems to be supported. The usefulness is more like 2^2146435071 because POWER and shifts don't seem to work with a POWER larger then 2,146,435,071(for POW() ) or a shift amount more than 2,147,483,647. Larger shifts can be done but it would take several rounds ruining efficiency. And the other item is slow at those speeds - a single shift was taking about 7 seconds and BigInteger.Pow() took at least 5 minutes.
.Net 5, AMD Threadripper, 32GB RAM, Windows 10 x64

Increment forever and you get -2147483648?

For a clever and complicated reason that I don't really want to explain (because it involves making a timer in an extremely ugly and hacky way), I wrote some C# code sort of like this:
int i = 0;
while (i >= 0) i++; //Should increment forever
Console.Write(i);
I expected the program to hang forever or crash or something, but, to my surprise, after waiting for about 20 seconds or so, I get this ouput:
-2147483648
Well, programming has taught me many things, but I still cannot grasp why continually incrementing a number causes it to eventually be negative...what's going on here?
In C#, the built-in integers are represented by a sequence of bit values of a predefined length. For the basic int datatype that length is 32 bits. Since 32 bits can only represent 4,294,967,296 different possible values (since that is 2^32), clearly your code will not loop forever with continually increasing values.
Since int can hold both positive and negative numbers, the sign of the number must be encoded somehow. This is done with first bit. If the first bit is 1, then the number is negative.
Here are the int values laid out on a number-line in hexadecimal and decimal:
Hexadecimal Decimal
----------- -----------
0x80000000 -2147483648
0x80000001 -2147483647
0x80000002 -2147483646
... ...
0xFFFFFFFE -2
0xFFFFFFFF -1
0x00000000 0
0x00000001 1
0x00000002 2
... ...
0x7FFFFFFE 2147483646
0x7FFFFFFF 2147483647
As you can see from this chart, the bits that represent the smallest possible value are what you would get by adding one to the largest possible value, while ignoring the interpretation of the sign bit. When a signed number is added in this way, it is called "integer overflow". Whether or not an integer overflow is allowed or treated as an error is configurable with the checked and unchecked statements in C#. The default is unchecked, which is why no error occured, but you got that crazy small number in your program.
This representation is called 2's Complement.
The value is overflowing the positive range of 32 bit integer storage going to 0xFFFFFFFF which is -2147483648 in decimal. This means you overflow at 31 bit integers.
It's been pointed out else where that if you use an unsigned int you'll get different behaviour as the 32nd bit isn't being used to store the sign of of the number.
What you are experiencing is Integer Overflow.
In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is larger than can be represented within the available storage space. For instance, adding 1 to the largest value that can be represented constitutes an integer overflow. The most common result in these cases is for the least significant representable bits of the result to be stored (the result is said to wrap).
int is a signed integer. Once past the max value, it starts from the min value (large negative) and marches towards 0.
Try again with uint and see what is different.
Try it like this:
int i = 0;
while (i >= 0)
checked{ i++; } //Should increment forever
Console.Write(i);
And explain the results
What the others have been saying. If you want something that can go on forever (and I wont remark on why you would need something of this sort), use the BigInteger class in the System.Numerics namespace (.NET 4+). You can do the comparison to an arbitrarily large number.
It has a lot to do with how positive numbers and negative numbers are really stored in memory (at bit level).
If you're interested, check this video: Programming Paradigms at 12:25 and onwards. Pretty interesting and you will understand why your code behaves the way it does.
This happens because when the variable "i" reaches the maximum int limit, the next value will be a negative one.
I hope this does not sound like smart-ass advice, because its well meant, and not meant to be snarky.
What you are asking is for us to describe that which is pretty fundamental behaviour for integer datatypes.
There is a reason why datatypes are covered in the 1st year of any computer science course, its really very fundamental to understanding how and where things can go wrong (you can probably already see how the behaviour above if unexpected causes unexpected behaviour i.e. a bug in your application).
My advice is get hold of the reading material for 1st year computer science + Knuth's seminal work "The art of computer pragramming" and for ~ $500 you will have everything you need to become a great programmer, much cheaper than a whole Uni course ;-)

Most significant bit

I haven't dealt with programming against hardware devices in a long while and have forgotten pretty much all the basics.
I have a spec of what I should send in a byte and each bit is defined from the most significant bit (bit7) to the least significant (bit 0). How do i build this byte? From MSB to LSB, or vice versa?
If these bits are being 'packeted' (which they usually are), then the order of bits is the native order, 0 being the LSB, and 7 being the MSB. Bits are not usually sent one-by-one, but as bytes (usually more than one byte...).
According to wikipedia, bit ordering can sometimes be from 7->0, but this is probably the rare case.
If you're going to write the whole byte at the same time, i.e. do a parallel transfer as opposed to a serial, the order of the bits doesn't matter.
If the transfer is serial, then you must find out which order the device expects the bits in, it's impossible to tell from the outside.
To just assemble a byte from eight bits, just use bitwise-OR to "add" bits, one at a time:
byte value = 0;
value |= (1 << n); // 'n' is the index, with 0 as the LSB, of the bit to set.
If the spec says MSB, then build it MSB. Otherwise if the spec says LSB, then build it LSB. Otherwise, ask for more information.

Categories