How unchecked int overflow work c#

How unchecked int overflow work c# - c#

We know that an int value has a maximum value of 2^31 - 1 and a minimum value of -2^31. If we were to set int to the maximum value:
int x = int.MaxValue;
And we take that x and add one to it in a unchecked field
unchecked
{
x++;
}
Then we get x = the minimum value of int. My question is why does this happen and how does it happen in terms of binary.

In C#, the built-in integers are represented by a sequence of bit values of a predefined length. For the basic int datatype that length is 32 bits. Since 32 bits can only represent 4,294,967,296 different possible values (since that is 2^32),
Since int can hold both positive and negative numbers, the sign of the number must be encoded somehow. This is done with first bit. If the first bit is 1, then the number is negative.
Here are the int values laid out on a number-line in hexadecimal and decimal:
Hexadecimal Decimal
----------- -----------
0x80000000 -2147483648
0x80000001 -2147483647
0x80000002 -2147483646
... ...
0xFFFFFFFE -2
0xFFFFFFFF -1
0x00000000 0
0x00000001 1
0x00000002 2
... ...
0x7FFFFFFE 2147483646
0x7FFFFFFF 2147483647
As you can see from this chart, the bits that represent the smallest possible value are what you would get by adding one to the largest possible value, while ignoring the interpretation of the sign bit. When a signed number is added in this way, it is called "integer overflow". Whether or not an integer overflow is allowed or treated as an error is configurable with the checked and unchecked statements in C#.
This representation is called 2's complement
You can check this link if you want to go deeper.

Int has a maximum value of 2^31 - 1 because , int is an alias for Int32 that stores 4 bytes in memory to represent the value.
Now lets come to the concept of int.Max + 1. Here you need to know about the Signed number representations that is used to store the negative values. In Binary number representation, there's nothing like negative numbers but they are represented by the one's complement and two's complement bits.
Lets say, my int1 has a storage memory of 1 byte, ie. 8 bits. So Max value you can store into int1 is 2^8 -1 = 255 . Now let's add 1 into this value-
11111111
+ 00000000
----------
100000000
your output is 100000000 = (512) in decimal, that is beyond the storage capacity of int1 and that represents the negative value of -256 in decimal (as first bit shows the negative value).
This is the reason adding 1 to int.Max become, int.Minimum. ie.
`int.Maximum + 1 = int.Minimum`

Okay, I'm going to assume for sake of convenience that there's a 4 bit integer type in C#. The most significant bit is used to store the sign, the remaining three are used to store the magnitude.
The max number that can be stored in such a representation is +7 (positive 7 in the base 10). In binary, this is:
0111
Now let's add positive one:
0111
+0001
_____
1000
Whoops, that carried over from the magnitudes and flipped the sign bit! The sum you see above is actually the number -8, the smallest possible number that can be stored in this representation. If you're not familiar with two's complement, which is how signed numbers are commonly represented in binary, you might think that number is negative zero (which would be the case in a sign and magnitude representation). You can read up on two's complement to understand this better.

Related

Explicit conversion from Single to Decimal results in different bit representation

If I convert single s into decimal d I've noticed it's bit representation differs from that of the decimal created directly.
For example:
Single s = 0.01f;
Decimal d = 0.01m;
int[] bitsSingle = Decimal.GetBits((decimal)s)
int[] bitsDecimal = Decimal.GetBits(d)
Returns (middle elements removed for brevity):
bitsSingle:
[0] = 10
[3] = 196608
bitsDecimal:
[0] = 1
[3] = 131072
Both of these are decimal numbers, which both (appear) to be accurately representing 0.01:
Looking at the spec sheds no light except perhaps:
§4.1.7 Contrary to the float and double data types, decimal fractional
numbers such as 0.1 can be represented exactly in the decimal
representation.
Suggesting that this is somehow affected by single not being able accurately represent 0.01 before the conversion, therefore:
Why is this not accurate by the time the conversion is done?
Why do we seem to have two ways to represent 0.01 in the same datatype?

TL;DR
Both decimals precisely represent 0.1. It's just that the decimal format, allows multiple bitwise-different values that represent the exact same number.
Explanation
It isn't about single not being able to represent 0.1 precisely. As per the documentation of GetBits:
The binary representation of a Decimal number consists of a 1-bit
sign, a 96-bit integer number, and a scaling factor used to divide the
integer number and specify what portion of it is a decimal fraction.
The scaling factor is implicitly the number 10, raised to an exponent
ranging from 0 to 28.
The return value is a four-element array of 32-bit signed integers.
The first, second, and third elements of the returned array contain
the low, middle, and high 32 bits of the 96-bit integer number.
The fourth element of the returned array contains the scale factor and
sign. It consists of the following parts:
Bits 0 to 15, the lower word, are unused and must be zero.
Bits 16 to 23 must contain an exponent between 0 and 28, which
indicates the power of 10 to divide the integer number.
Bits 24 to 30 are unused and must be zero.
Bit 31 contains the sign: 0 mean positive, and 1 means negative.
Note that the bit representation differentiates between negative and
positive zero. These values are treated as being equal in all
operations.
The fourth integer of each decimal in your example is 0x00030000 for bitsSingle and 0x00020000 for bitsDecimal. In binary this maps to:
bitsSingle 00000000 00000011 00000000 00000000
|\-----/ \------/ \---------------/
| | | |
sign <-+ unused exponent unused
| | | |
|/-----\ /------\ /---------------\
bitsDecimal 00000000 00000010 00000000 00000000
NOTE: exponent represents multiplication by negative power of 10
Therefore, in the first case the 96-bit integer is divided by an additional factor of 10 compared to the second -- bits 16 to 23 give the value 3 instead of 2. But that is offset by the 96-bit integer itself, which in the first case is also 10 times greater than in the second (obvious from the values of the first elements).
The difference in observed values can therefore be attributed simply to the fact that the conversion from single uses subtly different logic to derive the internal representation compared to the "straight" constructor.

Why do float and int have such different maximum values even though they're the same number of bits? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
what the difference between the float and integer data type when the size is same in java?
As you probably know, both of these types are 32-bits.int can hold only integer numbers, whereas float also supports floating point numbers (as the type names suggest).
How is it possible then that the max value of int is 231, and the max value of float is 3.4*1038, while both of them are 32 bits?
I think that int's max value capacity should be higher than the float because it doesn't save memory for the floating number and accepts only integer numbers. I'll be glad for an explanation in that case.

Your intuition quite rightly tells you that there can be no more information content in one than the other, because they both have 32 bits. But that doesn't mean we can't use those bits to represent different values.
Suppose I invent two new datatypes, uint4 and foo4. uint4 uses 4 bits to represent an integer, in the standard binary representation, so we have
bits value
0000 0
0001 1
0010 2
...
1111 15
But foo4 uses 4 bits to represent these values:
bits value
0000 0
0001 42
0010 -97
0011 1
...
1110 pi
1111 e
Now foo4 has a much wider range of values than uint4, despite having the same number of bits! How? Because there are some uint4 values that can't be represented by foo4, so those 'slots' in the bit mapping are available for other values.
It is the same for int and float - they can both store values from a set of 232 values, just different sets of 232 values.

A float might store a higher number value, but it will not be precise even on digits before the decimal dot.
Consider the following example:
float a = 123456789012345678901234567890f; //30 digits
Console.WriteLine(a); // 1.234568E+29
Notice that barely any precision is kept.
An integer on the other hand will always precisely store any number within its range of values.
For the sake of comparison, let's look at a double precision floating point number:
double a = 123456789012345678901234567890d; //30 digits
Console.WriteLine(a); // 1.23456789012346E+29
Notice that roughly twice as much significant digits are preserved.

These are based on IEEE754 floating point specification, that is why it is possible. Please read this documentation. It is not just about how many bits.

The hint is in the "floating" part of "floating point". What you say basically assumes fixed point. A floating point number does not "reserve space" for the digits after the decimal point - it has a limited number of digits (23 binary) and remembers what power of two to multiply it by.

Is there any practical difference between the .net decimal values 1m and 1.0000m?

Is there any practical difference between the .net decimal values 1m and 1.0000m?
The internal storage is different:
1m : 0x00000001 0x00000000 0x00000000 0x00000000
1.0000m : 0x000186a0 0x00000000 0x00000000 0x00050000
But, is there a situation where the knowledge of "significant digits" would be used by a method in the BCL?
I ask because I'm working on a means of compressing the space required for decimal values for disk storage or network transport and am toying with the idea of "normalizing" the value before I store it to improve it's compressability. But, I'd like to know if it is likely to cause issues down the line. I'm guessing that it should be fine, but only because I don't see any methods or properties that expose the precision of the value. Does anyone know otherwise?

The reason for the difference in encoding is because the Decimal data type stores the number as a whole number (96 bit integer), with a scale which is used to form the divisor to get the fractional number. The value is essentially
integer / 10^scale
Internally the Decimal type is represented as 4 Int32, see the documentation of Decimal.GetBits for more detail. In summary, GetBits returns an array of 4 Int32s, where each element represents the follow portion of the Decimal encoding
Element 0,1,2 - Represent the low, middle and high 32 bits on the 96 bit integer
Element 3 - Bits 0-15 Unused
Bits 16-23 exponent which is the power of 10 to divide the integer by
Bits 24-30 Unused
Bit 31 the sign where 0 is positive and 1 is negative
So in your example, very simply put when 1.0000m is encoded as a decimal the actual representation is 10000 / 10^4 while 1m is represented as 1 / 10^0 mathematically the same value just encoded differently.
If you use the native .NET operators for the decimal type and do not manipulate/compare the bit/bytes yourself you should be safe.
You will also notice that the string conversions will also take this binary representation into consideration and produce different strings so you need to be careful in that case if you ever rely on the string representation.

The decimal type tracks scale because it's important in arithmetic. If you do long multiplication, by hand, of two numbers — for instance, 3.14 * 5.00 — the result has 6 digits of precision and a scale of 4.
To do the multiplication, ignore the decimal points (for now) and treat the two numbers as integers.
3.14
* 5.00
------
0000 -- 0 * 314 (0 in the one's place)
00000 -- 0 * 314 (0 in the 10's place)
157000 -- 5 * 314 (5 in the 100's place)
------
157000
That gives you the unscaled results. Now, count the total number of digits to the right of the decimal point in the expression (that would be 4) and insert the decimal point 4 places to the left:
15.7000
That result, while equivalent in value to 15.7, is more precise than the value 15.7. The value 15.7000 has 6 digits of precision and a scale of 4; 15.7 has 3 digits of precision and a scale of 1.
If one is trying to do precision arithmetic, it is important to track the precision and scale of your values and results as it tells you something about the precision of your results (note that precision isnt' the same as accuracy: measure something with a ruler graduated in 1/10ths of an inch and the best you can say about the resulting measurement, no matter how many trailing zeros you put to the right of the decimal point is that it is accurate to, at best, a 1/10th of an inch. Another way of putting it would be to say that your measurement is accurate, at best, within +/- 5/100ths of the stated value.

The only reason I can think of is so invoking `ToString returns the exact textual representation in the source code.
Console.WriteLine(1m); // 1
Console.WriteLine(1.000m); // 1.000

If byte is 8 bit integer then how can we set it to 255?

The byte keyword denotes an integral
type that stores values as indicated
in the following table. It's an Unsigned 8-bit integer.
If it's only 8 bits then how can we assign it to equal 255?
byte myByte = 255;
I thought 8 bits was the same thing as just one character?

There are 256 different configuration of bits in a byte
0000 0000
0000 0001
0000 0010
...
1111 1111
So can assign a byte a value in the 0-255 range

Characters are described (in a basic sense) by a numeric representation that fits inside an 8 bit structure. If you look at the ASCII Codes for ascii characters, you'll see that they're related to numbers.
The integer count a bit sequence can represent is generated by the formula 2^n - 1 (as partially described above by #Marc Gravell). So an 8 bit structure can hold 256 values including 0 (also note TCPIP numbers are 4 separate sequences of 8 bit structures). If this was a signed integer, the first bit would be a flag for the sign and the remaining 7 would indicate the value, so while it would still hold 256 values, but the maximum and minimum would be determined by the 7 trailing bits (so 2^7 - 1 = 127).
When you get into Unicode characters and "high ascii" characters, the encoding requires more than an 8 bit structure. So in your example, if you were to assign a byte a value of 76, a lookup table could be consulted to derive the ascii character v.

11111111 (8 on bits) is 255: 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1
Perhaps you're confusing this with 256, which is 2^8?

8 bits (unsigned) is 0 thru 255, or (2^8)-1.
It sounds like you are confusing integer vs text representations of data.

i thought 8 bits was the same thing as
just one character?
I think you're confusing the number 255 with the string "255."
Think about it this way: if computers stored numbers internally using characters, how would it store those characters? Using bits, right?
So in this hypothetical scenario, a computer would use bits to represent characters which it then in turn used to represent numbers. Aside from being horrendous from an efficiency standpoint, this is just redundant. Bits can represent numbers directly.

255 = 2^8 − 1 = FF[hex] = 11111111[bin]

range of values for unsigned 8 bits is 0 to 255. so this is perfectly valid
8 bits is not the same as one character in c#. In c# character is 16 bits. ANd even if character is 8 bits it has no relevance to the main question

I think you're confusing character encoding with the actual integral value stored in the variable.
A 8 bit value can have 255 configurations as answered by Arkain
Optionally, in ASCII, each of those configuration represent a different ASCII character
So, basically it depends how you interpret the value, as a integer value or as a character
ASCII Table
Wikipedia on ASCII

Sure, a bit late to answer, but for those who get this in a google search, here we go...
Like others have said, a character is definitely different to an integer. Whether it's 8-bits or not is irrelevant, but I can help by simply stating how each one works:
for an 8-bit integer, a value range between 0 and 255 is possible (or -127..127 if it's signed, and in this case, the first bit decides the polarity)
for an 8-bit character, it will most likely be an ASCII character, of which is usually referenced by an index specified with a hexadecimal value, e.g. FF or 0A. Because computers back in the day were only 8-bit, the result was a 16x16 table i.e. 256 possible characters in the ASCII character set.
Either way, if the byte is 8 bits long, then both an ASCII address or an 8-bit integer will fit in the variable's data. I would recommend using a different more dedicated data type though for simplicity. (e.g. char for ASCII or raw data, int for integers of any bit length, usually 32-bit)

Why has the Int32 type a maximum value of 2³¹ − 1?

I know Int32 has a length of 32 bits (4 bytes). I assume it has 2³² values but as half of them must be under zero, I guess it has something to do with this.
I would like to know why exactly Int32 has maximum positive number 2³¹ − 1.

This most significant bit is used to code the sign (1 meaning negative), so only 31 bits are available for the actual value.
Int32.MaxValue = 2^31 - 1 = 01111111111111111111111111111111
1 = 00000000000000000000000000000001
0 = 00000000000000000000000000000000
-1 = 11111111111111111111111111111111
Int32.MinValue = -2^31 = 10000000000000000000000000000000

2³² possible values
− 2³¹ values used for negative integers
− 1 value used for zero
= 2³¹ − 1 values available for positive integers

2³² is about 4.2 billion. This is the maximum number of VALUES that a binary number with 32 digits (a 32-bit number) can represent.
Those values can be any values in any range. In an UNSIGNED 32-bit number, the valid values are from 0 to 2³² − 1 (instead of 1 to 2³², but the same number of VALUES, about 4.2 billion).
In a SIGNED 32-bit number, one of the 32 bits is used to indicate whether the number is negative or not. This reduces the number of values by a factor of 2¹, or by half. This leaves 2³¹, which is about 2.1 billion. This means the range is now about −2.1 billion to 2.1 billion. Same number of values, different range.

You have 2^31 values below zero (minimum value = -2^31), 2^31-1 values above zero and zero itself. That makes 2^31 + 2^31-1 + 1 = 2*2^31 = 2^32 values :) ...
The other explanation involves the way how negative numbers are represented (using the two-complement): Shortly, the most-significant bit indicates a negative number, so you have 2^31 positive numbers (including zero) left, which gives us the range 0..2^31-1

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.