Related
This is most probably the dumbest question anyone would ask, but regardless I hope I will find a clear answer for this.
My question is - How is an integer stored in computer memory?
In c# an integer is of size 32 bit. MSDN says we can store numbers from -2,147,483,648 to 2,147,483,647 inside an integer variable.
As per my understanding a bit can store only 2 values i.e 0 & 1. If I can store only 0 or 1 in a bit, how will I be able to store numbers 2 to 9 inside a bit?
More precisely, say I have this code int x = 5; How will this be represented in memory or in other words how is 5 converted into 0's and 1's, and what is the convention behind it?
It's represented in binary (base 2). Read more about number bases. In base 2 you only need 2 different symbols to represent a number. We usually use the symbols 0 and 1. In our usual base we use 10 different symbols to represent all the numbers, 0, 1, 2, ... 8, and 9.
For comparison, think about a number that doesn't fit in our usual system. Like 14. We don't have a symbol for 14, so how to we represent it? Easy, we just combine two of our symbols 1 and 4. 14 in base 10 means 1*10^1 + 4*10^0.
1110 in base 2 (binary) means 1*2^3 + 1*2^2 + 1*2^1 + 0*2^0 = 8 + 4 + 2 + 0 = 14. So despite not having enough symbols in either base to represent 14 with a single symbol, we can still represent it in both bases.
In another commonly used base, base 16, which is also known as hexadecimal, we have enough symbols to represent 14 using only one of them. You'll usually see 14 written using the symbol e in hexadecimal.
For negative integers we use a convenient representation called twos-complement which is the complement (all 1s flipped to 0 and all 0s flipped to 1s) with one added to it.
There are two main reasons this is so convenient:
We know immediately if a number is positive of negative by looking at a single bit, the most significant bit out of the 32 we use.
It's mathematically correct in that x - y = x + -y using regular addition the same way you learnt in grade school. This means that processors don't need to do anything special to implement subtraction if they already have addition. They can simply find the twos-complement of y (recall, flip the bits and add one) and then add x and y using the addition circuit they already have, rather than having a special circuit for subtraction.
This is not a dumb question at all.
Let's start with uint because it's slightly easier. The convention is:
You have 32 bits in a uint. Each bit is assigned a number ranging from 0 to 31. By convention the rightmost bit is 0 and the leftmost bit is 31.
Take each bit number and raise 2 to that power, and then multiply it by the value of the bit. So if bit number three is one, that's 1 x 23. If bit number twelve is zero, that's 0 x 212.
Add up all those numbers. That's the value.
So five would be 00000000000000000000000000000101, because 5 = 1 x 20 + 0 x 21 + 1 x 22 + ... the rest are all zero.
That's a uint. The convention for ints is:
Compute the value as a uint.
If the value is greater than or equal to 0 and strictly less than 231 then you're done. The int and uint values are the same.
Otherwise, subtract 232 from the uint value and that's the int value.
This might seem like an odd convention. We use it because it turns out that it is easy to build chips that perform arithmetic in this format extremely quickly.
Binary works as follows (as your 32 bits).
1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1 | 1 1 1 1
2^ 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16......................................0
x
x = sign bit (if 1 then negative number if 0 then positive)
So the highest number is 0111111111............1 (all ones except the negative bit), which is 2^30 + 2 ^29 + 2^28 +........+2^1 + 2^0 or 2,147,483,647.
The lowest is 1000000.........0, meaning -2^31 or -2147483648.
Is this what high level languages lead to!? Eeek!
As other people have said it's a base 2 counting system. Humans are naturally base 10 counters mostly, though time for some reason is base 60, and 6 x 9 = 42 in base 13. Alan Turing was apparently adept at base 17 mental arithmetic.
Computers operate in base 2 because it's easy for the electronics to be either on or off - representing 1 and 0 which is all you need for base 2. You could build the electronics in such a way that it was on, off or somewhere in between. That'd be 3 states, allowing you to do tertiary math (as opposed to binary math). However the reliability is reduced because it's harder to tell the difference between those three states, and the electronics is much more complicated. Even more levels leads to worse reliability.
Despite that it is done in multi level cell flash memory. In these each memory cell represents on, off and a number of intermediate values. This improves the capacity (each cell can store several bits), but it is bad news for reliability. This sort of chip is used in solid state drives, and these operate on the very edge of total unreliability in order to maximise capacity.
I'm writing up a QuadTree class and I need to include neighbor searches. I'm following along with this paper which includes a C implementation. (I'm basically converting it to C#.) And I'm having trouble because I'm not very good at working with binary numbers. I need to do the following things:
given #100 and #010, combine them to get #110. 1s will never overlap. right now what I do is store 100 and 10, when I add them it gives me 110.
perform left and right binary shifts on numbers like #011. right now I was storing 011 and applying the >> or << operator to it. This gives me completely the wrong number but strangely the logic I used it in to find common quad tree parents was working just fine.
given #011 and #001 sum them to get #100. Right now I'm storing 11 and 1, this obviously doesn't work at all
.
Perform bitwise subtraction, same as above but - instead of + :)
perform ANDs, ORs and XORs.
So, right now I'm storing everything as an integer that looks like the binary value, and the math seems to be working for the ANDs, ORs, XORs and shifts. The numbers are crazy in the middle but I get the right answer in the end.
I'm trying an implementation right now where I store the integer value that is the value of the binary number but I have to convert back and forth a lot, this is particularly a pain because I'm converting between an integer representation of the Value of the number and a string representation of the binary number itself. Creating a new string every second line of code is going to be a performance nightmare.
Can you guys recommend a basic strategy for working with binary numbers in C#?
P.S. I don't ever need to see the integer representation of these numbers. I really don't care that #010 == 2, the work I need to do sees it as #010 end of story.
Hopefully this is more clear than my original post!
I hope this will clear up some confusion:
The literal 0x00000001 represents 1 using hexadecimal notation.
The literal 1 represents 1 using decimal notation.
Here are some pairs of equivalent literals:
0x9 == 9
0xA == 10
0xF == 15
0x10 == 16
0x11 == 17
0x1F == 31
The primitive integral data types (byte, sbyte, ushort, short, char, uint, int, ulong, long) are are always stored in the computer's memory in binary format. Here are the above values notated with five bits:
1 == #00001
9 == #01001
10 == #01010
15 == #01111
16 == #10000
17 == #10001
31 == #11111
Now, consider:
unsigned int xLeftLocCode = xLocCode - 0x00000001;
Either of the following is equivalent in C#:
uint xLeftLocCode = xLocCode - 1;
uint xLeftLocCode = xLocCode - 0x00000001;
To assign the binary #10 to a uint variable, do this:
uint binaryOneZero = 2; //or 0x2
Hexadecimal notation is useful for bit manipulation because there is a one-to-one correspondence between hexadecimal and binary digits. More accurately, there is a one-to-four correspondence, as each of the 16 hexadecimal digits represents one of the 16 possible states of four bits:
0 == 0 == 0000
1 == 1 == 0001
2 == 2 == 0010
3 == 3 == 0011
4 == 4 == 0100
5 == 5 == 0101
6 == 6 == 0110
7 == 7 == 0111
8 == 8 == 1000
9 == 9 == 1001
A == 10 == 1010
B == 11 == 1011
C == 12 == 1100
D == 13 == 1101
E == 14 == 1110
F == 15 == 1111
It's therefore fairly easy to know that 0x700FF, for example, has all the bits of the least significant byte set, none of the next most significant byte, and the lower three bits of the next most significant byte. It's much harder to see that information at a glance if you use the literal 459007 to represent that value.
In conclusion:
Can you guys recommend a basic strategy for working with binary numbers in C#?
Learn the table above, and use hexadecimal literals.
why that line of interest does bitwise subtraction in C but integer subtraction in C#.
They should both do the same. Why do you think otherwise? And what do you mean with bitwise subtraction?
I'm storing the binary number 10 as the integer 10, not as the integer 2. I could be doing this all wrong.
Yes, That's wrong. #010 == 2, not 10.
I guess what I looking for is some basic strategy for working with binary numbers in C#.
By and large, those operations should be the same as in C.
The line you marked has nothing to do with bitwise anything. It's a straight subtraction of the number 1 (expressed as a hex double word (DWORD), or unsigned long integer (uint in C#).
Please i'm in a bit of fix. I have a number, say 9 and i would like to figure out how to write a program to calculate if an unknown number is maybe 21(ie 9+12) or 30(i.e 9+12) and so on
Use the % operator and check if the result is 0 (multiple) or not zero (not multiple)
http://msdn.microsoft.com/en-us/library/0w4e0fzs%28v=vs.80%29.aspx
The answer from Aleadam is correct but the link appears to have broken. Putting a little more context
The % Operator will find the mathematical remainder so if the numbers are multiples the remainder will be 0
% Operator is the following format
dividend % divisor
For example
SELECT 9%3 will return "0" because 9 is a multiple of 3
SELECT 7%3 will return "1" because 3 does not go into 7 and instead has a mathematical remainder of 1/3
SELECT 7 / 3 AS Integer, 7 % 3 AS Remainder;
I recently had a test question:
byte a,b,c;
a = 190;
b = 4;
c = (byte)(a & b);
What is the value of c?
I have never used a logical operand in this manner, what's going on here? Stepping through this, the answer is 4, but why?
Also, where would this come up in the real world? I would argue that using logical operands in this manner, with a cast, is just bad practice, but I could be wrong.
You are doing a bitwise AND in this case, not a logical AND, it is combining the bits of the two values of a & b and giving you a result that has only the bits set that are both set in a & b, in this case, just the 4s place bit.
190 = 10111110
& 4 = 00000100
-------------------
= 4 00000100
Edit: Interestingly, msdn itself makes the issue of whether to call it logical vs bitwise a bit muddy. On their description of the logical operators (& && | || etc) they say logical operators (bitwise and bool) but then on the description of & itself it indicates it performs a bitwise AND for integers and a logical AND for bools. It appears it is still considered a logical operator, but the action between integer types is a bitwise AND.
http://msdn.microsoft.com/en-us/library/sbf85k1c(v=vs.71).aspx
The logical AND operator, when applied to integers performs a bitwise AND operation. The result is 1 in each position in which a 1 appears in both of the operands.
0011
& 0101
------
0001
The decimal value 190 is equivalent to binary 10111110. Decimal 4 is binary 00000100.
Do a logical AND operation on the bits like this:
10111110
& 00000100
----------
00000100
So the result is 4.
Also, where would this come up in the real world? I would argue that using logical operands in this manner, with a cast, is just bad practice, but I could be wrong.
These operations are useful in several circumstances. The most common is when using Enum values as flags.
[Flags]
public enum MyFileOptions
{
None = 0,
Read = 1, // 2^0
Write = 2, // 2^1
Append = 4, // 2^2
}
If an Enum has values that are powers of two, then they can be combined into a single integer variable (with the Logical OR operator).
MyFileOptions openReadWrite = MyFileOptions.Read | MyFileOptions.Write;
In this variable, both bits are set, so it indicates that both the Read and Write options are selected.
The logical AND operator can be used to test values.
bool openForWriting = ((openReadWrite & MyFileOptions.Write) == MyFileOptions.Write);
NOTE
A lot of people are pointing out that this is actually a bitwise AND not a logical AND. I looked it up in the spec before I posted, and I was suprised to learn that both versions are referred to as "Logical AND" in the spec. This makes sense because it is performing the logical AND operation on each bit. So you are actually correct in the title of the question.
This is a bitwise AND, meaning the bits on both bytes are compared, and a 1 is returned if both bits are 1.
10111110 &
00000100
--------
00000100
One of the uses of a logical & on bytes is in networking and is called the Binary And Test. Basically, you logical and bytes by converting them to binary and doing a logical and on every bit.
In binary
4 == 100
190 == 10111110
& is and AND operation on the boolean operators, so it does de Binary and on 4 and 190 in byte format, so 10111110 AND 100 gives you 100, so the result is 4.
This is a bitwise AND, not a logical AND. Logical AND is a test of a condition ie:
if(a && b) doSomething();
A bitwise AND looks at the binary value of the two variables and combines them. Eg:
10101010 &
00001000
---------
00001000
This lines up the binary of the two numbers, like this:
10111110
00000100
--------
00000100
On each column, it checks to see if both numbers are 1. If it is, it will return 1 on the bottom. Otherwise, it will return 0 on the bottom.
I would argue that using logical operands in this manner, with a cast, is just bad practice, but I could be wrong.
There is no cast. & is defined on integers, and intended to be used this way. And just so everyone knows, this technically is a logical operator according to MSDN, which I find kind of crazy.
Hey, I'm self-learning about bitwise, and I saw somewhere in the internet that arithmetic shift (>>) by one halfs a number. I wanted to test it:
44 >> 1 returns 22, ok
22 >> 1 returns 11, ok
11 >> 1 returns 5, and not 5.5, why?
Another Example:
255 >> 1 returns 127
127 >> 1 returns 63 and not 63.5, why?
Thanks.
The bit shift operator doesn't actually divide by 2. Instead, it moves the bits of the number to the right by the number of positions given on the right hand side. For example:
00101100 = 44
00010110 = 44 >> 1 = 22
Notice how the bits in the second line are the same as the line above, merely
shifted one place to the right. Now look at the second example:
00001011 = 11
00000101 = 11 >> 1 = 5
This is exactly the same operation as before. However, the result of 5 is due to the fact that the last bit is shifted to the right and disappears, creating the result 5. Because of this behavior, the right-shift operator will generally be equivalent to dividing by two and then throwing away any remainder or decimal portion.
11 in binary is 1011
11 >> 1
means you shift your binary representation to the right by one step.
1011 >> 1 = 101
Then you have 101 in binary which is 1*1 + 0*2 + 1*4 = 5.
If you had done 11 >> 2 you would have as a result 10 in binary i.e. 2 (1*2 + 0*1).
Shifting by 1 to the right transforms sum(A_i*2^i) [i=0..n] in sum(A_(i+1)*2^i) [i=0..n-1]
that's why if your number is even (i.e. A_0 = 0) it is divided by two. (sorry for the customised LateX syntax... :))
Binary has no concept of decimal numbers. It's returning the truncated (int) value.
11 = 1011 in binary. Shift to the right and you have 101, which is 5 in decimal.
Bit shifting is the same as division or multiplication by 2^n. In integer arithmetics the result gets rounded towards zero to an integer. In floating-point arithmetics bit shifting is not permitted.
Internally bit shifting, well, shifts bits, and the rounding simply means bits that fall off an edge simply getting removed (not that it would actually calculate the precise value and then round it). The new bits that appear on the opposite edge are always zeroes for the right hand side and for positive values. For negative values, one bits are appended on the left hand side, so that the value stays negative (see how two's complement works) and the arithmetic definition that I used still holds true.
In most statically-typed languages, the return type of the operation is e.g. "int". This precludes a fractional result, much like integer division.
(There are better answers about what's 'under the hood', but you don't need to understand those to grok the basics of the type system.)