C# So how does this hexadecimal stuff work? - c#

I'm doing some entry level programming challenges at codefights.com and I came across the following question. The link is to a blog that has the answer, but it includes the question in it as well. If only it had an explanation...
https://codefightssolver.wordpress.com/2016/10/19/swap-adjacent-bits/
My concern is with the line of code (it is the only line of code) below.
return (((n & 0x2AAAAAAA) >> 1) | ((n & 0x15555555) << 1)) ;
Specifically, I'm struggling to find some decent info on how the "0x2AAAAAAA" and "0x15555555" work, so I have a few dumb questions. I know they represent binary values of 10101010... and 01010101... respectively.
1. I've messed around some and found out that the number of 5s and As corresponds loosely and as far as I can tell to bit size, but how?
2. Why As? Why 5s?
3. Why the 2 and the 1 before the As and 5s?
4. Anything else I should know about this? Does anyone know a cool blog post or website that explains some of this in more detail?

0x2AAAAAAA is 00101010101010101010101010101010 in 32 bits binary,
0x15555555 is ‭00010101010101010101010101010101‬ in 32 bits binary.
Note that the problem specifies Constraints: 0 ≤ n < 2^30. For this reason the highest two bits can be 00.
The two hex numbers have been "built" starting from their binary representation, that has a particular property (that we will see in the next paragraph).
Now... We can say that, given the constraint, x & 0x2AAAAAAA will return the even bits of x (if we count the bits as first, second, third... the second bit is even), while x & 0x15555555 will return the odd bits of x. By using << 1 and >> 1 you move them of one step. By using | (or) you re-merge them.

0x2AAAAAAA is used to get 30 bits, which is the constraint.
Constraints:
0 ≤ n < 2^30.
0x15555555 also represent 30 bits with bits opposite of other number.
I would start with binary number (101010101010101010101010101010) in the calculator and select hex using programmer calculator to show the number in hex.
you can also use 0b101010101010101010101010101010 too, if you like, depending on language.

Related

Increment forever and you get -2147483648?

For a clever and complicated reason that I don't really want to explain (because it involves making a timer in an extremely ugly and hacky way), I wrote some C# code sort of like this:
int i = 0;
while (i >= 0) i++; //Should increment forever
Console.Write(i);
I expected the program to hang forever or crash or something, but, to my surprise, after waiting for about 20 seconds or so, I get this ouput:
-2147483648
Well, programming has taught me many things, but I still cannot grasp why continually incrementing a number causes it to eventually be negative...what's going on here?
In C#, the built-in integers are represented by a sequence of bit values of a predefined length. For the basic int datatype that length is 32 bits. Since 32 bits can only represent 4,294,967,296 different possible values (since that is 2^32), clearly your code will not loop forever with continually increasing values.
Since int can hold both positive and negative numbers, the sign of the number must be encoded somehow. This is done with first bit. If the first bit is 1, then the number is negative.
Here are the int values laid out on a number-line in hexadecimal and decimal:
Hexadecimal Decimal
----------- -----------
0x80000000 -2147483648
0x80000001 -2147483647
0x80000002 -2147483646
... ...
0xFFFFFFFE -2
0xFFFFFFFF -1
0x00000000 0
0x00000001 1
0x00000002 2
... ...
0x7FFFFFFE 2147483646
0x7FFFFFFF 2147483647
As you can see from this chart, the bits that represent the smallest possible value are what you would get by adding one to the largest possible value, while ignoring the interpretation of the sign bit. When a signed number is added in this way, it is called "integer overflow". Whether or not an integer overflow is allowed or treated as an error is configurable with the checked and unchecked statements in C#. The default is unchecked, which is why no error occured, but you got that crazy small number in your program.
This representation is called 2's Complement.
The value is overflowing the positive range of 32 bit integer storage going to 0xFFFFFFFF which is -2147483648 in decimal. This means you overflow at 31 bit integers.
It's been pointed out else where that if you use an unsigned int you'll get different behaviour as the 32nd bit isn't being used to store the sign of of the number.
What you are experiencing is Integer Overflow.
In computer programming, an integer overflow occurs when an arithmetic operation attempts to create a numeric value that is larger than can be represented within the available storage space. For instance, adding 1 to the largest value that can be represented constitutes an integer overflow. The most common result in these cases is for the least significant representable bits of the result to be stored (the result is said to wrap).
int is a signed integer. Once past the max value, it starts from the min value (large negative) and marches towards 0.
Try again with uint and see what is different.
Try it like this:
int i = 0;
while (i >= 0)
checked{ i++; } //Should increment forever
Console.Write(i);
And explain the results
What the others have been saying. If you want something that can go on forever (and I wont remark on why you would need something of this sort), use the BigInteger class in the System.Numerics namespace (.NET 4+). You can do the comparison to an arbitrarily large number.
It has a lot to do with how positive numbers and negative numbers are really stored in memory (at bit level).
If you're interested, check this video: Programming Paradigms at 12:25 and onwards. Pretty interesting and you will understand why your code behaves the way it does.
This happens because when the variable "i" reaches the maximum int limit, the next value will be a negative one.
I hope this does not sound like smart-ass advice, because its well meant, and not meant to be snarky.
What you are asking is for us to describe that which is pretty fundamental behaviour for integer datatypes.
There is a reason why datatypes are covered in the 1st year of any computer science course, its really very fundamental to understanding how and where things can go wrong (you can probably already see how the behaviour above if unexpected causes unexpected behaviour i.e. a bug in your application).
My advice is get hold of the reading material for 1st year computer science + Knuth's seminal work "The art of computer pragramming" and for ~ $500 you will have everything you need to become a great programmer, much cheaper than a whole Uni course ;-)

When is a shift operator >> or << useful? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
When to use Shift operators << >> in C# ?
I've in programming a while and I've never used the shift operator. I could see how it could be helpful for calculating hash codes like in Tuple<T>, but other than that,
When and how is the shift operator useful in C#/.NET?
In general it's not used very often. But it's very useful when dealing with bit level operations. For example printing out the bits in a numeric value
public static string GetBits(int value) {
var builder = new StringBuilder();
for (int i = 0; i < 32; i++) {
var test = 1 << (31 - i);
var isSet = 0 != (test & value);
builder.Append(isSet ? '1' : '0');
}
return builder.ToString();
}
It's useful to write powers of two.
Quick: What's 227?
Answer: 1 << 27
Writing 1 << 27 is both easier and more understandable than 134217728.
I use it rather a lot in dealing with hardware. This isn't something you probably do in C# a lot, but the operator was inherited from C/C++ where it is a fairly common need.
Example 1:
I just got a longword from a little-endian machine, but I'm big endian. How do I convert it? Well, the obvious is call htonl() (you cheater). One of the manual ways to do it is with something like the following:
((source & 0x000000ff) << 24 ) |
((source & 0x0000ff00) << 8) |
((source & 0x00ff0000) >> 8) |
((source & 0xff000000) >> 24);
Example 2:
I have a DMA device that only allows longword accesses of up to 512K. So it requires me to put (for reasons only understood by hardware folks) the modulo 4 of the transfer size into the high order 18 bits of a DMA transfer control register. For the sake of arguement, the low-order bits are to be filled with various flags controlling the DMA operation. That would be accomplished like so:
dma_flags | ((length & 0xffffc) << 14);
These might not be the kind of things you do every day. But for those of us that regularly interface to hardware they are.
If you ever need to multiply without using * How to implement multiplication without using multiplication operator in .NET :)
Or write a Sudoku solver Sudoku validity check algorithm - how does this code works?
In practice, the only time I've seen it used in my (limited) experience was as an (arguably) confusing way to multiply (see first link) or in conjunction with setting BitFlags (the Sudoku solver above).
In .NET I rarely have to work at the bit level; but if you need to, being able to shift is important.
Bitwise operators are good for saving space, but nowadays, space is hardly an issue.
It's useful when multiplying by powers of 2
number<<power;
is number*2^power
And of course division by powers of 2:
number>>power;
Another place is flags in enums.
when you come across code like
Regex re = new Regex(".",RegexOptions.Multiline|RegexOptions.Singleline);
the ability to use RegexOptions.Multiline|RegexOptions.Singleline i.e multiple flags is enabled through the shifting and also this allows them to be unique.
Something like:
enum RegexOptions {
Multiline = (1 << 0),
Singleline = (1<<1)
};
Bit shifts are used when manipulating individual bits is desired. You'll see a lot of bit shifts in many encryption algorithms, for example.
In optimization, it can used in place of multiplication/division. A shift left is equal to multiplying by two. Shift right equals division. You probably don't see this done anymore, since this level of optimization is often unnecessary.
Other than that, I can't think of many reasons to use it. I've seen it used before, but rarely in cases where it was really required and usually a more readable method could have been used.
Whenever you need to multiply by 2 ;)
Really the only use I have is for interoperability code and bitfields:
http://www.codeproject.com/KB/cs/masksandflags.aspx

Get number of digits in an unsigned long integer c#

I'm trying to determine the number of digits in a c# ulong number, i'm trying to do so using some math logic rather than using ToString().Length. I have not benchmarked the 2 approaches but have seen other posts about using System.Math.Floor(System.Math.Log10(number)) + 1 to determine the number of digits.
Seems to work fine until i transition from 999999999999997 to 999999999999998 at which point, it i start getting an incorrect count.
Has anyone encountered this issue before ?
I have seen similar posts with a Java emphasis # Why log(1000)/log(10) isn't the same as log10(1000)? and also a post # How to get the separate digits of an int number? which indicates how i could possibly achieve the same using the % operator but with a lot more code
Here is the code i used to simulate this
Action<ulong> displayInfo = number =>
Console.WriteLine("{0,-20} {1,-20} {2,-20} {3,-20} {4,-20}",
number,
number.ToString().Length,
System.Math.Log10(number),
System.Math.Floor(System.Math.Log10(number)),
System.Math.Floor(System.Math.Log10(number)) + 1);
Array.ForEach(new ulong[] {
9U,
99U,
999U,
9999U,
99999U,
999999U,
9999999U,
99999999U,
999999999U,
9999999999U,
99999999999U,
999999999999U,
9999999999999U,
99999999999999U,
999999999999999U,
9999999999999999U,
99999999999999999U,
999999999999999999U,
9999999999999999999U}, displayInfo);
Array.ForEach(new ulong[] {
1U,
19U,
199U,
1999U,
19999U,
199999U,
1999999U,
19999999U,
199999999U,
1999999999U,
19999999999U,
199999999999U,
1999999999999U,
19999999999999U,
199999999999999U,
1999999999999999U,
19999999999999999U,
199999999999999999U,
1999999999999999999U
}, displayInfo);
Thanks in advance
Pat
log10 is going to involve floating point conversion - hence the rounding error. The error is pretty small for a double, but is a big deal for an exact integer!
Excluding the .ToString() method and a floating point method, then yes I think you are going to have to use an iterative method but I would use an integer divide rather than a modulo.
Integer divide by 10. Is the result>0? If so iterate around. If not, stop.
The number of digits is the number of iterations required.
Eg. 5 -> 0; 1 iteration = 1 digit.
1234 -> 123 -> 12 -> 1 -> 0; 4 iterations = 4 digits.
I would use ToString().Length unless you know this is going to be called millions of times.
"premature optimization is the root of all evil" - Donald Knuth
From the documentation:
By default, a Double value contains 15
decimal digits of precision, although
a maximum of 17 digits is maintained
internally.
I suspect that you're running into precision limits. Your value of 999,999,999,999,998 probably is at the limit of precision. And since the ulong has to be converted to double before calling Math.Log10, you see this error.
Other answers have posted why this happens.
Here is an example of a fairly quick way to determine the "length" of an integer (some cases excluded). This by itself is not very interesting -- but I include it here because using this method in conjunction with Log10 can get the accuracy "perfect" for the entire range of an unsigned long without requiring a second log invocation.
// the lookup would only be generated once
// and could be a hard-coded array literal
ulong[] lookup = Enumerable.Range(0, 20)
.Select((n) => (ulong)Math.Pow(10, n)).ToArray();
ulong x = 999;
int i = 0;
for (; i < lookup.Length; i++) {
if (lookup[i] > x) {
break;
}
}
// i is length of x "in a base-10 string"
// does not work with "0" or negative numbers
This lookup-table approach can be easily converted to any base. This method should be faster than the iterative divide-by-base approach but profiling is left as an exercise to the reader. (A direct if-then branch broken into "groups" is likely quicker yet, but that's way too much repetitive typing for my tastes.)
Happy coding.

Why arithmetic shift halfs a number only in SOME incidents?

Hey, I'm self-learning about bitwise, and I saw somewhere in the internet that arithmetic shift (>>) by one halfs a number. I wanted to test it:
44 >> 1 returns 22, ok
22 >> 1 returns 11, ok
11 >> 1 returns 5, and not 5.5, why?
Another Example:
255 >> 1 returns 127
127 >> 1 returns 63 and not 63.5, why?
Thanks.
The bit shift operator doesn't actually divide by 2. Instead, it moves the bits of the number to the right by the number of positions given on the right hand side. For example:
00101100 = 44
00010110 = 44 >> 1 = 22
Notice how the bits in the second line are the same as the line above, merely
shifted one place to the right. Now look at the second example:
00001011 = 11
00000101 = 11 >> 1 = 5
This is exactly the same operation as before. However, the result of 5 is due to the fact that the last bit is shifted to the right and disappears, creating the result 5. Because of this behavior, the right-shift operator will generally be equivalent to dividing by two and then throwing away any remainder or decimal portion.
11 in binary is 1011
11 >> 1
means you shift your binary representation to the right by one step.
1011 >> 1 = 101
Then you have 101 in binary which is 1*1 + 0*2 + 1*4 = 5.
If you had done 11 >> 2 you would have as a result 10 in binary i.e. 2 (1*2 + 0*1).
Shifting by 1 to the right transforms sum(A_i*2^i) [i=0..n] in sum(A_(i+1)*2^i) [i=0..n-1]
that's why if your number is even (i.e. A_0 = 0) it is divided by two. (sorry for the customised LateX syntax... :))
Binary has no concept of decimal numbers. It's returning the truncated (int) value.
11 = 1011 in binary. Shift to the right and you have 101, which is 5 in decimal.
Bit shifting is the same as division or multiplication by 2^n. In integer arithmetics the result gets rounded towards zero to an integer. In floating-point arithmetics bit shifting is not permitted.
Internally bit shifting, well, shifts bits, and the rounding simply means bits that fall off an edge simply getting removed (not that it would actually calculate the precise value and then round it). The new bits that appear on the opposite edge are always zeroes for the right hand side and for positive values. For negative values, one bits are appended on the left hand side, so that the value stays negative (see how two's complement works) and the arithmetic definition that I used still holds true.
In most statically-typed languages, the return type of the operation is e.g. "int". This precludes a fractional result, much like integer division.
(There are better answers about what's 'under the hood', but you don't need to understand those to grok the basics of the type system.)

C/C++ Date Solution/Conversion

I need to come up with a way to unpack a date into a readable format. unfortunately I don't completely understand the original process/code that was used.
Per information that was forwarded to me the date was packed using custom C/Python code as follows;
date = year << 20;
date |= month << 16;
date |= day << 11;
date |= hour << 6;
date |= minute;
For example, a recent packed date is 2107224749 which equates to Tuesday Sept. 22 2009 10:45am
I understand....or at least I am pretty sure....the << is shifting the bits but I am not sure what the "|" accomplishes.
Also, in order to unpack the code the notes read as follows;
year = (date & 0xfff00000) >> 20;
month = (date & 0x000f0000) >> 16;
day = (date & 0x0000f800) >> 11;
hour = (date & 0x000007c0) >> 6;
minute = (date & 0x0000003f);
Ultimately, what I need to do is perform the unpack and convert to readable format using either JavaScript or ASP but I need to better understand the process above in order to develop a solution.
Any help, hints, tips, pointers, ideas, etc. would be greatly appreciated.
The pipe (|) is bitwise or, it is used to combine the bits into a single value.
The extraction looks straight-forward, except I would recommend shifting first, and masking then. This keeps the constant used for the mask as small as possible, which is easier to manage (and can possibly be a tad more efficient, although for this case that hardly matters).
Looking at the masks used written in binary reveals how many bits are used for each field:
0xfff00000 has 12 bits set, so 12 bits are used for the year
0x000f0000 has 4 bits set, for the month
0x0000f800 has 5 bits set, for the day
0x000007c0 has 5 bits set, for the hour
0x0000003f has 6 bits set, for the minute
The idea is exactly what you said. Performing "<<" just shifts the bits to the left.
What the | (bitwise or) is accomplishing is basically adding more bits to the number, but without overwriting what was already there.
A demonstration of this principle might help.
Let's say we have a byte (8 bits), and we have two numbers that are each 4 bits, which we want to "put together" to make a byte. Assume the numbers are, in binary, 1010, and 1011. So we want to end up with the byte: 10101011.
Now, how do we do this? Assume we have a byte b, which is initialized to 0.
If we take the first number we want to add, 1010, and shift it by 4 bits, we get the number 10100000 (the shift adds bytes to the right of the number).
If we do: b = (1010 << 4), b will have the value 10100000.
But now, we want to add the 4 more bits (0011), without touching the previous bits. To do this, we can use |. This is because the | operator "ignores" anything in our number which is zero. So when we do:
10100000 (b's current value)
|
00001011 (the number we want to add)
We get:
10101011 (the first four numbers are copied from the first number,
the other four numbers copied from the second number).
Note: This answer came out a little long, I'm wikiing this, so, if anyone here has a better idea how to explain it, I'd appreciate your help.
These links might help:
http://www.gamedev.net/reference/articles/article1563.asp
http://compsci.ca/v3/viewtopic.php?t=9893
In the decode section & is bit wise and the 0xfff00000 is a hexadecimal bit mask. Basically each character in the bit mask represents 4 bits of the number. 0 being 0000 in binary and f being 1111 so if you look at the operation in binary you are anding 1111 1111 1111 0000 0000 ... with whatever is in date so basically you are getting the upper three nibbles(half bytes) and shifting them down so that 00A00000 gives you 10(A in hex) for the year.
Also note that |= is like += it is bit wise or then assignment rolled in to one.
Just to add some practical tips:
minute = value & ((1 << 6)-1);
hour = (value >> 6) & ((1<<5)-1); // 5 == 11-6 == bits reserved for hour
...
1 << 5 creates a bit at position 5 (i.e. 32=00100000b),
(1<<5)-1 cretaes a bit mask where the 5 lowest bits are set (i.e. 31 == 00011111b)
x & ((1<<5)-1) does a bitwise 'and' preserving only the bits set in the lowest five bits, extracting the original hour value.
Yes the << shifts bits and the | is the bitwise OR operator.

Categories