Hamming Weight of Int64 [duplicate] - c#

I have read through this SO question about 32-bits, but what about 64-bit numbers? Should I just mask the upper and lower 4 bytes, perform the count on the 32-bits and then add them together?

You can find 64 bit version here http://en.wikipedia.org/wiki/Hamming_weight
It is something like this
static long NumberOfSetBits(long i)
{
i = i - ((i >> 1) & 0x5555555555555555);
i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333);
return (((i + (i >> 4)) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56;
}
This is a 64 bit version of the code form here How to count the number of set bits in a 32-bit integer?
Using Joshua's suggestion I would transform it into this:
static int NumberOfSetBits(ulong i)
{
i = i - ((i >> 1) & 0x5555555555555555UL);
i = (i & 0x3333333333333333UL) + ((i >> 2) & 0x3333333333333333UL);
return (int)(unchecked(((i + (i >> 4)) & 0xF0F0F0F0F0F0F0FUL) * 0x101010101010101UL) >> 56);
}
EDIT: I found a bug while testing 32 bit version. I added missing parentheses. The sum should be done before bitwise &, in the last line
EDIT2 Added safer version for ulong

A fast (and more portable than using non-standard compiler extensions) way:
int bitcout(long long n)
{
int ret=0;
while (n!=0)
{
n&=(n-1);
ret++;
}
return ret;
}
Every time you do a n&=(n-1) you eliminate the last set bit in n. Thus this takes O(number of set bits) time.
This faster than the O(log n) you would need if you tested every bit - not every bit is set unless the number is 0xFFFFFFFFFFFFFFFF), thus usually you need far fewer iterations.

Standard answer in C#:
ulong val = //whatever
byte count = 0;
while (val != 0) {
if ((val & 0x1) == 0x1) count++;
val >>= 1;
}
This shifts val right one bit, and increments count if the rightmost bit is set. This is a general algorithm that can be used for any length integer.

Related

Efficiently shift an integer to the left and wrap around [duplicate]

I'm trying to implement a function that performs a circular rotation of a byte to the left and to the right.
I wrote the same code for both operations. For example, if you are rotating left 1010 becomes 0101. Is this right?
unsigned char rotl(unsigned char c) {
int w;
unsigned char s = c;
for (w = 7; w >= 0; w--) {
int b = (int)getBit(c, w);//
if (b == 0) {
s = clearBit(s, 7 - w);
} else if (b == 1) {
s = setBit(s, 7 - w);
}
}
return s;
}
unsigned char getBit(unsigned char c, int n) {
return c = (c & (1 << n)) >> n;
}
unsigned char setBit(unsigned char c, int n) {
return c = c | (1 << n);
}
unsigned char clearBit(unsigned char c, int n) {
return c = c &(~(1 << n));
}
There is no rotation operator in C, but if you write:
unsigned char rotl(unsigned char c)
{
return (c << 1) | (c >> 7);
}
then, according to this: http://www.linux-kongress.org/2009/slides/compiler_survey_felix_von_leitner.pdf (page 56), compilers will figure out what you want to do and perform the rotation it in only one (very fast) instruction.
Reading the answers and comments so far, there seems to be some confusion about what you are trying to accomplish - this may be because of the words you use. In bit manipulation, there are several "standard" things you can do. I will summarize some of these to help clarify different concepts. In all that follows, I will use abcdefgh to denote 8 bits (could be ones or zeros) - and as they move around, the same letter will refer to the same bit (maybe in a different position); if a bit becomes "definitely 0 or 1, I will denote it as such).
1) Bit shifting: This is essentially a "fast multiply or divide by a power of 2". The symbol used is << for "left shift" (multiply) or >> for right shift (divide). Thus
abcdefgh >> 2 = 00abcdef
(equivalent to "divide by four") and
abcdefgh << 3 = abcdefgh000
(equivalent to "multiply by eight" - and assuming there was "space" to shift the abc into; otherwise this might result in an overflow)
2) Bit masking: sometimes you want to set certain bits to zero. You do this by doing an AND operation with a number that has ones where you want to preserve a bit, and zeros where you want to clear a bit.
abcdefgh & 01011010 = 0b0de0g0
Or if you want to make sure certain bits are one, you use the OR operation:
abcdefgh | 01011010 = a1c11f1h
3) Circular shift: this is a bit trickier - there are instances where you want to "move bits around", with the ones that "fall off at one end" re-appearing at the other end. There is no symbol for this in C, and no "quick instruction" (although most processors have a built-in instruction which assembler code can take advantage of for FFT calculations and such). If you want to do a "left circular shift" by three positions:
circshift(abcdefgh, 3) = defghabc
(note: there is no circshift function in the standard C libraries, although it exists in other languages - e.g. Matlab). By the same token a "right shift" would be
circshift(abcdefgh, -2) = ghabcdef
4) Bit reversal: Sometimes you need to reverse the bits in a number. When reversing the bits, there is no "left" or "right" - reversed is reversed:
reverse(abcdefgh) = hgfedcba
Again, there isn't actually a "reverse" function in standard C libraries.
Now, let's take a look at some tricks for implementing these last two functions (circshift and reverse) in C. There are entire websites devoted to "clever ways to manipulate bits" - see for example this excellent one. for a wonderful collection of "bit hacks", although some of these may be a little advanced...
unsigned char circshift(unsigned char x, int n) {
return (x << n) | (x >> (8 - n));
}
This uses two tricks from the above: shifting bits, and using the OR operation to set bits to specific values. Let's look at how it works, for n = 3 (note - I am ignoring bits above the 8th bit since the return type of the function is unsigned char):
(abcdefgh << 3) = defgh000
(abcdefgh >> (8 - 3)) = 00000abc
Taking the bitwise OR of these two gives
defgh000 | 00000abc = defghabc
Which is exactly the result we wanted. Note also that a << n is the same as a >> (-n); in other words, right shifting by a negative number is the same as left shifting by a positive number, and vice versa.
Now let's look at the reverse function. There are "fast ways" and "slow ways" to do this. Your code above gave a "very slow" way - let me show you a "very fast" way, assuming that your compiler allows the use of 64 bit (long long) integers.
unsigned char reverse(unsigned char b) {
return (b * 0x0202020202ULL & 0x010884422010ULL) % 1023;
}
You may ask yourself "what just happened"??? Let me show you:
b = abcdefgh
* 0x0000000202020202 = 00000000 00000000 0000000a bcdefgha bcdefgha bcdefgha bcdefgha bcdefgh0
& 0x0000010884422010 = 00000000 00000000 00000001 00001000 10000100 01000010 00100000 00010000
= 00000000 00000000 0000000a 0000f000 b0000g00 0c0000h0 00d00000 000e0000
Note that we now have all the bits exactly once - they are just in a rather strange pattern. The modulo 1023 division "collapses" the bits of interest on top of each other - it's like magic, and I can't explain it. The result is indeed
hgfedcba
A slightly less obscure way to achieve the same thing (less efficient, but works for larger numbers quite efficiently) recognizes that if you swap adjacent bits , then adjacent bit pairs, then adjacent nibbles (4 bit groups), etc - you end up with a complete bit reversal. In that case, a byte reversal becomes
unsigned char bytereverse(unsigned char b) {
b = (b & 0x55) << 1 | (b & 0xAA) >> 1; // swap adjacent bits
b = (b & 0x33) << 2 | (b & 0xCC) >> 2; // swap adjacent pairs
b = (b & 0x0F) << 4 | (b & 0xF0) >> 4; // swap nibbles
return b;
}
In this case the following happens to byte b = abcdefgh:
b & 0x55 = abcdefgh & 01010101 = 0b0d0f0h << 1 = b0d0f0h0
b & 0xAA = abcdefgh & 10101010 = a0c0e0g0 >> 1 = 0a0c0e0g
OR these two to get badcfehg
Next line:
b & 0x33 = badcfehg & 00110011 = 00dc00hg << 2 = dc00hg00
b & 0xCC = badcfehg & 11001100 = ba00fe00 >> 2 = 00ba00fe
OR these to get dcbahgfe
last line:
b & 0x0F = dcbahgfe & 00001111 = 0000hgfe << 4 = hgfe0000
b & 0xF0 = dcbahgfe & 11110000 = dcba0000 >> 4 = 0000dcba
OR these to get hgfedcba
Which is the reversed byte you were after. It should be easy to see how just a couple more lines (similar to the above) get you to a reversed integer (32 bits). As the size of the number increases, this trick becomes more and more efficient, comparatively.
I trust that the answer you were looking for is "somewhere" in the above. If nothing else I hope you have a clearer understanding of the possibilities of bit manipulation in C.
If, as according to your comments, you want to shift one bit exactly, then one easy way to accomplish that would be this:
unsigned char rotl(unsigned char c)
{
return((c << 1) | (c >> 7));
}
What your code does is reversing the bits; not rotating them. For instance, it would make 10111001 into 10011101, not 01110011.

How to duplicate Delphi's swap function in c#

I am trying to write a swap function in C# to mimic the one in Delphi. According to the documentation the one in Delphi will do the following:
If the number is two bytes, bytes 1 and 2 are swapped
if the number is four bytes, bytes 1 and 4 are swapped, bytes 2 and 3 remain where they are
Below is the code I have.
int number = 17665024;
var hi = (byte)(number >> 24);
var lo = (byte)(number & 0xff);
return (number & 0x00FFFF00) + (lo & 0xFF000000) + (hi & 0x000000FF);
Some numbers seem to return what I expect, but most do not.
// Value in // Expected // Actual
17665024 887809 887809
5376 21 5376
-30720 136 16746751
3328 13 3328
It's probably a fairly obvious mistake to most, but I haven't dealt with bitwise shift operators much and I cannot seem to work out what I have done wrong.
Thanks in advance.
In C#, the data types short and int correspond to integral data types of 2 bytes and 4 bytes, respectively. The algorithm above applies to int (4 bytes).
This algorithm contains an error: (lo & 0xFF000000) will always return 0 because lo is a byte. What you probably intended was lo << 24, which shifts lo 24 bytes to the left.
For an int data type, the proper function then becomes:
int SwapInt(int number)
{
var hi = (byte)(number >> 24);
var lo = (byte)(number & 0xff);
return ((number & 0xffff00) | (lo << 24) | hi);
}
For a short data type, the middle term disappears and we are left with simply:
short SwapShort(short number)
{
var hi = (byte)(number >> 8);
var lo = (byte)(number & 0xff);
return (short)((lo << 8) | hi);
}
Then Swap((short)5376) returns the expected value of 21. Note that Swap(5376) will use the default int datatype for 5376, which returns 5376. To treat integers that can be wholly expressed in two bytes as short, you can run:
int Swap(int n)
{
if (n >= Short.MinValue && n <= Short.MaxValue)
return SwapShort((short)n);
else
return SwapInt(n);
}

How do I properly loop through and print bits of an Int, Long, Float, or BigInteger?

I'm trying to debug some bit shifting operations and I need to visualize the bits as they exist before and after a Bit-Shifting operation.
I read from this answer that I may need to handle backfill from the shifting, but I'm not sure what that means.
I think that by asking this question (how do I print the bits in a int) I can figure out what the backfill is, and perhaps some other questions I have.
Here is my sample code so far.
static string GetBits(int num)
{
StringBuilder sb = new StringBuilder();
uint bits = (uint)num;
while (bits!=0)
{
bits >>= 1;
isBitSet = // somehow do an | operation on the first bit.
// I'm unsure if it's possible to handle different data types here
// or if unsafe code and a PTR is needed
if (isBitSet)
sb.Append("1");
else
sb.Append("0");
}
}
Convert.ToString(56,2).PadLeft(8,'0') returns "00111000"
This is for a byte, works for int also, just increase the numbers
To test if the last bit is set you could use:
isBitSet = ((bits & 1) == 1);
But you should do so before shifting right (not after), otherwise you's missing the first bit:
isBitSet = ((bits & 1) == 1);
bits = bits >> 1;
But a better option would be to use the static methods of the BitConverter class to get the actual bytes used to represent the number in memory into a byte array. The advantage (or disadvantage depending on your needs) of this method is that this reflects the endianness of the machine running the code.
byte[] bytes = BitConverter.GetBytes(num);
int bitPos = 0;
while(bitPos < 8 * bytes.Length)
{
int byteIndex = bitPos / 8;
int offset = bitPos % 8;
bool isSet = (bytes[byteIndex] & (1 << offset)) != 0;
// isSet = [True] if the bit at bitPos is set, false otherwise
bitPos++;
}

Count number of bits in a 64-bit (long, big) integer?

I have read through this SO question about 32-bits, but what about 64-bit numbers? Should I just mask the upper and lower 4 bytes, perform the count on the 32-bits and then add them together?
You can find 64 bit version here http://en.wikipedia.org/wiki/Hamming_weight
It is something like this
static long NumberOfSetBits(long i)
{
i = i - ((i >> 1) & 0x5555555555555555);
i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333);
return (((i + (i >> 4)) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56;
}
This is a 64 bit version of the code form here How to count the number of set bits in a 32-bit integer?
Using Joshua's suggestion I would transform it into this:
static int NumberOfSetBits(ulong i)
{
i = i - ((i >> 1) & 0x5555555555555555UL);
i = (i & 0x3333333333333333UL) + ((i >> 2) & 0x3333333333333333UL);
return (int)(unchecked(((i + (i >> 4)) & 0xF0F0F0F0F0F0F0FUL) * 0x101010101010101UL) >> 56);
}
EDIT: I found a bug while testing 32 bit version. I added missing parentheses. The sum should be done before bitwise &, in the last line
EDIT2 Added safer version for ulong
A fast (and more portable than using non-standard compiler extensions) way:
int bitcout(long long n)
{
int ret=0;
while (n!=0)
{
n&=(n-1);
ret++;
}
return ret;
}
Every time you do a n&=(n-1) you eliminate the last set bit in n. Thus this takes O(number of set bits) time.
This faster than the O(log n) you would need if you tested every bit - not every bit is set unless the number is 0xFFFFFFFFFFFFFFFF), thus usually you need far fewer iterations.
Standard answer in C#:
ulong val = //whatever
byte count = 0;
while (val != 0) {
if ((val & 0x1) == 0x1) count++;
val >>= 1;
}
This shifts val right one bit, and increments count if the rightmost bit is set. This is a general algorithm that can be used for any length integer.

C# - Making one Int64 from two Int32s

Is there a function in c# that takes two 32 bit integers (int) and returns a single 64 bit one (long)?
Sounds like there should be a simple way to do this, but I couldn't find a solution.
Try the following
public long MakeLong(int left, int right) {
//implicit conversion of left to a long
long res = left;
//shift the bits creating an empty space on the right
// ex: 0x0000CFFF becomes 0xCFFF0000
res = (res << 32);
//combine the bits on the right with the previous value
// ex: 0xCFFF0000 | 0x0000ABCD becomes 0xCFFFABCD
res = res | (long)(uint)right; //uint first to prevent loss of signed bit
//return the combined result
return res;
}
Just for clarity... While the accepted answer does appear to work correctly. All of the one liners presented do not appear to produce accurate results.
Here is a one liner that does work:
long correct = (long)left << 32 | (long)(uint)right;
Here is some code so you can test it for yourself:
long original = 1979205471486323557L;
int left = (int)(original >> 32);
int right = (int)(original & 0xffffffffL);
long correct = (long)left << 32 | (long)(uint)right;
long incorrect1 = (long)(((long)left << 32) | (long)right);
long incorrect2 = ((Int64)left << 32 | right);
long incorrect3 = (long)(left * uint.MaxValue) + right;
long incorrect4 = (long)(left * 0x100000000) + right;
Console.WriteLine(original == correct);
Console.WriteLine(original == incorrect1);
Console.WriteLine(original == incorrect2);
Console.WriteLine(original == incorrect3);
Console.WriteLine(original == incorrect4);
Try
(long)(((long)i1 << 32) | (long)i2)
this shifts the first int left by 32 bits (the length of an int), then ors in the second int, so you end up with the two ints concatentated together in a long.
Be careful with the sign bit. Here is a fast ulong solution, that is also not portable from little endian to big endian:
var a = 123;
var b = -123;
unsafe
{
ulong result = *(uint*)&a;
result <<= 32;
result |= *(uint*)&b;
}
This should do the trick
((Int64) a << 32 | b)
Where a and b are Int32. Although you might want to check what happens with the highest bits. Or just put it inside an "unchecked {...}" block.
Gotta be careful with bit twiddling like this though cause you'll have issues on little endian/big endian machines (exp Mono platforms aren't always little endian). Plus you have to deal with sign extending. Mathematically the following is the same but deals with sign extension and is platform agnostic.
return (long)( high * uint.MaxValue ) + low;
When jitted at runtime it will result in performance similar to the bit twiddling. That's one of the nice things about interpreted languages.
There is a problem when i2 < 0 - high 32 bits will be set (0xFFFFFFFF,1xxx... binary) - thecoop was wrong
Better would be something like (Int64)(((UInt64)i1 << 32) | (UInt32)i2)
Or simply C++ way
public static unsafe UInt64 MakeLong(UInt32 low, UInt32 high)
{
UInt64 retVal;
UInt32* ptr = (UInt32*)&retVal;
*ptr++ = low;
*ptr = high;
return retVal;
}
UInt64 retVal;
unsafe
{
UInt32* ptr = (UInt32*)&retVal;
*ptr++ = low;
*ptr = high;
}
But the best solution found then here ;-)
[StructLayout(LayoutKind.Explicit)]
[FieldOffset()]
https://stackoverflow.com/questions/12898591
(even w/o unsafe)
Anyway FieldOffset works for each item, so you have to specify position of each half separate and remember negative #s are zero complements, so ex. low <0 and high >0 will not make sense - for example -1,0 will give Int64 as 4294967295 probably.

Categories