I am trying to write a swap function in C# to mimic the one in Delphi. According to the documentation the one in Delphi will do the following:
If the number is two bytes, bytes 1 and 2 are swapped
if the number is four bytes, bytes 1 and 4 are swapped, bytes 2 and 3 remain where they are
Below is the code I have.
int number = 17665024;
var hi = (byte)(number >> 24);
var lo = (byte)(number & 0xff);
return (number & 0x00FFFF00) + (lo & 0xFF000000) + (hi & 0x000000FF);
Some numbers seem to return what I expect, but most do not.
// Value in // Expected // Actual
17665024 887809 887809
5376 21 5376
-30720 136 16746751
3328 13 3328
It's probably a fairly obvious mistake to most, but I haven't dealt with bitwise shift operators much and I cannot seem to work out what I have done wrong.
Thanks in advance.
In C#, the data types short and int correspond to integral data types of 2 bytes and 4 bytes, respectively. The algorithm above applies to int (4 bytes).
This algorithm contains an error: (lo & 0xFF000000) will always return 0 because lo is a byte. What you probably intended was lo << 24, which shifts lo 24 bytes to the left.
For an int data type, the proper function then becomes:
int SwapInt(int number)
{
var hi = (byte)(number >> 24);
var lo = (byte)(number & 0xff);
return ((number & 0xffff00) | (lo << 24) | hi);
}
For a short data type, the middle term disappears and we are left with simply:
short SwapShort(short number)
{
var hi = (byte)(number >> 8);
var lo = (byte)(number & 0xff);
return (short)((lo << 8) | hi);
}
Then Swap((short)5376) returns the expected value of 21. Note that Swap(5376) will use the default int datatype for 5376, which returns 5376. To treat integers that can be wholly expressed in two bytes as short, you can run:
int Swap(int n)
{
if (n >= Short.MinValue && n <= Short.MaxValue)
return SwapShort((short)n);
else
return SwapInt(n);
}
Related
I'm trying to implement a function that performs a circular rotation of a byte to the left and to the right.
I wrote the same code for both operations. For example, if you are rotating left 1010 becomes 0101. Is this right?
unsigned char rotl(unsigned char c) {
int w;
unsigned char s = c;
for (w = 7; w >= 0; w--) {
int b = (int)getBit(c, w);//
if (b == 0) {
s = clearBit(s, 7 - w);
} else if (b == 1) {
s = setBit(s, 7 - w);
}
}
return s;
}
unsigned char getBit(unsigned char c, int n) {
return c = (c & (1 << n)) >> n;
}
unsigned char setBit(unsigned char c, int n) {
return c = c | (1 << n);
}
unsigned char clearBit(unsigned char c, int n) {
return c = c &(~(1 << n));
}
There is no rotation operator in C, but if you write:
unsigned char rotl(unsigned char c)
{
return (c << 1) | (c >> 7);
}
then, according to this: http://www.linux-kongress.org/2009/slides/compiler_survey_felix_von_leitner.pdf (page 56), compilers will figure out what you want to do and perform the rotation it in only one (very fast) instruction.
Reading the answers and comments so far, there seems to be some confusion about what you are trying to accomplish - this may be because of the words you use. In bit manipulation, there are several "standard" things you can do. I will summarize some of these to help clarify different concepts. In all that follows, I will use abcdefgh to denote 8 bits (could be ones or zeros) - and as they move around, the same letter will refer to the same bit (maybe in a different position); if a bit becomes "definitely 0 or 1, I will denote it as such).
1) Bit shifting: This is essentially a "fast multiply or divide by a power of 2". The symbol used is << for "left shift" (multiply) or >> for right shift (divide). Thus
abcdefgh >> 2 = 00abcdef
(equivalent to "divide by four") and
abcdefgh << 3 = abcdefgh000
(equivalent to "multiply by eight" - and assuming there was "space" to shift the abc into; otherwise this might result in an overflow)
2) Bit masking: sometimes you want to set certain bits to zero. You do this by doing an AND operation with a number that has ones where you want to preserve a bit, and zeros where you want to clear a bit.
abcdefgh & 01011010 = 0b0de0g0
Or if you want to make sure certain bits are one, you use the OR operation:
abcdefgh | 01011010 = a1c11f1h
3) Circular shift: this is a bit trickier - there are instances where you want to "move bits around", with the ones that "fall off at one end" re-appearing at the other end. There is no symbol for this in C, and no "quick instruction" (although most processors have a built-in instruction which assembler code can take advantage of for FFT calculations and such). If you want to do a "left circular shift" by three positions:
circshift(abcdefgh, 3) = defghabc
(note: there is no circshift function in the standard C libraries, although it exists in other languages - e.g. Matlab). By the same token a "right shift" would be
circshift(abcdefgh, -2) = ghabcdef
4) Bit reversal: Sometimes you need to reverse the bits in a number. When reversing the bits, there is no "left" or "right" - reversed is reversed:
reverse(abcdefgh) = hgfedcba
Again, there isn't actually a "reverse" function in standard C libraries.
Now, let's take a look at some tricks for implementing these last two functions (circshift and reverse) in C. There are entire websites devoted to "clever ways to manipulate bits" - see for example this excellent one. for a wonderful collection of "bit hacks", although some of these may be a little advanced...
unsigned char circshift(unsigned char x, int n) {
return (x << n) | (x >> (8 - n));
}
This uses two tricks from the above: shifting bits, and using the OR operation to set bits to specific values. Let's look at how it works, for n = 3 (note - I am ignoring bits above the 8th bit since the return type of the function is unsigned char):
(abcdefgh << 3) = defgh000
(abcdefgh >> (8 - 3)) = 00000abc
Taking the bitwise OR of these two gives
defgh000 | 00000abc = defghabc
Which is exactly the result we wanted. Note also that a << n is the same as a >> (-n); in other words, right shifting by a negative number is the same as left shifting by a positive number, and vice versa.
Now let's look at the reverse function. There are "fast ways" and "slow ways" to do this. Your code above gave a "very slow" way - let me show you a "very fast" way, assuming that your compiler allows the use of 64 bit (long long) integers.
unsigned char reverse(unsigned char b) {
return (b * 0x0202020202ULL & 0x010884422010ULL) % 1023;
}
You may ask yourself "what just happened"??? Let me show you:
b = abcdefgh
* 0x0000000202020202 = 00000000 00000000 0000000a bcdefgha bcdefgha bcdefgha bcdefgha bcdefgh0
& 0x0000010884422010 = 00000000 00000000 00000001 00001000 10000100 01000010 00100000 00010000
= 00000000 00000000 0000000a 0000f000 b0000g00 0c0000h0 00d00000 000e0000
Note that we now have all the bits exactly once - they are just in a rather strange pattern. The modulo 1023 division "collapses" the bits of interest on top of each other - it's like magic, and I can't explain it. The result is indeed
hgfedcba
A slightly less obscure way to achieve the same thing (less efficient, but works for larger numbers quite efficiently) recognizes that if you swap adjacent bits , then adjacent bit pairs, then adjacent nibbles (4 bit groups), etc - you end up with a complete bit reversal. In that case, a byte reversal becomes
unsigned char bytereverse(unsigned char b) {
b = (b & 0x55) << 1 | (b & 0xAA) >> 1; // swap adjacent bits
b = (b & 0x33) << 2 | (b & 0xCC) >> 2; // swap adjacent pairs
b = (b & 0x0F) << 4 | (b & 0xF0) >> 4; // swap nibbles
return b;
}
In this case the following happens to byte b = abcdefgh:
b & 0x55 = abcdefgh & 01010101 = 0b0d0f0h << 1 = b0d0f0h0
b & 0xAA = abcdefgh & 10101010 = a0c0e0g0 >> 1 = 0a0c0e0g
OR these two to get badcfehg
Next line:
b & 0x33 = badcfehg & 00110011 = 00dc00hg << 2 = dc00hg00
b & 0xCC = badcfehg & 11001100 = ba00fe00 >> 2 = 00ba00fe
OR these to get dcbahgfe
last line:
b & 0x0F = dcbahgfe & 00001111 = 0000hgfe << 4 = hgfe0000
b & 0xF0 = dcbahgfe & 11110000 = dcba0000 >> 4 = 0000dcba
OR these to get hgfedcba
Which is the reversed byte you were after. It should be easy to see how just a couple more lines (similar to the above) get you to a reversed integer (32 bits). As the size of the number increases, this trick becomes more and more efficient, comparatively.
I trust that the answer you were looking for is "somewhere" in the above. If nothing else I hope you have a clearer understanding of the possibilities of bit manipulation in C.
If, as according to your comments, you want to shift one bit exactly, then one easy way to accomplish that would be this:
unsigned char rotl(unsigned char c)
{
return((c << 1) | (c >> 7));
}
What your code does is reversing the bits; not rotating them. For instance, it would make 10111001 into 10011101, not 01110011.
I have read through this SO question about 32-bits, but what about 64-bit numbers? Should I just mask the upper and lower 4 bytes, perform the count on the 32-bits and then add them together?
You can find 64 bit version here http://en.wikipedia.org/wiki/Hamming_weight
It is something like this
static long NumberOfSetBits(long i)
{
i = i - ((i >> 1) & 0x5555555555555555);
i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333);
return (((i + (i >> 4)) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56;
}
This is a 64 bit version of the code form here How to count the number of set bits in a 32-bit integer?
Using Joshua's suggestion I would transform it into this:
static int NumberOfSetBits(ulong i)
{
i = i - ((i >> 1) & 0x5555555555555555UL);
i = (i & 0x3333333333333333UL) + ((i >> 2) & 0x3333333333333333UL);
return (int)(unchecked(((i + (i >> 4)) & 0xF0F0F0F0F0F0F0FUL) * 0x101010101010101UL) >> 56);
}
EDIT: I found a bug while testing 32 bit version. I added missing parentheses. The sum should be done before bitwise &, in the last line
EDIT2 Added safer version for ulong
A fast (and more portable than using non-standard compiler extensions) way:
int bitcout(long long n)
{
int ret=0;
while (n!=0)
{
n&=(n-1);
ret++;
}
return ret;
}
Every time you do a n&=(n-1) you eliminate the last set bit in n. Thus this takes O(number of set bits) time.
This faster than the O(log n) you would need if you tested every bit - not every bit is set unless the number is 0xFFFFFFFFFFFFFFFF), thus usually you need far fewer iterations.
Standard answer in C#:
ulong val = //whatever
byte count = 0;
while (val != 0) {
if ((val & 0x1) == 0x1) count++;
val >>= 1;
}
This shifts val right one bit, and increments count if the rightmost bit is set. This is a general algorithm that can be used for any length integer.
I want to take the first 4 bits of one byte and all bits of another bit and append them to eachother.
This is the result I need to achieve:
This is what I have now:
private void ParseLocation(int UpperLogicalLocation, int UnderLogicalLocation)
{
int LogicalLocation = UpperLogicalLocation & 0x0F; // Take bit 0-3
LogicalLocation += UnderLogicalLocation;
}
But this is not giving the right results.
int UpperLogicalLocation_Offset = 0x51;
int UnderLogicalLocation = 0x23;
int LogicalLocation = UpperLogicalLocation & 0x0F; // Take bit 0-3
LogicalLocation += UnderLogicalLocation;
Console.Write(LogicalLocation);
This should give 0x51(01010001) + 0x23 (00100011),
So the result I want to achieve is 0001 + 00100011 = 000100100011 (0x123)
You will need to left-shift the UpperLogicalLocation bits by 8 before combining the bits:
int UpperLogicalLocation = 0x51;
int UnderLogicalLocation = 0x23;
int LogicalLocation = (UpperLogicalLocation & 0x0F) << 8; // Take bit 0-3 and shift
LogicalLocation |= UnderLogicalLocation;
Console.WriteLine(LogicalLocation.ToString("x"));
Note that I also changed += to |= to better express what is happening.
The problem is that you're storing the upper bits into bits 0-3 of LogicalLocation instead of bits 8-11. You need to shift the bits into the right place. The following change should fix the problem:
int LogicalLocation = (UpperLogicalLocation & 0x0F) << 8;
Also note that the bits are more idiomatically combined using the logical-or operator. So your second line becomes:
LogicalLocation |= UnderLogicalLocation;
You can do this:
int LogicalLocation = (UpperLogicalLocation & 0x0F) << 8; // Take bit 0-3
LogicalLocation |= (UnderLogicalLocation & 0xFF);
...but be careful about endianness! Your documentation says UpperLogicalLocation should be stored in Byte 3, the next 8 bits in Byte 4. Do achieve this, the resulting int LogicalLocation needs to be split into these two bytes correctly.
I have read through this SO question about 32-bits, but what about 64-bit numbers? Should I just mask the upper and lower 4 bytes, perform the count on the 32-bits and then add them together?
You can find 64 bit version here http://en.wikipedia.org/wiki/Hamming_weight
It is something like this
static long NumberOfSetBits(long i)
{
i = i - ((i >> 1) & 0x5555555555555555);
i = (i & 0x3333333333333333) + ((i >> 2) & 0x3333333333333333);
return (((i + (i >> 4)) & 0xF0F0F0F0F0F0F0F) * 0x101010101010101) >> 56;
}
This is a 64 bit version of the code form here How to count the number of set bits in a 32-bit integer?
Using Joshua's suggestion I would transform it into this:
static int NumberOfSetBits(ulong i)
{
i = i - ((i >> 1) & 0x5555555555555555UL);
i = (i & 0x3333333333333333UL) + ((i >> 2) & 0x3333333333333333UL);
return (int)(unchecked(((i + (i >> 4)) & 0xF0F0F0F0F0F0F0FUL) * 0x101010101010101UL) >> 56);
}
EDIT: I found a bug while testing 32 bit version. I added missing parentheses. The sum should be done before bitwise &, in the last line
EDIT2 Added safer version for ulong
A fast (and more portable than using non-standard compiler extensions) way:
int bitcout(long long n)
{
int ret=0;
while (n!=0)
{
n&=(n-1);
ret++;
}
return ret;
}
Every time you do a n&=(n-1) you eliminate the last set bit in n. Thus this takes O(number of set bits) time.
This faster than the O(log n) you would need if you tested every bit - not every bit is set unless the number is 0xFFFFFFFFFFFFFFFF), thus usually you need far fewer iterations.
Standard answer in C#:
ulong val = //whatever
byte count = 0;
while (val != 0) {
if ((val & 0x1) == 0x1) count++;
val >>= 1;
}
This shifts val right one bit, and increments count if the rightmost bit is set. This is a general algorithm that can be used for any length integer.
What is the best way to divide a 32 bit integer into four (unsigned) chars in C#.
Quick'n'dirty:
int value = 0x48454C4F;
Console.WriteLine(Encoding.ASCII.GetString(
BitConverter.GetBytes(value).Reverse().ToArray()
));
Converting the int to bytes, reversing the byte-array for the correct order and then getting the ASCII character representation from it.
EDIT: The Reverse method is an extension method from .NET 3.5, just for info. Reversing the byte order may also not be needed in your scenario.
Char? Maybe you are looking for this handy little helper function?
Byte[] b = BitConverter.GetBytes(i);
Char c = (Char)b[0];
[...]
It's not clear if this is really what you want, but:
int x = yourNumber();
char a = (char)(x & 0xff);
char b = (char)((x >> 8) & 0xff);
char c = (char)((x >> 16) & 0xff);
char d = (char)((x >> 24) & 0xff);
This assumes you want the bytes interpreted as the lowest range of Unicode characters.
I have tried it a few ways and clocked the time taken to convert 1000000 ints.
Built-in convert method, 325000 ticks:
Encoding.ASCII.GetChars(BitConverter.GetBytes(x));
Pointer conversion, 100000 ticks:
static unsafe char[] ToChars(int x)
{
byte* p = (byte*)&x)
char[] chars = new char[4];
chars[0] = (char)*p++;
chars[1] = (char)*p++;
chars[2] = (char)*p++;
chars[3] = (char)*p;
return chars;
}
Bitshifting, 77000 ticks:
public static char[] ToCharsBitShift(int x)
{
char[] chars = new char[4];
chars[0] = (char)(x & 0xFF);
chars[1] = (char)(x >> 8 & 0xFF);
chars[2] = (char)(x >> 16 & 0xFF);
chars[3] = (char)(x >> 24 & 0xFF);
return chars;
}
Do get the 8-byte-blocks:
int a = i & 255; // bin 11111111
int b = i & 65280; // bin 1111111100000000
Do break the first three bytes down into a single byte, just divide them by the proper number and perform another logical and to get your final byte.
Edit: Jason's solution with the bitshifts is much nicer of course.
.net uses Unicode, a char is 2 bytes not 1
To convert between binary data containing non-unicode text use the System.Text.Encoding class.
If you do want 4 bytes and not chars then replace the char with byte in Jason's answer