How can I convert a ulong to an positive int? - c#

I have a piece of code that is
// Bernstein hash
// http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx
ulong result = (ulong)s[0];
for ( int i = 1; i < s.Length; ++i )
{
result = 33 * result + (ulong)s[i];
}
return (int)result % Buckets.Count;
and the problem is that it's sometimes returning negative values. I know the reason is because (int)result can be negative. But I want to coerce it to be non-negative since it's used as an index. Now I realize I could do
int k = (int)result % Buckets.Count;
k = k < 0 ? k*-1 : k;
return k;
but is there a better way?
On a deeper level, why is int used for the index of containers in C#? I come from a C++ background and we have size_t which is an unsigned integral type. That makes more sense to me.

Use
return (int)(result % (ulong)Buckets.Count);
As you sum up values you reach a positive integer number that cannot be expressed as a positive number in a signed 32 bit integer. The cast to int will return a negative number. The modulo operation will then also return a negative number. If you do the modulo operation first, you will get a low positive number and the cast to int will do no harm.

While you can find a way to cast this to an int properly, I'm wondering why you don't just calculate it as an int from the beginning.
int result = (int)s[0]; // or, if s[0] is already an int, omit the cast
for ( int i = 1; i < s.Length; ++i )
{
result = 33 * result + (int)s[i];
}
return Math.Abs(result) % Buckets.Count;
As to why C# uses a signed int for indexes, it has to do with cross-language compatibility.

Related

Shift bits of an integer only if the number of bits of its binary presentation is greater then given value

I need to shift the bits of an integer to the right only if the number of bits is greater then a certain number. For the example Lets take 10.
If the integer is 818 the then binary representation of the integer is 1100110010, In that case i do nothing.
If the Integer is 1842 the binary representation of the integer is 11100110010 which is greater then 10 by one, So i need to shift one bit to the right(Or setting bit at index 10 to 0 which gives the same result as far as i know, Maybe im wrong).
What i did until now is make an integer array of ones and zeros represent the int, But i`m sure there is more elegant way of doing this
int y = 818;
string s = Convert.ToString(y, 2);
int[] bits = s.PadLeft(8, '0')
.Select(c => int.Parse(c.ToString()))
.ToArray();
if (bits.Length > 10)
{
for (int i = 10; i < bits.Length; i++)
{
bits[i] = 0;
}
}
I also tried to do this:
if(bits.Length > 10){ y = y >> (bits.Length - 10)}
but for some reason i got 945 (1110110001) when the input was 1891 (11101100011)
There's no need to do this with strings. 2 to the power of 10 has 11 binary digits, so
if (y >= Math.Pow(2, 10))
{
y = y >> 1;
}
seems to do what you want.

Is this C# SqlDecimal math bug?

I'm trying to implement SQL Server Vardecimal decompression. Values stored as 3 digits decimals per every 10 bits. But during implementation I found strange behavior of math. Here is simple test I made
private SqlDecimal Test() {
SqlDecimal mantissa = 0;
SqlDecimal sign = -1;
byte exponent = 0x20;
int numDigits = 0;
// -999999999999999999999999999999999.99999
for (int i = 0; i < 13; i++) {
int temp = 999;
//equal to mantissa = mantissa * 1000 + temp;
numDigits += 3;
int pwr = exponent - (numDigits - 1);
mantissa += temp * (SqlDecimal)Math.Pow(10, pwr);
}
return sign * mantissa;
}
First 2 passes are fine, I have
999000000000000000000000000000000
999999000000000000000000000000000
but third have
999999998999999999999980020000000
Is it some bug in C# SqlDecimal math or am I doing something wrong?
This is an issue with how you're constructing the value to add here:
mantissa += temp * (SqlDecimal)Math.Pow(10, pwr);
The problem starts when pwr is 24. You can see this very clearly here:
Console.WriteLine((SqlDecimal) Math.Pow(10, 24));
The output on my box is:
999999999999999980000000
Now I don't know exactly where that's coming from - but it's simplest to remove the floating point arithmetic entirely. While it may not be efficient, this is a simple way of avoiding the problem:
static SqlDecimal PowerOfTen(int power)
{
// Note: only works for non-negative power values at the moment!
// (To handle negative input, divide by 10 on each iteration instead.)
SqlDecimal result = 1;
for (int i = 0; i < power; i++)
{
result = result * 10;
}
return result;
}
If you then change the line to:
mantissa += temp * PowerOfTen(pwr);
then you'll get the results you expect - at least while pwr is greater than zero. It should be easy to fix PowerOfTen to handle negative values as well though.
Update
Amending the below method to just work with Parse and ToString should improve performance for larger numbers (which would be the general use case for these types):
public static SqlDecimal ToSqlDecimal(this BigInteger bigint)
{
return SqlDecimal.Parse(bigint.ToString());
}
This trick also works for the double returned by the original Math.Pow call; so you could just do:
SqlDecimal.Parse(string.Format("{0:0}",Math.Pow(10,24)))
Original Answer
Obviously #JonSkeet's answer's best, as it only involves 24 iterations, vs potentially thousands in my attempt. However, here's an alternate solution, which may help out in other scenarios where you need to convert large integers (i.e. System.Numeric.BigInteger) to SqlDecimal / where performance is less of a concern.
Fiddle Example
//using System.Data.SqlTypes;
//using System.Numerics; //also needs an assembly reference to System.Numerics.dll
public static class BigIntegerExtensions
{
public static SqlDecimal ToSqlDecimal(this BigInteger bigint)
{
SqlDecimal result = 0;
var longMax = (SqlDecimal)long.MaxValue; //cache the converted value to minimise conversions
var longMin = (SqlDecimal)long.MinValue;
while (bigint > long.MaxValue)
{
result += longMax;
bigint -= long.MaxValue;
}
while (bigint < long.MinValue)
{
result += longMin;
bigint -= long.MinValue;
}
return result + (SqlDecimal)(long)bigint;
}
}
For your above use case, you could use this like so (uses the BigInteger.Pow method):
mantissa += temp * BigInteger.Pow(10, pwr).ToSqlDecimal();

Convert 24 bit two's complement to int?

I have a C#/Mono app that reads data from an ADC. My read functions returns it as ulong. The data is in two complement and I need to translate it into int. E.g.:
0x7FFFFF = + 8,388,607
0x7FFFFE = + 8,388,606
0x000000 = 0
0xFFFFFF = -1
0x800001 = - 8,388,607
0x800000 = - 8,388,608
How can I do this?
const int MODULO = 1 << 24;
const int MAX_VALUE = (1 << 23) - 1;
int transform(int value) {
if (value > MAX_VALUE) {
value -= MODULO;
}
return value;
}
Explanation: MAX_VALUE is the maximum possible positive value that can be stored in 24 signed int, so if the value is less or equal then that, it should be returned as is. Otherwise value is unsigned representation of two-complement negative number. The way two-complement works is that negative number is written as a number modulo MODULO, which would effectively mean that MODULO is added. So to convert it back to signed we subtract MODULO.

Create random ints with minimum and maximum from Random.NextBytes()

Title pretty much says it all. I know I could use Random.NextInt(), of course, but I want to know if there's a way to turn unbounded random data into bounded without statistical bias. (This means no RandomInt() % (maximum-minimum)) + minimum). Surely there is a method like it, that doesn't introduce bias into the data it outputs?
If you assume that the bits are randomly distributed, I would suggest:
Generate enough bytes to get a number within the range (e.g. 1 byte to get a number in the range 0-100, 2 bytes to get a number in the range 0-30000 etc).
Use only enough bits from those bytes to cover the range you need. So for example, if you're generating numbers in the range 0-100, take the bottom 7 bits of the byte you've generated
Interpret the bits you've got as a number in the range [0, 2n) where n is the number of bit
Check whether the number is in your desired range. It should be at least half the time (on average)
If so, use it. If not, repeat the above steps until a number is in the right range.
The use of just the required number of bits is key to making this efficient - you'll throw away up to half the number of bytes you generate, but no more than that, assuming a good distribution. (And if you are generating numbers in a nicely binary range, you won't need to throw anything away.)
Implementation left as an exercise to the reader :)
You could try with something like:
public static int MyNextInt(Random rnd, int minValue, int maxValue)
{
var buffer = new byte[4];
rnd.NextBytes(buffer);
uint num = BitConverter.ToUInt32(buffer, 0);
// The +1 is to exclude the maxValue in the case that
// minValue == int.MinValue, maxValue == int.MaxValue
double dbl = num * 1.0 / ((long)uint.MaxValue + 1);
long range = (long)maxValue - minValue;
int result = (int)(dbl * range) + minValue;
return result;
}
Totally untested... I can't guarantee that the results are truly pseudo-random... But the idea of creating a double (dbl) number is the same used by the Random class. Only I use the uint.MaxValue as the base instead of int.MaxValue. In this way I don't have to check for negative values of the buffer.
I propose a generator of random integers, based on NextBytes.
This method discards only 9.62% of bits in average over the word size range for positive Int32's due to the usage of Int64 as a representation for bit manupulation.
Maximum bit loss occurs at word size of 22 bits, and it's 20 lost bits of 64 used in byte range conversion. In this case bit efficiency is 68.75%
Also, 25% of values are lost because of clipping the unbound range to maximum value.
Be careful to use Take(N) on the IEnumerable returned, because it's an infinite generator otherwise.
I'm using a buffer of 512 long values, so it generates 4096 random bytes at once. If you just need a sequence of few integers, change the buffer size from 512 to a more optimal value, down to 1.
public static class RandomExtensions
{
public static IEnumerable<int> GetRandomIntegers(this Random r, int max)
{
if (max < 1)
throw new ArgumentOutOfRangeException("max", max, "Must be a positive value.");
const int longWordsTotal = 512;
const int bufferSize = longWordsTotal * 8;
var buffer = new byte[bufferSize];
var wordSize = (int)Math.Log(max, 2) + 1;
while(true)
{
r.NextBytes(buffer);
for (var longWordIndex = 0; longWordIndex < longWordsTotal; longWordIndex++)
{
ulong longWord = BitConverter.ToUInt64(buffer, longWordIndex);
var lastStartBit = 64 - wordSize;
var count = 0;
for (var startBit = 0; startBit <= lastStartBit; startBit += wordSize)
{
count ++;
var mask = ((1UL << wordSize) - 1) << startBit;
var unboundValue = (int)((mask & longWord) >> startBit);
if (unboundValue <= max)
yield return unboundValue;
}
}
}
}
}

strange behavior of reverse loop in c# and c++

I just programmed a simple reverse loop like this:
for (unsigned int i = 50; i >= 0; i--)
printf("i = %d\n", i);
but it doesn't stop at 0 as expected but goes down far to the negative values, why?
See this ideone sample: http://ideone.com/kkixx8
(I tested it in c# and c++)
You declared the int as unsigned. It will always be >= 0. The only reason you see negative values is that your printf call interprets it as signed (%d) instead of unsigned (%ud).
Although you did not ask for a solution, here are two common ways of fixing the problem:
// 1. The goes-to operator
for (unsigned int i = 51; i --> 0; )
printf("i = %d\n", i);
// 2. Waiting for overflow
for (unsigned int i = 50; i <= 50; i--)
printf("i = %d\n", i);
An unsigned int can never become negative.
In C# this code
for (uint i = 50; i >= 0; i--)
Console.WriteLine(i);
Produces following output:
50
...
7
6
5
4
3
2
1
0
4294967295
4294967294
4294967293
...
You are using an unsigned int. It can never be < 0. It just wraps around. You are seeing negative values because of the way you are formatting your output (interpreting it as a signed int).
Loop breaks when i would be less than zero. But i is unsigned, and it never be less than zero.
in your for loop
for (unsigned int i = 50; i >= 0; i--)
printf("i = %d\n", i);
the value of i decresed by 1 and when your value of i==0 then loop decrement try to assign
i-- means i=-1
The -1 to the right of your equals sign is set up as a signed integer (probably 32 bits in size) and will have the hexadecimal value 0xFFFFFFF4. The compiler generates code to move this signed integer into your unsigned integer i which is also a 32 bit entity. The compiler assumes you only have a positive value to the right of the equals sign so it simply moves all 32 bits into i. i now has the value 0xFFFFFFF4 which is 4294967284 if interpreted as a positive number. But the printf format of %d says the 32 bits are to be interpreted as a signed integer so you get -1. If you had used %u it would have printed as 4294967284.

Categories