Put number in range c# - c#

This is widely discussed maybe, but i can't find the proper answer yet. Here is my problem i want to put a number in current range, but the number is random. I don't use
Random rand = new Random();
rand.Next(0,100);
the number is from GetHashCode(), and i have to put it in range *[0, someArray.Length);
I tried :
int a = 12345;
int currentIndex = a.GetHashCode();
currentIndex % someArray.Length + someArrayLength
but it doesn't work. I will appreciate any help.

I'd go for (hash & 0x7FFFFFFF) % modulus. The masking ensures that the input is positive, and then the remainder operator % maps it into the target range.
Alternatives include:
result = hash % modulus;
if(result < 0)
result += modulus;
and
result = ((hash % modulus) + modulus) % modulus
What unfortunately doesn't work is
result = Math.Abs(hash) % modulus
because Math.Abs(int.MinValue) is int.MinValue and thus negative. To fix this approach one could cast to long:
result = (int)(Math.Abs((long)hash)) % modulus)
All of these methods introduce a minor bias for some input ranges and modulus values, since unless the number of input values is an integral multiple of the modulus they can't be mapped to each output value with the same probability. In some contexts this can be a problem, but it's fine for hashtables.
If you mainly care about performance then the masking solution is preferable since & is cheap compared to % or branching.

The proper way to handle negative values is to use double-modulus.
int currentIndex = ((a.GetHashCode() % someArray.Length) + someArray.Length) % someArray.Length;
Introduce some variables into the mix:
int len = someArray.Length;
int currentIndex = ((a.GetHashCode() % len) + len) % len;
This will first make the value range from -len up to (len -1), so when you add len to it, it will range from 0 up to len*2-1, and then you use modulus again, which will put the value in the range of 0 to len-1, which is what you want.
This method will handle all valid values of a.GetHashCode(), no need to special-handle int.MinValue or int.MaxValue.
Note that this method will ensure that if you add one to the input (which is a.GetHashCode() in this case, so might not matter), you'll end up adding one to the output (which will wrap around to 0 when it reaches the end). Methods that uses Math.Abs or bitwise manipulation to ensure a positive value might not work like that for negative numbers. It depends on what you want.

You should be able to use:
int currentIndex = (a.GetHashCode() & 0x7FFFFFFF) % someArray.Length;
Note that, depending on the array length and implementation of GetHashCode, this may not have a random distribution. This is especially true if you use an Int32 as in your sample code, as Int32.GetHashCode just returns the integer itself, so there's no need to call GetHashCode.

Related

Create random ints with minimum and maximum from Random.NextBytes()

Title pretty much says it all. I know I could use Random.NextInt(), of course, but I want to know if there's a way to turn unbounded random data into bounded without statistical bias. (This means no RandomInt() % (maximum-minimum)) + minimum). Surely there is a method like it, that doesn't introduce bias into the data it outputs?
If you assume that the bits are randomly distributed, I would suggest:
Generate enough bytes to get a number within the range (e.g. 1 byte to get a number in the range 0-100, 2 bytes to get a number in the range 0-30000 etc).
Use only enough bits from those bytes to cover the range you need. So for example, if you're generating numbers in the range 0-100, take the bottom 7 bits of the byte you've generated
Interpret the bits you've got as a number in the range [0, 2n) where n is the number of bit
Check whether the number is in your desired range. It should be at least half the time (on average)
If so, use it. If not, repeat the above steps until a number is in the right range.
The use of just the required number of bits is key to making this efficient - you'll throw away up to half the number of bytes you generate, but no more than that, assuming a good distribution. (And if you are generating numbers in a nicely binary range, you won't need to throw anything away.)
Implementation left as an exercise to the reader :)
You could try with something like:
public static int MyNextInt(Random rnd, int minValue, int maxValue)
{
var buffer = new byte[4];
rnd.NextBytes(buffer);
uint num = BitConverter.ToUInt32(buffer, 0);
// The +1 is to exclude the maxValue in the case that
// minValue == int.MinValue, maxValue == int.MaxValue
double dbl = num * 1.0 / ((long)uint.MaxValue + 1);
long range = (long)maxValue - minValue;
int result = (int)(dbl * range) + minValue;
return result;
}
Totally untested... I can't guarantee that the results are truly pseudo-random... But the idea of creating a double (dbl) number is the same used by the Random class. Only I use the uint.MaxValue as the base instead of int.MaxValue. In this way I don't have to check for negative values of the buffer.
I propose a generator of random integers, based on NextBytes.
This method discards only 9.62% of bits in average over the word size range for positive Int32's due to the usage of Int64 as a representation for bit manupulation.
Maximum bit loss occurs at word size of 22 bits, and it's 20 lost bits of 64 used in byte range conversion. In this case bit efficiency is 68.75%
Also, 25% of values are lost because of clipping the unbound range to maximum value.
Be careful to use Take(N) on the IEnumerable returned, because it's an infinite generator otherwise.
I'm using a buffer of 512 long values, so it generates 4096 random bytes at once. If you just need a sequence of few integers, change the buffer size from 512 to a more optimal value, down to 1.
public static class RandomExtensions
{
public static IEnumerable<int> GetRandomIntegers(this Random r, int max)
{
if (max < 1)
throw new ArgumentOutOfRangeException("max", max, "Must be a positive value.");
const int longWordsTotal = 512;
const int bufferSize = longWordsTotal * 8;
var buffer = new byte[bufferSize];
var wordSize = (int)Math.Log(max, 2) + 1;
while(true)
{
r.NextBytes(buffer);
for (var longWordIndex = 0; longWordIndex < longWordsTotal; longWordIndex++)
{
ulong longWord = BitConverter.ToUInt64(buffer, longWordIndex);
var lastStartBit = 64 - wordSize;
var count = 0;
for (var startBit = 0; startBit <= lastStartBit; startBit += wordSize)
{
count ++;
var mask = ((1UL << wordSize) - 1) << startBit;
var unboundValue = (int)((mask & longWord) >> startBit);
if (unboundValue <= max)
yield return unboundValue;
}
}
}
}
}

"Substring" a Numeric Value

In C#, what is the best way to "substring" (for lack of a better word) a long value.
I need to calculate a sum of account numbers for a trailer record but only need the 16 least significant characters.
I am able to this by converting the value to string but wondered if there is a better way in which it can be done.
long number = 1234567890123456789L;
const long _MAX_LENGTH = 9999999999999999L;
if (number > _MAX_LENGTH)
{
string strNumber = number.ToString();
number = Convert.ToInt64(strNumber.Substring(strNumber.Length - 16));
}
This will return the value 4567890123456789.
You could do:
long number = 1234567890123456789L;
long countSignificant = 16;
long leastSignificant = number % (long) Math.Pow(10, countSignificant);
How does this work? Well, if you divide by 10, you drop off the last digit, right? And the remainder will be that last digit? The same goes for 100, 1000 and Math.Pow(1, n).
Let's just look at the least significant digit, because we can do this in our head:
1234 divided by 10 is 123 remainder 4
In c#, that would be:
1234 / 10 == 123;
1234 % 10 == 4;
So, the next step is to figure out how to get the last n significant digits. It turns out, that that is the same as dividing by 10 to the power of n. Since c# doesn't have an exponent operator (like ** in python), we use a library function:
Math.Pow(10, 4) == 1000.0; // oops: a float!
We need to cast that back to a long:
(long) Math.Pow(10, 4) == 1000;
I think now you have all the pieces to create a nice function of your own ;)
You could use modulo (the % operator in C#). For example:
123456 % 100000 = 23456
or
123456 % 1000 = 456
As a quick guide I keep remembering that you get as many digits as there are zeros in the divisor. Or, vice versa, the divisor needs as many zeros as you want to keep digits.
So in your case you'd need:
long number = 1234567890123456789L;
long divisor = 10000000000000000L;
long result = number % divisor;
Complete Code, use modulo operator:
long number = 1234567890123456789L;
const long _MAX_LENGTH = 9999999999999999L;
number = number % (_MAX_LENGTH + 1);
Console.WriteLine (number);
Live test: http://ideone.com/pKB6w
Until you are enlightened with modulo approach, you can opt for this one instead for the meantime:
long number = 1234567890123456789L;
const long _MAX_LENGTH = 9999999999999999L;
if (number > _MAX_LENGTH) {
long minus = number / (_MAX_LENGTH + 1) * (_MAX_LENGTH + 1);
number = number - minus;
}
Console.WriteLine(number);
Live test: http://ideone.com/oAkcy
Note:
Strongly recommended, use modulo approach, don't use subtraction. Modulo approach is the best, no corner case, i.e. no need to use if statement.

Fastest way to sum digits in a number

Given a large number, e.g. 9223372036854775807 (Int64.MaxValue), what is the quickest way to sum the digits?
Currently I am ToStringing and reparsing each char into an int:
num.ToString().Sum(c => int.Parse(new String(new char[] { c })));
Which is surely horrifically inefficent. Any suggestions?
And finally, how would you make this work with BigInteger?
Thanks
Well, another option is:
int sum = 0;
while (value != 0)
{
int remainder;
value = Math.DivRem(value, 10, out remainder);
sum += remainder;
}
BigInteger has a DivRem method as well, so you could use the same approach.
Note that I've seen DivRem not be as fast as doing the same arithmetic "manually", so if you're really interested in speed, you might want to consider that.
Also consider a lookup table with (say) 1000 elements precomputed with the sums:
int sum = 0;
while (value != 0)
{
int remainder;
value = Math.DivRem(value, 1000, out remainder);
sum += lookupTable[remainder];
}
That would mean fewer iterations, but each iteration has an added array access...
Nobody has discussed the BigInteger version. For that I'd look at 101, 102, 104, 108 and so on until you find the last 102n that is less than your value. Take your number div and mod 102n to come up with 2 smaller values. Wash, rinse, and repeat recursively. (You should keep your iterated squares of 10 in an array, and in the recursive part pass along the information about the next power to use.)
With a BigInteger with k digits, dividing by 10 is O(k). Therefore finding the sum of the digits with the naive algorithm is O(k2).
I don't know what C# uses internally, but the non-naive algorithms out there for multiplying or dividing a k-bit by a k-bit integer all work in time O(k1.6) or better (most are much, much better, but have an overhead that makes them worse for "small big integers"). In that case preparing your initial list of powers and splitting once takes times O(k1.6). This gives you 2 problems of size O((k/2)1.6) = 2-0.6O(k1.6). At the next level you have 4 problems of size O((k/4)1.6) for another 2-1.2O(k1.6) work. Add up all of the terms and the powers of 2 turn into a geometric series converging to a constant, so the total work is O(k1.6).
This is a definite win, and the win will be very, very evident if you're working with numbers in the many thousands of digits.
Yes, it's probably somewhat inefficient. I'd probably just repeatedly divide by 10, adding together the remainders each time.
The first rule of performance optimization: Don't divide when you can multiply instead. The following function will take four digit numbers 0-9999 and do what you ask. The intermediate calculations are larger than 16 bits. We multiple the number by 1/10000 and take the result as a Q16 fixed point number. Digits are then extracted by multiplication by 10 and taking the integer part.
#define TEN_OVER_10000 ((1<<25)/1000 +1) // .001 Q25
int sum_digits(unsigned int n)
{
int c;
int sum = 0;
n = (n * TEN_OVER_10000)>>9; // n*10/10000 Q16
for (c=0;c<4;c++)
{
printf("Digit: %d\n", n>>16);
sum += n>>16;
n = (n & 0xffff) * 10; // next digit
}
return sum;
}
This can be extended to larger sizes but its tricky. You need to ensure that the rounding in the fixed point calculation always works correctly. I also did 4 digit numbers so the intermediate result of the fixed point multiply would not overflow.
Int64 BigNumber = 9223372036854775807;
String BigNumberStr = BigNumber.ToString();
int Sum = 0;
foreach (Char c in BigNumberStr)
Sum += (byte)c;
// 48 is ascii value of zero
// remove in one step rather than in the loop
Sum -= 48 * BigNumberStr.Length;
Instead of int.parse, why not subtract '0' from each digit to get the actual value.
Remember, '9' - '0' = 9, so you should be able to do this in order k (length of the number). The subtraction is just one operation, so that should not slow things down.

how to loop through the digits of a binary number?

I have a binary number 1011011, how can I loop through all these binary digits one after the other ?
I know how to do this for decimal integers by using modulo and division.
int n = 0x5b; // 1011011
Really you should just do this, hexadecimal in general is much better representation:
printf("%x", n); // this prints "5b"
To get it in binary, (with emphasis on easy understanding) try something like this:
printf("%s", "0b"); // common prefix to denote that binary follows
bool leading = true; // we're processing leading zeroes
// starting with the most significant bit to the least
for (int i = sizeof(n) * CHAR_BIT - 1; i >= 0; --i) {
int bit = (n >> i) & 1;
leading |= bit; // if the bit is 1, we are no longer reading leading zeroes
if (!leading)
printf("%d", bit);
}
if (leading) // all zero, so just print 0
printf("0");
// at this point, for n = 0x5b, we'll have printed 0b1011011
You can use modulo and division by 2 exactly like you would in base 10. You can also use binary operators, but if you already know how to do that in base 10, it would be easier if you just used division and modulo
Expanding on Frédéric and Gabi's answers, all you need to do is realise that the rules in base 2 are no different to in base 10 - you just need to do your division and modulus with a divisor 2 instead of 10.
The next step is simply to use number >> 1 instead of number / 2 and number & 0x1 instead of number % 2 to improve performance. Mind you, with modern optimising compilers there's probably no difference...
Use an AND with increasing powers of two...
In C, at least, you can do something like:
while (val != 0)
{
printf("%d", val&0x1);
val = val>>1;
}
To expand on #Marco's answer with an example:
uint value = 0x82fa9281;
for (int i = 0; i < 32; i++)
{
bool set = (value & 0x1) != 0;
value >>= 1;
Console.WriteLine("Bit set: {0}", set);
}
What this does is test the last bit, and then shift everything one bit.
If you're already starting with a string, you could just iterate through each of the characters in the string:
var values = "1011011".Reverse().ToCharArray();
for(var index = 0; index < values.Length; index++) {
var isSet = (Boolean)Int32.Parse(values[index]); // Boolean.Parse only works on "true"/"false", not 0/1
// do whatever
}
byte input = Convert.ToByte("1011011", 2);
BitArray arr = new BitArray(new[] { input });
foreach (bool value in arr)
{
// ...
}
You can simply loop through every bit. The following C like pseudocode allows you to set the bit number you want to check. (You might also want to google endianness)
for()
{
bitnumber = <your bit>
printf("%d",(val & 1<<bitnumber)?1:0);
}
The code basically writes 1 if the bit it set or 0 if not. We shift the value 1 (which in binary is 1 ;) ) the number of bits set in bitnumber and then we AND it with the value in val to see if it matches up. Simple as that!
So if bitnumber is 3 we simply do this
00000100 ( The value 1 is shifted 3 left for example)
AND
10110110 (We check it with whatever you're value is)
=
00000100 = True! - Both values have bit 3 set!

Simple Pseudo-Random Algorithm

I'm need a pseudo-random generator which takes a number as input and returns another number witch is reproducible and seems to be random.
Each input number should match to exactly one output number and vice versa
same input numbers always result in same output numbers
sequential input numbers that are close together (eg. 1 and 2) should produce completely different output numbers (eg. 1 => 9783526, 2 => 283)
It must not be perfect, it's just to create random but reproducible test data.
I use C#.
I wrote this funny piece of code some time ago which produced something random.
public static long Scramble(long number, long max)
{
// some random values
long[] scramblers = { 3, 5, 7, 31, 343, 2348, 89897 };
number += (max / 7) + 6;
number %= max;
// shuffle according to divisibility
foreach (long scrambler in scramblers)
{
if (scrambler >= max / 3) break;
number = ((number * scrambler) % max)
+ ((number * scrambler) / max);
}
return number % max;
}
I would like to have something better, more reliable, working with any size of number (no max argument).
Could this probably be solved using a CRC algorithm? Or some bit shuffling stuff.
I remove the microsoft code from this answer, the GNU code file is a lot longer but basically it contains this from http://cs.uccs.edu/~cs591/bufferOverflow/glibc-2.2.4/stdlib/random_r.c :
int32_t val = state[0];
val = ((state[0] * 1103515245) + 12345) & 0x7fffffff;
state[0] = val;
*result = val;
for your purpose, the seed is state[0] so it would look more like
int getRand(int val)
{
return ((val * 1103515245) + 12345) & 0x7fffffff;
}
You (maybe) can do this easily in C# using the Random class:
public int GetPseudoRandomNumber(int input)
{
Random random = new Random(input);
return random.Next();
}
Since you're explicitly seeding Random with the input, you will get the same output every time given the same input value.
A tausworthe generator is simple to implement and pretty fast. The following pseudocode implementation has full cycle (2**31 - 1, because zero is a fixed point):
def tausworthe(seed)
seed ^= seed >> 13
seed ^= seed << 18
return seed & 0x7fffffff
I don't know C#, but I'm assuming it has XOR (^) and bit shift (<<, >>) operators as in C.
Set an initial seed value, and invoke with seed = tausworthe(seed).
The first two rules suggest a fixed or input-seeded permutation of the input, but the third rule requires a further transform.
Is there any further restriction on what the outputs should be, to guide that transform? - e.g. is there an input set of output values to choose from?
If the only guide is "no max", I'd use the following...
Apply a hash algorithm to the whole input to get the first output item. A CRC might work, but for more "random" results, use a crypto hash algorithm such as MD5.
Use a next permutation algorithm (plenty of links on Google) on the input.
Repeat the hash-then-next-permutation until all required outputs are found.
The next permutation may be overkill though, you could probably just increment the first input (and maybe, on overflow, increment the second and so on) before redoing the hash.
For crypto-style hashing, you'll need a key - just derive something from the input before you start.

Categories