How random is Random.Next()?

How random is Random.Next()? - c#

I have been doing some testing on the Random class and I have used the following code:
while (x++ <= 5000000)
{
y = rnd.Next(1, 5000000);
if (!data.Contains(y))
data.Add(y);
else
{
Console.WriteLine("Cycle {2}: Repetation found for number {0} after {1} iteration", y, x, i);
break;
}
}
I kept changing the rnd max limit (i.e. 5000000) and I changed the number of iterations and I got the following result:
1) if y = rnd.Next(1, 5000) : The average is between 80 to 110 iterations
2) if y = rnd.Next(1, 5000000) : The average is between 2000 to 4000 iterations
3) if y = rnd.Next(1, int.MaxValue) : The average is between 40,000 to 80,000 iterations.
Why am I getting these averages, i.e. out of 10 times I checked for each value, 80% of the time I get within this average range. I dont think we can call it near to being Random.
What can I do to get a fairly random number.

You are not testing for cycles. You are testing for how long it takes to get a random number you've had before. That's completely different. Your figures are spot on for testing how long it takes to get a random number you had before. Look in wikipedia under "the birthday paradox" for a chart of the probability of getting a collision after a certain number of iterations.
Coincidentally, last week I wrote a blog article about this exact subject. It'll go live on March 22nd; see my blog then for details.
If what you want to test for is the cycle length of a pseudo-random number generator then you need to look for not a number you've had before, but rather, a lengthy exact sequence of numbers that you've had before. There are a number of interesting ways to do that, but it's probably easier for me to just tell you: the cycle length of Random is a few billion, so you are unlikely to be able to write a program that discovers that fact. You'd have to store a lot of numbers.
However, cycle length is not the only measure of quality of a pseudo-random number generator. Remember, PRNGs are not random, they are predictable, and therefore you have to think very carefully about what your metric for "randomness" is.
Give us more details: why do you care how "random" Random is? What application are you using it for that you care? What aspects of randomness are important to you?

You are assuming that the randomness is better if numbers are not repeated. That is not true.
Real randomness doesn't have a memory. When you pick the next number, the chance to get the same number again is just as high as any other number in the range.
If you roll a dice and get a six, then roll the dice again, there is no less chance to get a six again. If you happen to get two sixes in a row, that doesn't mean that the dice is broken.
The randomness in the Random class it of course not perfect, but that is not what your test reveals. It simply shows a phenomenon that you get with every random number generator, even if actually creates real random numbers and not just pseudo-random numbers.

You are judging randomness by repeat pairs, which isn't the best test for randomness. The repeats you see are akin to the birthday paradox: http://en.wikipedia.org/wiki/Birthday_problem, where a repeat event can occur with a small sample size if you are not looking for a specific event.

Per the documentation at http://msdn.microsoft.com/en-us/library/system.random.aspx
To generate a cryptographically secure
random number suitable for creating a
random password, for example, use a
class derived from
System.Security.Cryptography..::.RandomNumberGenerator
such as
System.Security.Cryptography..::.RNGCryptoServiceProvider.

A computer can't generate a real random number.
if You need a real random number (David gave you the best option from dot net framework)
you need an external random source.

Related

Randomly generated values are not really random

I am making a simple game in C# and using Random class to generate a random number every 2 seconds, but it gives numbers that I can guess in advance because it increments by a certain number each time.
private int RandomNumber;
//This Timer function runs every 2 seconds
private void TmrGenerateRandomNum(object sender, EventArgs e)
{
Random r = new Random();
RandomNumber = r.Next(50, 200); //Trying to get random values between 50 and 200
Label1.Text = RandomNumber.ToString();
}
This is literally the only code I have, and I am displaying the value each time through Label1. The problem is, for example, if I first get 52, the next "randomly"-generated value is 71 or 72, and the next one is 91 or 92, and the next one is 111 or 112. As you can see, it is incrementing by 20ish. Then it suddenly gets a real random value again like, say, 163. But then the next generated number is 183, again incremented by 20. Since 183+20 exceeds the range, it gets a real random value again, say 83. But the next number generated is again 103, or 104, and then 123 or 124...
This is not "Random" because you can tell around what number the next one is going to be... and this is driving me crazy.
Do you know why it does this, and is there a fix for this?

Random numbers use a seedvalue to start the sequence. If you give nothing, it takes to current datetime.now value. if you supply the same value (like 3842) you will get the same sequence of random values guaranteed. Your series seemed to be very closely related so you see the same values. All this can be read in the documentation on msdn, see Instantiate pseudo random :
Instantiating the random number generator
You instantiate the random number generator by providing a seed value (a starting value for the pseudo-random number generation algorithm) to a Random class constructor. You can supply the seed value either explicitly or implicitly:
The Random(Int32) constructor uses an explicit seed value that you supply.
The Random() constructor uses the system clock to provide a seed value. This is the most common way of instantiating the random number generator.
If the same seed is used for separate Random objects, they will generate the same series of random numbers. This can be useful for creating a test suite that processes random values, or for replaying games that derive their data from random numbers. However, note that Random objects in processes running under different versions of the .NET Framework may return different series of random numbers even if they're instantiated with identical seed values.
Source: https://msdn.microsoft.com/en-us/library/system.random(v=vs.110).aspx#Instantiate
You find more information and even methods to "better" cryptographical random methods linked above the source I cited.

While true random numbers are relatively difficult to generate, you should be able to get much better results than you are. The problem you're likely having right now, as suggested by comments, is that you should not be "starting over" every time the function is called. As a test, you can try passing some value like 5 as a parameter to the Random constructor. Then you will notice your results are really not random. To fix this, as the comments suggest, you should move the construction of the Random object somewhere global so that the randomness can "build" on itself over time instead of always being reset. Once you confirm that you can get different values with repeated calls, then remove the parameter to the Random constructor so that you don't get the same sequence of random numbers every time you restart your program.

More of an in-depth explanation:
Basically, it's pretty much impossible to generate truly random values in a computer, all of the random values you get are pseudo random and are based off of values like:
User input, like how many keys were pressed in the last 2 minuts
Memory usage, how many KB of memory is being used at this very moment
The most common one: Using the current time in mili seconds to generate a "seed" that's used to generate a series of numbers that are mathematical as randomized as possible
This is where the problem comes from. For example, if you were to have the following code
Random r1 = new Random();
Random r2 = new Random();
Console.WriteLine(r1.next())
Console.WriteLine(r2.next())
Miraculously, you will always get the same number from r1 and r2 despite them being two separate instances. It's because the code block is execute in a time window of <1 ms, which gives these two random number generators the same seed, thus they will generate the same list of numbers.
In your case, the number generator is instantiated every 2 seconds, thus they get an incrementing seed. It's hard to say if that will result in an incrementing starting number, but it might be a start.
So instead, try initializing a single Random() and always call the .next() function on it, and you should be getting a much more random list of numbers

The best you can do is use a CPRNG or Cryptograph Psudo Random Number Generator, it does not need to be seeded by tyhe user and in generally continnualy updates with system events.
To generate a cryptographically secure random number use the RNGCryptoServiceProvider class or derive a class from System.Security.Cryptography.RandomNumberGenerator.
See MSDN Random Class.

Is 161803398 A 'Special' Number? Inside of Math.Random()

I suspect the answer is 'Because of Math', but I was hoping someone could give a little more insight at a basic level...
I was poking around in the BCL source code today, having a look at how some of the classes I've used before were actually implemented. I'd never thought about how to generate (pseudo) random numbers before, so I decided to see how it was done.
Full source here: http://referencesource.microsoft.com/#mscorlib/system/random.cs#29
private const int MSEED = 161803398;
This MSEED value is used every time a Random() class is seeded.
Anyway, I saw this 'magic number' - 161803398 - and I don't have the foggiest idea of why that number was selected. It's not a prime number or a power of 2. It's not 'half way' to a number that seemed more significant. I looked at it in binary and hex and well, it just looked like a number to me.
I tried searching for the number in Google, but I found nothing.

No, but it's based on Phi (the "golden ratio").
161803398 = 1.61803398 * 10^8 ≈ φ * 10^8
More about the golden ratio here.
And a really good read for the casual mathematician here.
And I found a research paper on random number generators that agrees with this assertion. (See page 53.)

This number is taken from golden ratio 1.61803398 * 10^8. Matt gave a nice answer what is this number, therefore I will just explain a little bit about an algorithm.
This is not a special number for this algorithm. The algorithm is Knuth's subtractive random number generator algorithm and the main points of it are:
store a circular list of 56 random numbers
initialization is process of filling the list, then randomize those values with a specific deterministic algorithm
two indices are kept which are 31 apart
new random number is the difference of the two values at the two indices
store new random number in the list
The generator is based on the following recursion: Xn = (Xn-55 - Xn-24) mod m, where n &geq; 0. This is a partial case of lagged Fibonacci generator: Xn = (Xn-j # Xn-k) mod m, where 0 < k < j and # is any binary operation (subtraction, addition, xor).
There are several implementations of this generator. Knuth offers an implementation in
FORTRAN in his book. I found the following code, with the following comment:
PARAMETER (MBIG=1000000000,MSEED=161803398,MZ=0,FAC=1.E-9)
According
to Knuth, any large MBIG, and any smaller (but still large) MSEED can
be substituted for the above values.
A little bit more can be found here Note, that this is not actually a research paper (as stated by Math), this is just a master degree thesis.
People in cryptography like to use irrational number (pi, e, sqrt(5)) because there is a conjecture that digits of such numbers appears with equal frequency and thus have high entropy. You can find this related question on security stackexchange to learn more about such numbers. Here is a quote:
"If the constants are chosen at random, then with high probability, no
attacker will be able to break it." But cryptographers, being a
paranoid lot, are skeptical when someone says, "Let's use this set of
constants. I picked them at random, I swear." So as a compromise,
they'll use constants like, say, the binary expansion of π. While we
no longer have the mathematical benefit of having chosen them at
random from some large pool of numbers, we can at least be more
confident there was no sabotage.

Probability with Random.Next()

I want to write a lottery draw program which needs to randomly choose 20000 numbers from 1-2000000 range. The code is as below:
Random r = New Random(seed); //seed is a 6 digits e.g 123456
int i=0;
while(true){
r.Next(2000000);
i++;
if(i>=20000)
break;
}
My questions are:
Can it make sure the same possibility of all the numbers from 1 to 2000000?
Is the upper bound 2000000 included in the r.Next()?
Any suggestion?

The .NET Random class does a fairly good job of generating random numbers. However be aware that if you seed it with the same number you'll get the same "random" numbers each time. If you don't want this behavior don't provide a seed.
If you're after much more random number generator than the built in .NET one then take a look at random.org. It's one of the best sites out there for getting true random numbers - I believe there's an API. Here's a quote from their site:
RANDOM.ORG offers true random numbers to anyone on the Internet. The
randomness comes from atmospheric noise, which for many purposes is
better than the pseudo-random number algorithms typically used in
computer programs. People use RANDOM.ORG for holding drawings,
lotteries and sweepstakes, to drive games and gambling sites, for
scientific applications and for art and music. The service has existed
since 1998 and was built by Dr Mads Haahr of the School of Computer
Science and Statistics at Trinity College, Dublin in Ireland. Today,
RANDOM.ORG is operated by Randomness and Integrity Services Ltd.
Finally Random.Next() is exlusive so the upper value you supply will never be called. You may need to adjust your code appropriately if you want 2000000 to be in there.

It includes the minValue but does not include the maxValue. Therefore if you want to generate numbers from 1 to 2000000 use:
r.Next(1,2000001)

I believe your question is implementation dependent.
The naïve method of generating a random integer in a range is to generate a random 32-bit word and then normalise it across your range.
The larger the range you're normalising the more the probabilities of each individual value fluctuate.
In your situation, you're normalising 4.3 billion inputs over 2 million outputs. This will mean that the probabilities of each number in your range will differ by up to about 1 in 2000 (or 0.05%). If this slight difference in probabilities is okay for you, then go ahead.

Upperbound included?
No, the upperbound is exclusive so you'll have to use 2000001 to include 2000000.
Any suggestion?
Let me take the liberty of suggesting not to use a while(true) / break. Simply put the condition of the if in your while statement:
Random r = New Random(seed); //seed is a 6 digits e.g 123456
int i=0;
while(i++ < 20000)
{
r.Next(1, 2000001);
}
I know this is nitpicking, but it is a suggestion... :)

How to create a probability by a given percentage?

I'm trying to create a percentage-based probability for a game. E.g. if an item has a 45% chance of a critical hit, that must mean it is 45 of 100 hits would be critical.
First, I tried to use a simple solution:
R = new Random();
int C = R.Next(1, 101);
if (C <= ProbabilityPercent) DoSomething()
But in 100 iterations with a chance of e.g. 48%, it gives 40-52 out of 100.
Same goes for 49, 50, 51.
So, there is no difference between these "percents".
The question is how to set a percentage of e.g. 50, and get strictly 50 of 100 with random?
It is a very important thing for probability of rare item finding with an opportunity to increase a chance to find with an item. So the buff of 1% would be sensinble, because now it is not.
Sorry for my bad English.

You need to think only in terms of uniform distribution over repeated rolls.
You can't look over 100 rolls, because forcing that to yield exactly 45 would not be random. Usually, such rolls should exhibit "lack of memory". For example, if you roll a dice looking for a 6, you have a 1-in-6 chance. If you roll it 5 times, and don't get a six - then: the chance of getting a 6 on the next roll is not 1. It is still 1 in 6. As such, you can only look at how well it met your expectation when amortized over a statistically large number of events... 100,000 say.
Basically: your current code is fine. If the user knows (because they've hit 55 times without a critical) that the next 45 hits must be critical, then it is no longer random and they can game the system.
Also; 45% chance of critical hit seems a bit high ;p

I'm trying to create a percentage-based probability for a game. E.g.
if an item has a 45% chance of a critical hit, that must mean it is 45
of 100 hits would be critical.
No, that's not true. You missunderstud completely the concept of Probability. You dont want a "percentage-based probability", you want a "percentage-based random distribution of 100 samples".
What you need is a "bag of events", 45 of them "crytical" and 55 of them "non crytical". Once you pick an event from the bag, you only have the remaining events to pick the next time.
You can model it this way:
Initialize two integer variables Crytical and NonCrytical so that they sum exactly 100 according to the desired percetnages.
Get a random value from 1 to Crytical+NonCrytical
If the random value is less than the value of Crytical, let you hit be crytical and:
Crytical = Crytical -1
Else, let your hit be non-crytical
NonCrytical = NonCrytical-1
End If
Repeat until Crytical+NonCrytical = 0

Since I am no expert in C# I will use a C++ function for ease but still applicable for any language.
rand() - random number generator.
//33.34%
double probability = 0.3334;
//divide to get a number between 0 and 1
double result = rand() / RAND_MAX;
if(result < probability)
//do something
I have used this method to create very large percolated grids and it works excellent for precision values.

The thing is, with Random you might want to initialize this class only once.
This is because Random uses the system time as a seed for generating random numbers.
If your loop is very fast it could happen that multiple Random-instances are using the same seed and thus generating the same numbers.
Check the generated numbers if you suspect this is happening.
Besides this, inherent to Randomness is that it won't give you exact results.
This means that even with a 50/50 chance it could happen that a sequence of 100 "heads" or "tails" gives "heads" 100 times.
The only thing you can do is create the Random-instance once and live with the results; otherwise you shouldn't use Random.

How is a random number generated at runtime?

Since computers cannot pick random numbers(can they?) how is this random number actually generated. For example in C# we say,
Random.Next()
What happens inside?

You may checkout this article. According to the documentation the specific implementation used in .NET is based on Donald E. Knuth's subtractive random number generator algorithm. For more information, see D. E. Knuth. "The Art of Computer Programming, volume 2: Seminumerical Algorithms". Addison-Wesley, Reading, MA, second edition, 1981.

Since computers cannot pick random numbers (can they?)
As others have noted, "Random" is actually pseudo-random. To answer your parenthetical question: yes, computers can pick truly random numbers. Doing so is much more expensive than the simple integer arithmetic of a pseudo-random number generator, and usually not required. However there are applications where you must have non-predictable true randomness: cryptography and online poker immediately come to mind. If either use a predictable source of pseudo-randomness then attackers can decrypt/forge messages much more easily, and cheaters can figure out who has what in their hands.
The .NET crypto classes have methods that give random numbers suitable for cryptography or games where money is on the line. As for how they work: the literature on crypto-strength randomness is extensive; consult any good university undergrad textbook on cryptography for details.
Specialty hardware also exists to get random bits. If you need random numbers that are drawn from atmospheric noise, see www.random.org.

Knuth covers the topic of randomness very well.
We don't really understand random well. How can something predictable be random? And yet pseudo-random sequences can appear to be perfectly random by statistical tests.
There are three categories of Random generators, amplifying on the comment above.
First, you have pseudo random number generators where if you know the current random number, it's easy to compute the next one. This makes it easy to reverse engineer other numbers if you find out a few.
Then, there are cryptographic algorithms that make this much harder. I believe they still are pseudo random sequences (contrary to what the comment above implies), but with the very important property that knowing a few numbers in the sequence does NOT make it obvious how to compute the rest. The way it works is that crypto routines tend to hash up the number, so that if one bit changes, every bit is equally likely to change as a result.
Consider a simple modulo generator (similar to some implementations in C rand() )
int rand() {
return seed = seed * m + a;
}
if m=0 and a=0, this is a lousy generator with period 1: 0, 0, 0, 0, ....
if m=1 and a=1, it's also not very random looking: 0, 1, 2, 3, 4, 5, 6, ...
But if you pick m and a to be prime numbers around 2^16, this will jump around nicely looking very random if you are casually inspecting. But because both numbers are odd, you would see that the low bit would toggle, ie the number is alternately odd and even. Not a great random number generator. And since there are only 2^32 values in a 32 bit number, by definition after 2^32 iterations at most, you will repeat the sequence again, making it obvious that the generator is NOT random.
If you think of the middle bits as nice and scrambled, while the lower ones aren't as random, then you can construct a better random number generator out of a few of these, with the various bits XORed together so that all the bits are covered well. Something like:
(rand1() >> 8) ^ rand2() ^ (rand3() > 5) ...
Still, every number is flipping in synch, which makes this predictable. And if you get two sequential values they are correlated, so that if you plot them you will get lines on your screen. Now imagine you have rules combining the generators, so that sequential values are not the next ones.
For example
v1 = rand1() >> 8 ^ rand2() ...
v2 = rand2() >> 8 ^ rand5() ..
and imagine that the seeds don't always advance. Now you're starting to make something that's much harder to predict based on reverse engineering, and the sequence is longer.
For example, if you compute rand1() every time, but only advance the seed in rand2() every 3rd time, a generator combining them might not repeat for far longer than the period of either one.
Now imagine that you pump your (fairly predictable) modulo-type random number generator through DES or some other encryption algorithm. That will scramble up the bits.
Obviously, there are better algorithms, but this gives you an idea. Numerical Recipes has a lot of algorithms implemented in code and explained. One very good trick: generate not one but a block of random values in a table. Then use an independent random number generator to pick one of the generated numbers, generate a new one and replace it. This breaks up any correlation between adjacent pairs of numbers.
The third category is actual hardware-based random number generators, for example based on atmospheric noise
http://www.random.org/randomness/
This is, according to current science, truly random. Perhaps someday we will discover that it obeys some underlying rule, but currently, we cannot predict these values, and they are "truly" random as far as we are concerned.
The boost library has excellent C++ implementations of Fibonacci generators, the reigning kings of pseudo-random sequences if you want to see some source code.

I'll just add an answer to the first part of the question (the "can they?" part).h
Computers can generate (well, generate may not be an entirely accurate word) random numbers (as in, not pseudo-random). Specifically, by using environmental randomness which is gotten through specialized hardware devices (that generates randomness based on noise, for e.g.) or by using environmental inputs (e.g. hard disk timings, user input event timings).
However, that has no bearing on the second question (which was how Random.Next() works).

The Random class is a pseudo-random number generator.
It is basically an extremely long but deterministic repeating sequence. The "randomness" comes from starting at different positions. Specifying where to start is done by choosing a seed for the random number generator and can for example be done by using the system time or by getting a random seed from another random source. The default Random constructor uses the system time as a seed.
The actual algorithm used to generate the sequence of numbers is documented in MSDN:
The current implementation of the Random class is based on Donald E. Knuth's subtractive random number generator algorithm. For more information, see D. E. Knuth. "The Art of Computer Programming, volume 2: Seminumerical Algorithms". Addison-Wesley, Reading, MA, second edition, 1981.

Computers use pseudorandom number generators. Essentially, they work by start with a seed number and iterating it through an algorithm each time a new pseudorandom number is required.
The process is of course entirely deterministic, so a given seed will generate exactly the same sequence of numbers every time it is used, but the numbers generated form a statistically uniform distribution (approximately), and this is fine, since in most scenarios all you need is stochastic randomness.
The usual practice is to use the current system time as a seed, though if more security is required, "entropy" may be gathered from a physical source such as disk latency in order to generate a seed that is more difficult to predict. In this case, you'd also want to use a cryptographically strong random number generator such as this.

I don't know much details but what I know is that a seed is used in order to generate the random numbers it is then based on some algorithm that uses that seed that a new number is obtained.
If you get random numbers based on the same seed they will be the same often.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.