I wrote this class to generate a random code, it shouldn't create any two repetitive number.
I want to know, in this code, how much it possible that we have collision?
public string MyRandom()
{
Random r = new Random();
int x = r.Next(1000);//Max range
PersianCalendar pc = new PersianCalendar();
string _myrandom = pc.GetMonth(DateTime.Now).ToString() + pc.GetDayOfMonth(DateTime.Now).ToString() + pc.GetHour(DateTime.Now).ToString() + pc.GetMinute(DateTime.Now).ToString() + pc.GetSecond(DateTime.Now).ToString() + pc.GetMilliseconds(DateTime.Now).ToString() + x.ToString();
return _myrandom;
}
If we do
List<string> s = new List<string>();
for(int i = 0; i < 100; i++)
{
s.Add(MyRandom());
}
Why?
When random class is created in a loop, you can get duplicate random numbers. More Info
Solution:
Create Random instance outside loop.
It is safer to use more stronger methods such as RNGCryptoServiceProvider which you can find several samples of its usage. Generally, pure random generation is not possible is very high concurrency loads, but reduce its probability to a reasonable amount.
Well first, because you are using the date in the way that you are, then there is only one possible millisecond per year where the _myrandom string could possibly be repeated.
Within that millisecond each year, the odds of hitting a random string you have already generated are down to the variable x, and related directly to how many other random strings you have made in that millisecond (and the previous milliseconds of each previous year). If, for example, you were to have made 10 random strings last year in that millisecond and 4 the year before, the chances of duplication this year are 14:999, which is a chance of about 1.4%.
Of course, if you make 1000 strings per millisecond, the chance of duplication is 100%.
In fact, the chances of duplication are slightly higher, due to the fact that the C# Random number generator doesn't produce a true random number. It's a pseudo-random number, so the chances of a collision are higher.
I see that you've written: "this generator is using for naming photos which uploaded in my application" in the comments above, which makes it unlikely that you'll be making 1000+ names per millisecond. Generally, whenever you are using a random number generator in this way, it's a good idea to just put in some error checking in 1 of two ways:
1. offensive or 2. defensive.
Offensive would be waiting for the error (two identical file names) to happen and then dealing with it (maybe running your random generator again until the error isn't thrown). Defensive would be to check all of the filenames in the file, or keep a list of them all, to check that your new filename isn't the same before you even try to save it. This is more watertight (because you aren't relying on an error to be thrown) but far more computationally expensive.
Related
This question already has answers here:
Random number generator only generating one random number
(15 answers)
Closed 3 years ago.
for (int i = 1; i < 10; i++)
{
number = randomNumber();
if (!straightLine.Contains(number))
{
straightLine.Add(number);
}
}
public static int randomNumber()
{
Random random = new Random();
return random.Next(1, 100);
}
when I debug it works fine but when I run the program it gets 1 random number and that's it. so the problem is that the random Number method is only called once (when I don't debug then it calls it every time)
what can I do?
germi's answer is slightly misleading as it implies that any repeated instantiation of a new Random will produce the same sequence of values. This isn't quite correct, because your debug code works as you expect.
The documentation for Random says that the Random number generator is seeded from the system clock if you dont pass a seed value. https://learn.microsoft.com/en-us/dotnet/api/system.random.-ctor?view=netframework-4.8
The reason it works in debug is that debugging code is very slow (you're literally taking hundreds of milliseconds to step through the code a line at a time) and the clock has time to change in between runs of the loop.
When your code is run at full speed it runs so quickly that the system clock simply won't have changed to a new milliseconds value, so your repeated making of a new Random will result in it being seeded with the same value from the clock
If you were to insert some delay in the code such as Thread.Sleep(1000) in the loop then it would work. If you were to run the loop a million times it would take long enough to work through that the clock would eventually change - a million iterations might take long enough for a small handful of values to come out of the Random
The recommendation for a solution is sound though; make one new Random somewhere outside of the loop and then repeatedly call it. You could also seed the Random with something that is unique each time (like the value of i), though you should bear in mind that providing a particular seed will guarantee that the same random number comes out of it when you call Next. In some contexts this is useful, because you might want a situation where you can provide a certain value and then see the same sequence of random numbers emerge. The main thing to be mindful of is that by default the Random starts it's sequence based on the clock; two different computers with different time settings can theoretically produce the same random numbers if their clocks are reading the same at the moment the Random is created.
Repeatedly instantiating a new Random instance will lead to the same number being generated, since the seed will be the same.
The solution is to have one instance of Random in the class that generates the values:
var random = new Random();
for (int i = 1; i < 10; i++)
{
number = random.Next(1,100);
if (!straightLine.Contains(number))
{
straightLine.Add(number);
}
}
Note that this behavior only exists in .NET Framework, .NET Core will produce different values even across multiple Random instances created in quick succession.
I am making a simple game in C# and using Random class to generate a random number every 2 seconds, but it gives numbers that I can guess in advance because it increments by a certain number each time.
private int RandomNumber;
//This Timer function runs every 2 seconds
private void TmrGenerateRandomNum(object sender, EventArgs e)
{
Random r = new Random();
RandomNumber = r.Next(50, 200); //Trying to get random values between 50 and 200
Label1.Text = RandomNumber.ToString();
}
This is literally the only code I have, and I am displaying the value each time through Label1. The problem is, for example, if I first get 52, the next "randomly"-generated value is 71 or 72, and the next one is 91 or 92, and the next one is 111 or 112. As you can see, it is incrementing by 20ish. Then it suddenly gets a real random value again like, say, 163. But then the next generated number is 183, again incremented by 20. Since 183+20 exceeds the range, it gets a real random value again, say 83. But the next number generated is again 103, or 104, and then 123 or 124...
This is not "Random" because you can tell around what number the next one is going to be... and this is driving me crazy.
Do you know why it does this, and is there a fix for this?
Random numbers use a seedvalue to start the sequence. If you give nothing, it takes to current datetime.now value. if you supply the same value (like 3842) you will get the same sequence of random values guaranteed. Your series seemed to be very closely related so you see the same values. All this can be read in the documentation on msdn, see Instantiate pseudo random :
Instantiating the random number generator
You instantiate the random number generator by providing a seed value (a starting value for the pseudo-random number generation algorithm) to a Random class constructor. You can supply the seed value either explicitly or implicitly:
The Random(Int32) constructor uses an explicit seed value that you supply.
The Random() constructor uses the system clock to provide a seed value. This is the most common way of instantiating the random number generator.
If the same seed is used for separate Random objects, they will generate the same series of random numbers. This can be useful for creating a test suite that processes random values, or for replaying games that derive their data from random numbers. However, note that Random objects in processes running under different versions of the .NET Framework may return different series of random numbers even if they're instantiated with identical seed values.
Source: https://msdn.microsoft.com/en-us/library/system.random(v=vs.110).aspx#Instantiate
You find more information and even methods to "better" cryptographical random methods linked above the source I cited.
While true random numbers are relatively difficult to generate, you should be able to get much better results than you are. The problem you're likely having right now, as suggested by comments, is that you should not be "starting over" every time the function is called. As a test, you can try passing some value like 5 as a parameter to the Random constructor. Then you will notice your results are really not random. To fix this, as the comments suggest, you should move the construction of the Random object somewhere global so that the randomness can "build" on itself over time instead of always being reset. Once you confirm that you can get different values with repeated calls, then remove the parameter to the Random constructor so that you don't get the same sequence of random numbers every time you restart your program.
More of an in-depth explanation:
Basically, it's pretty much impossible to generate truly random values in a computer, all of the random values you get are pseudo random and are based off of values like:
User input, like how many keys were pressed in the last 2 minuts
Memory usage, how many KB of memory is being used at this very moment
The most common one: Using the current time in mili seconds to generate a "seed" that's used to generate a series of numbers that are mathematical as randomized as possible
This is where the problem comes from. For example, if you were to have the following code
Random r1 = new Random();
Random r2 = new Random();
Console.WriteLine(r1.next())
Console.WriteLine(r2.next())
Miraculously, you will always get the same number from r1 and r2 despite them being two separate instances. It's because the code block is execute in a time window of <1 ms, which gives these two random number generators the same seed, thus they will generate the same list of numbers.
In your case, the number generator is instantiated every 2 seconds, thus they get an incrementing seed. It's hard to say if that will result in an incrementing starting number, but it might be a start.
So instead, try initializing a single Random() and always call the .next() function on it, and you should be getting a much more random list of numbers
The best you can do is use a CPRNG or Cryptograph Psudo Random Number Generator, it does not need to be seeded by tyhe user and in generally continnualy updates with system events.
To generate a cryptographically secure random number use the RNGCryptoServiceProvider class or derive a class from System.Security.Cryptography.RandomNumberGenerator.
See MSDN Random Class.
I'm writing a program which basically processes data and outputs many files. There is no way it will be producing more than 10-20 files each use. I just wanted to know if using this method to generate unique filenames is a good idea? is it possible that rand will choose, lets say x and then within 10 instances, choose x again. Is using random(); a good idea? Any inputs will be appreciated!
Random rand = new Random ();
int randNo = rand.Next(100000,999999)l
using (var write = new StreamWriter("C:\\test" + randNo + ".txt")
{
// Stuff
}
I just wanted to know if using this method to generate unique filenames is a good idea?
No. Uniqueness isn't a property of randomness. Random means that the resulting value is not in any way dependent upon previous state. Which means repeats are possible. You could get the same number many times in a row (though it's unlikely).
If you want values which are unique, use a GUID:
Guid.NewGuid();
As pointed out in the comments below, this solution isn't perfect. But I contend that it's good enough for the problem at hand. The idea is that Random is designed to be random, and Guid is designed to be unique. Mathematically, "random" and "unique" are non-trivial problems to solve.
Neither of these implementations is 100% perfect at what it does. But the point is to simply use the correct one of the two for the intended functionality.
Or, to use an analogy... If you want to hammer a nail into a piece of wood, is it 100% guaranteed that the hammer will succeed in doing that? No. There exists a non-zero chance that the hammer will shatter upon contacting the nail. But I'd still reach for the hammer rather than jury rigging something with the screwdriver.
No, this is not correct method to create temporary file names in .Net.
The right way is to use either Path.GetTempFileName (creates file immediatedly) or Path.GetRandomFileName (creates high quality random name).
Note that there is not much wrong with Random, Guid.NewGuid(), DateTime.Now to generate small number of file names as covered in other answers, but using functions that are expected to be used for particular purpose leads to code that is easier to read/prove correctness.
If you want to generate a unique value, there's a tool specifically designed for generating unqiue identifying values, a Globally Unique IDentifier (GUID).
var guid = Guid.NewGuid();
Leave the problem of figuring out the best way of creating such a unique value to others.
There is what is called the Birthday Paradox... If you generate some random numbers (any number > 1), the possibility of encountering a "collision" increases... If you generate sqrt(numberofpossiblevalues) values, the possibility of a collision is around 50%... so you have 799998 possible values... sqrt(799998) is 894... It is quite low... With 45-90 calls to your program you have a 50% chance of a collision.
Note that random being random, if you generate two random numbers, there is a non-zero possibility of a collision, and if you generate numberofpossiblevalues + 1 random numbers, the possibility of a collision is 1.
Now... Someone will tell you that Guid.NewGuid will generate always unique values. They are sellers of very good snake oil. As written in the MSDN, in the Guid.NewGuid page...
The chance that the value of the new Guid will be all zeros or equal to any other Guid is very low.
The chance isn't 0, it is very (very very I'll add) low! Here the Birthday Paradox activates... Now... Microsoft Guid have 122 bits of "random" part and 6 bits of "fixed" part, the 50% chance of a collision happens around 2.3x10^18 . It is a big number! The 1% chance of collision is after 3.27x10^17... still a big number!
Note that Microsoft generates these 122 bits with a strong random number generator: https://msdn.microsoft.com/en-us/library/bb417a2c-7a58-404f-84dd-6b494ecf0d13#id9
Windows uses the cryptographic PRNG from the Cryptographic API (CAPI) and the Cryptographic API Next Generation (CNG) for generation of Version 4 GUIDs.
So while the whole Guid generated by Guid.NewGuid isn't totally random (because 6 bits are fixed), it is still quite random.
I would think it would be a good idea to add in the date & time the file was created in the file name in order to make sure it is not duplicated. You could also add random numbers to this if you want to make it even more unique (in the case your 10 files are saved at the exact same time).
So the files name might be file06182015112300.txt (showing the month, day, year, hour, minute & seconds)
If you want to use files of that format, and you know you won't run out of unused numbers, it's safer to check that the random number you generate isn't already used as follows:
Random rand = new Random();
string filename = "";
do
{
int randNo = rand.Next(100000, 999999);
filename = "C:\\test" + randNo + ".txt";
} while (File.Exists(filename));
using (var write = new StreamWriter(filename))
{
//Stuff
}
i want to generate a sequence of unique random numbers in the range of 00000001 to 99999999.
So the first one might be 00001010, the second 40002928 etc.
The easy way is to generate a random number and store it in the database, and every next time do it again and check in the database if the number already exists and if so, generate a new one, check it again, etc.
But that doesn't look right, i could be regenerating a number maybe 100 times if the number of generated items gets large.
Is there a smarter way?
EDIT
as allways i forgot to say WHY i wanted this, and it will probably make things clearer and maybe get an alternative, and it is:
we want to generate an ordernumber for a booking, so we could just use 000001, 000002 etc. But we don't want to give the competitors a clue of how much orders are created (because it's not a high volume market, and we don't want them to know if we are on order 30 after 2 months or at order 100. So we want to have an order number which is random (yet unique)
You can use either an Linear Congruential Generator (LCG) or Linear Feedback Shift Register (LFSR). Google or wikipedia for more info.
Both can, with the right parameters, operate on a 'full-cycle' (or 'full period') basis so that they will generate a 'psuedo-random number' only once in a single period, and generate all numbers within the range. Both are 'weak' generators, so no good for cyptography, but perhaps 'good enough' for apparent randomness. You may have to constrain the period to work within your 'decimal' maximum as having 'binary' periods is necessary.
Update: I should add that it is not necessary to pre-calculate or pre-store previous values in any way, you only need to keep the previous seed-value (single int) and calculate 'on-demand' the next number in the sequence. Of course you can save a chain of pre-calculated numbers to your DB if desired, but it isn't necessary.
How about creating a set all of possible numbers and simply randomising the order? You could then just pick the next number from the tail.
Each number appears only once in the set, and when you want a new one it has already been generated, so the overhead is tiny at the point at which you want one. You could do this in memory or the database of your choice. You'll just need a sensible locking strategy for pulling the next available number.
You could build a table with all the possible numbers in it, give the record a 'used' field.
Select all records that have not been 'used'
Pick a random number (r) between 1 and record count
Take record number r
Get your 'random value' from the record
Set the 'used' flag and update the db.
That should be more efficient than picking random numbers, querying the database and repeat until not found as that's just begging for an eternity for the last few values.
Use Pseudo-random Number Generators.
For example - Linear Congruential Random Number Generator
(if increment and n are coprime, then code will generate all numbers from 0 to n-1):
int seed = 1, increment = 3;
int n = 10;
int x = seed;
for(int i = 0; i < n; i++)
{
x = (x + increment) % n;
Console.WriteLine(x);
}
Output:
4
7
0
3
6
9
2
5
8
1
Basic Random Number Generators
Mersenne Twister
Using this algorithm might be suitable, though it's memory consuming:
http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
Put the numbers in the array from 1 to 99999999 and do the shuffle.
For the extremely limited size of your numbers no you cannot expect uniqueness for any type of random generation.
You are generating a 32bit integer, whereas to reach uniqueness you need a much larger number in terms around 128bit which is the size GUIDs use which are guaranteed to always be globally unique.
In case you happen to have access to a library and you want to dig into and understand the issue well, take a look at
The Art of Computer Programming, Volume 2: Seminumerical Algorithms
by Donald E. Knuth. Chapter 3 is all about random numbers.
You could just place your numbers in a set. If the size of the set after generation of your N numbers is too small, generate some more.
Do some trial runs. How many numbers do you have to generate on average? Try to find out an optimal solution to the tradeoff "generate too many numbers" / "check too often for duplicates". This optimal is a number M, so that after generating M numbers, your set will likely hold N unique numbers.
Oh, and M can also be calculated: If you need an extra number (your set contains N-1), then the chance of a random number already being in the set is (N-1)/R, with R being the range. I'm going crosseyed here, so you'll have to figure this out yourself (but this kinda stuff is what makes programming fun, no?).
You could put a unique constraint on the column that contains the random number, then handle any constraint voilations by regenerating the number. I think this normally indexes the column as well so this would be faster.
You've tagged the question with C#, so I'm guessing you're using C# to generate the random number. Maybe think about getting the database to generate the random number in a stored proc, and return it.
You could try giving writing usernames by using a starting number and an incremental number. You start at a number (say, 12000), then, for each account created, the number goes up by the incremental value.
id = startValue + (totalNumberOfAccounts * inctrementalNumber)
If incrementalNumber is a prime value, you should be able to loop around the max account value and not hit another value. This creates the illusion of a random id, but should also have very little conflicts. In the case of a conflicts, you could add a number to increase when there's a conflict, so the above code becomes. We want to handle this case, since, if we encounter one account value that is identical, when we increment, we will bump into another conflict when we increment again.
id = startValue + (totalNumberOfAccounts * inctrementalNumber) + totalConflicts
By fallowing line we can get e.g. 6 non repetitive random numbers for range e.g. 1 to 100.
var randomNumbers = Enumerable.Range(1, 100)
.OrderBy(n => Guid.NewGuid())
.Take(6)
.OrderBy(n => n);
I've had to do something like this before (create a "random looking" number for part of a URL). What I did was create a list of keys randomly generated. Each time it needed a new number it simply randomly selected a number from keys.Count and XOR the key and the given sequence number, then outputted XORed value (in base 62) prefixed with the keys index (in base 62).
I also check the output to ensure it does not contain any naught words. If it does simply take the next key and have a second go.
Decrypting the number is equally simple (the first digit is the index to the key to use, a simple XOR and you are done).
I like andora's answer if you are generating new numbers and might have used it had I known. However if I was to do this again I would have simply used UUIDs. Most (if not every) platform has a method for generating them and the length is just not an issue for URLs.
You could try shuffling the set of possible values then using them sequentially.
I like Lazarus's solution, but if you want to avoid effectively pre-allocating the space for every possible number, just store the used numbers in the table, but build an "unused numbers" list in memory by adding all possible numbers to a collection then deleting every one that's present in the database. Then select one of the remaining numbers and use that, adding it to the list in the database, obviously.
But, like I say, I like Lazaru's solution - I think that's your best bet for most scenarios.
function getShuffledNumbers(count) {
var shuffledNumbers = new Array();
var choices = new Array();
for (var i = 0; i<count; i++) {
// choose a number between 1 and amount of numbers remaining
choices[i] = selectedNumber = Math.ceil(Math.random()*(99999999 - i));
// Now to figure out the number based on this selection, work backwards until
// you figure out which choice this number WOULD have been on the first step
for (var j = 0; j < i; j++) {
if (choices[i - 1 - j] >= selectedNumber) {
// This basically says "it was choice number (selectedNumber) on the last step,
// but if it's greater than or equal to this, it must have been choice number
// (selectedNumber + 1) on THIS step."
selectedNumber++;
}
}
shuffledNumbers[i] = selectedNumber;
}
return shuffledNumbers;
}
This is as fast a way I could think of and only uses memory as it needs, however if you run it all the way through it will use double as much memory because it has two arrays, choices and shuffledNumbers.
Running a linear congruential generator once to generate each number is apt to produce rather feeble results. Running it through a number of iterations which is relatively prime to your base (100,000,000 in this case) will improve it considerably. If before reporting each output from the generator, you run it through one or more additional permutation functions, the final output will still be a duplicate-free permutation of as many numbers as you want (up to 100,000,000) but if the proper functions are chosen the result can be cryptographically strong.
create and store ind db two shuffled versions(SHUFFLE_1 and SHUFFLE_2) of the interval [0..N), where N=10'000;
whenever a new order is created, you assign its id like this:
ORDER_FAKE_INDEX = N*SHUFFLE_1[ORDER_REAL_INDEX / N] + SHUFFLE_2[ORDER_REAL_INDEX % N]
I also came with same kind of problem but in C#. I finally solved it. Hope it works for you also.
Suppose I need random number between 0 and some MaxValue and having a Random type object say random.
int n=0;
while(n<MaxValue)
{
int i=0;
i=random.Next(n,MaxValue);
n++;
Write.Console(i.ToString());
}
the stupid way: build a table to record, store all the numble first, and them ,every time the numble used, and flag it as "used"
System.Random rnd = new System.Random();
IEnumerable<int> numbers = Enumerable.Range(0, 99999999).OrderBy(r => rnd.Next());
This gives a randomly shuffled collection of ints in your range. You can then iterate through the collection in order.
The nice part about this is that you're not actually creating the entire collection in memory.
See comments below - this will generate the entire collection in memory when you iterate to the first element.
You can genearate number like below if you are ok with consumption of memory.
import java.util.ArrayList;
import java.util.Collections;
public class UniqueRandomNumbers {
public static void main(String[] args) {
ArrayList<Integer> list = new ArrayList<Integer>();
for (int i=1; i<11; i++) {
list.add(i);
}
Collections.shuffle(list);
for (int i=0; i<11; i++) {
System.out.println(list.get(i));
}
}
}
I have been doing some testing on the Random class and I have used the following code:
while (x++ <= 5000000)
{
y = rnd.Next(1, 5000000);
if (!data.Contains(y))
data.Add(y);
else
{
Console.WriteLine("Cycle {2}: Repetation found for number {0} after {1} iteration", y, x, i);
break;
}
}
I kept changing the rnd max limit (i.e. 5000000) and I changed the number of iterations and I got the following result:
1) if y = rnd.Next(1, 5000) : The average is between 80 to 110 iterations
2) if y = rnd.Next(1, 5000000) : The average is between 2000 to 4000 iterations
3) if y = rnd.Next(1, int.MaxValue) : The average is between 40,000 to 80,000 iterations.
Why am I getting these averages, i.e. out of 10 times I checked for each value, 80% of the time I get within this average range. I dont think we can call it near to being Random.
What can I do to get a fairly random number.
You are not testing for cycles. You are testing for how long it takes to get a random number you've had before. That's completely different. Your figures are spot on for testing how long it takes to get a random number you had before. Look in wikipedia under "the birthday paradox" for a chart of the probability of getting a collision after a certain number of iterations.
Coincidentally, last week I wrote a blog article about this exact subject. It'll go live on March 22nd; see my blog then for details.
If what you want to test for is the cycle length of a pseudo-random number generator then you need to look for not a number you've had before, but rather, a lengthy exact sequence of numbers that you've had before. There are a number of interesting ways to do that, but it's probably easier for me to just tell you: the cycle length of Random is a few billion, so you are unlikely to be able to write a program that discovers that fact. You'd have to store a lot of numbers.
However, cycle length is not the only measure of quality of a pseudo-random number generator. Remember, PRNGs are not random, they are predictable, and therefore you have to think very carefully about what your metric for "randomness" is.
Give us more details: why do you care how "random" Random is? What application are you using it for that you care? What aspects of randomness are important to you?
You are assuming that the randomness is better if numbers are not repeated. That is not true.
Real randomness doesn't have a memory. When you pick the next number, the chance to get the same number again is just as high as any other number in the range.
If you roll a dice and get a six, then roll the dice again, there is no less chance to get a six again. If you happen to get two sixes in a row, that doesn't mean that the dice is broken.
The randomness in the Random class it of course not perfect, but that is not what your test reveals. It simply shows a phenomenon that you get with every random number generator, even if actually creates real random numbers and not just pseudo-random numbers.
You are judging randomness by repeat pairs, which isn't the best test for randomness. The repeats you see are akin to the birthday paradox: http://en.wikipedia.org/wiki/Birthday_problem, where a repeat event can occur with a small sample size if you are not looking for a specific event.
Per the documentation at http://msdn.microsoft.com/en-us/library/system.random.aspx
To generate a cryptographically secure
random number suitable for creating a
random password, for example, use a
class derived from
System.Security.Cryptography..::.RandomNumberGenerator
such as
System.Security.Cryptography..::.RNGCryptoServiceProvider.
A computer can't generate a real random number.
if You need a real random number (David gave you the best option from dot net framework)
you need an external random source.