What would be the easiest way to code a function in .NET to generate a GUID based on a seed so that I can have greater confidence about its uniqueness?
string GenerateSeededGuid(int seed) { /* code here */ }
Ideally, the seed would come from CryptGenRandom which describes its random number generation as follows:
The data produced by this function is cryptographically random. It is
far more random than the data generated by the typical random number
generator such as the one shipped with your C compiler.
This function is often used to generate random initialization vectors
and salt values.
Software random number generators work in fundamentally the same way.
They start with a random number, known as the seed, and then use an
algorithm to generate a pseudo-random sequence of bits based on it.
The most difficult part of this process is to get a seed that is truly
random. This is usually based on user input latency, or the jitter
from one or more hardware components.
With Microsoft CSPs, CryptGenRandom uses the same random number
generator used by other security components. This allows numerous
processes to contribute to a system-wide seed. CryptoAPI stores an
intermediate random seed with every user. To form the seed for the
random number generator, a calling application supplies bits it might
have—for instance, mouse or keyboard timing input—that are then
combined with both the stored seed and various system data and user
data such as the process ID and thread ID, the system clock, the
system time, the system counter, memory status, free disk clusters,
the hashed user environment block. This result is used to seed the
pseudorandom number generator (PRNG). [...] If an application has access to a good random source, it can
fill the pbBuffer buffer with some random data before calling
CryptGenRandom. The CSP then uses this data to further randomize its
internal seed. It is acceptable to omit the step of initializing the
pbBuffer buffer before calling CryptGenRandom.
tldr; use Guid.NewGuid instead of trying to invent another "more random" approach. (The only reason I can think of to create a UUIDvX from a seed is when a predictable, resettable, sequence is desired. However, a GUID might also not be the best approach2.)
By very definition of being a finite range - 128bits minus 6 versioning bits, so 122 bits of uniqueness for v4 - there are only so many (albeit supremely huge number! astronomically big!) "unique" identifiers.
Due to the Pigeonhole Principle there are only so many Pigeonholes. If Pigeons keep reproducing eventually there will not be enough Holes for each Pigeon. Due to the Birthday Paradox, assuming complete randomness, two Pigeons will try to fight for the same Pigeonholes before they are all filled up. Because there is no Master Pigeonhole List1 this cannot be prevented. Also, not all animals are Pigeons3.
While there are no guarantees as to which GUID generator will be used, .NET uses the underlying OS call, which is a GUIDv4 (aka Random UUID) generator since Windows 2k. As far as I know - or care, really - this is as good a random as it gets for such a purpose. It has been well vetted for over a decade and has not been replaced.
From Wikipedia:
.. only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs.
1 While there are still a finite set of Pigeonholes, UUIDv1 (aka MAC UUID) - assuming unique time-space - is guaranteed to generate deterministically unique numbers (with some "relatively small" theoretical maximum number of UUIDs generated per second on a given machine). Different broods of Pigeons living in different parallel dimensions - awesome!
2 Twitter uses Snowflakes in parallel dimensions in its own distributed Unique-ID scheme.
3 Rabbits like to live in Burrows, not Pigeonholes. The use of a GUID also acts as an implicit parallel partition. It is only when a duplicate GUID is used for the same purpose that collision-related problems can arise. Just think of how many duplicate auto-increment database primary keys there are!
All you really need to do in your GenerateSeededGuid method is to create a 128-bit random number and the convert it to a Guid. Something like:
public Guid GenerateSeededGuid(int seed)
{
var r = new Random(seed);
var guid = new byte[16];
r.NextBytes(guid);
return new Guid(guid);
}
This is a bit old, but no need for a random generator. But yes this is usefull for testing purpose, but not for general uses
public static Guid GenerateSeededGuid<T>(T value)
{
byte[] bytes = new byte[16];
BitConverter.GetBytes(value.GetHashCode()).CopyTo(bytes, 0);
return new Guid(bytes);
}
public static Guid SeededGuid(int seed, Random random = null)
{
random ??= new Random(seed);
return Guid.Parse(string.Format("{0:X4}{1:X4}-{2:X4}-{3:X4}-{4:X4}-{5:X4}{6:X4}{7:X4}",
random.Next(0, 0xffff), random.Next(0, 0xffff),
random.Next(0, 0xffff),
random.Next(0, 0xffff) | 0x4000,
random.Next(0, 0x3fff) | 0x8000,
random.Next(0, 0xffff), random.Next(0, 0xffff), random.Next(0, 0xffff)));
}
//Example 1
SeededGuid("Test".GetHashCode());
SeededGuid("Test".GetHashCode());
//Example 2
var random = new Random("Test".GetHashCode());
SeededGuid("Test".GetHashCode(), random);
SeededGuid("Test".GetHashCode(), random);
This method is based on php v4 uui https://www.php.net/manual/en/function.uniqid.php#94959
Related
It seems that there is no way to manually seed the RNGCryptoServiceProvider in C#. Is there really nothing simple I can do below to get repeatable randomBytes here for debugging?
RNGCryptoServiceProvider rngCsp = new RNGCryptoServiceProvider();
byte[] randomBytes = new byte[20];
rngCsp.GetBytes(randomBytes);
MessageBox.Show(Convert.ToBase64String(randomBytes));
I know that I could manually enter the 20 bytes, but this is a pain since I really need many more than 20. Also, I know that I could use the non-cryptography random number generator, but, in the end, I will need the best random generation.
By the way, I would guess some CPUs have true random generation built in where seeding is not physically possible, but I don't think my CPU has this capability. I wonder if anybody knows exactly what I could do with my CPU to reset the RNGCryptoServiceProvider environment and trick RNGCryptoServiceProvider into using a prior seed...I imagine I could set my clock back and reset some "user log bits" somewhere...I know this wouldn't be practical, but wonder if anybody has ever been successful at this (even though Microsoft's goal is probably to prevent this).
There is not a means to seed the RNGCryptoServiceProvider. One solution to generating deterministic values for debugging is to derive your own class that implements from System.Security.Cryptography.RandomNumberGenerator (the base class for RNGCryptoServiceProvider):
class DeterministicRandomGenerator : System.Security.Cryptography.RandomNumberGenerator
{
Random r = new Random(0);
public override void GetBytes(byte[] data)
{
r.NextBytes(data);
}
public override void GetNonZeroBytes(byte[] data)
{
// simple implementation
for (int i = 0; i < data.Length; i++)
data[i] = (byte)r.Next(1, 256);
}
}
Note the implementation uses the standard Random implementation with a seed of 0, ensuring deterministic results. Now you can use this class in place of RNGCryptoServiceProvider for debugging purposes:
RandomNumberGenerator rngCsp =
#if DEBUG
new DeterministicRandomGenerator(); // get deterministic values if debugging
#else
new RNGCryptoServiceProvider(); // otherwise, use CryptoRNG
#endif
byte[] randomBytes = new byte[20];
rngCsp.GetBytes(randomBytes);
MessageBox.Show(Convert.ToBase64String(randomBytes));
Edited to add
I wonder if anybody knows exactly what I could do with my CPU to reset the RNGCryptoServiceProvider environment and trick RNGCryptoServiceProvider into using a prior seed
Internally, RNGCryptoServiceProvider calls the Win32 CryptGenRandom function to fill the buffer with cryptographically-random values (Source and additional information). It is not based on a single seed. (Although the Win32 API allows the caller to provide a seed with supplemental random data, the .NET API does not expose this functionality. The purpose of a seed in this context is to provide additional entropy the application has access to, rather than to force a deterministic sequence.) The CryptGenRandom documentation states that:
To form the seed for the random number generator, a calling application supplies bits it might have—for instance, mouse or keyboard timing input—that are then combined with both the stored seed and various system data and user data such as the process ID and thread ID, the system clock, the system time, the system counter, memory status, free disk clusters, the hashed user environment block. This result is used to seed the pseudorandom number generator (PRNG). In Windows Vista with Service Pack 1 (SP1) and later, an implementation of the AES counter-mode based PRNG specified in NIST Special Publication 800-90 is used. In Windows Vista, Windows Storage Server 2003, and Windows XP, the PRNG specified in Federal Information Processing Standard (FIPS) 186-2 is used.
The result is that "resetting" the RNGCryptoServiceProvider to coerce it into repeating a former sequence is, by design, not a practical approach.
Also, I know that I could use the non-cryptography random number generator, but, in the end, I will need the best random generation.
Use a Mersenne Twister or similar as a seedable PRNG for testing purposes - only in those tests where a repeatable result is required - and the RCPCryptoServiceProvider for everything else.
Another alternative is to pre-generate a batch of randomBytes arrays and save those to disk for use in your testing phase. This gives the advantage of giving you repeatability while retaining the same random distribution and other characteristics of the RCP CSP. Also it's easier to forget to remove temporary code than it is to forget NOT to include extraneous files.
I'm writing a program which basically processes data and outputs many files. There is no way it will be producing more than 10-20 files each use. I just wanted to know if using this method to generate unique filenames is a good idea? is it possible that rand will choose, lets say x and then within 10 instances, choose x again. Is using random(); a good idea? Any inputs will be appreciated!
Random rand = new Random ();
int randNo = rand.Next(100000,999999)l
using (var write = new StreamWriter("C:\\test" + randNo + ".txt")
{
// Stuff
}
I just wanted to know if using this method to generate unique filenames is a good idea?
No. Uniqueness isn't a property of randomness. Random means that the resulting value is not in any way dependent upon previous state. Which means repeats are possible. You could get the same number many times in a row (though it's unlikely).
If you want values which are unique, use a GUID:
Guid.NewGuid();
As pointed out in the comments below, this solution isn't perfect. But I contend that it's good enough for the problem at hand. The idea is that Random is designed to be random, and Guid is designed to be unique. Mathematically, "random" and "unique" are non-trivial problems to solve.
Neither of these implementations is 100% perfect at what it does. But the point is to simply use the correct one of the two for the intended functionality.
Or, to use an analogy... If you want to hammer a nail into a piece of wood, is it 100% guaranteed that the hammer will succeed in doing that? No. There exists a non-zero chance that the hammer will shatter upon contacting the nail. But I'd still reach for the hammer rather than jury rigging something with the screwdriver.
No, this is not correct method to create temporary file names in .Net.
The right way is to use either Path.GetTempFileName (creates file immediatedly) or Path.GetRandomFileName (creates high quality random name).
Note that there is not much wrong with Random, Guid.NewGuid(), DateTime.Now to generate small number of file names as covered in other answers, but using functions that are expected to be used for particular purpose leads to code that is easier to read/prove correctness.
If you want to generate a unique value, there's a tool specifically designed for generating unqiue identifying values, a Globally Unique IDentifier (GUID).
var guid = Guid.NewGuid();
Leave the problem of figuring out the best way of creating such a unique value to others.
There is what is called the Birthday Paradox... If you generate some random numbers (any number > 1), the possibility of encountering a "collision" increases... If you generate sqrt(numberofpossiblevalues) values, the possibility of a collision is around 50%... so you have 799998 possible values... sqrt(799998) is 894... It is quite low... With 45-90 calls to your program you have a 50% chance of a collision.
Note that random being random, if you generate two random numbers, there is a non-zero possibility of a collision, and if you generate numberofpossiblevalues + 1 random numbers, the possibility of a collision is 1.
Now... Someone will tell you that Guid.NewGuid will generate always unique values. They are sellers of very good snake oil. As written in the MSDN, in the Guid.NewGuid page...
The chance that the value of the new Guid will be all zeros or equal to any other Guid is very low.
The chance isn't 0, it is very (very very I'll add) low! Here the Birthday Paradox activates... Now... Microsoft Guid have 122 bits of "random" part and 6 bits of "fixed" part, the 50% chance of a collision happens around 2.3x10^18 . It is a big number! The 1% chance of collision is after 3.27x10^17... still a big number!
Note that Microsoft generates these 122 bits with a strong random number generator: https://msdn.microsoft.com/en-us/library/bb417a2c-7a58-404f-84dd-6b494ecf0d13#id9
Windows uses the cryptographic PRNG from the Cryptographic API (CAPI) and the Cryptographic API Next Generation (CNG) for generation of Version 4 GUIDs.
So while the whole Guid generated by Guid.NewGuid isn't totally random (because 6 bits are fixed), it is still quite random.
I would think it would be a good idea to add in the date & time the file was created in the file name in order to make sure it is not duplicated. You could also add random numbers to this if you want to make it even more unique (in the case your 10 files are saved at the exact same time).
So the files name might be file06182015112300.txt (showing the month, day, year, hour, minute & seconds)
If you want to use files of that format, and you know you won't run out of unused numbers, it's safer to check that the random number you generate isn't already used as follows:
Random rand = new Random();
string filename = "";
do
{
int randNo = rand.Next(100000, 999999);
filename = "C:\\test" + randNo + ".txt";
} while (File.Exists(filename));
using (var write = new StreamWriter(filename))
{
//Stuff
}
My colleague and I are debating which of these methods to use for auto generating user ID's and post ID's for identification in the database:
One option uses a single instance of Random, and takes some useful parameters so it can be reused for all sorts of string-gen cases (i.e. from 4 digit numeric pins to 20 digit alphanumeric ids). Here's the code:
// This is created once for the lifetime of the server instance
class RandomStringGenerator
{
public const string ALPHANUMERIC_CAPS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
public const string ALPHA_CAPS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
public const string NUMERIC = "1234567890";
Random rand = new Random();
public string GetRandomString(int length, params char[] chars)
{
string s = "";
for (int i = 0; i < length; i++)
s += chars[rand.Next() % chars.Length];
return s;
}
}
and the other option is simply to use:
Guid.NewGuid();
see Guid.NewGuid on MSDN
We're both aware that Guid.NewGuid() would work for our needs, but I would rather use the custom method. It does the same thing but with more control.
My colleague thinks that because the custom method has been cooked up ourselves, it's more likely to generate collisions. I'll admit I'm not fully aware of the implementation of Random, but I presume it is just as random as Guid.NewGuid(). A typical usage of the custom method might be:
RandomStringGenerator stringGen = new RandomStringGenerator();
string id = stringGen.GetRandomString(20, RandomStringGenerator.ALPHANUMERIC_CAPS.ToCharArray());
Edit 1:
We are using Azure Tables which doesn't have an auto increment (or similar) feature for generating keys.
Some answers here just tell me to use NewGuid() "because that's what it's made for". I'm looking for a more in depth reason as to why the cooked up method may be more likely to generate collisions given the same degrees of freedom as a Guid.
Edit 2:
We were also using the cooked up method to generate post ID's which, unlike session tokens, need to look pretty for display in the url of our website (like http://mywebsite.com/14983336), so guids are not an option here, however collisions are still to be avoided.
I am looking for a more in depth reason as to why the cooked up method may be more likely to generate collisions given the same degrees of freedom as a Guid.
First, as others have noted, Random is not thread-safe; using it from multiple threads can cause it to corrupt its internal data structures so that it always produces the same sequence.
Second, Random is seeded based on the current time. Two instances of Random created within the same millisecond (recall that a millisecond is several million processor cycles on modern hardware) will have the same seed, and therefore will produce the same sequence.
Third, I lied. Random is not seeded based on the current time; it is seeded based on the amount of time the machine has been active. The seed is a 32 bit number, and since the granularity is in milliseconds, that's only a few weeks until it wraps around. But that's not the problem; the problem is: the time period in which you create that instance of Random is highly likely to be within a few minutes of the machine booting up. Every time you power-cycle a machine, or bring a new machine online in a cluster, there is a small window in which instances of Random are created, and the more that happens, the greater the odds are that you'll get a seed that you had before.
(UPDATE: Newer versions of the .NET framework have mitigated some of these problems; in those versions you no longer have every Random created within the same millisecond have the same seed. However there are still many problems with Random; always remember that it is only pseudo-random, not crypto-strength random. Random is actually very predictable, so if you are relying on unpredictability, it is not suitable.)
As other have said: if you want a primary key for your database then have the database generate you a primary key; let the database do its job. If you want a globally unique identifier then use a guid; that's what they're for.
And finally, if you are interested in learning more about the uses and abuses of guids then you might want to read my "guid guide" series; part one is here:
https://ericlippert.com/2012/04/24/guid-guide-part-one/
As written in other answers, my implementation had a few severe problems:
Thread safety: Random is not thread safe.
Predictability: the method couldn't be used for security critical identifiers like session tokens due to the nature of the Random class.
Collisions: Even though the method created 20 'random' numbers, the probability of a collision is not (number of possible chars)^20 due to the seed value only being 31 bits, and coming from a bad source. Given the same seed, any length of sequence will be the same.
Guid.NewGuid() would be fine, except we don't want to use ugly GUIDs in urls and .NETs NewGuid() algorithm is not known to be cryptographically secure for use in session tokens - it might give predictable results if a little information is known.
Here is the code we're using now, it is secure, flexible and as far as I know it's very unlikely to create collisions if given enough length and character choice:
class RandomStringGenerator
{
RNGCryptoServiceProvider rand = new RNGCryptoServiceProvider();
public string GetRandomString(int length, params char[] chars)
{
string s = "";
for (int i = 0; i < length; i++)
{
byte[] intBytes = new byte[4];
rand.GetBytes(intBytes);
uint randomInt = BitConverter.ToUInt32(intBytes, 0);
s += chars[randomInt % chars.Length];
}
return s;
}
}
"Auto generating user ids and post ids for identification in the database"...why not use a database sequence or identity to generate keys?
To me your question is really, "What is the best way to generate a primary key in my database?" If that is the case, you should use the conventional tool of the database which will either be a sequence or identity. These have benefits over generated strings.
Sequences/identity index better. There are numerous articles and blog posts that explain why GUIDs and so forth make poor indexes.
They are guaranteed to be unique within the table
They can be safely generated by concurrent inserts without collision
They are simple to implement
I guess my next question is, what reasons are you considering GUID's or generated strings? Will you be integrating across distributed databases? If not, you should ask yourself if you are solving a problem that doesn't exist.
Your custom method has two problems:
It uses a global instance of Random, but doesn't use locking. => Multi threaded access can corrupt its state. After which the output will suck even more than it already does.
It uses a predictable 31 bit seed. This has two consequences:
You can't use it for anything security related where unguessability is important
The small seed (31 bits) can reduce the quality of your numbers. For example if you create multiple instances of Random at the same time(since system startup) they'll probably create the same sequence of random numbers.
This means you cannot rely on the output of Random being unique, no matter how long it is.
I recommend using a CSPRNG (RNGCryptoServiceProvider) even if you don't need security. Its performance is still acceptable for most uses, and I'd trust the quality of its random numbers over Random. If you you want uniqueness, I recommend getting numbers with around 128 bits.
To generate random strings using RNGCryptoServiceProvider you can take a look at my answer to How can I generate random 8 character, alphanumeric strings in C#?.
Nowadays GUIDs returned by Guid.NewGuid() are version 4 GUIDs. They are generated from a PRNG, so they have pretty similar properties to generating a random 122 bit number (the remaining 6 bits are fixed). Its entropy source has much higher quality than what Random uses, but it's not guaranteed to be cryptographically secure.
But the generation algorithm can change at any time, so you can't rely on that. For example in the past the Windows GUID generation algorithm changed from v1 (based on MAC + timestamp) to v4 (random).
Use System.Guid as it:
...can be used across all computers and networks wherever a unique identifier is required.
Note that Random is a pseudo-random number generator. It is not truly random, nor unique. It has only 32-bits of value to work with, compared to the 128-bit GUID.
However, even GUIDs can have collisions (although the chances are really slim), so you should use the database's own features to give you a unique identifier (e.g. the autoincrement ID column). Also, you cannot easily turn a GUID into a 4 or 20 (alpha)numeric number.
Contrary to what some people have said in the comment, a GUID generated by Guid.NewGuid() is NOT dependent on any machine-specific identifier (only type 1 GUIDs are, Guid.NewGuid() returns a type 4 GUID, which is mostly random).
As long as you don't need cryptographic security, the Random class should be good enough, but if you want to be extra safe, use System.Security.Cryptography.RandomNumberGenerator. For the Guid approach, note that not all digits in a GUID are random. Quote from wikipedia:
In the canonical representation, xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx, the most significant bits of N indicates the variant (depending on the variant; one, two or three bits are used). The variant covered by the UUID specification is indicated by the two most significant bits of N being 1 0 (i.e. the hexadecimal N will always be 8, 9, A, or B).
In the variant covered by the UUID specification, there are five versions. For this variant, the four bits of M indicates the UUID version (i.e. the hexadecimal M will either be 1, 2, 3, 4, or 5).
Regarding your edit, here is one reason to prefer a GUID over a generated string:
The native storage for a GUID (uniqueidentifier) in SQL Server is 16 bytes. To store a equivalent-length varchar (string), where each "digit" in the id is stored as a character, would require somewhere between 32 and 38 bytes, depending on formatting.
Because of its storage, SQL Server is also able to index a uniqueidentifier column more efficiently than a varchar column as well.
Since computers cannot pick random numbers(can they?) how is this random number actually generated. For example in C# we say,
Random.Next()
What happens inside?
You may checkout this article. According to the documentation the specific implementation used in .NET is based on Donald E. Knuth's subtractive random number generator algorithm. For more information, see D. E. Knuth. "The Art of Computer Programming, volume 2: Seminumerical Algorithms". Addison-Wesley, Reading, MA, second edition, 1981.
Since computers cannot pick random numbers (can they?)
As others have noted, "Random" is actually pseudo-random. To answer your parenthetical question: yes, computers can pick truly random numbers. Doing so is much more expensive than the simple integer arithmetic of a pseudo-random number generator, and usually not required. However there are applications where you must have non-predictable true randomness: cryptography and online poker immediately come to mind. If either use a predictable source of pseudo-randomness then attackers can decrypt/forge messages much more easily, and cheaters can figure out who has what in their hands.
The .NET crypto classes have methods that give random numbers suitable for cryptography or games where money is on the line. As for how they work: the literature on crypto-strength randomness is extensive; consult any good university undergrad textbook on cryptography for details.
Specialty hardware also exists to get random bits. If you need random numbers that are drawn from atmospheric noise, see www.random.org.
Knuth covers the topic of randomness very well.
We don't really understand random well. How can something predictable be random? And yet pseudo-random sequences can appear to be perfectly random by statistical tests.
There are three categories of Random generators, amplifying on the comment above.
First, you have pseudo random number generators where if you know the current random number, it's easy to compute the next one. This makes it easy to reverse engineer other numbers if you find out a few.
Then, there are cryptographic algorithms that make this much harder. I believe they still are pseudo random sequences (contrary to what the comment above implies), but with the very important property that knowing a few numbers in the sequence does NOT make it obvious how to compute the rest. The way it works is that crypto routines tend to hash up the number, so that if one bit changes, every bit is equally likely to change as a result.
Consider a simple modulo generator (similar to some implementations in C rand() )
int rand() {
return seed = seed * m + a;
}
if m=0 and a=0, this is a lousy generator with period 1: 0, 0, 0, 0, ....
if m=1 and a=1, it's also not very random looking: 0, 1, 2, 3, 4, 5, 6, ...
But if you pick m and a to be prime numbers around 2^16, this will jump around nicely looking very random if you are casually inspecting. But because both numbers are odd, you would see that the low bit would toggle, ie the number is alternately odd and even. Not a great random number generator. And since there are only 2^32 values in a 32 bit number, by definition after 2^32 iterations at most, you will repeat the sequence again, making it obvious that the generator is NOT random.
If you think of the middle bits as nice and scrambled, while the lower ones aren't as random, then you can construct a better random number generator out of a few of these, with the various bits XORed together so that all the bits are covered well. Something like:
(rand1() >> 8) ^ rand2() ^ (rand3() > 5) ...
Still, every number is flipping in synch, which makes this predictable. And if you get two sequential values they are correlated, so that if you plot them you will get lines on your screen. Now imagine you have rules combining the generators, so that sequential values are not the next ones.
For example
v1 = rand1() >> 8 ^ rand2() ...
v2 = rand2() >> 8 ^ rand5() ..
and imagine that the seeds don't always advance. Now you're starting to make something that's much harder to predict based on reverse engineering, and the sequence is longer.
For example, if you compute rand1() every time, but only advance the seed in rand2() every 3rd time, a generator combining them might not repeat for far longer than the period of either one.
Now imagine that you pump your (fairly predictable) modulo-type random number generator through DES or some other encryption algorithm. That will scramble up the bits.
Obviously, there are better algorithms, but this gives you an idea. Numerical Recipes has a lot of algorithms implemented in code and explained. One very good trick: generate not one but a block of random values in a table. Then use an independent random number generator to pick one of the generated numbers, generate a new one and replace it. This breaks up any correlation between adjacent pairs of numbers.
The third category is actual hardware-based random number generators, for example based on atmospheric noise
http://www.random.org/randomness/
This is, according to current science, truly random. Perhaps someday we will discover that it obeys some underlying rule, but currently, we cannot predict these values, and they are "truly" random as far as we are concerned.
The boost library has excellent C++ implementations of Fibonacci generators, the reigning kings of pseudo-random sequences if you want to see some source code.
I'll just add an answer to the first part of the question (the "can they?" part).h
Computers can generate (well, generate may not be an entirely accurate word) random numbers (as in, not pseudo-random). Specifically, by using environmental randomness which is gotten through specialized hardware devices (that generates randomness based on noise, for e.g.) or by using environmental inputs (e.g. hard disk timings, user input event timings).
However, that has no bearing on the second question (which was how Random.Next() works).
The Random class is a pseudo-random number generator.
It is basically an extremely long but deterministic repeating sequence. The "randomness" comes from starting at different positions. Specifying where to start is done by choosing a seed for the random number generator and can for example be done by using the system time or by getting a random seed from another random source. The default Random constructor uses the system time as a seed.
The actual algorithm used to generate the sequence of numbers is documented in MSDN:
The current implementation of the Random class is based on Donald E. Knuth's subtractive random number generator algorithm. For more information, see D. E. Knuth. "The Art of Computer Programming, volume 2: Seminumerical Algorithms". Addison-Wesley, Reading, MA, second edition, 1981.
Computers use pseudorandom number generators. Essentially, they work by start with a seed number and iterating it through an algorithm each time a new pseudorandom number is required.
The process is of course entirely deterministic, so a given seed will generate exactly the same sequence of numbers every time it is used, but the numbers generated form a statistically uniform distribution (approximately), and this is fine, since in most scenarios all you need is stochastic randomness.
The usual practice is to use the current system time as a seed, though if more security is required, "entropy" may be gathered from a physical source such as disk latency in order to generate a seed that is more difficult to predict. In this case, you'd also want to use a cryptographically strong random number generator such as this.
I don't know much details but what I know is that a seed is used in order to generate the random numbers it is then based on some algorithm that uses that seed that a new number is obtained.
If you get random numbers based on the same seed they will be the same often.
I know that the Random class can generate pseudo-random numbers but is there a way to generate truly random numbers?
The answer here has two main sides to it. There are some quite important subtleties to which you should pay due attention...
The Easy Way (for simplicity & practicality)
The RNGCryptoServiceProvider, which is part of the Crypto API in the BCL, should do the job for you. It's still technically a pseudo-random number generated, but the quality of "randomness" is much higher - suitable for cryptographic purposes, as the name might suggest.
There are other crypographic APIs with high quality pseudo random generaters available too. Algorithms such as the Mersenne twister are quite popular.
Comparing this to the Random class in the BCL, it is significantly better. If you plot the numbers generated by Random on a graph, for example, you should be able to recognise patterns, which is a strong sign of weakness. This is largely due to the fact that the algorithm simply uses a seeded lookup table of fixed size.
The Hard Way (for high quality theoretical randomness)
To generate truly random numbers, you need to make use of some natural phenomenon, such as nuclear decay, microscopic temperature fluctuations (CPU temperature is a comparatively conveient source), to name a few. This however is much more difficult and requires additional hardware, of course. I suspect the practical solution (RNGCryptoServiceProvider or such) should do the job perfectly well for you.
Now, note that if you really do require truly random numbers, you could use a service such as Random.org, which generates numbers with very high randomness/entropy (based on atmospheric noise). Data is freely available for download. This may nonetheless be unnecessarily complicated for your situation, although it certainly gives you data suitable for scientific study and whatnot.
The choice is yours in the end, but at least you should now be able to make an informative decision, being aware of the various types and levels of RNGs.
short answer: It is not directly possible to generate TRULY RANDOM NUMBERS using only C# (i.e. using only a purely mathematical construction).
long(er) answer: Only by means of employing an external device capable of generating "randomness" such as a white noise generator or similar - and capturing the output of that device as a seed for a pseudo random number generator (PRG). That part could be accomplished using C#.
True random numbers can only be generated if there is a truly random physical input device that provides the seed for the random function.
Whether anything physical and truly random exists is still debated (and likely will be for a long time) by the science community.
Psuedo-random number generators are the next best thing and the best are very difficult to predict.
As John von Neumann joked, "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."
The thread is old and answered, but i thought I'd proceed anyway. It's for completeness and people should know some things about Random in c#.
As for truly random, the best you can ever hope to do is use a "secure Pseudo Random Generator" like salsa20 or RC4 (sort of, sometimes). They pass a barrage of tests where "efficient" adversaries try to distinguish them from random. This comes with certain costs and is probably unnecessary for most uses.
The random class in c# is pretty good most of the time, it has a statically distribution that looks random. However the default seed for random() is the system time. So if you take lots of randoms at the "same time" they are taken with the same seed and will be the same ("random" is completely deterministic, don't let it fool you). Similar system time seeds also may produce similar numbers because of random class's shortcomings.
The way to deal with this is to set you own seeds, like
Random random = new Random((int)DateTime.Now.Ticks & (0x0000FFFF + x));
where x is some value you increment if you've created a loop to get a bunch of random numbers, say.
Also with c# random extensionsto your new variable like NextDouble() can be helpful in manipulating the random numbers, in this case crow-baring them into interval (0,1) to become unif(0,1), which happens is a distribution you can plug into stat formulas to create all the distributions in statistics.
Take a look at using an algorithm like Yarrow or Fortuna with entropy accumulation. The point with these algorithms is that they keep track of entropy as a measure of theoretical information content available for predicting future numbers by knowing the past numbers and the algorithms used to produce them; and they use cryptographic techniques to fold new entropy sources into the number generator.
You'll still need an external source of random data (e.g. hardware source of random numbers), whether that's time of keystrokes, or mouse movement, or hard disk access times, or CPU temperature, or webcam data, or stock prices, or whatever -- but in any case, you keep mixing this information into the entropy pools, so that even if the truly random data is slow or low quality, it's enough to keep things going in an unpredictable fashion.
I was debating building a random number generator based off twitter or one of the other social networking sites. Basically use the api to pull recent posts and then use that to seed a high quality pseudo random number generator. It probably isn't any more effective than randomizing off the timer but seemed like fun. Besides it seems like the best use for most of the stuff people post to twitter.
I always liked this idea, for the retro 60s look:
Lavarand
There is no "true" random in computers, everything is based on something else. For some (feasible) ways to generate pseudorandom data, try something such as a pool of the HD temp, CPU temp, network usage (packets/second) and possibly hits/second to the webserver.
Just to clarify everyone saying that there is no True RNG available in C# or on your computer is mistaken. A multi-core processor is inherently a True RNG. Very simply by taking advantage of processor spin you can generate bools that have no discernible pattern. From there you can generate whatever number range you want by using the bools as bits and constructing the number by adding the bits together.
Yes this is magnitudes slower than a purely mathematical solution but a purely mathematical solution will always have a pattern.
public static bool GenerateBoolean()
{
var gen1 = 0;
var gen2 = 0;
Task.Run(() =>
{
while (gen1 < 1 || gen2 < 1)
Interlocked.Increment(ref gen1);
});
while (gen1 < 1 || gen2 < 1)
Interlocked.Increment(ref gen2);
return (gen1 + gen2) % 2 == 0;
}
There is no way to generate truly random numbers with a computer. True randomness requires an external source that monitors some natural phenomenon.
That said, if you don't have access to such a source of truly random numbers you could use a "poor man's" process like this:
Create a long array (10000 or more items?) of numbers
Populate the array with current time-seeded random numbers the standard way
When a random number is required, generate a random index into the array and return the number contained at that position
Create a new, current time-seeded random number at the array index to replace the number used
This two-step process should improve the randomness of your results somewhat without the need for external input.
Here's a sample library that implements the above-described algorithm in C++: http://www.boost.org/doc/libs/1_39_0/libs/random/random-generators.html
This code will return you a random number between min and max:
private static readonly Random random = new Random();
private static readonly object syncLock = new object();
public int RandomNumber(int min, int max)
{
lock (syncLock)
{ // synchronize
return random.Next(min, max);
}
}
Usage:
int randomNumber = RandomNumber(0, 10); // a random number between 1 and 10