Related
What is the best way to randomize an array of strings with .NET? My array contains about 500 strings and I'd like to create a new Array with the same strings but in a random order.
Please include a C# example in your answer.
The following implementation uses the Fisher-Yates algorithm AKA the Knuth Shuffle. It runs in O(n) time and shuffles in place, so is better performing than the 'sort by random' technique, although it is more lines of code. See here for some comparative performance measurements. I have used System.Random, which is fine for non-cryptographic purposes.*
static class RandomExtensions
{
public static void Shuffle<T> (this Random rng, T[] array)
{
int n = array.Length;
while (n > 1)
{
int k = rng.Next(n--);
T temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}
}
Usage:
var array = new int[] {1, 2, 3, 4};
var rng = new Random();
rng.Shuffle(array);
rng.Shuffle(array); // different order from first call to Shuffle
* For longer arrays, in order to make the (extremely large) number of permutations equally probable it would be necessary to run a pseudo-random number generator (PRNG) through many iterations for each swap to produce enough entropy. For a 500-element array only a very small fraction of the possible 500! permutations will be possible to obtain using a PRNG. Nevertheless, the Fisher-Yates algorithm is unbiased and therefore the shuffle will be as good as the RNG you use.
If you're on .NET 3.5, you can use the following IEnumerable coolness:
Random rnd=new Random();
string[] MyRandomArray = MyArray.OrderBy(x => rnd.Next()).ToArray();
Edit: and here's the corresponding VB.NET code:
Dim rnd As New System.Random
Dim MyRandomArray = MyArray.OrderBy(Function() rnd.Next()).ToArray()
Second edit, in response to remarks that System.Random "isn't threadsafe" and "only suitable for toy apps" due to returning a time-based sequence: as used in my example, Random() is perfectly thread-safe, unless you're allowing the routine in which you randomize the array to be re-entered, in which case you'll need something like lock (MyRandomArray) anyway in order not to corrupt your data, which will protect rnd as well.
Also, it should be well-understood that System.Random as a source of entropy isn't very strong. As noted in the MSDN documentation, you should use something derived from System.Security.Cryptography.RandomNumberGenerator if you're doing anything security-related. For example:
using System.Security.Cryptography;
...
RNGCryptoServiceProvider rnd = new RNGCryptoServiceProvider();
string[] MyRandomArray = MyArray.OrderBy(x => GetNextInt32(rnd)).ToArray();
...
static int GetNextInt32(RNGCryptoServiceProvider rnd)
{
byte[] randomInt = new byte[4];
rnd.GetBytes(randomInt);
return Convert.ToInt32(randomInt[0]);
}
You're looking for a shuffling algorithm, right?
Okay, there are two ways to do this: the clever-but-people-always-seem-to-misunderstand-it-and-get-it-wrong-so-maybe-its-not-that-clever-after-all way, and the dumb-as-rocks-but-who-cares-because-it-works way.
Dumb way
Create a duplicate of your first array, but tag each string should with a random number.
Sort the duplicate array with respect to the random number.
This algorithm works well, but make sure that your random number generator is unlikely to tag two strings with the same number. Because of the so-called Birthday Paradox, this happens more often than you might expect. Its time complexity is O(n log n).
Clever way
I'll describe this as a recursive algorithm:
To shuffle an array of size n (indices in the range [0..n-1]):
if n = 0
do nothing
if n > 0
(recursive step) shuffle the first n-1 elements of the array
choose a random index, x, in the range [0..n-1]
swap the element at index n-1 with the element at index x
The iterative equivalent is to walk an iterator through the array, swapping with random elements as you go along, but notice that you cannot swap with an element after the one that the iterator points to. This is a very common mistake, and leads to a biased shuffle.
Time complexity is O(n).
This algorithm is simple but not efficient, O(N2). All the "order by" algorithms are typically O(N log N). It probably doesn't make a difference below hundreds of thousands of elements but it would for large lists.
var stringlist = ... // add your values to stringlist
var r = new Random();
var res = new List<string>(stringlist.Count);
while (stringlist.Count >0)
{
var i = r.Next(stringlist.Count);
res.Add(stringlist[i]);
stringlist.RemoveAt(i);
}
The reason why it's O(N2) is subtle: List.RemoveAt() is a O(N) operation unless you remove in order from the end.
You can also make an extention method out of Matt Howells. Example.
namespace System
{
public static class MSSystemExtenstions
{
private static Random rng = new Random();
public static void Shuffle<T>(this T[] array)
{
rng = new Random();
int n = array.Length;
while (n > 1)
{
int k = rng.Next(n);
n--;
T temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}
}
}
Then you can just use it like:
string[] names = new string[] {
"Aaron Moline1",
"Aaron Moline2",
"Aaron Moline3",
"Aaron Moline4",
"Aaron Moline5",
"Aaron Moline6",
"Aaron Moline7",
"Aaron Moline8",
"Aaron Moline9",
};
names.Shuffle<string>();
Just thinking off the top of my head, you could do this:
public string[] Randomize(string[] input)
{
List<string> inputList = input.ToList();
string[] output = new string[input.Length];
Random randomizer = new Random();
int i = 0;
while (inputList.Count > 0)
{
int index = r.Next(inputList.Count);
output[i++] = inputList[index];
inputList.RemoveAt(index);
}
return (output);
}
Randomizing the array is intensive as you have to shift around a bunch of strings. Why not just randomly read from the array? In the worst case you could even create a wrapper class with a getNextString(). If you really do need to create a random array then you could do something like
for i = 0 -> i= array.length * 5
swap two strings in random places
The *5 is arbitrary.
public static void Shuffle(object[] arr)
{
Random rand = new Random();
for (int i = arr.Length - 1; i >= 1; i--)
{
int j = rand.Next(i + 1);
object tmp = arr[j];
arr[j] = arr[i];
arr[i] = tmp;
}
}
Generate an array of random floats or ints of the same length. Sort that array, and do corresponding swaps on your target array.
This yields a truly independent sort.
Ok, this is clearly a bump from my side (apologizes...), but I often use a quite general and cryptographically strong method.
public static class EnumerableExtensions
{
static readonly RNGCryptoServiceProvider RngCryptoServiceProvider = new RNGCryptoServiceProvider();
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> enumerable)
{
var randomIntegerBuffer = new byte[4];
Func<int> rand = () =>
{
RngCryptoServiceProvider.GetBytes(randomIntegerBuffer);
return BitConverter.ToInt32(randomIntegerBuffer, 0);
};
return from item in enumerable
let rec = new {item, rnd = rand()}
orderby rec.rnd
select rec.item;
}
}
Shuffle() is an extension on any IEnumerable so getting, say, numbers from 0 to 1000 in random order in a list can be done with
Enumerable.Range(0,1000).Shuffle().ToList()
This method also wont give any surprises when it comes to sorting, since the sort value is generated and remembered exactly once per element in the sequence.
Random r = new Random();
List<string> list = new List(originalArray);
List<string> randomStrings = new List();
while(list.Count > 0)
{
int i = r.Random(list.Count);
randomStrings.Add(list[i]);
list.RemoveAt(i);
}
Jacco, your solution ising a custom IComparer isn't safe. The Sort routines require the comparer to conform to several requirements in order to function properly. First among them is consistency. If the comparer is called on the same pair of objects, it must always return the same result. (the comparison must also be transitive).
Failure to meet these requirements can cause any number of problems in the sorting routine including the possibility of an infinite loop.
Regarding the solutions that associate a random numeric value with each entry and then sort by that value, these are lead to an inherent bias in the output because any time two entries are assigned the same numeric value, the randomness of the output will be compromised. (In a "stable" sort routine, whichever is first in the input will be first in the output. Array.Sort doesn't happen to be stable, but there is still a bias based on the partitioning done by the Quicksort algorithm).
You need to do some thinking about what level of randomness you require. If you are running a poker site where you need cryptographic levels of randomness to protect against a determined attacker you have very different requirements from someone who just wants to randomize a song playlist.
For song-list shuffling, there's no problem using a seeded PRNG (like System.Random). For a poker site, it's not even an option and you need to think about the problem a lot harder than anyone is going to do for you on stackoverflow. (using a cryptographic RNG is only the beginning, you need to ensure that your algorithm doesn't introduce a bias, that you have sufficient sources of entropy, and that you don't expose any internal state that would compromise subsequent randomness).
This post has already been pretty well answered - use a Durstenfeld implementation of the Fisher-Yates shuffle for a fast and unbiased result. There have even been some implementations posted, though I note some are actually incorrect.
I wrote a couple of posts a while back about implementing full and partial shuffles using this technique, and (this second link is where I'm hoping to add value) also a follow-up post about how to check whether your implementation is unbiased, which can be used to check any shuffle algorithm. You can see at the end of the second post the effect of a simple mistake in the random number selection can make.
You don't need complicated algorithms.
Just one simple line:
Random random = new Random();
array.ToList().Sort((x, y) => random.Next(-1, 1)).ToArray();
Note that we need to convert the Array to a List first, if you don't use List in the first place.
Also, mind that this is not efficient for very large arrays! Otherwise it's clean & simple.
This is a complete working Console solution based on the example provided in here:
class Program
{
static string[] words1 = new string[] { "brown", "jumped", "the", "fox", "quick" };
static void Main()
{
var result = Shuffle(words1);
foreach (var i in result)
{
Console.Write(i + " ");
}
Console.ReadKey();
}
static string[] Shuffle(string[] wordArray) {
Random random = new Random();
for (int i = wordArray.Length - 1; i > 0; i--)
{
int swapIndex = random.Next(i + 1);
string temp = wordArray[i];
wordArray[i] = wordArray[swapIndex];
wordArray[swapIndex] = temp;
}
return wordArray;
}
}
int[] numbers = {0,1,2,3,4,5,6,7,8,9};
List<int> numList = new List<int>();
numList.AddRange(numbers);
Console.WriteLine("Original Order");
for (int i = 0; i < numList.Count; i++)
{
Console.Write(String.Format("{0} ",numList[i]));
}
Random random = new Random();
Console.WriteLine("\n\nRandom Order");
for (int i = 0; i < numList.Capacity; i++)
{
int randomIndex = random.Next(numList.Count);
Console.Write(String.Format("{0} ", numList[randomIndex]));
numList.RemoveAt(randomIndex);
}
Console.ReadLine();
Could be:
Random random = new();
string RandomWord()
{
const string CHARS = "abcdefghijklmnoprstuvwxyz";
int n = random.Next(CHARS.Length);
return string.Join("", CHARS.OrderBy(x => random.Next()).ToArray())[0..n];
}
Here's a simple way using OLINQ:
// Input array
List<String> lst = new List<string>();
for (int i = 0; i < 500; i += 1) lst.Add(i.ToString());
// Output array
List<String> lstRandom = new List<string>();
// Randomize
Random rnd = new Random();
lstRandom.AddRange(from s in lst orderby rnd.Next(100) select s);
private ArrayList ShuffleArrayList(ArrayList source)
{
ArrayList sortedList = new ArrayList();
Random generator = new Random();
while (source.Count > 0)
{
int position = generator.Next(source.Count);
sortedList.Add(source[position]);
source.RemoveAt(position);
}
return sortedList;
}
This question already has answers here:
Random number generator only generating one random number
(15 answers)
Closed 7 years ago.
Consider this method:
private static int GenerateRandomNumber(int seed, int max)
{
return new Random(seed).Next(max);
}
On my machine, executing this loop yields the same number through 1500 iterations:
for (int i = 0; i < 1501; i++)
{
int random = GenerateRandomNumber(100000000, 999999999);
Console.WriteLine(random.ToString());
Console.ReadKey();
}
I get 145156561, for every single iteration.
I don't have a pressing issue, I was just curious about this behavior because .Next(max) says "Returns a Non Negative random number less than the specified maximum. Perhaps I am not understanding something basic.
You're always seeding a new instance with the same seed, and then grabbing the first max. By using a Seed, you're guaranteeing the same results.
If you want to have a static, random number generation that does different results, you should rework this a bit. However, since Random is not threadsafe, it requires some synchronization when used statically. Something like:
private static Random random;
private static object syncObj = new object();
private static void InitRandomNumber(int seed)
{
random = new Random(seed);
}
private static int GenerateRandomNumber(int max)
{
lock(syncObj)
{
if (random == null)
random = new Random(); // Or exception...
return random.Next(max);
}
}
Dilbert has encountered the same problem back in 2001:
http://dilbert.com/strips/comic/2001-10-25/
Coincidence?
I don't think so.
And random.org agrees : http://www.random.org/analysis/
The problem is that you are creating a new Random instance with the same seed number each time. You should create a single Random instance (store it in a static if necessary) and simply call the next method on that same instance.
Random number generation is not truly random, see this Wikipedia entry for more details.
Salam to All,
Well it drove me crazy as well. The answer is simple. Change the seed before you generate random.
Example:
I want to generate random number between 1 to 10
Random rnd = new Random(DateTime.Now.Second);
int random_number = rnd.Next(10);
Put it inside a loop and run it three times. It will give out random numbers below 10.
Pseudo-random number generator usually work by choosing a seed, and then generating a deterministic sequence based on that seed. Choosing the same seed every time, you generate the same sequence.
There are "only" 2^32 different random sequences in .NET.
Not sure how the internals work.. check wiki for it, but it's very simple.
public class MathCalculations
{
private Random rnd = new Random();
public Int32 getRandom(Int32 iMin, Int32 iMax)
{
return rnd.Next(iMin, iMax);
}
}
public class Main
{
MathCalculations mathCalculations = new MathCalculations();
for (int i = 0; i < 6; i++)
{
getRandom(0,1000);
}
}
will generate Number1, Number2, Number3, Number4, Number5, Number6 (1 seed, 1 sequence of many numbers, random*not really, but approx.*)
if you however do this:
public class MathCalculations
{
public Int32 getRandom(Int32 iMin, Int32 iMax)
{
Random rnd = new Random();
return rnd.Next(iMin, iMax);
}
}
public class Main
{
MathCalculations mathCalculations = new MathCalculations();
for (int i = 0; i < 6; i++)
{
getRandom(0,1000);
}
}
You will now get Number1, Number1, Number1, Number1, Number1, Number1 (1 seed, 6 equal sequences of many numbers, always pick the same starting number from each equal sequence).. At some point Number1 will be different, because the seed changes over time.. but you need to wait some time for this, nonetheless, you never pick number2 from the sequence.
The reason is, each time you generate a new sequence with the same seed, hence the sequence is the same over and over again, and each time your random generated will pick the first number in it's sequence, which, with the same seed, is of course always the same.
Not sure if this is technically correct by the underlying methods of the random generator, but that's how it behaves.
In the event that anyone is looking for a "quick and dirty" "solution" (and I use that term with caution) then this will suffice for most.
int secondsSinceMidnight = Convert.ToInt32(DateTime.Now.Subtract(DateTime.Today).TotalSeconds);
Random rand = new Random(secondsSinceMidnight);
var usuallyRandomId = rand.Next();
Please note my use of usually random. I agree that the item marked as the answer is a more correct way of doing this.
I want to produce a selection of random numbers in a range between 1 and 30, ideally this number will alter (to a max of +10) but will always have a range of 1 - 30. I want to be able to do this several times when needed. The numbers that I want to produce within the range will vary in numbers eg maybe I will want only 2 or 5 numbers.
I thought I should produce a static class and use the same random instance with a method that accepts an integer which indicated the total number of numbers I require within the range? Obviously the numbers should be no identical numbers returned from the method for each call. But if one method call produces the same numbers as a previous call then thats fine, but ideally they should differ.
I'm not sure how I would code this or if I'm completely wrong?
Sample code:
public static class getMyNumbers
{
private static Random random = new Random();
public static int[] getThese(int i)
{
int[] wanted = new int[i];
// a loop to generate the numbers???
// this bit I'm not sure about the syntax
// new to c#
return wanted
}
}
Thanks
You just need to loop putting the numbers into the array:
public static int[] getThese(int i)
{
int[] wanted = new int[i];
for (int j = 0; j < i; ++j)
wanted[j] = random.Next(31);
return wanted;
}
Note that the parameter to random.Next() is an exclusive upper range, so passing 31 will generate random numbers between 0 and 30.
As an aside, note that it's customary to use n for a count and i for a loop variable, so it would probably be better to name your variables thus:
public static int[] GetThese(int n)
{
int[] result = new int[n];
for (int i = 0; i < n; ++i)
result[i] = random.Next(31);
return result;
}
Just because everybody loves Linq:
private static Random random = new Random();
public IEnumerable<int> GetRandomInts(int Amount, int Max = 30)
{
return Enumerable.Range(0, Amount).Select(a => random.Next(Max)+1);
}
For no duplicates...
return Enumerable.Range(1, Max).OrderBy(a => random.Next()).Take(Amount);
I need to create a list of numbers from a range (for example from x to y) in a random order so that every order has an equal chance.
I need this for a music player I write in C#, to create play lists in a random order.
Any ideas?
Thanks.
EDIT: I'm not interested in changing the original list, just pick up random indexes from a range in a random order so that every order has an equal chance.
Here's what I've wrriten so far:
public static IEnumerable<int> RandomIndexes(int count)
{
if (count > 0)
{
int[] indexes = new int[count];
int indexesCountMinus1 = count - 1;
for (int i = 0; i < count; i++)
{
indexes[i] = i;
}
Random random = new Random();
while (indexesCountMinus1 > 0)
{
int currIndex = random.Next(0, indexesCountMinus1 + 1);
yield return indexes[currIndex];
indexes[currIndex] = indexes[indexesCountMinus1];
indexesCountMinus1--;
}
yield return indexes[0];
}
}
It's working, but the only problem of this is that I need to allocate an array in the memory in the size of count. I'm looking for something that dose not require memory allocation.
Thanks.
This can actually be tricky if you're not careful (i.e., using a naïve shuffling algorithm). Take a look at the Fisher-Yates/Knuth shuffle algorithm for proper distribution of values.
Once you have the shuffling algorithm, the rest should be easy.
Here's more detail from Jeff Atwood.
Lastly, here's Jon Skeet's implementation and description.
EDIT
I don't believe that there's a solution that satisfies your two conflicting requirements (first, to be random with no repeats and second to not allocate any additional memory). I believe you may be prematurely optimizing your solution as the memory implications should be negligible, unless you're embedded. Or, perhaps I'm just not smart enough to come up with an answer.
With that, here's code that will create an array of evenly distributed random indexes using the Knuth-Fisher-Yates algorithm (with a slight modification). You can cache the resulting array, or perform any number of optimizations depending on the rest of your implementation.
private static int[] BuildShuffledIndexArray( int size ) {
int[] array = new int[size];
Random rand = new Random();
for ( int currentIndex = array.Length - 1; currentIndex > 0; currentIndex-- ) {
int nextIndex = rand.Next( currentIndex + 1 );
Swap( array, currentIndex, nextIndex );
}
return array;
}
private static void Swap( IList<int> array, int firstIndex, int secondIndex ) {
if ( array[firstIndex] == 0 ) {
array[firstIndex] = firstIndex;
}
if ( array[secondIndex] == 0 ) {
array[secondIndex] = secondIndex;
}
int temp = array[secondIndex];
array[secondIndex] = array[firstIndex];
array[firstIndex] = temp;
}
NOTE: You can use ushort instead of int to half the size in memory as long as you don't have more than 65,535 items in your playlist. You could always programmatically switch to int if the size exceeds ushort.MaxValue. If I, personally, added more than 65K items to a playlist, I wouldn't be shocked by increased memory utilization.
Remember, too, that this is a managed language. The VM will always reserve more memory than you are using to limit the number of times it needs to ask the OS for more RAM and to limit fragmentation.
EDIT
Okay, last try: we can look to tweak the performance/memory trade off: You could create your list of integers, then write it to disk. Then just keep a pointer to the offset in the file. Then every time you need a new number, you just have disk I/O to deal with. Perhaps you can find some balance here, and just read N-sized blocks of data into memory where N is some number you're comfortable with.
Seems like a lot of work for a shuffle algorithm, but if you're dead-set on conserving memory, then at least it's an option.
If you use a maximal linear feedback shift register, you will use O(1) of memory and roughly O(1) time. See here for a handy C implementation (two lines! woo-hoo!) and tables of feedback terms to use.
And here is a solution:
public class MaximalLFSR
{
private int GetFeedbackSize(uint v)
{
uint r = 0;
while ((v >>= 1) != 0)
{
r++;
}
if (r < 4)
r = 4;
return (int)r;
}
static uint[] _feedback = new uint[] {
0x9, 0x17, 0x30, 0x44, 0x8e,
0x108, 0x20d, 0x402, 0x829, 0x1013, 0x203d, 0x4001, 0x801f,
0x1002a, 0x2018b, 0x400e3, 0x801e1, 0x10011e, 0x2002cc, 0x400079, 0x80035e,
0x1000160, 0x20001e4, 0x4000203, 0x8000100, 0x10000235, 0x2000027d, 0x4000016f, 0x80000478
};
private uint GetFeedbackTerm(int bits)
{
if (bits < 4 || bits >= 28)
throw new ArgumentOutOfRangeException("bits");
return _feedback[bits];
}
public IEnumerable<int> RandomIndexes(int count)
{
if (count < 0)
throw new ArgumentOutOfRangeException("count");
int bitsForFeedback = GetFeedbackSize((uint)count);
Random r = new Random();
uint i = (uint)(r.Next(1, count - 1));
uint feedback = GetFeedbackTerm(bitsForFeedback);
int valuesReturned = 0;
while (valuesReturned < count)
{
if ((i & 1) != 0)
{
i = (i >> 1) ^ feedback;
}
else {
i = (i >> 1);
}
if (i <= count)
{
valuesReturned++;
yield return (int)(i-1);
}
}
}
}
Now, I selected the feedback terms (badly) at random from the link above. You could also implement a version that had multiple maximal terms and you select one of those at random, but you know what? This is pretty dang good for what you want.
Here is test code:
static void Main(string[] args)
{
while (true)
{
Console.Write("Enter a count: ");
string s = Console.ReadLine();
int count;
if (Int32.TryParse(s, out count))
{
MaximalLFSR lfsr = new MaximalLFSR();
foreach (int i in lfsr.RandomIndexes(count))
{
Console.Write(i + ", ");
}
}
Console.WriteLine("Done.");
}
}
Be aware that maximal LFSR's never generate 0. I've hacked around this by returning the i term - 1. This works well enough. Also, since you want to guarantee uniqueness, I ignore anything out of range - the LFSR only generates sequences up to powers of two, so in high ranges, it will generate wost case 2x-1 too many values. These will get skipped - that will still be faster than FYK.
Personally, for a music player, I wouldn't generate a shuffled list, and then play that, then generate another shuffled list when that runs out, but do something more like:
IEnumerable<Song> GetSongOrder(List<Song> allSongs)
{
var playOrder = new List<Song>();
while (true)
{
// this step assigns an integer weight to each song,
// corresponding to how likely it is to be played next.
// in a better implementation, this would look at the total number of
// songs as well, and provide a smoother ramp up/down.
var weights = allSongs.Select(x => playOrder.LastIndexOf(x) > playOrder.Length - 10 ? 50 : 1);
int position = random.Next(weights.Sum());
foreach (int i in Enumerable.Range(allSongs.Length))
{
position -= weights[i];
if (position < 0)
{
var song = allSongs[i];
playOrder.Add(song);
yield return song;
break;
}
}
// trim playOrder to prevent infinite memory here as well.
if (playOrder.Length > allSongs.Length * 10)
playOrder = playOrder.Skip(allSongs.Length * 8).ToList();
}
}
This would make songs picked in order, as long as they haven't been recently played. This provides "smoother" transitions from the end of one shuffle to the next, because the first song of the next shuffle could be the same song as the last shuffle with 1/(total songs) probability, whereas this algorithm has a lower (and configurable) chance of hearing one of the last x songs again.
Unless you shuffle the original song list (which you said you don't want to do), you are going to have to allocate some additional memory to accomplish what you're after.
If you generate the random permutation of song indices beforehand (as you are doing), you obviously have to allocate some non-trivial amount of memory to store it, either encoded or as a list.
If the user doesn't need to be able to see the list, you could generate the random song order on the fly: After each song, pick another random song from the pool of unplayed songs. You still have to keep track of which songs have already been played, but you can use a bitfield for that. If you have 10000 songs, you just need 10000 bits (1250 bytes), each one representing whether the song has been played yet.
I don't know your exact limitations, but I have to wonder if the memory required to store a playlist is significant compared to the amount required for playing audio.
There are a number of methods of generating permutations without needing to store the state. See this question.
I think you should stick to your current solution (the one in your edit).
To do a re-order with no repetitions & not making your code behave unreliable, you have to track what you have already used / like by keeping unused indexes or indirectly by swapping from the original list.
I suggest to check it in the context of the working application i.e. if its of any significance vs. the memory used by other pieces of the system.
From a logical standpoint, it is possible. Given a list of n songs, there are n! permutations; if you assign each permutation a number from 1 to n! (or 0 to n!-1 :-D) and pick one of those numbers at random, you can then store the number of the permutation that you are currently using, along with the original list and the index of the current song within the permutation.
For example, if you have a list of songs {1, 2, 3}, your permutations are:
0: {1, 2, 3}
1: {1, 3, 2}
2: {2, 1, 3}
3: {2, 3, 1}
4: {3, 1, 2}
5: {3, 2, 1}
So the only data I need to track is the original list ({1, 2, 3}), the current song index (e.g. 1) and the index of the permutation (e.g. 3). Then, if I want to find the next song to play, I know it's third (2, but zero-based) song of permutation 3, e.g. Song 1.
However, this method relies on you having an efficient means of determining the ith song of the jth permutation, which until I've had chance to think (or someone with a stronger mathematical background than I can interject) is equivalent to "then a miracle happens". But the principle is there.
If memory was really a concern after a certain number of records and it's safe to say that if that memory boundary is reached, there's enough items in the list to not matter if there are some repeats, just as long as the same song was not repeated twice, I would use a combination method.
Case 1: If count < max memory constraint, generate the playlist ahead of time and use Knuth shuffle (see Jon Skeet's implementation, mentioned in other answers).
Case 2: If count >= max memory constraint, the song to be played will be determined at run time (I'd do it as soon as the song starts playing so the next song is already generated by the time the current song ends). Save the last [max memory constraint, or some token value] number of songs played, generate a random number (R) between 1 and song count, and if R = one of X last songs played, generate a new R until it is not in the list. Play that song.
Your max memory constraints will always be upheld, although performance can suffer in case 2 if you've played a lot of songs/get repeat random numbers frequently by chance.
you could use a trick we do in sql server to order sets in random like this with the use of guid. the values are always distributed equaly random.
private IEnumerable<int> RandomIndexes(int startIndexInclusive, int endIndexInclusive)
{
if (endIndexInclusive < startIndexInclusive)
throw new Exception("endIndex must be equal or higher than startIndex");
List<int> originalList = new List<int>(endIndexInclusive - startIndexInclusive);
for (int i = startIndexInclusive; i <= endIndexInclusive; i++)
originalList.Add(i);
return from i in originalList
orderby Guid.NewGuid()
select i;
}
You're going to have to allocate some memory, but it doesn't have to be a lot. You can reduce the memory footprint (the degree by which I'm unsure, as I don't know that much about the guts of C#) by using a bool array instead of int. Best case scenario this will only use (count / 8) bytes of memory, which isn't too bad (but I doubt C# actually represents bools as single bits).
public static IEnumerable<int> RandomIndexes(int count) {
Random rand = new Random();
bool[] used = new bool[count];
int i;
for (int counter = 0; counter < count; counter++) {
while (used[i = rand.Next(count)]); //i = some random unused value
used[i] = true;
yield return i;
}
}
Hope that helps!
As many others have said you should implement THEN optimize, and only optimize the parts that need it (which you check on with a profiler). I offer a (hopefully) elegant method of getting the list you need, which doesn't really care so much about performance:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Test
{
class Program
{
static void Main(string[] a)
{
Random random = new Random();
List<int> list1 = new List<int>(); //source list
List<int> list2 = new List<int>();
list2 = random.SequenceWhile((i) =>
{
if (list2.Contains(i))
{
return false;
}
list2.Add(i);
return true;
},
() => list2.Count == list1.Count,
list1.Count).ToList();
}
}
public static class RandomExtensions
{
public static IEnumerable<int> SequenceWhile(
this Random random,
Func<int, bool> shouldSkip,
Func<bool> continuationCondition,
int maxValue)
{
int current = random.Next(maxValue);
while (continuationCondition())
{
if (!shouldSkip(current))
{
yield return current;
}
current = random.Next(maxValue);
}
}
}
}
It is pretty much impossible to do it without allocating extra memory. If you're worried about the amount of extra memory allocated, you could always pick a random subset and shuffle between those. You'll get repeats before every song is played, but with a sufficiently large subset I'll warrant few people will notice.
const int MaxItemsToShuffle = 20;
public static IEnumerable<int> RandomIndexes(int count)
{
Random random = new Random();
int indexCount = Math.Min(count, MaxItemsToShuffle);
int[] indexes = new int[indexCount];
if (count > MaxItemsToShuffle)
{
int cur = 0, subsetCount = MaxItemsToShuffle;
for (int i = 0; i < count; i += 1)
{
if (random.NextDouble() <= ((float)subsetCount / (float)(count - i + 1)))
{
indexes[cur] = i;
cur += 1;
subsetCount -= 1;
}
}
}
else
{
for (int i = 0; i < count; i += 1)
{
indexes[i] = i;
}
}
for (int i = indexCount; i > 0; i -= 1)
{
int curIndex = random.Next(0, i);
yield return indexes[curIndex];
indexes[curIndex] = indexes[i - 1];
}
}
What is the best way to randomize an array of strings with .NET? My array contains about 500 strings and I'd like to create a new Array with the same strings but in a random order.
Please include a C# example in your answer.
The following implementation uses the Fisher-Yates algorithm AKA the Knuth Shuffle. It runs in O(n) time and shuffles in place, so is better performing than the 'sort by random' technique, although it is more lines of code. See here for some comparative performance measurements. I have used System.Random, which is fine for non-cryptographic purposes.*
static class RandomExtensions
{
public static void Shuffle<T> (this Random rng, T[] array)
{
int n = array.Length;
while (n > 1)
{
int k = rng.Next(n--);
T temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}
}
Usage:
var array = new int[] {1, 2, 3, 4};
var rng = new Random();
rng.Shuffle(array);
rng.Shuffle(array); // different order from first call to Shuffle
* For longer arrays, in order to make the (extremely large) number of permutations equally probable it would be necessary to run a pseudo-random number generator (PRNG) through many iterations for each swap to produce enough entropy. For a 500-element array only a very small fraction of the possible 500! permutations will be possible to obtain using a PRNG. Nevertheless, the Fisher-Yates algorithm is unbiased and therefore the shuffle will be as good as the RNG you use.
If you're on .NET 3.5, you can use the following IEnumerable coolness:
Random rnd=new Random();
string[] MyRandomArray = MyArray.OrderBy(x => rnd.Next()).ToArray();
Edit: and here's the corresponding VB.NET code:
Dim rnd As New System.Random
Dim MyRandomArray = MyArray.OrderBy(Function() rnd.Next()).ToArray()
Second edit, in response to remarks that System.Random "isn't threadsafe" and "only suitable for toy apps" due to returning a time-based sequence: as used in my example, Random() is perfectly thread-safe, unless you're allowing the routine in which you randomize the array to be re-entered, in which case you'll need something like lock (MyRandomArray) anyway in order not to corrupt your data, which will protect rnd as well.
Also, it should be well-understood that System.Random as a source of entropy isn't very strong. As noted in the MSDN documentation, you should use something derived from System.Security.Cryptography.RandomNumberGenerator if you're doing anything security-related. For example:
using System.Security.Cryptography;
...
RNGCryptoServiceProvider rnd = new RNGCryptoServiceProvider();
string[] MyRandomArray = MyArray.OrderBy(x => GetNextInt32(rnd)).ToArray();
...
static int GetNextInt32(RNGCryptoServiceProvider rnd)
{
byte[] randomInt = new byte[4];
rnd.GetBytes(randomInt);
return Convert.ToInt32(randomInt[0]);
}
You're looking for a shuffling algorithm, right?
Okay, there are two ways to do this: the clever-but-people-always-seem-to-misunderstand-it-and-get-it-wrong-so-maybe-its-not-that-clever-after-all way, and the dumb-as-rocks-but-who-cares-because-it-works way.
Dumb way
Create a duplicate of your first array, but tag each string should with a random number.
Sort the duplicate array with respect to the random number.
This algorithm works well, but make sure that your random number generator is unlikely to tag two strings with the same number. Because of the so-called Birthday Paradox, this happens more often than you might expect. Its time complexity is O(n log n).
Clever way
I'll describe this as a recursive algorithm:
To shuffle an array of size n (indices in the range [0..n-1]):
if n = 0
do nothing
if n > 0
(recursive step) shuffle the first n-1 elements of the array
choose a random index, x, in the range [0..n-1]
swap the element at index n-1 with the element at index x
The iterative equivalent is to walk an iterator through the array, swapping with random elements as you go along, but notice that you cannot swap with an element after the one that the iterator points to. This is a very common mistake, and leads to a biased shuffle.
Time complexity is O(n).
This algorithm is simple but not efficient, O(N2). All the "order by" algorithms are typically O(N log N). It probably doesn't make a difference below hundreds of thousands of elements but it would for large lists.
var stringlist = ... // add your values to stringlist
var r = new Random();
var res = new List<string>(stringlist.Count);
while (stringlist.Count >0)
{
var i = r.Next(stringlist.Count);
res.Add(stringlist[i]);
stringlist.RemoveAt(i);
}
The reason why it's O(N2) is subtle: List.RemoveAt() is a O(N) operation unless you remove in order from the end.
You can also make an extention method out of Matt Howells. Example.
namespace System
{
public static class MSSystemExtenstions
{
private static Random rng = new Random();
public static void Shuffle<T>(this T[] array)
{
rng = new Random();
int n = array.Length;
while (n > 1)
{
int k = rng.Next(n);
n--;
T temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}
}
}
Then you can just use it like:
string[] names = new string[] {
"Aaron Moline1",
"Aaron Moline2",
"Aaron Moline3",
"Aaron Moline4",
"Aaron Moline5",
"Aaron Moline6",
"Aaron Moline7",
"Aaron Moline8",
"Aaron Moline9",
};
names.Shuffle<string>();
Just thinking off the top of my head, you could do this:
public string[] Randomize(string[] input)
{
List<string> inputList = input.ToList();
string[] output = new string[input.Length];
Random randomizer = new Random();
int i = 0;
while (inputList.Count > 0)
{
int index = r.Next(inputList.Count);
output[i++] = inputList[index];
inputList.RemoveAt(index);
}
return (output);
}
Randomizing the array is intensive as you have to shift around a bunch of strings. Why not just randomly read from the array? In the worst case you could even create a wrapper class with a getNextString(). If you really do need to create a random array then you could do something like
for i = 0 -> i= array.length * 5
swap two strings in random places
The *5 is arbitrary.
public static void Shuffle(object[] arr)
{
Random rand = new Random();
for (int i = arr.Length - 1; i >= 1; i--)
{
int j = rand.Next(i + 1);
object tmp = arr[j];
arr[j] = arr[i];
arr[i] = tmp;
}
}
Generate an array of random floats or ints of the same length. Sort that array, and do corresponding swaps on your target array.
This yields a truly independent sort.
Ok, this is clearly a bump from my side (apologizes...), but I often use a quite general and cryptographically strong method.
public static class EnumerableExtensions
{
static readonly RNGCryptoServiceProvider RngCryptoServiceProvider = new RNGCryptoServiceProvider();
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> enumerable)
{
var randomIntegerBuffer = new byte[4];
Func<int> rand = () =>
{
RngCryptoServiceProvider.GetBytes(randomIntegerBuffer);
return BitConverter.ToInt32(randomIntegerBuffer, 0);
};
return from item in enumerable
let rec = new {item, rnd = rand()}
orderby rec.rnd
select rec.item;
}
}
Shuffle() is an extension on any IEnumerable so getting, say, numbers from 0 to 1000 in random order in a list can be done with
Enumerable.Range(0,1000).Shuffle().ToList()
This method also wont give any surprises when it comes to sorting, since the sort value is generated and remembered exactly once per element in the sequence.
Random r = new Random();
List<string> list = new List(originalArray);
List<string> randomStrings = new List();
while(list.Count > 0)
{
int i = r.Random(list.Count);
randomStrings.Add(list[i]);
list.RemoveAt(i);
}
Jacco, your solution ising a custom IComparer isn't safe. The Sort routines require the comparer to conform to several requirements in order to function properly. First among them is consistency. If the comparer is called on the same pair of objects, it must always return the same result. (the comparison must also be transitive).
Failure to meet these requirements can cause any number of problems in the sorting routine including the possibility of an infinite loop.
Regarding the solutions that associate a random numeric value with each entry and then sort by that value, these are lead to an inherent bias in the output because any time two entries are assigned the same numeric value, the randomness of the output will be compromised. (In a "stable" sort routine, whichever is first in the input will be first in the output. Array.Sort doesn't happen to be stable, but there is still a bias based on the partitioning done by the Quicksort algorithm).
You need to do some thinking about what level of randomness you require. If you are running a poker site where you need cryptographic levels of randomness to protect against a determined attacker you have very different requirements from someone who just wants to randomize a song playlist.
For song-list shuffling, there's no problem using a seeded PRNG (like System.Random). For a poker site, it's not even an option and you need to think about the problem a lot harder than anyone is going to do for you on stackoverflow. (using a cryptographic RNG is only the beginning, you need to ensure that your algorithm doesn't introduce a bias, that you have sufficient sources of entropy, and that you don't expose any internal state that would compromise subsequent randomness).
This post has already been pretty well answered - use a Durstenfeld implementation of the Fisher-Yates shuffle for a fast and unbiased result. There have even been some implementations posted, though I note some are actually incorrect.
I wrote a couple of posts a while back about implementing full and partial shuffles using this technique, and (this second link is where I'm hoping to add value) also a follow-up post about how to check whether your implementation is unbiased, which can be used to check any shuffle algorithm. You can see at the end of the second post the effect of a simple mistake in the random number selection can make.
You don't need complicated algorithms.
Just one simple line:
Random random = new Random();
array.ToList().Sort((x, y) => random.Next(-1, 1)).ToArray();
Note that we need to convert the Array to a List first, if you don't use List in the first place.
Also, mind that this is not efficient for very large arrays! Otherwise it's clean & simple.
This is a complete working Console solution based on the example provided in here:
class Program
{
static string[] words1 = new string[] { "brown", "jumped", "the", "fox", "quick" };
static void Main()
{
var result = Shuffle(words1);
foreach (var i in result)
{
Console.Write(i + " ");
}
Console.ReadKey();
}
static string[] Shuffle(string[] wordArray) {
Random random = new Random();
for (int i = wordArray.Length - 1; i > 0; i--)
{
int swapIndex = random.Next(i + 1);
string temp = wordArray[i];
wordArray[i] = wordArray[swapIndex];
wordArray[swapIndex] = temp;
}
return wordArray;
}
}
int[] numbers = {0,1,2,3,4,5,6,7,8,9};
List<int> numList = new List<int>();
numList.AddRange(numbers);
Console.WriteLine("Original Order");
for (int i = 0; i < numList.Count; i++)
{
Console.Write(String.Format("{0} ",numList[i]));
}
Random random = new Random();
Console.WriteLine("\n\nRandom Order");
for (int i = 0; i < numList.Capacity; i++)
{
int randomIndex = random.Next(numList.Count);
Console.Write(String.Format("{0} ", numList[randomIndex]));
numList.RemoveAt(randomIndex);
}
Console.ReadLine();
Could be:
Random random = new();
string RandomWord()
{
const string CHARS = "abcdefghijklmnoprstuvwxyz";
int n = random.Next(CHARS.Length);
return string.Join("", CHARS.OrderBy(x => random.Next()).ToArray())[0..n];
}
Here's a simple way using OLINQ:
// Input array
List<String> lst = new List<string>();
for (int i = 0; i < 500; i += 1) lst.Add(i.ToString());
// Output array
List<String> lstRandom = new List<string>();
// Randomize
Random rnd = new Random();
lstRandom.AddRange(from s in lst orderby rnd.Next(100) select s);
private ArrayList ShuffleArrayList(ArrayList source)
{
ArrayList sortedList = new ArrayList();
Random generator = new Random();
while (source.Count > 0)
{
int position = generator.Next(source.Count);
sortedList.Add(source[position]);
source.RemoveAt(position);
}
return sortedList;
}