How do I use First() in LINQ but random? - c#

In a list like this:
var colors = new List<string>{"green", "red", "blue", "black","purple"};
I can get the first value like this:
var color = colors.First(c => c.StartsWidth("b")); //This will return the string with "blue"
Bot how do I do it, if I want want a random value matching the conditions? For example something like this:
Debug.log(colors.RandomFirst(c => c.StartsWidth("b"))) // Prints out black
Debug.log(colors.RandomFirst(c => c.StartsWidth("b"))) // Prints out black
Debug.log(colors.RandomFirst(c => c.StartsWidth("b"))) // Prints out blue
Debug.log(colors.RandomFirst(c => c.StartsWidth("b"))) // Prints out black
As in if there are multiple entries in the list matching the condition, i want to pull one of them randomly.
It has (I need it to be) to be an inline solution.
Thank you.

Random ordering then:
var rnd = new Random();
var color = colors.Where(c => c.StartsWith("b"))
.OrderBy(x => rnd.Next())
.First();
The above generates a random number for each element and sorts the results by that number.
You propbably won't notice a random results if you have only 2 elements matching your condition. But you can try the below sample (using the extension method below):
var colors = Enumerable.Range(0, 100).Select(i => "b" + i);
var rnd = new Random();
for (int i = 0; i < 5; i++)
{
Console.WriteLine(colors.RandomFirst(x => x.StartsWith("b"), rnd));
}
Output:
b23
b73
b27
b11
b8
You can create an extension method out of this called RandomFirst:
public static class MyExtensions
{
public static T RandomFirst<T>(this IEnumerable<T> source, Func<T, bool> predicate,
Random rnd)
{
return source.Where(predicate).OrderBy(i => rnd.Next()).First();
}
}
Usage:
var rnd = new Random();
var color1 = colors.RandomFirst(x => x.StartsWith("b"), rnd);
var color2 = colors.RandomFirst(x => x.StartsWith("b"), rnd);
var color3 = colors.RandomFirst(x => x.StartsWith("b"), rnd);
Optimization:
If you're worried about performance, you can try this optimized method (cuts the time to half for large lists):
public static T RandomFirstOptimized<T>(this IEnumerable<T> source,
Func<T, bool> predicate, Random rnd)
{
var matching = source.Where(predicate);
int matchCount = matching.Count();
if (matchCount == 0)
matching.First(); // force the exception;
return matching.ElementAt(rnd.Next(0, matchCount));
}

In case you have IList<T> you could also write a tiny extension method to pick a random element:
static class IListExtensions
{
private static Random _rnd = new Random();
public static void PickRandom<T>(this IList<T> items) =>
return items[_rnd.Next(items.Count)];
}
and use it like this:
var color = colors.Where(c => c.StartsWith("b")).ToList().PickRandom();

Another implementation is to extract all possible colors (sample) and take random one from them:
// Simplest, but not thread safe
private static Random random = new Random();
...
// All possible colors: [blue, black]
var sample = colors
.Where(c => c.StartsWidth("b"))
.ToArray();
var color = sample[random.Next(sample.Length)];

Simple way for short sequences if you don't mind iterating the sequence twice:
var randomItem = sequence.Skip(rng.Next(sequence.Count())).First();
For example (error handling elided for clarity):
var colors = new List<string> { "bronze", "green", "red", "blue", "black", "purple", "brown" };
var rng = new Random();
for (int i = 0; i < 10; ++i)
{
var sequence = colors.Where(c => c.StartsWith("b"));
var randomItem = sequence.Skip(rng.Next(sequence.Count())).First();
Console.WriteLine(randomItem);
}
This is an O(N) solution, but requires that the sequence is iterated once to get the count, then again to select a random item.
More complex solution using Reservoir Sampling suitable for long sequences
You can randomly select N items from a sequence of an unknown length in a single pass (O(N)) without resorting to expensive sorting, using a method known as Reservoir Sampling.
You would especially want to use Reservoir Sampling when:
The number of items to randomly choose from is large
The number of items to randomly choose from is unknown in advance
The number of items to randomly choose is small compared to the number of items to choose from
although you can use it for other situations too.
Here's a sample implementation:
/// <summary>
/// This uses Reservoir Sampling to select <paramref name="n"/> items from a sequence of items of unknown length.
/// The sequence must contain at least <paramref name="n"/> items.
/// </summary>
/// <typeparam name="T">The type of items in the sequence from which to randomly choose.</typeparam>
/// <param name="items">The sequence of items from which to randomly choose.</param>
/// <param name="n">The number of items to randomly choose<paramref name="items"/>.</param>
/// <param name="rng">A random number generator.</param>
/// <returns>The randomly chosen items.</returns>
public static T[] RandomlySelectedItems<T>(IEnumerable<T> items, int n, System.Random rng)
{
var result = new T[n];
int index = 0;
int count = 0;
foreach (var item in items)
{
if (index < n)
{
result[count++] = item;
}
else
{
int r = rng.Next(0, index + 1);
if (r < n)
result[r] = item;
}
++index;
}
if (index < n)
throw new ArgumentException("Input sequence too short");
return result;
}
For your case, you will need to pass n as 1, and you will receive an array of size 1.
You could use it like this (but note that this has no error checking, in the case that colors.Where(c => c.StartsWith("b") returns an empty sequence):
var colors = new List<string> { "green", "red", "blue", "black", "purple" };
var rng = new Random();
for (int i = 0; i < 10; ++i)
Console.WriteLine(RandomlySelectedItems(colors.Where(c => c.StartsWith("b")), 1, rng)[0]);
However, if you want to call this multiple times rather than just once, then you would be better off shuffling the array and accessing the first N items in the shuffled array. (It's hard to tell what your actual usage pattern will be from the question.)

I have created these two RandomOrDefault that are optimized to work on IList. One with predicate and one without it.
/// <summary>
/// Get a random element in the list
/// </summary>
public static TSource RandomOrDefault<TSource>(this IList<TSource> source)
{
if (source == null || source.Count == 0)
return default;
if (source.Count == 1)
return source[0];
var rand = new Random();
return source[rand.Next(source.Count)];
}
/// <summary>
/// Get a random element in the list that satisfies a condition
/// </summary>
public static TSource RandomOrDefault<TSource>(this IList<TSource> source, Func<TSource, bool> predicate)
{
if (source == null || source.Count == 0)
return default;
if (source.Count == 1)
{
var first = source[0];
if (predicate(first))
return first;
return default;
}
var matching = source.Where(predicate);
int matchCount = matching.Count();
if (matchCount == 0)
return default;
var rand = new Random();
return matching.ElementAt(rand.Next(matchCount));
}

Related

algorithm for selecting N random elements from a List<T> in C# [duplicate]

This question already has answers here:
Randomize a List<T>
(28 answers)
Closed 6 years ago.
I need a quick algorithm to select 4 random elements from a generic list. For example, I'd like to get 4 random elements from a List and then based on some calculations if elements found not valid then it should again select next 4 random elements from the list.
You could do it like this
public static class Extensions
{
public static Dictionary<int, T> GetRandomElements<T>(this IList<T> list, int quantity)
{
var result = new Dictionary<int, T>();
if (list == null)
return result;
Random rnd = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < quantity; i++)
{
int idx = rnd.Next(0, list.Count);
result.Add(idx, list[idx]);
}
return result;
}
}
Then use the extension method like this:
List<string> list = new List<string>() { "a", "b", "c", "d", "e", "f", "g", "h" };
Dictionary<int, string> randomElements = list.GetRandomElements(3);
foreach (KeyValuePair<int, string> elem in randomElements)
{
Console.WriteLine($"index in original list: {elem.Key} value: {elem.Value}");
}
something like that:
using System;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(5);
int n = 4;
var rand = new Random();
var randomObjects = new List<int>();
for (int i = 0; i<n; i++)
{
var index = rand.Next(list.Count);
randomObjects.Add(list[index]);
}
}
}
You can store indexes in some list to get non-repeated indexes:
List<T> GetRandomElements<T>(List<T> allElements, int randomCount = 4)
{
if (allElements.Count < randomCount)
{
return allElements;
}
List<int> indexes = new List<int>();
// use HashSet if performance is very critical and you need a lot of indexes
//HashSet<int> indexes = new HashSet<int>();
List<T> elements = new List<T>();
Random random = new Random();
while (indexes.Count < randomCount)
{
int index = random.Next(allElements.Count);
if (!indexes.Contains(index))
{
indexes.Add(index);
elements.Add(allElements[index]);
}
}
return elements;
}
Then you can do some calculation and call this method:
void Main(String[] args)
{
do
{
List<int> elements = GetRandomelements(yourElements);
//do some calculations
} while (some condition); // while result is not right
}
Suppose that the length of the List is N. Now suppose that you will put these 4 numbers in another List called out. Then you can loop through the List and the probability of the element you are on being chosen is
(4 - (out.Count)) / (N - currentIndex)
funcion (list)
(
loop i=0 i < 4
index = (int) length(list)*random(0 -> 1)
element[i] = list[index]
return element
)
while(check == false)
(
elements = funcion (list)
Do some calculation which returns check == false /true
)
This is the pseudo code, but i think you should of come up with this yourself.
Hope it helps:)
All the answers up to now have one fundamental flaw; you are asking for an algorithm that will generate a random combination of n elements and this combination, following some logic rules, will be valid or not. If its not, a new combination should be produced. Obviously, this new combination should be one that has never been produced before. All the proposed algorithms do not enforce this. If for example out of 1000000 possible combinations, only one is valid, you might waste a whole lot of resources until that particular unique combination is produced.
So, how to solve this? Well, the answer is simple, create all possible unique solutions, and then simply produce them in a random order. Caveat: I will suppose that the input stream has no repeating elements, if it does, then some combinations will not be unique.
First of all, lets write ourselves a handy immutable stack:
class ImmutableStack<T> : IEnumerable<T>
{
public static readonly ImmutableStack<T> Empty = new ImmutableStack<T>();
private readonly T head;
private readonly ImmutableStack<T> tail;
public int Count { get; }
private ImmutableStack()
{
Count = 0;
}
private ImmutableStack(T head, ImmutableStack<T> tail)
{
this.head = head;
this.tail = tail;
Count = tail.Count + 1;
}
public T Peek()
{
if (this == Empty)
throw new InvalidOperationException("Can not peek a empty stack.");
return head;
}
public ImmutableStack<T> Pop()
{
if (this == Empty)
throw new InvalidOperationException("Can not pop a empty stack.");
return tail;
}
public ImmutableStack<T> Push(T item) => new ImmutableStack<T>(item, this);
public IEnumerator<T> GetEnumerator()
{
var current = this;
while (current != Empty)
{
yield return current.head;
current = current.tail;
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
This will make our life easier while producing all combinations by recursion. Next, let's get the signature of our main method right:
public static IEnumerable<IEnumerable<T>> GetAllPossibleCombinationsInRandomOrder<T>(
IEnumerable<T> data, int combinationLength)
Ok, that looks about right. Now let's implement this thing:
var allCombinations = GetAllPossibleCombinations(data, combinationLength).ToArray();
var rnd = new Random();
var producedIndexes = new HashSet<int>();
while (producedIndexes.Count < allCombinations.Length)
{
while (true)
{
var index = rnd.Next(allCombinations.Length);
if (!producedIndexes.Contains(index))
{
producedIndexes.Add(index);
yield return allCombinations[index];
break;
}
}
}
Ok, all we are doing here is producing random indexees, checking we haven't produced it yet (we use a HashSet<int> for this), and returning the combination at that index.
Simple, now we only need to take care of GetAllPossibleCombinations(data, combinationLength).
Thats easy, we'll use recursion. Our bail out condition is when our current combination is the specified length. Another caveat: I'm omitting argument validation throughout the whole code, things like checking for null or if the specified length is not bigger than the input length, etc. should be taken care of.
Just for the fun, I'll be using some minor C#7 syntax here: nested functions.
public static IEnumerable<IEnumerable<T>> GetAllPossibleCombinations<T>(
IEnumerable<T> stream, int length)
{
return getAllCombinations(stream, ImmutableStack<T>.Empty);
IEnumerable<IEnumerable<T>> getAllCombinations<T>(IEnumerable<T> currentData, ImmutableStack<T> combination)
{
if (combination.Count == length)
yield return combination;
foreach (var d in currentData)
{
var newCombination = combination.Push(d);
foreach (var c in getAllCombinations(currentData.Except(new[] { d }), newCombination))
{
yield return c;
}
}
}
}
And there we go, now we can use this:
var data = "abc";
var random = GetAllPossibleCombinationsInRandomOrder(data, 2);
foreach (var r in random)
{
Console.WriteLine(string.Join("", r));
}
And sure enough, the output is:
bc
cb
ab
ac
ba
ca

Find values that appear in all lists (or arrays or collections)

Given the following:
List<List<int>> lists = new List<List<int>>();
lists.Add(new List<int>() { 1,2,3,4,5,6,7 });
lists.Add(new List<int>() { 1,2 });
lists.Add(new List<int>() { 1,2,3,4 });
lists.Add(new List<int>() { 1,2,5,6,7 });
What is the best/fastest way of identifying which numbers appear in all lists?
You can use the .net 3.5 .Intersect() extension method:-
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> common = a.Intersect(b).ToList();
To do it for two lists one would use x.Intersect(y).
To do it for several we would want to do something like:
var intersection = lists.Aggregate((x, y) => x.Intersect(y));
But this won't work because the result of the lambda isn't List<int> and so it can't be fed back in. This might tempt us to try:
var intersection = lists.Aggregate((x, y) => x.Intersect(y).ToList());
But then this makes n-1 needless calls to ToList() which is relatively expensive. We can get around this with:
var intersection = lists.Aggregate(
(IEnumerable<int> x, IEnumerable<int> y) => x.Intersect(y));
Which applies the same logic, but in using explicit types in the lambda, we can feed the result of Intersect() back in without wasting time and memory creating a list each time, and so gives faster results.
If this came up a lot we can get further (slight) performance improvements by rolling our own rather than using Linq:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(!en.MoveNext()) return Enumerable.Empty<T>();
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
return set;
}
}
This assumes its for internal use only. If it could be called from another assembly it should have error checks, and perhaps should be defined so as to only perform the intersect operations on the first MoveNext() of the calling code:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
if(source == null)
throw new ArgumentNullException("source");
return IntersectAllIterator(source);
}
public static IEnumerable<T> IntersectAllIterator<T>(IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(en.MoveNext())
{
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
foreach(T item in set)
yield return item;
}
}
}
(In these final two versions there's an opportunity to short-circuit if we end up emptying the set, but it only pays off if this happens relatively often, otherwise it's a nett loss).
Conversely, if these aren't concerns, and if we know that we're only ever going to want to do this with lists, we can optimise a bit further with the use of Count and indices:
public static IEnumerable<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return Enumerable.Empty<T>();
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return set;
}
And further if we know we're always going to want the results in a list, and we know that either we won't mutate the list, or it won't matter if the input list got mutated, we can optimise for the case of there being zero or one lists (but this costs more if we might ever not need the output in a list):
public static List<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return new List<T>(0);
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return new List<T>(set);
}
Again though, as well as making the method less widely-applicable, this has risks in terms of how it could be used, so is only appropriate for internal code were you can know either that you won't change either the input or the output after the fact, or that this won't matter.
Linq already offers Intersect and you can exploit Aggregate as well:
var result = lists.Aggregate((a, b) => a.Intersect(b).ToList());
If you don't trust the Intersect method or you just prefer to see what's going on, here's a snippet of code that should do the trick:
// Output goes here
List<int> output = new List<int>();
// Make sure lists are sorted
for (int i = 0; i < lists.Count; ++i) lists[i].Sort();
// Maintain array of indices so we can step through all the lists in parallel
int[] index = new int[lists.Count];
while(index[0] < lists[0].Count)
{
// Search for each value in the first list
int value = lists[0][index[0]];
// No. lists that value appears in, we want this to equal lists.Count
int count = 1;
// Search all the other lists for the value
for (int i = 1; i < lists.Count; ++i)
{
while (index[i] < lists[i].Count)
{
// Stop if we've passed the spot where value would have been
if (lists[i][index[i]] > value) break;
// Stop if we find value
if (lists[i][index[i]] == value)
{
++count;
break;
}
++index[i];
}
// If we reach the end of any list there can't be any more matches so end the search now
if (index[i] >= lists[i].Count) goto done;
}
// Store the value if we found it in all the lists
if (count == lists.Count) output.Add(value);
// Skip multiple occurrances of the same value
while (index[0] < lists[0].Count && lists[0][index[0]] == value) ++index[0];
}
done:
Edit:
I got bored and did some benchmarks on this vs. Jon Hanna's version. His is consistently faster, typically by around 50%. Mine wins by about the same margin if you happen to have presorted lists, though. Also you can gain a further 20% or so with unsafe optimisations. Just thought I'd share that.
You can also get it with SelectMany and Distinct:
List<int> result = lists
.SelectMany(x => x.Where(e => lists.All(l => l.Contains(e))))
.Distinct().ToList();
Edit:
List<int> result2 = lists.First().Where(e => lists.Skip(1).All(l => l.Contains(e)))
.ToList();
Edit 2:
List<int> result3 = lists
.Select(l => l.OrderBy(n => n).Take(lists.Min(x => x.Count()))).First()
.TakeWhile((n, index) => lists.Select(l => l.OrderBy(x => x)).Skip(1).All(l => l.ElementAt(index) == n))
.ToList();

LINQ alternative to creating collection by looping over array of fields

I have an array which contains fields for a data structure in the following format;
[0] = Record 1 (Name Field)
[1] = Record 1 (ID Field)
[2] = Record 1 (Other Field)
[3] = Record 2 (Name Field)
[4] = Record 2 (ID Field)
[5] = Record 2 (Other Field)
etc.
I'm processing this into a collection as follows;
for (int i = 0; i < components.Length; i = i + 3)
{
results.Add(new MyObj
{
Name = components[i],
Id = components[i + 1],
Other = components[i + 2],
});
}
This works fine, but I was wondering if there is a nice way to achieve the same output with LINQ? There's no functional requirement here, I'm just curious if it can be done or not.
I did do some experimenting with grouping by an index (after ToList()'ing the array);
var groupings = components
.GroupBy(x => components.IndexOf(x) / 3)
.Select(g => g.ToArray())
.Select(a => new
{
Name = a[0],
Id = a[1],
Other = a[2]
});
This works, but I think it's a bit overkill for what I'm trying to do. Is there a simpler way to achieve the same output as the for loop?
Looks like a perfect candidate for Josh Einstein's IEnumerable.Batch extension. It slices an enumerable into chunks of a certain size and feeds them out as an enumeration of arrays:
public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> self, int batchSize)
In the case of this question, you'd do something like this:
var results =
from batch in components.Batch(3)
select new MyObj { Name = batch[0], Id = batch[1], Other = batch[2] };
Update: 2 years on and the Batch extension I linked to seems to have disappeared. Since it was considered the answer to the question, and just in case someone else finds it useful, here's my current implementation of Batch:
public static partial class EnumExts
{
/// <summary>Split sequence into blocks of specified size.</summary>
/// <typeparam name="T">Type of items in sequence</typeparam>
/// <param name="sequence"><see cref="IEnumerable{T}"/> sequence to split</param>
/// <param name="batchLength">Number of items per returned array</param>
/// <returns>Arrays of <paramref name="batchLength"/> items, with last array smaller if sequence count is not a multiple of <paramref name="batchLength"/></returns>
public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> sequence, int batchLength)
{
if (sequence == null)
throw new ArgumentNullException("sequence");
if (batchLength < 2)
throw new ArgumentException("Batch length must be at least 2", "batchLength");
using (var iter = sequence.GetEnumerator())
{
var bfr = new T[batchLength];
while (true)
{
for (int i = 0; i < batchLength; i++)
{
if (!iter.MoveNext())
{
if (i == 0)
yield break;
Array.Resize(ref bfr, i);
break;
}
bfr[i] = iter.Current;
}
yield return bfr;
bfr = new T[batchLength];
}
}
}
}
This operation is deferred, single enumeration and executes in linear time. It is relatively quick compared to a few other Batch implementations I've seen, even though it is allocating a new array for each result.
Which just goes to show: you never can tell until you profile, and you should always quote the code in case it disappears.
I would say stick with your for-loop. However, this should work with Linq:
List<MyObj> results = components
.Select((c ,i) => new{ Component = c, Index = i })
.GroupBy(x => x.Index / 3)
.Select(g => new MyObj{
Name = g.First().Component,
Id = g.ElementAt(1).Component,
Other = g.Last().Component
})
.ToList();
Maybe an iterator could be appropriate.
Declare a custom iterator:
static IEnumerable<Tuple<int, int, int>> ToPartitions(int count)
{
for (var i = 0; i < count; i += 3)
yield return new Tuple<int, int, int>(i, i + 1, i + 2);
}
Prepare the following LINQ:
var results = from partition in ToPartitions(components.Length)
select new {Name = components[partition.Item1], Id = components[partition.Item2], Other = components[partition.Item3]};
This method may give you an idea on how to make the code more expressive.
public static IEnumerable<MyObj> AsComponents<T>(this IEnumerable<T> serialized)
where T:class
{
using (var it = serialized.GetEnumerator())
{
Func<T> next = () => it.MoveNext() ? it.Current : null;
var obj = new MyObj
{
Name = next(),
Id = next(),
Other = next()
};
if (obj.Name == null)
yield break;
yield return obj;
}
}
As it stands, I dislike the way I detect the end of the input, but you might have domain specific information on how to do this better.

How to replace FirstOrDefault with something like RandomOrDefault to "balance" calls?

I have such code
senders.FirstOrDefault(sender => !sender.IsBusy);
This line is called pretty often.
The problem is that it always returns the first non-busy sender; the same first sender is returned pretty often, but the last one is returned very rarely. How to easily balance this?
Ideally, on every call I should return the most rarely used sender. I.e. between all non-busy senders select the one that was selected the least number of times during the last second.
Maybe something like:
public static T RandomOrDefault<T>(this IEnumerable<T> dataSet)
{
return dataSet.RandomOrDefault(y => true);
}
public static T RandomOrDefault<T>(this IEnumerable<T> dataSet, Func<T, bool> filter)
{
var elems = dataSet.Where(filter).ToList();
var count = elems.Count;
if (count == 0)
{
return default(T);
}
var random = new Random();
var index = random.Next(count - 1);
return elems[index];
}
then you can call it with:
senders.RandomOrDefault(sender => !sender.IsBusy);
If you want to get the least used one efficiently you will be probably good with the following non-Linq 'list rotation' solution, which is O(n) effiency and O(1) space unlike most of others:
// keep track of these
List<Sender> senders;
int nSelected = 0; // number of already selected senders
// ...
// solution
int total = senders.Count; // total number of senders
// looking for next non-busy sender
Sender s = null;
for (int i = 0; i < total; i++)
{
int ind = (i + nSelected) % total; // getting the one 'after' previous
if (!senders[ind].IsBusy)
{
s = senders[ind];
++nSelected;
break;
}
}
Of course this adds the must-be-indexable constraint on senders container.
You could easily reorder by a new Guid, like this:
senders.Where(sender => !sender.IsBusy).OrderBy(x => Guid.NewGuid()).FirstOrDefault();
You don't mess with random numbers, you don't have to identify a "range" for these numbers. It just plain works and it's pretty elegant, I think.
You could use the "Shuffle" extension method from this post before your FirstOrDefault
You could use Skip with a random number less than the total number of non-busy senders.
senders.Where(sender => !sender.IsBusy).Skip(randomNumber).FirstOrDefault();
Identifying a sensible limit for the random number might be a bit tricky though.
Keep a look-up of senders that you have used and the time when they were used.
var recentlyUsed = new Dictionary<Sender, DateTime>();
var sender = senders.FirstOrDefault(sender => !sender.IsBusy && (!recentlyUsed.ContainsKey(sender) || recentlyUsed[sender] < DateTime.Now.AddSeconds(-1)));
if (sender != null)
recentlyUsed[sender] = DateTime.Now;
Based on algorithm from the "Real world functional programming" book here's the O(n) implementation of extension method for taking random or default value from IEnumearble.
public static class SampleExtensions
{
// The Random class is instantiated as singleton
// because it would give bad random values
// if instantiated on every call to RandomOrDefault method
private static readonly Random RandomGenerator = new Random(unchecked((int)DateTime.Now.Ticks));
public static T RandomOrDefault<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
IEnumerable<T> filtered = source.Where(predicate);
int count = 0;
T selected = default(T);
foreach (T current in filtered)
{
if (RandomGenerator.Next(0, ++count) == 0)
{
selected = current;
}
}
return selected;
}
public static T RandomOrDefault<T>(this IEnumerable<T> source)
{
return RandomOrDefault(source, element => true);
}
}
Here's the code to ensure that this algorithm really gives the Uniform distribution:
[Test]
public void TestRandom()
{
IEnumerable<int> source = Enumerable.Range(1, 10);
Dictionary<int, int> result = source.ToDictionary(element => element, element => 0);
result[0] = 0;
const int Limit = 1000000;
for (int i = 0; i < Limit; i++)
{
result[source.RandomOrDefault()]++;
}
foreach (var pair in result)
{
Console.WriteLine("{0}: {1:F2}%", pair.Key, pair.Value * 100f / Limit);
}
Console.WriteLine(Enumerable.Empty<int>().RandomOrDefault());
}
The output of TestRandom method is:
1: 9,92%
2: 10,03%
3: 10,04%
4: 9,99%
5: 10,00%
6: 10,01%
7: 9,98%
8: 10,03%
9: 9,97%
10: 10,02%
0: 0,00%
0

Optimal LINQ query to get a random sub collection - Shuffle

Please suggest an easiest way to get a random shuffled collection of count 'n' from a collection having 'N' items. where n <= N
Further to mquander's answer and Dan Blanchard's comment, here's a LINQ-friendly extension method that performs a Fisher-Yates-Durstenfeld shuffle:
// take n random items from yourCollection
var randomItems = yourCollection.Shuffle().Take(n);
// ...
public static class EnumerableExtensions
{
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
{
return source.Shuffle(new Random());
}
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source, Random rng)
{
if (source == null) throw new ArgumentNullException("source");
if (rng == null) throw new ArgumentNullException("rng");
return source.ShuffleIterator(rng);
}
private static IEnumerable<T> ShuffleIterator<T>(
this IEnumerable<T> source, Random rng)
{
var buffer = source.ToList();
for (int i = 0; i < buffer.Count; i++)
{
int j = rng.Next(i, buffer.Count);
yield return buffer[j];
buffer[j] = buffer[i];
}
}
}
Another option is to use OrderBy and to sort on a GUID value, which you can do so using:
var result = sequence.OrderBy(elem => Guid.NewGuid());
I did some empirical tests to convince myself that the above actually generates a random distribution (which it appears to do). You can see my results at Techniques for Randomly Reordering an Array.
This has some issues with "random bias" and I am sure it's not optimal, this is another possibility:
var r = new Random();
l.OrderBy(x => r.NextDouble()).Take(n);
Shuffle the collection into a random order and take the first n items from the result.
A bit less random, but efficient:
var rnd = new Random();
var toSkip = list.Count()-n;
if (toSkip > 0)
toSkip = rnd.Next(toSkip);
else
toSkip=0;
var randomlySelectedSequence = list.Skip(toSkip).Take(n);
Just needed to do that today. Here's my stab at it:
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> enumerable)
=> enumerable.Shuffle(new Random());
private static IEnumerable<T> Shuffle<T>(this IEnumerable<T> enumerable, Random random)
=> enumerable
.Select(t => (t, r: random.Next())) // zip with random value
.OrderBy(tr => tr.r)
.Select(tr => tr.t);
The main idea here is the "zip" but without using Zip since we only want to iterate the enumerable once. Inside the sort, each element of the original enumerable has the same "value".
I write this overrides method:
public static IEnumerable<T> Randomize<T>(this IEnumerable<T> items) where T : class
{
int max = items.Count();
var secuencia = Enumerable.Range(1, max).OrderBy(n => n * n * (new Random()).Next());
return ListOrder<T>(items, secuencia.ToArray());
}
private static IEnumerable<T> ListOrder<T>(IEnumerable<T> items, int[] secuencia) where T : class
{
List<T> newList = new List<T>();
int count = 0;
foreach (var seed in count > 0 ? secuencia.Skip(1) : secuencia.Skip(0))
{
newList.Add(items.ElementAt(seed - 1));
count++;
}
return newList.AsEnumerable<T>();
}
Then, I have my source list (all items)
var listSource = p.Session.QueryOver<Listado>(() => pl)
.Where(...);
Finally, I call "Randomize" and I get a random sub-collection of items, in my case, 5 items:
var SubCollection = Randomize(listSource.List()).Take(5).ToList();
Sorry for ugly code :-), but
var result =yourCollection.OrderBy(p => (p.GetHashCode().ToString() + Guid.NewGuid().ToString()).GetHashCode()).Take(n);

Categories