List<T> and IEnumerable difference - c#

While implementing this generic merge sort, as a kind of Code Kata, I stumbled on a difference between IEnumerable and List that I need help to figure out.
Here's the MergeSort
public class MergeSort<T>
{
public IEnumerable<T> Sort(IEnumerable<T> arr)
{
if (arr.Count() <= 1) return arr;
int middle = arr.Count() / 2;
var left = arr.Take(middle).ToList();
var right = arr.Skip(middle).ToList();
return Merge(Sort(left), Sort(right));
}
private static IEnumerable<T> Merge(IEnumerable<T> left, IEnumerable<T> right)
{
var arrSorted = new List<T>();
while (left.Count() > 0 && right.Count() > 0)
{
if (Comparer<T>.Default.Compare(left.First(), right.First()) < 0)
{
arrSorted.Add(left.First());
left=left.Skip(1);
}
else
{
arrSorted.Add(right.First());
right=right.Skip(1);
}
}
return arrSorted.Concat(left).Concat(right);
}
}
If I remove the .ToList() on the left and right variables it fails to sort correctly. Do you see why?
Example
var ints = new List<int> { 5, 8, 2, 1, 7 };
var mergeSortInt = new MergeSort<int>();
var sortedInts = mergeSortInt.Sort(ints);
With .ToList()
[0]: 1
[1]: 2
[2]: 5
[3]: 7
[4]: 8
Without .ToList()
[0]: 1
[1]: 2
[2]: 5
[3]: 7
[4]: 2
Edit
It was my stupid test that got me.
I tested it like this:
var sortedInts = mergeSortInt.Sort(ints);
ints.Sort();
if (Enumerable.SequenceEqual(ints, sortedInts)) Console.WriteLine("ints sorts ok");
just changing the first row to
var sortedInts = mergeSortInt.Sort(ints).ToList();
removes the problem (and the lazy evaluation).
EDIT 2010-12-29
I thought I would figure out just how the lazy evaluation messes things up here but I just don't get it.
Remove the .ToList() in the Sort method above like this
var left = arr.Take(middle);
var right = arr.Skip(middle);
then try this
var ints = new List<int> { 5, 8, 2 };
var mergeSortInt = new MergeSort<int>();
var sortedInts = mergeSortInt.Sort(ints);
ints.Sort();
if (Enumerable.SequenceEqual(ints, sortedInts)) Console.WriteLine("ints sorts ok");
When debugging You can see that before ints.Sort() a sortedInts.ToList() returns
[0]: 2
[1]: 5
[2]: 8
but after ints.Sort() it returns
[0]: 2
[1]: 5
[2]: 5
What is really happening here?

Your function is correct - if you inspect the result of Merge, you'll see the result is sorted (example).
So where's the problem? Just as you've suspected, you're testing it wrong - when you call Sort on your original list you change all collections that derive from it!
Here's a snippet that demonstrates what you did:
List<int> numbers = new List<int> {5, 4};
IEnumerable<int> first = numbers.Take(1);
Console.WriteLine(first.Single()); //prints 5
numbers.Sort();
Console.WriteLine(first.Single()); //prints 4!
All collections you create are basically the same as first - in a way, they are lazy pointers to positions in ints. Obviously, when you call ToList, the problem is eliminated.
Your case is more complex than that. Your Sort is part lazy, exactly as you suggest: First you create a list (arrSorted) and add integers to it. That part isn't lazy, and is the reason you see the first few elements sorted. Next, you add the remaining elements - but Concat is lazy. Now, recursion enters to mess this even more: In most cases, most elements on your IEnumerable are eager - you create lists out of left and right, which are also made of mostly eager + lazy tail. You end up with a sorted List<int>, lazily concated to a lazy pointer, which should be just the last element (other elements were merged before).
Here's a call graph of your functions - red indicated a lazy collection, black a real number:
When you change the list the new list is mostly intact, but the last element is lazy, and point to the position of the largest element in the original list.
The result is mostly good, but its last element still points to the original list:
One last example: consider you're changing all elements in the original list. As you can see, most elements in the sorted collection remain the same, but the last is lazy and points to the new value:
var ints = new List<int> { 3,2,1 };
var mergeSortInt = new MergeSort<int>();
var sortedInts = mergeSortInt.Sort(ints);
// sortedInts is { 1, 2, 3 }
for(int i=0;i<ints.Count;i++) ints[i] = -i * 10;
// sortedInts is { 1, 2, 0 }
Here's the same example on Ideone: http://ideone.com/FQVR7

Unable to reproduce - I've just tried this, and it works absolutely fine. Obviously it's rather inefficient in various ways, but removing the ToList calls didn't make it fail.
Here's my test code, with your MergeSort code as-is, but without the ToList() calls:
using System;
using System.Collections.Generic;
public static class Extensions
{
public static void Dump<T>(this IEnumerable<T> items, string name)
{
Console.WriteLine(name);
foreach (T item in items)
{
Console.Write(item);
Console.Write(" ");
}
Console.WriteLine();
}
}
class Test
{
static void Main()
{
var ints = new List<int> { 5, 8, 2, 1, 7 };
var mergeSortInt = new MergeSort<int>();
var sortedInts = mergeSortInt.Sort(ints);
sortedInts.Dump("Sorted");
}
}
Output:
Sorted
1 2 5 7 8
Perhaps the problem was how you were testing your code?

I ran it with and without the list and it worked.
Anyway, one of the strengths points of merge sort is its ability to sort in-place with O(1) space complexity that this implementation will not benefit.

The problem is you sort the left right, than the right side and merge to one sequence. That does not mean that you get a completely sorted sequence.
First you have to merge and than you have to sort:
public IEnumerable<T> Sort(IEnumerable<T> arr)
{
if (arr.Count() <= 1) return arr;
int middle = arr.Count() / 2;
var left = arr.Take(middle).ToList();
var right = arr.Skip(middle).ToList();
// first merge and than sort
return Sort(Merge(left, right));
}

Related

C# why does binarysearch have to be made on sorted arrays and lists?

C# why does binarysearch have to be made on sorted arrays and lists?
Is there any other method that does not require me to sort the list?
It kinda messes with my program in a way that I cannot sort the list for it to work as I want to.
A binary search works by dividing the list of candidates in half using equality. Imagine the following set:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
We can also represent this as a binary tree, to make it easier to visualise:
Source
Now, say we want to find the number 3. We can do it like so:
Is 3 smaller than 8? Yes. OK, now we're looking at everything between 1 and 7.
Is 3 smaller than 4? Yes. OK, now we're looking at everything between 1 and 3.
Is 3 smaller than 2? No. OK, now we're looking at 3.
We found it!
Now, if your list isn't sorted, how will we divide the list in half? The simple answer is: we can't. If we swap 3 and 15 in the example above, it would work like this:
Is 3 smaller than 8? Yes. OK, now we're looking at everything between 1 and 7.
Is 3 smaller than 4? Yes. OK, now we're looking at everything between 1 and 3 (except we swapped it with 15).
Is 3 smaller than 2? No. OK, now we're looking at 15.
Huh? There's no more items to check but we didn't find it. I guess it's not in the list.
The solution is to use an appropriate data type instead. For fast lookups of key/value pairs, I'll use a Dictionary. For fast checks if something already exists, I'll use a HashSet. For general storage I'll use a List or an array.
Dictionary example:
var values = new Dictionary<int, string>();
values[1] = "hello";
values[2] = "goodbye";
var value2 = values[2]; // this lookup will be fast because Dictionaries are internally optimised inside and partition keys' hash codes into buckets.
HashSet example:
var mySet = new HashSet<int>();
mySet.Add(1);
mySet.Add(2);
if (mySet.Contains(2)) // this lookup is fast for the same reason as a dictionary.
{
// do something
}
List exmaple:
var list = new List<int>();
list.Add(1);
list.Add(2);
if (list.Contains(2)) // this isn't fast because it has to visit each item in the list, but it works OK for small sets or places where performance isn't so important
{
}
var idx2 = list.IndexOf(2);
If you have multiple values with the same key, you could store a list in a Dictionary like this:
var values = new Dictionary<int, List<string>>();
if (!values.ContainsKey(key))
{
values[key] = new List<string>();
}
values[key].Add("value1");
values[key].Add("value2");
There is no way you use binary search on unordered collections. Sorting collection is the main concept of the binary search. The key is that on every move u take the middle index between l and r. On first step they are 0 and size - 1, after every step one of them becomes middle index between them. If x > arr[m] then l becomes m + 1, otherwise r becomes m - 1. Basically, on every step you take half of the array you had and, of course, it remains sorted. This code is recursive, if you don't know what recursion is(which is very important in programming), you can review and learn here.
// C# implementation of recursive Binary Search
using System;
class GFG {
// Returns index of x if it is present in
// arr[l..r], else return -1
static int binarySearch(int[] arr, int l,
int r, int x)
{
if (r >= l) {
int mid = l + (r - l) / 2;
// If the element is present at the
// middle itself
if (arr[mid] == x)
return mid;
// If element is smaller than mid, then
// it can only be present in left subarray
if (arr[mid] > x)
return binarySearch(arr, l, mid - 1, x);
// Else the element can only be present
// in right subarray
return binarySearch(arr, mid + 1, r, x);
}
// We reach here when element is not present
// in array
return -1;
}
// Driver method to test above
public static void Main()
{
int[] arr = { 2, 3, 4, 10, 40 };
int n = arr.Length;
int x = 10;
int result = binarySearch(arr, 0, n - 1, x);
if (result == -1)
Console.WriteLine("Element not present");
else
Console.WriteLine("Element found at index "
+ result);
}
}
Output:
Element is present at index 3
Sure there is.
var list = new List<int>();
list.Add(42);
list.Add(1);
list.Add(54);
var index = list.IndexOf(1); //TADA!!!!
EDIT: Ok, I hoped the irony was obvious. But strictly speaking, if your array is not sorted, you are pretty much stuck with the linear search, readily available by means of IndexOf() or IEnumerable.First().

Appropriate way to compare two lists and generate error message indicating any indexes where they are different with corresponding difference?

I have two lists of doubles that I need to compare for equality. There are obviously a million ways to do this, the simplest probably being list1.Equals(list2). However I want to have some sort of error message indicating precisely every list index and value for both lists wherever there is a difference. This error message would hopefully be something like
list1 and list2 are not equal.
list1 has value 0.1 at index 2, list2 has value 0.05 at index 2
etc. etc. for every difference
I also have a Utilities method already called AreEqual that basically just compares the values.
My first thought was evidently to loop through the lists and use AreEqual (I already know the lists are the same length)
for (int index = 0; index < list1.Count; index++)
{
check.AreEqual(list1[index], list2[index]);
}
but this doesn't help much for generating a useful error message unless in the case they're not equal I call some method to generate an error message like this
public string ErrorMessage(List<double> oldList, List<double> newList)
{
// build some error message here by taking the list difference
// and using IndexOf or whatnot
}
This seems super overkill, though. I can think of a million ways to do this but I can't determine what an appropriate way to do it is.
Is looping over the values and calling an error-message generating method reasonable?
Or is using something like
list3 = list1.Except(list2)
and then checking whether or not list3 is empty or not and correspondingly using IndexOf to get the differing values in both lists appropriate?
Or am I losing my mind and there's a much more straightforward way to do this?
You can use following LINQ query:
string sizeMsg = "";
if (list1.Count != list2.Count)
sizeMsg = String.Format("They have a different size, list1.Count:{0} list2.Count:{1}", list1.Count, list2.Count);
int count = Math.Min(list1.Count, list2.Count);
var differences = Enumerable.Range(0, count)
.Select(index => new { index, d1 = list1[index], d2 = list2[index] })
.Where(x => x.d1 != x.d2)
.Select(x => String.Format("list1 has value {0} at index {1}, list2 has value {2} at index {1}"
, x.d1, x.index, x.d2));
string differenceMessage = String.Join(Environment.NewLine, differences);
I think that using Linq here just makes it more complicated, when you can just do something like this:
public static IEnumerable<string> DifferenceErrors(List<double> list1, List<double> list2)
{
// I recommend defining a minimum difference below which you consider the values to be identical:
const double EPSILON = 0.00001;
for (int i = 0; i < list1.Count; ++i)
if (Math.Abs(list1[i] - list2[i]) >= EPSILON)
yield return $"At index {i}, list1 has value {list1[i]} and list2 has value {list2[i]}";
}
If you want to use C# prior to C#6 change the yield to this:
yield return string.Format("At index {0} list1 has value {1} and list2 has value {2}", i, list1[i], list2[i]);
for eq test I will use this and check if list3 is empty
list3 = list1.Except(list2)
if list3 is not empty and values are unique - we can loop thru list three and provide meaning full feedback.
This seems to be the easiest for me.
but using linqPad - had a small test(6 entries are different)
var list1 = new List<double>{1,2,3,4,7,8,9,10,11};
var list2 = new List<double>{1,2,3,5,6,7,8,19,20};
var list3 = list1.Except(list2).Dump();
var list4 = list2.Except(list1).Dump();
IEnumerable (4 items) 4 9 10 11
IEnumerable (4 items) 5 6 19 20
but result gives us only four entries are different.
If you care about order - there is a need for a loop, if not - go with except.

C# Similarities of two arrays

There must be an better way to do this, I'm sure...
// Simplified code
var a = new List<int>() { 1, 2, 3, 4, 5, 6 };
var b = new List<int>() { 2, 3, 5, 7, 11 };
var z = new List<int>();
for (int i = 0; i < a.Count; i++)
if (b.Contains(a[i]))
z.Add(a[i]);
// (z) contains all of the numbers that are in BOTH (a) and (b), i.e. { 2, 3, 5 }
I don't mind using the above technique, but I want something fast and efficient (I need to compare very large Lists<> multiple times), and this appears to be neither! Any thoughts?
Edit: As it makes a difference - I'm using .NET 4.0, the initial arrays are already sorted and don't contain duplicates.
You could use IEnumerable.Intersect.
var z = a.Intersect(b);
which will probably be more efficient than your current solution.
note you left out one important piece of information - whether the lists happen to be ordered or not. If they are then a couple of nested loops that pass over each input array exactly once each may be faster - and a little more fun to write.
Edit
In response to your comment on ordering:
first stab at looping - it will need a little tweaking on your behalf but works for your initial data.
int j = 0;
foreach (var i in a)
{
int x = b[j];
while (x < i)
{
if (x == i)
{
z.Add(b[j]);
}
j++;
x = b[j];
}
}
this is where you need to add some unit tests ;)
Edit
final point - it may well be that Linq can use SortedList to perform this intersection very efficiently, if performance is a concern it is worth testing the various solutions. Dont forget to take the sorting into account if you load your data in an un-ordered manner.
One Final Edit because there has been some to and fro on this and people may be using the above without properly debugging it I am posting a later version here:
int j = 0;
int b1 = b[j];
foreach (var a1 in a)
{
while (b1 <= a1)
{
if (b1 == a1)
z1.Add(b[j]);
j++;
if (j >= b.Count)
break;
b1 = b[j];
}
}
There's IEnumerable.Intersect, but since this is an extension method, I doubt it will be very efficient.
If you want efficiency, take one list and turn it into a Set, then go over the second list and see which elements are in the set. Note that I preallocate z, just to make sure you don't suffer from any reallocations.
var set = new HashSet<int>(a);
var z = new List<int>(Math.Min(set.Count, b.Count));
foreach(int i in b)
{
if(set.Contains(i))
a.Add(i);
}
This is guaranteed to run in O(N+M) (N and M being the sizes of the two lists).
Now, you could use set.IntersectWith(b), and I believe it will be just as efficient, but I'm not 100% sure.
The Intersect() method does just that. From MSDN:
Produces the set intersection of two sequences by using the default
equality comparer to compare values.
So in your case:
var z = a.Intersect(b);
Use SortedSet<T> in System.Collections.Generic namespace:
SortedSet<int> a = new SortedSet<int>() { 1, 2, 3, 4, 5, 6 };
SortedSet<int> b = new SortedSet<int>() { 2, 3, 5, 7, 11 };
b.IntersectWith(s2);
But surely you have no duplicates!
Although your second list needs not to be a SortedSet. It can be any collection (IEnumerable<T>), but internally the method act in a way that if the second list also is SortedSet<T>, the operation is an O(n) operation.
If you can use LINQ, you could use the Enumerable.Intersect() extension method.

C# Collection - Order by an element (Rotate)

I have an IEnumerable<Point> collection. Lets say it contains 5 points (in reality it is more like 2000)
I want to order this collection so that a specifc point in the collection becomes the first element, so it's basically chopping a collection at a specific point and rejoining them together.
So my list of 5 points:
{0,0}, {10,0}, {10,10}, {5,5}, {0,10}
Reordered with respect to element at index 3 would become:
{5,5}, {0,10}, {0,0}, {10,0}, {10,10}
What is the most computationally efficient way of resolving this problem, or is there an inbuilt method that already exists... If so I can't seem to find one!
var list = new[] { 1, 2, 3, 4, 5 };
var rotated = list.Skip(3).Concat(list.Take(3));
// rotated is now {4, 5, 1, 2, 3}
A simple array copy is O(n) in this case, which should be good enough for almost all real-world purposes. However, I will grant you that in certain cases - if this is a part deep inside a multi-level algorithm - this may be relevant. Also, do you simply need to iterate through this collection in an ordered fashion or create a copy?
Linked lists are very easy to reorganize like this, although accessing random elements will be more costly. Overall, the computational efficiency will also depend on how exactly you access this collection of items (and also, what sort of items they are - value types or reference types?).
The standard .NET linked list does not seem to support such manual manipulation but in general, if you have a linked list, you can easily move around sections of the list in the way you describe, just by assigning new "next" and "previous" pointers to the endpoints.
The collection library available here supports this functionality: http://www.itu.dk/research/c5/.
Specifically, you are looking for LinkedList<T>.Slide() the method which you can use on the object returned by LinkedList<T>.View().
Version without enumerating list two times, but higher memory consumption because of the T[]:
public static IEnumerable<T> Rotate<T>(IEnumerable<T> source, int count)
{
int i = 0;
T[] temp = new T[count];
foreach (var item in source)
{
if (i < count)
{
temp[i] = item;
}
else
{
yield return item;
}
i++;
}
foreach (var item in temp)
{
yield return item;
}
}
[Test]
public void TestRotate()
{
var list = new[] { 1, 2, 3, 4, 5 };
var rotated = Rotate(list, 3);
Assert.That(rotated, Is.EqualTo(new[] { 4, 5, 1, 2, 3 }));
}
Note: Add argument checks.
Another alternative to the Linq method shown by ulrichb would be to use the Queue Class (a fifo collection) dequeue to your index, and enqueue the ones you have taken out.
The naive implementation using linq would be:
IEnumerable x = new[] { 1, 2, 3, 4 };
var tail = x.TakeWhile(i => i != 3);
var head = x.SkipWhile(i => i != 3);
var combined = head.Concat(tail); // is now 3, 4, 1, 2
What happens here is that you perform twice the comparisons needed to get to your first element in the combined sequence.
The solution is readable and compact but not very efficient.
The solutions described by the other contributors may be more efficient since they use special data structures as arrays or lists.
You can write a user defined extension of List that does the rotation by using List.Reverse(). I took the basic idea from the C++ Standard Template Library which basically uses Reverse in three steps: Reverse(first, mid) Reverse(mid, last) Reverse(first, last)
As far as I know, this is the most efficient and fastest way. I tested with 1 billion elements and the rotation Rotate(0, 50000, 800000) takes 0.00097 seconds. (By the way: adding 1 billion ints to the List already takes 7.3 seconds)
Here's the extension you can use:
public static class Extensions
{
public static void Rotate(this List<int> me, int first, int mid, int last)
{
//indexes are zero based!
if (first >= mid || mid >= lastIndex)
return;
me.Reverse(first, mid - first + 1);
me.Reverse(mid + 1, last - mid);
me.Reverse(first, last - first + 1);
}
}
The usage is like:
static void Main(string[] args)
{
List<int> iList = new List<int>{0,1,2,3,4,5};
Console.WriteLine("Before rotate:");
foreach (var item in iList)
{
Console.Write(item + " ");
}
Console.WriteLine();
int firstIndex = 0, midIndex = 2, lastIndex = 4;
iList.Rotate(firstIndex, midIndex, lastIndex);
Console.WriteLine($"After rotate {firstIndex}, {midIndex}, {lastIndex}:");
foreach (var item in iList)
{
Console.Write(item + " ");
}
Console.ReadKey();
}

How to merge 2 sorted listed into one shuffled list while keeping internal order in c#

I want to generate a shuffled merged list that will keep the internal order of the lists.
For example:
list A: 11 22 33
list B: 6 7 8
valid result: 11 22 6 33 7 8
invalid result: 22 11 7 6 33 8
Just randomly select a list (e.g. generate a random number between 0 and 1, if < 0.5 list A, otherwise list B) and then take the element from that list and add it to you new list. Repeat until you have no elements left in each list.
Generate A.Length random integers in the interval [0, B.Length). Sort the random numbers, then iterate i from 0..A.Length adding A[i] to into position r[i]+i in B. The +i is because you're shifting the original values in B to the right as you insert values from A.
This will be as random as your RNG.
None of the answers provided in this page work if you need the outputs to be uniformly distributed.
To illustrate my examples, assume we are merging two lists A=[1,2,3], B=[a,b,c]
In the approach mentioned in most answers (i.e. merging two lists a la mergesort, but choosing a list head randomly each time), the output [1 a 2 b 3 c] is far less likely than [1 2 3 a b c]. Intuitively, this happens because when you run out of elements in a list, then the elements on the other list are appended at the end. Because of that, the probability for the first case is 0.5*0.5*0.5 = 0.5^3 = 0.125, but in the second case, there are more random random events, since a random head has to be picked 5 times instead of just 3, leaving us with a probability of 0.5^5 = 0.03125. An empirical evaluation also easily validates these results.
The answer suggested by #marcog is almost correct. However, there is an issue where the distribution of r is not uniform after sorting it. This happens because original lists [0,1,2], [2,1,0], [2,1,0] all get sorted into [0,1,2], making this sorted r more likely than, for example, [0,0,0] for which there is only one possibility.
There is a clever way of generating the list r in such a way that it is uniformly distributed, as seen in this Math StackExchange question: https://math.stackexchange.com/questions/3218854/randomly-generate-a-sorted-set-with-uniform-distribution
To summarize the answer to that question, you must sample |B| elements (uniformly at random, and without repetition) from the set {0,1,..|A|+|B|-1}, sort the result and then subtract its index to each element in this new list. The result is the list r that can be used in replacement at #marcog's answer.
Original Answer:
static IEnumerable<T> MergeShuffle<T>(IEnumerable<T> lista, IEnumerable<T> listb)
{
var first = lista.GetEnumerator();
var second = listb.GetEnumerator();
var rand = new Random();
bool exhaustedA = false;
bool exhaustedB = false;
while (!(exhaustedA && exhaustedB))
{
bool found = false;
if (!exhaustedB && (exhaustedA || rand.Next(0, 2) == 0))
{
exhaustedB = !(found = second.MoveNext());
if (found)
yield return second.Current;
}
if (!found && !exhaustedA)
{
exhaustedA = !(found = first.MoveNext());
if (found)
yield return first.Current;
}
}
}
Second answer based on marcog's answer
static IEnumerable<T> MergeShuffle<T>(IEnumerable<T> lista, IEnumerable<T> listb)
{
int total = lista.Count() + listb.Count();
var random = new Random();
var indexes = Enumerable.Range(0, total-1)
.OrderBy(_=>random.NextDouble())
.Take(lista.Count())
.OrderBy(x=>x)
.ToList();
var first = lista.GetEnumerator();
var second = listb.GetEnumerator();
for (int i = 0; i < total; i++)
if (indexes.Contains(i))
{
first.MoveNext();
yield return first.Current;
}
else
{
second.MoveNext();
yield return second.Current;
}
}
Rather than generating a list of indices, this can be done by adjusting the probabilities based on the number of elements left in each list. On each iteration, A will have A_size elements remaining, and B will have B_size elements remaining. Choose a random number R from 1..(A_size + B_size). If R <= A_size, then use an element from A as the next element in the output. Otherwise use an element from B.
int A[] = {11, 22, 33}, A_pos = 0, A_remaining = 3;
int B[] = {6, 7, 8}, B_pos = 0, B_remaining = 3;
while (A_remaining || B_remaining) {
int r = rand() % (A_remaining + B_remaining);
if (r < A_remaining) {
printf("%d ", A[A_pos++]);
A_remaining--;
} else {
printf("%d ", B[B_pos++]);
B_remaining--;
}
}
printf("\n");
As a list gets smaller, the probability an element gets chosen from it will decrease.
This can be scaled to multiple lists. For example, given lists A, B, and C with sizes A_size, B_size, and C_size, choose R in 1..(A_size+B_size+C_size). If R <= A_size, use an element from A. Otherwise, if R <= A_size+B_size use an element from B. Otherwise C.
Here is a solution that ensures a uniformly distributed output, and is easy to reason why. The idea is first to generate a list of tokens, where each token represent an element of a specific list, but not a specific element. For example for two lists having 3 elements each, we generate this list of tokens: 0, 0, 0, 1, 1, 1. Then we shuffle the tokens. Finally we yield an element for each token, selecting the next element from the corresponding original list.
public static IEnumerable<T> MergeShufflePreservingOrder<T>(
params IEnumerable<T>[] sources)
{
var random = new Random();
var queues = sources
.Select(source => new Queue<T>(source))
.ToArray();
var tokens = queues
.SelectMany((queue, i) => Enumerable.Repeat(i, queue.Count))
.ToArray();
Shuffle(tokens);
return tokens.Select(token => queues[token].Dequeue());
void Shuffle(int[] array)
{
for (int i = 0; i < array.Length; i++)
{
int j = random.Next(i, array.Length);
if (i == j) continue;
if (array[i] == array[j]) continue;
var temp = array[i];
array[i] = array[j];
array[j] = temp;
}
}
}
Usage example:
var list1 = "ABCDEFGHIJKL".ToCharArray();
var list2 = "abcd".ToCharArray();
var list3 = "#".ToCharArray();
var merged = MergeShufflePreservingOrder(list1, list2, list3);
Console.WriteLine(String.Join("", merged));
Output:
ABCDaEFGHIb#cJKLd
This might be easier, assuming you have a list of three values in order that match 3 values in another table.
You can also sequence with the identity using identity (1,2)
Create TABLE #tmp1 (ID int identity(1,1),firstvalue char(2),secondvalue char(2))
Create TABLE #tmp2 (ID int identity(1,1),firstvalue char(2),secondvalue char(2))
Insert into #tmp1(firstvalue,secondvalue) Select firstvalue,null secondvalue from firsttable
Insert into #tmp2(firstvalue,secondvalue) Select null firstvalue,secondvalue from secondtable
Select a.firstvalue,b.secondvalue from #tmp1 a join #tmp2 b on a.id=b.id
DROP TABLE #tmp1
DROP TABLE #tmp2

Categories