C# All Unique Combinations of List<string> [duplicate]

C# All Unique Combinations of List<string> [duplicate] - c#

This question already has answers here:
All Possible Combinations of a list of Values
(19 answers)
Closed 4 years ago.
This question has been asked many times, but every SO post I've seen wants a specific length in values, whereas I just want to know every unique combination regardless of length.
The code below only provides a list where there are exactly 3 entries of combinations (and they are not unique).
List<string> list = new List<string> { "003_PS", "003_DH", "003_HEAT" };
var perms = list.GetPermutations();
public static class Extensions
{
public static IEnumerable<IEnumerable<T>> GetPermutations<T>(this IEnumerable<T> items)
{
foreach (var item in items)
{
var itemAsEnumerable = Enumerable.Repeat(item, 1);
var subSet = items.Except(itemAsEnumerable);
if (!subSet.Any())
{
yield return itemAsEnumerable;
}
else
{
foreach (var sub in items.Except(itemAsEnumerable).GetPermutations())
{
yield return itemAsEnumerable.Union(sub);
}
}
}
}
}
/*
OUTPUT:
003_PS, 003_DH, 003_HEAT
003_PS, 003_HEAT, 003_DH
003_DH, 003_PS, 003_HEAT
003_DH, 003_HEAT, 003_PS
003_HEAT, 003_PS, 003_DH
003_HEAT, 003_DH, 003_PS
*/
What I'm looking for is this:
/*
OUTPUT:
003_PS, 003_DH, 003_HEAT
003_PS, 003_DH
003_PS, 003_HEAT
003_PS
003_DH, 003_HEAT
003_DH
003_HEAT
*/
The size is not limited to 3 items and each entry is unique.
What do I need to change in this function? I'm open to a LINQ solution
Any help is appreciated. Thanks!
EDIT: list above was not accurate to output
**EDIT #2:
Here's a Javascript version of what I'm looking for. But I don't know what the C# equivalent syntax is:
function getUniqueList(arr){
if (arr.length === 1) return [arr];
else {
subarr = getUniqueList(arr.slice(1));
return subarr.concat(subarr.map(e => e.concat(arr[0])), [[arr[0]]]);
}
}
.

OK, you have n items, and you want the 2n combinations of those items.
FYI, this is called the "power set" when you do it on sets; since sequences can contain duplicates and sets cannot, what you want is not exactly the power set, but it's close enough. The code I am presenting will be in a sense wrong if your sequence contains duplicates, since we'll say that the result for {10, 10} will be {}, {10}, {10}, {10, 10} but if you don't like that, well, remove duplicates before you start, and then you'll be computing the power set.
Computing the sequence of all combinations is straightforward. We reason as follows:
If there are zero items then there is only one combination: the combination that has zero items.
If there are n > 0 items then the combinations are all the combinations with the first element, and all the combinations without the first element.
Let's write the code:
using System;
using System.Collections.Generic;
using System.Linq;
static class Extensions
{
public static IEnumerable<T> Prepend<T>(
this IEnumerable<T> items,
T first)
{
yield return first;
foreach(T item in items)
yield return item;
}
public static IEnumerable<IEnumerable<T>> Combinations<T>(
this IEnumerable<T> items)
{
if (!items.Any())
yield return items;
else
{
var head = items.First();
var tail = items.Skip(1);
foreach(var sequence in tail.Combinations())
{
yield return sequence; // Without first
yield return sequence.Prepend(head);
}
}
}
}
public class Program
{
public static void Main()
{
var items = new [] { 10, 20, 30 };
foreach(var sequence in items.Combinations())
Console.WriteLine($"({string.Join(",", sequence)})");
}
}
And we get the output
()
(10)
(20)
(10,20)
(30)
(10,30)
(20,30)
(10,20,30)
Note that this code is not efficient if the original sequence is very long.
EXERCISE 1: Do you see why?
EXERCISE 2: Can you make this code efficient in both time and space for long sequences? (Hint: if you can make the head, tail and prepend operations more memory-efficient, that will go a long way.)
EXERCISE 3: You give the same algorithm in JavaScript; can you make the analogies between each part of the JS algorithm and the C# version?
Of course, we assume that the number of items in the original sequence is short. If it's even a hundred, you're going to be waiting a long time to enumerate all those trillions of combinations.
If you want to get all the combinations with a specific number of elements, see my series of articles on algorithms for that: https://ericlippert.com/2014/10/13/producing-combinations-part-one/

Related

Testing an infinite sequence with NUnit, but only a finite sequence must be checked

I have an Extension Method that extract the numbers in a list of strings (bbb1, da21ds, dsa231djsla90 ==> 1, 21, 23190):
public static int[] GetContainedNumbers(this string[] source)
{
if (source == null) throw new ArgumentNullException();
int[] result = new int[] {};
foreach (var s in source)
{
string onlyNumbers = new String(s.Where(Char.IsDigit).ToArray());
if (String.IsNullOrEmpty(onlyNumbers)) throw new ArgumentException();
int extractedNumber = Int32.Parse(onlyNumbers);
result = result.Concat(new int[] { extractedNumber }).ToArray();
}
return result;
}
I need to make a test with NUnit. The request is to make a test with an infinite sequence of strings (a1z, a2z, a3z, ...), but the output only needs to check the first 100 numbers.
Currently I have no idea what I'm supposed to do. My starting idea was to create a test like this:
int[] expectedResult = Enumerable.Range(0, 99).ToArray();
string[] source = new string[] {};
int n = 0;
while (true)
{
if (n == 99) break;
source = source.Concat(new string[] { "a" + n + "z" }).ToArray();
n++;
}
Assert.AreEqual(expectedResult, source.GetContainedNumbers());
But it doesn't really make sense since the array is finite and not infinite.
I don't know what it means to create an infinite list of things and how I should create it nor testing it.
I can edit the Extension Method if needed. Thanks in advance for any help and sorry if my english is quite broken.
The first task is to make the Extension method itself, which I have resolved above. It can be modified, but it must resolve this piece of code (which I can't edit):
foreach (var d in new[] { "1qui7", "q8u8o", "-1024", "0q0ua0" }.GetContainedNumbers())
{
Console.WriteLine("{0}, ", d);
var strings = new [ ] { "1qui7" , " q8u8o " , " −1024" , " 0 q0ua0 " };
}
Expected output: 17, 88, 1024, 0,

I believe the purpose of this exercise is to investigate lazy evaluation.
Currently your method accepts a string[] and returns an int[] - but there's nothing in the question that says it has to do that.
If instead you were to write an extension method accepting an IEnumerable<string> and returning an IEnumerable<int>, then you could accept an infinite sequence of elements and return a corresponding infinite sequence of outputs, that is lazily evaluated: when the caller requests the next output element, you in turn request the next input element, process it, and yield an output. C# makes all of this quite straightforward with iterator blocks.
So I would expect a method like this:
public static IEnumerable<int> GetContainedNumbers(IEnumerable<string> source)
{
foreach (string inputElement)
{
int outputElement = ...; // TODO: your logic here
yield return output;
}
}
Now to test this, you can either use iterator blocks again to generate genuinely infinite sequences, or you could test against an effectively infinite sequence using LINQ, e.g.
IEnumerable<string> veryLargeInput = Enumerable
.Range(0, int.MaxValue)
.Select(x => $"a{x}z");
The truly infinite version might be something like:
IEnumerable<string> infiniteSequence = GetInfiniteSequence();
...
private static IEnumerable<string> GetInfiniteSequence()
{
int value = 0;
while (true)
{
yield return $"x{value}z";
// Eventually this will loop round from int.MaxValue to
// int.MinValue. You could use BigInteger if you really wanted.
value++;
}
}
When testing the result, you'll want to take the first 100 elements of the output. For example:
var input = GetInfiniteSequence(); // Or the huge Enumerable.Range
var output = input.GetContainedNumbers(); // This is still infinite
var first100Elements = output.Take(100).ToList();
// Now make assertions against first100Elements

Why is this printing out the numbers from highest to lowest?

I'm currently taking a intermediate course on Udemy for C# and I'm trying to do one of the exercises. I've looked in the Q&A for students to see other peoples solutions, I've even copied and pasted other peoples solutions to see if theirs works and they do, I don't see any difference between mine and other peoples but for some reason my code prints out the numbers from highest to lowest and no where in the code should this happen. The idea of the exercise was to create a stack, we have 3 methods: Push(), Pop(), and Clear(). The push method adds objects to an ArrayList, the pop method removes the number from the top of the stack and returns the number. The clear method is self explanatory. Here's my code:
Stack Class:
public class Stack {
private ArrayList _arrayList = new ArrayList();
public void Push(object obj) {
if (obj is null) {
throw new InvalidOperationException();
}
else {
_arrayList.Add(obj);
}
}
public object Pop() {
if (_arrayList is null) {
throw new InvalidOperationException();
}
else {
var top = _arrayList.Count;
_arrayList.Remove(top);
return top;
}
}
public void Clear() {
for (int i = 0; i < _arrayList.Count; i++) {
_arrayList.Remove(i);
}
}
}
Program Class:
class Program {
static void Main(string[] args) {
var stack = new Stack();
stack.Push(5);
stack.Push(1);
stack.Push(2);
stack.Push(4);
stack.Push(3);
Console.WriteLine(stack.Pop());
Console.WriteLine(stack.Pop());
Console.WriteLine(stack.Pop());
Console.WriteLine(stack.Pop());
Console.WriteLine(stack.Pop());
}
}

var top = _arrayList.Count;
_arrayList.Remove(top);
return top;
You aren't printing the values, you're printing the number of elements stored.
Try changing your values to be something other than the first few positive integers to catch this kind of mistake more easily.
PS: There's been no reason to use ArrayList for over a decade. The generic collection classes such as List<int> are better in every way -- faster, less wasted memory, type safety.

var top = _arrayList.Count;
_arrayList.Remove(top);
return top;
top is assigned the value of the size of the list, and is never reassigned. Thus it looks like it prints highest to lowest because its just printing the size of the stack (which of course is 5, then 4, then 3, and so on). It only looks like you printed the contents of your list because you happened to push the same numbers. Some different test data would have made this bug more obvious.
I think what you actually wanted was
var top = _arrayList[_arrayList.Count -1];
Note that your code would fail miserably if there was a duplicate element in the list (due to removing based on the item value and not on the index). You also really shouldn't be using ArrayList at all; that's a .NET 1.0 class that's just an awful collection interface. Use a generic like List<T>.

You are returning
var top = _arrayList.Count;
Not the value in that position.
_arrayList(top-1);
You should be getting an index out of range error on that remove.
_arrayList.Remove(top);

How to check if random values are unique?

C # code:
I have 20 random numbers between 1-100 in an array and the program should check if every value is unique. Now i should use another method which returns true if there are only unique values in the array and false if there are not any unique values in the array. I would appreciate if someone could help me with this.

bool allUnique = array.Distinct().Count() == array.Count(); // or array.Length
or
var uniqueNumbers = new HashSet<int>(array);
bool allUnique = uniqueNumbers.Count == array.Count();

A small alternative to #TimSchmelters excellent answers that can run a bit more efficient:
public static bool AllUniq<T> (this IEnumerable<T> data) {
HashSet<T> hs = new HashSet<T>();
return data.All(hs.Add);
}
What this basically does is generating a for loop:
public static bool AllUniq<T> (this IEnumerable<T> data) {
HashSet<T> hs = new HashSet<T>();
foreach(T x in data) {
if(!hs.Add(x)) {
return false;
}
}
return true;
}
From the moment one hs.Add fails - this because the element already exists - the method returns false, if no such object can be found, it returns true.
The reason that this can work faster is that it will stop the process from the moment a duplicate is found whereas the previously discussed approaches first construct a collection of unique numbers and then compare the size. Now if you iterate over large amount of numbers, constructing the entire distinct list can be computationally intensive.
Furthermore note that there are more clever ways than generate-and-test to generate random distinct numbers. For instance interleave the generate and test procedure. Once a project I had to correct generated Sudoku's this way. The result was that one had to wait entire days before it came up with a puzzle.

Here's a non linq solution
for(int i=0; i< YourArray.Length;i++)
{
for(int x=i+1; x< YourArray.Length; x++)
{
if(YourArray[i] == YourArray[x])
{
Console.WriteLine("Found repeated value");
}
}
}

IEnumerable<IEnumerable<int>> - no duplicate IEnumerable<int>s

I'm trying to find a solution to this problem:
Given a IEnumerable< IEnumerable< int>> I need a method/algorithm that returns the input, but in case of several IEnmerable< int> with the same elements only one per coincidence/group is returned.
ex.
IEnumerable<IEnumerable<int>> seqs = new[]
{
new[]{2,3,4}, // #0
new[]{1,2,4}, // #1 - equals #3
new[]{3,1,4}, // #2
new[]{4,1,2} // #3 - equals #1
};
"foreach seq in seqs" .. yields {#0,#1,#2} or {#0,#2,#3}
Sould I go with ..
.. some clever IEqualityComparer
.. some clever LINQ combination I havent figured out - groupby, sequenceequal ..?
.. some seq->HashSet stuff
.. what not. Anything will help
I'll be able to solve it by good'n'old programming but inspiration is always appreciated.

Here's a slightly simpler version of digEmAll's answer:
var result = seqs.Select(x => new HashSet<int>(x))
.Distinct(HashSet<int>.CreateSetComparer());
Given that you want to treat the elements as sets, you should have them that way to start with, IMO.
Of course this won't help if you want to maintain order within the sequences that are returned, you just don't mind which of the equal sets is returned... the above code will return an IEnumerable<HashSet<int>> which will no longer have any ordering within each sequence. (The order in which the sets are returned isn't guaranteed either, although it would be odd for them not to be return in first-seen-first-returned basis.)
It feels unlikely that this wouldn't be enough, but if you could give more details of what you really need to achieve, that would make it easier to help.
As noted in comments, this will also assume that there are no duplicates within each original source array... or at least, that they're irrelevant, so you're happy to treat { 1 } and { 1, 1, 1, 1 } as equal.

Use the correct collection type for the job. What you really want is ISet<IEnumerable<int>> with an equality comparer that will ignore the ordering of the IEnumerables.

EDITED:
You can get what you want by building your own IEqualityComparer<IEnumerable<int>> e.g.:
public class MyEqualityComparer : IEqualityComparer<IEnumerable<int>>
{
public bool Equals(IEnumerable<int> x, IEnumerable<int> y)
{
return x.OrderBy(el1 => el1).SequenceEqual(y.OrderBy(el2 => el2));
}
public int GetHashCode(IEnumerable<int> elements)
{
int hash = 0;
foreach (var el in elements)
{
hash = hash ^ el.GetHashCode();
}
return hash;
}
}
Usage:
var values = seqs.Distinct(new MyEqualityComparer()).ToList();
N.B.
this solution is slightly different from the one given by Jon Skeet.
His answer considers sublists as sets, so basically two lists like [1,2] and [1,1,1,2,2] are equal.
This solution don't, i.e. :
[1,2,1,1] is equal to [2,1,1,1] but not to [2,2,1,1], hence basically the two lists have to contain the same elements and in the same number of occurrences.

Some help understanding "yield"

In my everlasting quest to suck less I'm trying to understand the "yield" statement, but I keep encountering the same error.
The body of [someMethod] cannot be an iterator block because
'System.Collections.Generic.List< AClass>' is not an iterator interface type.
This is the code where I got stuck:
foreach (XElement header in headersXml.Root.Elements()){
yield return (ParseHeader(header));
}
What am I doing wrong? Can't I use yield in an iterator? Then what's the point?
In this example it said that List<ProductMixHeader> is not an iterator interface type.
ProductMixHeader is a custom class, but I imagine List is an iterator interface type, no?
--Edit--
Thanks for all the quick answers.
I know this question isn't all that new and the same resources keep popping up.
It turned out I was thinking I could return List<AClass> as a return type, but since List<T> isn't lazy, it cannot. Changing my return type to IEnumerable<T> solved the problem :D
A somewhat related question (not worth opening a new thread): is it worth giving IEnumerable<T> as a return type if I'm sure that 99% of the cases I'm going to go .ToList() anyway? What will the performance implications be?

A method using yield return must be declared as returning one of the following two interfaces:
IEnumerable<SomethingAppropriate>
IEnumerator<SomethingApropriate>
(thanks Jon and Marc for pointing out IEnumerator)
Example:
public IEnumerable<AClass> YourMethod()
{
foreach (XElement header in headersXml.Root.Elements())
{
yield return (ParseHeader(header));
}
}
yield is a lazy producer of data, only producing another item after the first has been retrieved, whereas returning a list will return everything in one go.
So there is a difference, and you need to declare the method correctly.
For more information, read Jon's answer here, which contains some very useful links.

It's a tricky topic. In a nutshell, it's an easy way of implementing IEnumerable and its friends. The compiler builds you a state machine, transforming parameters and local variables into instance variables in a new class. Complicated stuff.
I have a few resources on this:
Chapter 6 of C# in Depth (free download from that page)
Iterators, iterator blocks and data pipelines (article)
Iterator block implementation details (article)

"yield" creates an iterator block - a compiler generated class that can implement either IEnumerable[<T>] or IEnumerator[<T>]. Jon Skeet has a very good (and free) discussion of this in chapter 6 of C# in Depth.
But basically - to use "yield" your method must return an IEnumerable[<T>] or IEnumerator[<T>]. In this case:
public IEnumerable<AClass> SomeMethod() {
// ...
foreach (XElement header in headersXml.Root.Elements()){
yield return (ParseHeader(header));
}
}

List implements Ienumerable.
Here's an example that might shed some light on what you are trying to learn. I wrote this about 6 months
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace YieldReturnTest
{
public class PrimeFinder
{
private Boolean isPrime(int integer)
{
if (0 == integer)
return false;
if (3 > integer)
return true;
for (int i = 2; i < integer; i++)
{
if (0 == integer % i)
return false;
}
return true;
}
public IEnumerable<int> FindPrimes()
{
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
yield return i;
}
}
}
}
class Program
{
static void Main(string[] args)
{
PrimeFinder primes = new PrimeFinder();
foreach (int i in primes.FindPrimes())
{
Console.WriteLine(i);
Console.ReadLine();
}
Console.ReadLine();
Console.ReadLine();
}
}
}

I highly recommend using Reflector to have a look at what yield actually does for you. You'll be able to see the full code of the class that the compiler generates for you when using yield, and I've found that people understand the concept much more quickly when they can see the low-level result (well, mid-level I guess).

To understand yield, you need to understand when to use IEnumerator and IEnumerable (because you have to use either of them). The following examples help you to understand the difference.
First, take a look at the following class, it implements two methods - one returning IEnumerator<int>, one returning IEnumerable<int>. I'll show you that there is a big difference in usage, although the code of the 2 methods is looking similar:
// 2 iterators, one as IEnumerator, one as IEnumerable
public class Iterator
{
public static IEnumerator<int> IterateOne(Func<int, bool> condition)
{
for(var i=1; condition(i); i++) { yield return i; }
}
public static IEnumerable<int> IterateAll(Func<int, bool> condition)
{
for(var i=1; condition(i); i++) { yield return i; }
}
}
Now, if you're using IterateOne you can do the following:
// 1. Using IEnumerator allows to get item by item
var i=Iterator.IterateOne(x => true); // iterate endless
// 1.a) get item by item
i.MoveNext(); Console.WriteLine(i.Current);
i.MoveNext(); Console.WriteLine(i.Current);
// 1.b) loop until 100
int j; while (i.MoveNext() && (j=i.Current)<=100) { Console.WriteLine(j); }
1.a) prints:
1
2
1.b) prints:
3
4
...
100
because it continues counting right after the 1.a) statements have been executed.
You can see that you can advance item by item using MoveNext().
In contrast, IterateAll allows you to use foreach and also LINQ statements for bigger comfort:
// 2. Using IEnumerable makes looping and LINQ easier
var k=Iterator.IterateAll(x => x<100); // limit iterator to 100
// 2.a) Use a foreach loop
foreach(var x in k){ Console.WriteLine(x); } // loop
// 2.b) LINQ: take 101..200 of endless iteration
var lst=Iterator.IterateAll(x=>true).Skip(100).Take(100).ToList(); // LINQ: take items
foreach(var x in lst){ Console.WriteLine(x); } // output list
2.a) prints:
1
2
...
99
2.b) prints:
101
102
...
200
Note: Since IEnumerator<T> and IEnumerable<T> are Generics, they can be used with any type. However, for simplicity I have used int in my examples for type T.
This means, you can use one of the return types IEnumerator<ProductMixHeader> or IEnumerable<ProductMixHeader> (the custom class you have mentioned in your question).
The type List<ProductMixHeader> does not implement any of these interfaces, which is the reason why you can't use it that way. But Example 2.b) is showing how you can create a list from it.
If you're creating a list by appending .ToList() then the implication is, that it will create a list of all elements in memory, while an IEnumerable allows lazy creation of its elements - in terms of performance, it means that elements are enumerated just in time - as late as possible, but as soon as you're using .ToList(), then all elements are created in memory. LINQ tries to optimize performance this way behind the scenes.
DotNetFiddle of all examples

#Ian P´s answer helped me a lot to understand yield and why it is used. One (major) use case for yield is in "foreach" loops after the "in" keyword not to return a fully completed list. Instead of returning a complete list at once, in each "foreach" loop only one item (the next item) is returned. So you will gain performance with yield in such cases.
I have rewritten #Ian P´s code for my better understanding to the following:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace YieldReturnTest
{
public class PrimeFinder
{
private Boolean isPrime(int integer)
{
if (0 == integer)
return false;
if (3 > integer)
return true;
for (int i = 2; i < integer; i++)
{
if (0 == integer % i)
return false;
}
return true;
}
public IEnumerable<int> FindPrimesWithYield()
{
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
yield return i;
}
}
}
public IEnumerable<int> FindPrimesWithoutYield()
{
var primes = new List<int>();
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
primes.Add(i);
}
}
return primes;
}
}
class Program
{
static void Main(string[] args)
{
PrimeFinder primes = new PrimeFinder();
Console.WriteLine("Finding primes until 7 with yield...very fast...");
foreach (int i in primes.FindPrimesWithYield()) // FindPrimesWithYield DOES NOT iterate over all integers at once, it returns item by item
{
if (i > 7)
{
break;
}
Console.WriteLine(i);
//Console.ReadLine();
}
Console.WriteLine("Finding primes until 7 without yield...be patient it will take lonkg time...");
foreach (int i in primes.FindPrimesWithoutYield()) // FindPrimesWithoutYield DOES iterate over all integers at once, it returns the complete list of primes at once
{
if (i > 7)
{
break;
}
Console.WriteLine(i);
//Console.ReadLine();
}
Console.ReadLine();
Console.ReadLine();
}
}
}

What does the method you're using this in look like? I don't think this can be used in just a loop by itself.
For example...
public IEnumerable<string> GetValues() {
foreach(string value in someArray) {
if (value.StartsWith("A")) { yield return value; }
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# All Unique Combinations of List<string> [duplicate] - c#

Related

Testing an infinite sequence with NUnit, but only a finite sequence must be checked

Why is this printing out the numbers from highest to lowest?

How to check if random values are unique?

IEnumerable<IEnumerable<int>> - no duplicate IEnumerable<int>s

Some help understanding "yield"

Categories

Resources