How to get all subsets of an array? - c#

Given an array: [dog, cat, mouse]
what is the most elegant way to create:
[,,]
[,,mouse]
[,cat,]
[,cat,mouse]
[dog,,]
[dog,,mouse]
[dog,cat,]
[dog,cat,mouse]
I need this to work for any sized array.
This is essentially a binary counter, where array indices represent bits. This presumably lets me use some bitwise operation to count, but I can't see a nice way of translating this to array indices though.

Elegant? Why not Linq it.
public static IEnumerable<IEnumerable<T>> SubSetsOf<T>(IEnumerable<T> source)
{
if (!source.Any())
return Enumerable.Repeat(Enumerable.Empty<T>(), 1);
var element = source.Take(1);
var haveNots = SubSetsOf(source.Skip(1));
var haves = haveNots.Select(set => element.Concat(set));
return haves.Concat(haveNots);
}

string[] source = new string[] { "dog", "cat", "mouse" };
for (int i = 0; i < Math.Pow(2, source.Length); i++)
{
string[] combination = new string[source.Length];
for (int j = 0; j < source.Length; j++)
{
if ((i & (1 << (source.Length - j - 1))) != 0)
{
combination[j] = source[j];
}
}
Console.WriteLine("[{0}, {1}, {2}]", combination[0], combination[1], combination[2]);
}

You can use the BitArray class to easily access the bits in a number:
string[] animals = { "Dog", "Cat", "Mouse" };
List<string[]> result = new List<string[]>();
int cnt = 1 << animals.Length;
for (int i = 0; i < cnt; i++) {
string[] item = new string[animals.Length];
BitArray b = new BitArray(i);
for (int j = 0; j < item.Length; j++) {
item[j] = b[j] ? animals[j] : null;
}
result.Add(item);
}

static IEnumerable<IEnumerable<T>> GetSubsets<T>(IList<T> set)
{
var state = new BitArray(set.Count);
do
yield return Enumerable.Range(0, state.Count)
.Select(i => state[i] ? set[i] : default(T));
while (Increment(state));
}
static bool Increment(BitArray flags)
{
int x = flags.Count - 1;
while (x >= 0 && flags[x]) flags[x--] = false ;
if (x >= 0) flags[x] = true;
return x >= 0;
}
Usage:
foreach(var strings in GetSubsets(new[] { "dog", "cat", "mouse" }))
Console.WriteLine(string.Join(", ", strings.ToArray()));

Guffa's answer had the basic functionality that I was searching, however the line with
BitArray b = new BitArray(i);
did not work for me, it gave an ArgumentOutOfRangeException. Here's my slightly adjusted and working code:
string[] array = { "A", "B", "C","D" };
int count = 1 << array.Length; // 2^n
for (int i = 0; i < count; i++)
{
string[] items = new string[array.Length];
BitArray b = new BitArray(BitConverter.GetBytes(i));
for (int bit = 0; bit < array.Length; bit++) {
items[bit] = b[bit] ? array[bit] : "";
}
Console.WriteLine(String.Join("",items));
}

Here's a solution similar to David B's method, but perhaps more suitable if it's really a requirement that you get back sets with the original number of elements (even if empty):.
static public List<List<T>> GetSubsets<T>(IEnumerable<T> originalList)
{
if (originalList.Count() == 0)
return new List<List<T>>() { new List<T>() };
var setsFound = new List<List<T>>();
foreach (var list in GetSubsets(originalList.Skip(1)))
{
setsFound.Add(originalList.Take(1).Concat(list).ToList());
setsFound.Add(new List<T>() { default(T) }.Concat(list).ToList());
}
return setsFound;
}
If you pass in a list of three strings, you'll get back eight lists with three elements each (but some elements will be null).

Here's an easy-to-follow solution along the lines of your conception:
private static void Test()
{
string[] test = new string[3] { "dog", "cat", "mouse" };
foreach (var x in Subsets(test))
Console.WriteLine("[{0}]", string.Join(",", x));
}
public static IEnumerable<T[]> Subsets<T>(T[] source)
{
int max = 1 << source.Length;
for (int i = 0; i < max; i++)
{
T[] combination = new T[source.Length];
for (int j = 0; j < source.Length; j++)
{
int tailIndex = source.Length - j - 1;
combination[tailIndex] =
((i & (1 << j)) != 0) ? source[tailIndex] : default(T);
}
yield return combination;
}
}

This is a small change to Mehrdad's solution above:
static IEnumerable<T[]> GetSubsets<T>(T[] set) {
bool[] state = new bool[set.Length+1];
for (int x; !state[set.Length]; state[x] = true ) {
yield return Enumerable.Range(0, state.Length)
.Where(i => state[i])
.Select(i => set[i])
.ToArray();
for (x = 0; state[x]; state[x++] = false);
}
}
or with pointers
static IEnumerable<T[]> GetSubsets<T>(T[] set) {
bool[] state = new bool[set.Length+1];
for (bool *x; !state[set.Length]; *x = true ) {
yield return Enumerable.Range(0, state.Length)
.Where(i => state[i])
.Select(i => set[i])
.ToArray();
for (x = state; *x; *x++ = false);
}
}

I'm not very familiar with C# but I'm sure there's something like:
// input: Array A
foreach S in AllSubsetsOf1ToN(A.Length):
print (S.toArray().map(lambda x |> A[x]));
Ok, I've been told the answer above won't work. If you value elegance over efficiency, I would try recursion, in my crappy pseudocode:
Array_Of_Sets subsets(Array a)
{
if (a.length == 0)
return [new Set();] // emptyset
return subsets(a[1:]) + subsets(a[1:]) . map(lambda x |> x.add a[0])
}

Here is a variant of mqp's answer, that uses as state a BigInteger instead of an int, to avoid overflow for collections containing more than 30 elements:
using System.Numerics;
public static IEnumerable<IEnumerable<T>> GetSubsets<T>(IList<T> source)
{
BigInteger combinations = BigInteger.One << source.Count;
for (BigInteger i = 0; i < combinations; i++)
{
yield return Enumerable.Range(0, source.Count)
.Select(j => (i & (BigInteger.One << j)) != 0 ? source[j] : default);
}
}

Easy to understand version (with descriptions)
I assumed that source = {1,2,3,4}
public static IEnumerable<IEnumerable<T>> GetSubSets<T>(IEnumerable<T> source)
{
var result = new List<IEnumerable<T>>() { new List<T>() }; // empty cluster added
for (int i = 0; i < source.Count(); i++)
{
var elem = source.Skip(i).Take(1);
// for elem = 2
// and currently result = [ [],[1] ]
var matchUps = result.Select(x => x.Concat(elem));
//then matchUps => [ [2],[1,2] ]
result = result.Concat(matchUps).ToList();
// matchUps and result concat operation
// finally result = [ [],[1],[2],[1,2] ]
}
return result;
}

The way this is written, it is more of a Product (Cartesian product) rather than a list of all subsets.
You have three sets: (Empty,"dog"), (Empty,"cat"),(Empty,"mouse").
There are several posts on general solutions for products. As noted though, since you really just have 2 choices for each axis a single bit can represent the presence or not of the item.
So the total set of sets is all numbers from 0 to 2^N-1. If N < 31 an int will work.

Related

Find the missing integer in Codility

I need to "Find the minimal positive integer not occurring in a given sequence. "
A[0] = 1
A[1] = 3
A[2] = 6
A[3] = 4
A[4] = 1
A[5] = 2, the function should return 5.
Assume that:
N is an integer within the range [1..100,000];
each element of array A is an integer within the range [−2,147,483,648..2,147,483,647].
I wrote the code in codility, but for many cases it did not worked and the performance test gives 0 %. Please help me out, where I am wrong.
class Solution {
public int solution(int[] A) {
if(A.Length ==0) return -1;
int value = A[0];
int min = A.Min();
int max = A.Max();
for (int j = min+1; j < max; j++)
{
if (!A.Contains(j))
{
value = j;
if(value > 0)
{
break;
}
}
}
if(value > 0)
{
return value;
}
else return 1;
}
}
The codility gives error with all except the example, positive and negative only values.
Edit: Added detail to answer your actual question more directly.
"Please help me out, where I am wrong."
In terms of correctness: Consider A = {7,2,5,6,3}. The correct output, given the contents of A, is 1, but our algorithm would fail to detect this since A.Min() would return 2 and we would start looping from 3 onward. In this case, we would return 4 instead; since it's the next missing value.
Same goes for something like A = {14,15,13}. The minimal missing positive integer here is again 1 and, since all the values from 13-15 are present, the value variable will retain its initial value of value=A[0] which would be 14.
In terms of performance: Consider what A.Min(), A.Max() and A.Contains() are doing behind the scenes; each one of these is looping through A in its entirety and in the case of Contains, we are calling it repeatedly for every value between the Min() and the lowest positive integer we can find. This will take us far beyond the specified O(N) performance that Codility is looking for.
By contrast, here's the simplest version I can think of that should score 100% on Codility. Notice that we only loop through A once and that we take advantage of a Dictionary which lets us use ContainsKey; a much faster method that does not require looping through the whole collection to find a value.
using System;
using System.Collections.Generic;
class Solution {
public int solution(int[] A) {
// the minimum possible answer is 1
int result = 1;
// let's keep track of what we find
Dictionary<int,bool> found = new Dictionary<int,bool>();
// loop through the given array
for(int i=0;i<A.Length;i++) {
// if we have a positive integer that we haven't found before
if(A[i] > 0 && !found.ContainsKey(A[i])) {
// record the fact that we found it
found.Add(A[i], true);
}
}
// crawl through what we found starting at 1
while(found.ContainsKey(result)) {
// look for the next number
result++;
}
// return the smallest positive number that we couldn't find.
return result;
}
}
The simplest solution that scored perfect score was:
public int solution(int[] A)
{
int flag = 1;
A = A.OrderBy(x => x).ToArray();
for (int i = 0; i < A.Length; i++)
{
if (A[i] <= 0)
continue;
else if (A[i] == flag)
{
flag++;
}
}
return flag;
}
Fastest C# solution so far for [-1,000,000...1,000,000].
public int solution(int[] array)
{
HashSet<int> found = new HashSet<int>();
for (int i = 0; i < array.Length; i++)
{
if (array[i] > 0)
{
found.Add(array[i]);
}
}
int result = 1;
while (found.Contains(result))
{
result++;
}
return result;
}
A tiny version of another 100% with C#
using System.Linq;
class Solution
{
public int solution(int[] A)
{
// write your code in C# 6.0 with .NET 4.5 (Mono)
var i = 0;
return A.Where(a => a > 0).Distinct().OrderBy(a => a).Any(a => a != (i = i + 1)) ? i : i + 1;
}
}
A simple solution that scored 100% with C#
int Solution(int[] A)
{
var A2 = Enumerable.Range(1, A.Length + 1);
return A2.Except(A).First();
}
public class Solution {
public int solution( int[] A ) {
return Arrays.stream( A )
.filter( n -> n > 0 )
.sorted()
.reduce( 0, ( a, b ) -> ( ( b - a ) > 1 ) ? a : b ) + 1;
}
}
It seemed easiest to just filter out the negative numbers. Then sort the stream. And then reduce it to come to an answer. It's a bit of a functional approach, but it got a 100/100 test score.
Got an 100% score with this solution:
https://app.codility.com/demo/results/trainingUFKJSB-T8P/
public int MissingInteger(int[] A)
{
A = A.Where(a => a > 0).Distinct().OrderBy(c => c).ToArray();
if (A.Length== 0)
{
return 1;
}
for (int i = 0; i < A.Length; i++)
{
//Console.WriteLine(i + "=>" + A[i]);
if (i + 1 != A[i])
{
return i + 1;
}
}
return A.Max() + 1;
}
JavaScript solution using Hash Table with O(n) time complexity.
function solution(A) {
let hashTable = {}
for (let item of A) {
hashTable[item] = true
}
let answer = 1
while(true) {
if(!hashTable[answer]) {
return answer
}
answer++
}
}
The Simplest solution for C# would be:
int value = 1;
int min = A.Min();
int max = A.Max();
if (A.Length == 0) return value = 1;
if (min < 0 && max < 0) return value = 1;
List<int> range = Enumerable.Range(1, max).ToList();
List<int> current = A.ToList();
List<int> valid = range.Except(current).ToList();
if (valid.Count() == 0)
{
max++;
return value = max;
}
else
{
return value = valid.Min();
}
Considering that the array should start from 1 or if it needs to start from the minimum value than the Enumerable.range should start from Min
MissingInteger solution in C
int solution(int A[], int N) {
int i=0,r[N];
memset(r,0,(sizeof(r)));
for(i=0;i<N;i++)
{
if(( A[i] > 0) && (A[i] <= N)) r[A[i]-1]=A[i];
}
for(i=0;i<N;i++)
{
if( r[i] != (i+1)) return (i+1);
}
return (N+1);
}
My solution for it:
public static int solution()
{
var A = new[] { -1000000, 1000000 }; // You can try with different integers
A = A.OrderBy(i => i).ToArray(); // We sort the array first
if (A.Length == 1) // if there is only one item in the array
{
if (A[0]<0 || A[0] > 1)
return 1;
if (A[0] == 1)
return 2;
}
else // if there are more than one item in the array
{
for (var i = 0; i < A.Length - 1; i++)
{
if (A[i] >= 1000000) continue; // if it's bigger than 1M
if (A[i] < 0 || (A[i] + 1) >= (A[i + 1])) continue; //if it's smaller than 0, if the next integer is bigger or equal to next integer in the sequence continue searching.
if (1 < A[0]) return 1;
return A[i] + 1;
}
}
if (1 < A[0] || A[A.Length - 1] + 1 == 0 || A[A.Length - 1] + 1 > 1000000)
return 1;
return A[A.Length-1] +1;
}
class Solution {
public int solution(int[] A) {
int size=A.length;
int small,big,temp;
for (int i=0;i<size;i++){
for(int j=0;j<size;j++){
if(A[i]<A[j]){
temp=A[j];
A[j]=A[i];
A[i]=temp;
}
}
}
int z=1;
for(int i=0;i<size;i++){
if(z==A[i]){
z++;
}
//System.out.println(a[i]);
}
return z;
}
enter code here
}
In C# you can solve the problem by making use of built in library functions. How ever the performance is low for very large integers
public int solution(int[] A)
{
var numbers = Enumerable.Range(1, Math.Abs(A.Max())+1).ToArray();
return numbers.Except(A).ToArray()[0];
}
Let me know if you find a better solution performance wise
C# - MissingInteger
Find the smallest missing integer between 1 - 1000.000.
Assumptions of the OP take place
TaskScore/Correctness/Performance: 100%
using System;
using System.Linq;
namespace TestConsole
{
class Program
{
static void Main(string[] args)
{
var A = new int[] { -122, -5, 1, 2, 3, 4, 5, 6, 7 }; // 8
var B = new int[] { 1, 3, 6, 4, 1, 2 }; // 5
var C = new int[] { -1, -3 }; // 1
var D = new int[] { -3 }; // 1
var E = new int[] { 1 }; // 2
var F = new int[] { 1000000 }; // 1
var x = new int[][] { A, B, C, D, E, F };
x.ToList().ForEach((arr) =>
{
var s = new Solution();
Console.WriteLine(s.solution(arr));
});
Console.ReadLine();
}
}
// ANSWER/SOLUTION
class Solution
{
public int solution(int[] A)
{
// clean up array for negatives and duplicates, do sort
A = A.Where(entry => entry > 0).Distinct().OrderBy(it => it).ToArray();
int lowest = 1, aLength = A.Length, highestIndex = aLength - 1;
for (int i = 0; i < aLength; i++)
{
var currInt = A[i];
if (currInt > lowest) return lowest;
if (i == highestIndex) return ++lowest;
lowest++;
}
return 1;
}
}
}
Got 100% - C# Efficient Solution
public int solution (int [] A){
int len = A.Length;
HashSet<int> realSet = new HashSet<int>();
HashSet<int> perfectSet = new HashSet<int>();
int i = 0;
while ( i < len)
{
realSet.Add(A[i]); //convert array to set to get rid of duplicates, order int's
perfectSet.Add(i + 1); //create perfect set so can find missing int
i++;
}
perfectSet.Add(i + 1);
if (realSet.All(item => item < 0))
return 1;
int notContains =
perfectSet.Except(realSet).Where(item=>item!=0).FirstOrDefault();
return notContains;
}
class Solution {
public int solution(int[] a) {
int smallestPositive = 1;
while(a.Contains(smallestPositive)) {
smallestPositive++;
}
return smallestPositive;
}
}
Well, this is a new winner now. At least on C# and my laptop. It's 1.5-2 times faster than the previous champion and 3-10 times faster, than most of the other solutions. The feature (or a bug?) of this solution is that it uses only basic data types. Also 100/100 on Codility.
public int Solution(int[] A)
{
bool[] B = new bool[(A.Length + 1)];
for (int i = 0; i < A.Length; i++)
{
if ((A[i] > 0) && (A[i] <= A.Length))
B[A[i]] = true;
}
for (int i = 1; i < B.Length; i++)
{
if (!B[i])
return i;
}
return A.Length + 1;
}
Simple C++ solution. No additional memory need, time execution order O(N*log(N)):
int solution(vector<int> &A) {
sort (A.begin(), A.end());
int prev = 0; // the biggest integer greater than 0 found until now
for( auto it = std::begin(A); it != std::end(A); it++ ) {
if( *it > prev+1 ) break;// gap found
if( *it > 0 ) prev = *it; // ignore integers smaller than 1
}
return prev+1;
}
int[] A = {1, 3, 6, 4, 1, 2};
Set<Integer> integers = new TreeSet<>();
for (int i = 0; i < A.length; i++) {
if (A[i] > 0) {
integers.add(A[i]);
}
}
Integer[] arr = integers.toArray(new Integer[0]);
final int[] result = {Integer.MAX_VALUE};
final int[] prev = {0};
final int[] curr2 = {1};
integers.stream().forEach(integer -> {
if (prev[0] + curr2[0] == integer) {
prev[0] = integer;
} else {
result[0] = prev[0] + curr2[0];
}
});
if (Integer.MAX_VALUE == result[0]) result[0] = arr[arr.length-1] + 1;
System.out.println(result[0]);
I was surprised but this was a good lesson. LINQ IS SLOW. my answer below got me 11%
public int solution (int [] A){
if (Array.FindAll(A, x => x >= 0).Length == 0) {
return 1;
} else {
var lowestValue = A.Where(x => Array.IndexOf(A, (x+1)) == -1).Min();
return lowestValue + 1;
}
}
I think I kinda look at this a bit differently but gets a 100% evaluation. Also, I used no library:
public static int Solution(int[] A)
{
var arrPos = new int[1_000_001];
for (int i = 0; i < A.Length; i++)
{
if (A[i] >= 0)
arrPos[A[i]] = 1;
}
for (int i = 1; i < arrPos.Length; i++)
{
if (arrPos[i] == 0)
return i;
}
return 1;
}
public int solution(int[] A) {
// write your code in Java SE 8
Set<Integer> elements = new TreeSet<Integer>();
long lookFor = 1;
for (int i = 0; i < A.length; i++) {
elements.add(A[i]);
}
for (Integer integer : elements) {
if (integer == lookFor)
lookFor += 1;
}
return (int) lookFor;
}
I tried to use recursion in C# instead of sorting, because I thought it would show more coding skill to do it that way, but on the scaling tests it didn't preform well on large performance tests. Suppose it's best to just do the easy way.
class Solution {
public int lowest=1;
public int solution(int[] A) {
// write your code in C# 6.0 with .NET 4.5 (Mono)
if (A.Length < 1)
return 1;
for (int i=0; i < A.Length; i++){
if (A[i]==lowest){
lowest++;
solution(A);
}
}
return lowest;
}
}
Here is my solution in javascript
function solution(A) {
// write your code in JavaScript (Node.js 8.9.4)
let result = 1;
let haveFound = {}
let len = A.length
for (let i=0;i<len;i++) {
haveFound[`${A[i]}`] = true
}
while(haveFound[`${result}`]) {
result++
}
return result
}
class Solution {
public int solution(int[] A) {
var sortedList = A.Where(x => x > 0).Distinct().OrderBy(x => x).ToArray();
var output = 1;
for (int i = 0; i < sortedList.Length; i++)
{
if (sortedList[i] != output)
{
return output;
}
output++;
}
return output;
}
}
You should just use a HashSet as its look up time is also constant instead of a dictionary. The code is less and cleaner.
public int solution (int [] A){
int answer = 1;
var set = new HashSet<int>(A);
while (set.Contains(answer)){
answer++;
}
return answer;
}
This snippet should work correctly.
using System;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
int result = 1;
List<int> lst = new List<int>();
lst.Add(1);
lst.Add(2);
lst.Add(3);
lst.Add(18);
lst.Add(4);
lst.Add(1000);
lst.Add(-1);
lst.Add(-1000);
lst.Sort();
foreach(int curVal in lst)
{
if(curVal <=0)
result=1;
else if(!lst.Contains(curVal+1))
{
result = curVal + 1 ;
}
Console.WriteLine(result);
}
}
}

c# list permutations with limited length

I have a list of Offers, from which I want to create "chains" (e.g. permutations) with limited chain lengths.
I've gotten as far as creating the permutations using the Kw.Combinatorics project.
However, the default behavior creates permutations in the length of the list count. I'm not sure how to limit the chain lengths to 'n'.
Here's my current code:
private static List<List<Offers>> GetPerms(List<Offers> list, int chainLength)
{
List<List<Offers>> response = new List<List<Offers>>();
foreach (var row in new Permutation(list.Count).GetRows())
{
List<Offers> innerList = new List<Offers>();
foreach (var mix in Permutation.Permute(row, list))
{
innerList.Add(mix);
}
response.Add(innerList);
innerList = new List<Offers>();
}
return response;
}
Implemented by:
List<List<AdServer.Offers>> lst = GetPerms(offers, 2);
I'm not locked in KWCombinatorics if someone has a better solution to offer.
Here's another implementation which I think should be faster than the accepted answer (and it's definitely less code).
public static IEnumerable<IEnumerable<T>> GetVariationsWithoutDuplicates<T>(IList<T> items, int length)
{
if (length == 0 || !items.Any()) return new List<List<T>> { new List<T>() };
return from item in items.Distinct()
from permutation in GetVariationsWithoutDuplicates(items.Where(i => !EqualityComparer<T>.Default.Equals(i, item)).ToList(), length - 1)
select Prepend(item, permutation);
}
public static IEnumerable<IEnumerable<T>> GetVariations<T>(IList<T> items, int length)
{
if (length == 0 || !items.Any()) return new List<List<T>> { new List<T>() };
return from item in items
from permutation in GetVariations(Remove(item, items).ToList(), length - 1)
select Prepend(item, permutation);
}
public static IEnumerable<T> Prepend<T>(T first, IEnumerable<T> rest)
{
yield return first;
foreach (var item in rest) yield return item;
}
public static IEnumerable<T> Remove<T>(T item, IEnumerable<T> from)
{
var isRemoved = false;
foreach (var i in from)
{
if (!EqualityComparer<T>.Default.Equals(item, i) || isRemoved) yield return i;
else isRemoved = true;
}
}
On my 3.1 GHz Core 2 Duo, I tested with this:
public static void Test(Func<IList<int>, int, IEnumerable<IEnumerable<int>>> getVariations)
{
var max = 11;
var timer = System.Diagnostics.Stopwatch.StartNew();
for (int i = 1; i < max; ++i)
for (int j = 1; j < i; ++j)
getVariations(MakeList(i), j).Count();
timer.Stop();
Console.WriteLine("{0,40}{1} ms", getVariations.Method.Name, timer.ElapsedMilliseconds);
}
// Make a list that repeats to guarantee we have duplicates
public static IList<int> MakeList(int size)
{
return Enumerable.Range(0, size/2).Concat(Enumerable.Range(0, size - size/2)).ToList();
}
Unoptimized
GetVariations 11894 ms
GetVariationsWithoutDuplicates 9 ms
OtherAnswerGetVariations 22485 ms
OtherAnswerGetVariationsWithDuplicates 243415 ms
With compiler optimizations
GetVariations 9667 ms
GetVariationsWithoutDuplicates 8 ms
OtherAnswerGetVariations 19739 ms
OtherAnswerGetVariationsWithDuplicates 228802 ms
You're not looking for a permutation, but for a variation. Here is a possible algorithm. I prefer iterator methods for functions that can potentially return very many elements. This way, the caller can decide if he really needs all elements:
IEnumerable<IList<T>> GetVariations<T>(IList<T> offers, int length)
{
var startIndices = new int[length];
var variationElements = new HashSet<T>(); //for duplicate detection
while (startIndices[0] < offers.Count)
{
var variation = new List<T>(length);
var valid = true;
for (int i = 0; i < length; ++i)
{
var element = offers[startIndices[i]];
if (variationElements.Contains(element))
{
valid = false;
break;
}
variation.Add(element);
variationElements.Add(element);
}
if (valid)
yield return variation;
//Count up the indices
startIndices[length - 1]++;
for (int i = length - 1; i > 0; --i)
{
if (startIndices[i] >= offers.Count)
{
startIndices[i] = 0;
startIndices[i - 1]++;
}
else
break;
}
variationElements.Clear();
}
}
The idea for this algorithm is to use a number in offers.Count base. For three offers, all digits are in the range 0-2. We then basically increment this number step by step and return the offers that reside at the specified indices. If you want to allow duplicates, you can remove the check and the HashSet<T>.
Update
Here is an optimized variant that does the duplicate check on the index level. In my tests it is a lot faster than the previous variant:
IEnumerable<IList<T>> GetVariations<T>(IList<T> offers, int length)
{
var startIndices = new int[length];
for (int i = 0; i < length; ++i)
startIndices[i] = i;
var indices = new HashSet<int>(); // for duplicate check
while (startIndices[0] < offers.Count)
{
var variation = new List<T>(length);
for (int i = 0; i < length; ++i)
{
variation.Add(offers[startIndices[i]]);
}
yield return variation;
//Count up the indices
AddOne(startIndices, length - 1, offers.Count - 1);
//duplicate check
var check = true;
while (check)
{
indices.Clear();
for (int i = 0; i <= length; ++i)
{
if (i == length)
{
check = false;
break;
}
if (indices.Contains(startIndices[i]))
{
var unchangedUpTo = AddOne(startIndices, i, offers.Count - 1);
indices.Clear();
for (int j = 0; j <= unchangedUpTo; ++j )
{
indices.Add(startIndices[j]);
}
int nextIndex = 0;
for(int j = unchangedUpTo + 1; j < length; ++j)
{
while (indices.Contains(nextIndex))
nextIndex++;
startIndices[j] = nextIndex++;
}
break;
}
indices.Add(startIndices[i]);
}
}
}
}
int AddOne(int[] indices, int position, int maxElement)
{
//returns the index of the last element that has not been changed
indices[position]++;
for (int i = position; i > 0; --i)
{
if (indices[i] > maxElement)
{
indices[i] = 0;
indices[i - 1]++;
}
else
return i;
}
return 0;
}
If I got you correct here is what you need
this will create permutations based on the specified chain limit
public static List<List<T>> GetPerms<T>(List<T> list, int chainLimit)
{
if (list.Count() == 1)
return new List<List<T>> { list };
return list
.Select((outer, outerIndex) =>
GetPerms(list.Where((inner, innerIndex) => innerIndex != outerIndex).ToList(), chainLimit)
.Select(perms => (new List<T> { outer }).Union(perms).Take(chainLimit)))
.SelectMany<IEnumerable<IEnumerable<T>>, List<T>>(sub => sub.Select<IEnumerable<T>, List<T>>(s => s.ToList()))
.Distinct(new PermComparer<T>()).ToList();
}
class PermComparer<T> : IEqualityComparer<List<T>>
{
public bool Equals(List<T> x, List<T> y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(List<T> obj)
{
return (int)obj.Average(o => o.GetHashCode());
}
}
and you'll call it like this
List<List<AdServer.Offers>> lst = GetPerms<AdServer.Offers>(offers, 2);
I made this function is pretty generic so you may use it for other purpose too
eg
List<string> list = new List<string>(new[] { "apple", "banana", "orange", "cherry" });
List<List<string>> perms = GetPerms<string>(list, 2);
result

Get all subsets of a collection

I am trying to create a method that will return all subsets of a set.
For example if I have the collection 10,20,30 I will like to get the following output
return new List<List<int>>()
{
new List<int>(){10},
new List<int>(){20},
new List<int>(){30},
new List<int>(){10,20},
new List<int>(){10,30},
new List<int>(){20,30},
//new List<int>(){20,10}, that substet already exists
// new List<int>(){30,20}, that subset already exists
new List<int>(){10,20,30}
};
because the collection can also be a collection of strings for instance I want to create a generic method. This is what I have worked out based on this solution.
static void Main(string[] args)
{
Foo<int>(new int[] { 10, 20, 30});
}
static List<List<T>> Foo<T>(T[] set)
{
// Init list
List<List<T>> subsets = new List<List<T>>();
// Loop over individual elements
for (int i = 1; i < set.Length; i++)
{
subsets.Add(new List<T>(){set[i - 1]});
List<List<T>> newSubsets = new List<List<T>>();
// Loop over existing subsets
for (int j = 0; j < subsets.Count; j++)
{
var tempList = new List<T>();
tempList.Add(subsets[j][0]);
tempList.Add(subsets[i][0]);
var newSubset = tempList;
newSubsets.Add(newSubset);
}
subsets.AddRange(newSubsets);
}
// Add in the last element
//subsets.Add(set[set.Length - 1]);
//subsets.Sort();
//Console.WriteLine(string.Join(Environment.NewLine, subsets));
return null;
}
Edit
Sorry that is wrong I still get duplicates...
static List<List<T>> GetSubsets<T>(IEnumerable<T> Set)
{
var set = Set.ToList<T>();
// Init list
List<List<T>> subsets = new List<List<T>>();
subsets.Add(new List<T>()); // add the empty set
// Loop over individual elements
for (int i = 1; i < set.Count; i++)
{
subsets.Add(new List<T>(){set[i - 1]});
List<List<T>> newSubsets = new List<List<T>>();
// Loop over existing subsets
for (int j = 0; j < subsets.Count; j++)
{
var newSubset = new List<T>();
foreach(var temp in subsets[j])
newSubset.Add(temp);
newSubset.Add(set[i]);
newSubsets.Add(newSubset);
}
subsets.AddRange(newSubsets);
}
// Add in the last element
subsets.Add(new List<T>(){set[set.Count - 1]});
//subsets.Sort();
return subsets;
}
Then I could call that method as:
This is a basic algorithm which i used the below technique to make a single player scrabble word solver (the newspaper ones).
Let your set have n elements. Increment an integer starting from 0 to 2^n. For each generater number bitmask each position of the integer. If the i th position of the integer is 1 then select the i th element of the set. For each generated integer from 0 to 2^n doing the above bitmasting and selection will get you all the subsets.
Here is a post: http://phoxis.org/2009/10/13/allcombgen/
Here is an adaptation of the code provided by Marvin Mendes in this answer but refactored into a single method with an iterator block.
public static IEnumerable<IEnumerable<T>> Subsets<T>(IEnumerable<T> source)
{
List<T> list = source.ToList();
int length = list.Count;
int max = (int)Math.Pow(2, list.Count);
for (int count = 0; count < max; count++)
{
List<T> subset = new List<T>();
uint rs = 0;
while (rs < length)
{
if ((count & (1u << (int)rs)) > 0)
{
subset.Add(list[(int)rs]);
}
rs++;
}
yield return subset;
}
}
I know that this question is a little old but i was looking for a answer and dont find any good here, so i want to share this solution that is an adaptation found in this blog: http://praseedp.blogspot.com.br/2010/02/subset-generation-in-c.html
I Only transform the class into a generic class:
public class SubSet<T>
{
private IList<T> _list;
private int _length;
private int _max;
private int _count;
public SubSet(IList<T> list)
{
if (list== null)
throw new ArgumentNullException("lista");
_list = list;
_length = _list.Count;
_count = 0;
_max = (int)Math.Pow(2, _length);
}
public IList<T> Next()
{
if (_count == _max)
{
return null;
}
uint rs = 0;
IList<T> l = new List<T>();
while (rs < _length)
{
if ((_count & (1u << (int)rs)) > 0)
{
l.Add(_list[(int)rs]);
}
rs++;
}
_count++;
return l;
}
}
To use this code you can do like something that:
List<string> lst = new List<string>();
lst.AddRange(new string[] {"A", "B", "C" });
SubSet<string> subs = new SubSet<string>(lst);
IList<string> l = subs.Next();
while (l != null)
{
DoSomething(l);
l = subs.Next();
}
Just remember: this code still be an O(2^n) and if you pass something like 20 elements in the list you will get 2^20= 1048576 subsets!
Edit:
As Servy sugest i add an implementation with interator block to use with Linq an foreach, the new class is like this:
private class SubSet<T> : IEnumerable<IEnumerable<T>>
{
private IList<T> _list;
private int _length;
private int _max;
private int _count;
public SubSet(IEnumerable<T> list)
{
if (list == null)
throw new ArgumentNullException("list");
_list = new List<T>(list);
_length = _list.Count;
_count = 0;
_max = (int)Math.Pow(2, _length);
}
public int Count
{
get { return _max; }
}
private IList<T> Next()
{
if (_count == _max)
{
return null;
}
uint rs = 0;
IList<T> l = new List<T>();
while (rs < _length)
{
if ((_count & (1u << (int)rs)) > 0)
{
l.Add(_list[(int)rs]);
}
rs++;
}
_count++;
return l;
}
public IEnumerator<IEnumerable<T>> GetEnumerator()
{
IList<T> subset;
while ((subset = Next()) != null)
{
yield return subset;
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
and you now can use it like this:
List<string> lst = new List<string>();
lst.AddRange(new string[] {"A", "B", "C" });
SubSet<string> subs = new SubSet<string>(lst);
foreach(IList<string> l in subs)
{
DoSomething(l);
}
Thanks Servy for the advice.
It Doesn't give duplicate value;
Don't add a value of the int array at the start of the subsets
Correct program is as follows:
class Program
{
static HashSet<List<int>> SubsetMaker(int[] a, int sum)
{
var set = a.ToList<int>();
HashSet<List<int>> subsets = new HashSet<List<int>>();
subsets.Add(new List<int>());
for (int i =0;i<set.Count;i++)
{
//subsets.Add(new List<int>() { set[i]});
HashSet<List<int>> newSubsets = new HashSet<List<int>>();
for (int j = 0; j < subsets.Count; j++)
{
var newSubset = new List<int>();
foreach (var temp in subsets.ElementAt(j))
{
newSubset.Add(temp);
}
newSubset.Add(set[i]);
newSubsets.Add(newSubset);
}
Console.WriteLine("New Subset");
foreach (var t in newSubsets)
{
var temp = string.Join<int>(",", t);
temp = "{" + temp + "}";
Console.WriteLine(temp);
}
Console.ReadLine();
subsets.UnionWith(newSubsets);
}
//subsets.Add(new List<int>() { set[set.Count - 1] });
//subsets=subsets.;
return subsets;
}
static void Main(string[] args)
{
int[] b = new int[] { 1,2,3 };
int suma = 6;
var test = SubsetMaker(b, suma);
Console.WriteLine("Printing final set...");
foreach (var t in test)
{
var temp = string.Join<int>(",", t);
temp = "{" + temp + "}";
Console.WriteLine(temp);
}
Console.ReadLine();
}
}
You don't want to return a set of lists, you want to use java's set type. Set already does part of what you are looking for by holding only one unique element of each type. So you can't add 20 twice for instance. It is an unordered type, so what you might do is write a combinatoric function that creates a bunch of sets and then return a list that includes alist of those.
Get all subsets of a collection of a specific subsetlength:
public static IEnumerable<IEnumerable<T>> GetPermutations<T>(IEnumerable<T> list, int length) where T : IComparable
{
if (length == 1) return list.Select(t => new T[] { t });
return GetPermutations(list, length - 1).SelectMany(t => list.Where(e => t.All(g => g.CompareTo(e) != 0)), (t1, t2) => t1.Concat(new T[] { t2 }));
}
public static IEnumerable<IEnumerable<T>> GetOrderedSubSets<T>(IEnumerable<T> list, int length) where T : IComparable
{
if (length == 1) return list.Select(t => new T[] { t });
return GetOrderedSubSets(list, length - 1).SelectMany(t => list.Where(e => t.All(g => g.CompareTo(e) == -1)), (t1, t2) => t1.Concat(new T[] { t2 }));
}
Testcode:
List<int> set = new List<int> { 1, 2, 3 };
foreach (var x in GetPermutations(set, 3))
{
Console.WriteLine(string.Join(", ", x));
}
Console.WriteLine();
foreach (var x in GetPermutations(set, 2))
{
Console.WriteLine(string.Join(", ", x));
}
Console.WriteLine();
foreach (var x in GetOrderedSubSets(set, 2))
{
Console.WriteLine(string.Join(", ", x));
}
Test results:
1, 2, 3
1, 3, 2
2, 1, 3
2, 3, 1
3, 1, 2
3, 2, 1
1, 2
1, 3
2, 1
2, 3
3, 1
3, 2
1, 2
1, 3
2, 3
A simple algorithm based upon recursion:
private static List<List<int>> GetPowerList(List<int> a)
{
int n = a.Count;
var sublists = new List<List<int>>() { new List<int>() };
for (int i = 0; i < n; i++)
{
for (int j = i; j < n; j++)
{
var first = a[i];
var last = a[j];
if ((j - i) > 1)
{
sublists.AddRange(GetPowerList(a
.GetRange(i + 1, j - i - 1))
.Select(l => l
.Prepend(first)
.Append(last).ToList()));
}
else sublists.Add(a.GetRange(i,j - i + 1));
}
}
return sublists;
}

Check for missing number in sequence

I have an List<int> which contains 1,2,4,7,9 for example.
I have a range from 0 to 10.
Is there a way to determine what numbers are missing in that sequence?
I thought LINQ might provide an option but I can't see one
In the real world my List could contain 100,000 items so performance is key
var list = new List<int>(new[] { 1, 2, 4, 7, 9 });
var result = Enumerable.Range(0, 10).Except(list);
Turn the range you want to check into a HashSet:
public IEnumerable<int> FindMissing(IEnumerable<int> values)
{
HashSet<int> myRange = new HashSet<int>(Enumerable.Range(0,10));
myRange.ExceptWith(values);
return myRange;
}
Will return the values that aren't in values.
Using Unity i have tested two solutions on set of million integers. Looks like using Dictionary and two "for" loops gives better result than Enumerable.Except
FindMissing1 Total time: 0.1420 (Enumerable.Except)
FindMissing2 Total time: 0.0621 (Dictionary and two for loops)
public static class ArrayExtension
{
public static T[] FindMissing1<T>(T[] range, T[] values)
{
List<T> result = Enumerable.Except<T>(range, values).ToList<T>();
return result.ToArray<T>();
}
public static T[] FindMissing2<T>(T[] range, T[] values)
{
List<T> result = new List<T>();
Dictionary<T, T> hash = new Dictionary<T, T>(values.Length);
for (int i = 0; i < values.Length; i++)
hash.Add(values[i], values[i]);
for (int i = 0; i < range.Length; i++)
{
if (!hash.ContainsKey(range[i]))
result.Add(range[i]);
}
return result.ToArray<T>();
}
}
public class ArrayManipulationTest : MonoBehaviour
{
void Start()
{
int rangeLength = 1000000;
int[] range = Enumerable.Range(0, rangeLength).ToArray();
int[] values = new int[rangeLength / 5];
int[] missing;
float start;
float duration;
for (int i = 0; i < rangeLength / 5; i ++)
values[i] = i * 5;
start = Time.realtimeSinceStartup;
missing = ArrayExtension.FindMissing1<int>(range, values);
duration = Time.realtimeSinceStartup - start;
Debug.Log($"FindMissing1 Total time: {duration:0.0000}");
start = Time.realtimeSinceStartup;
missing = ArrayExtension.FindMissing2<int>(range, values);
duration = Time.realtimeSinceStartup - start;
Debug.Log($"FindMissing2 Total time: {duration:0.0000}");
}
}
List<int> selectedNumbers = new List<int>(){8, 5, 3, 12, 2};
int firstNumber = selectedNumbers.OrderBy(i => i).First();
int lastNumber = selectedNumbers.OrderBy(i => i).Last();
List<int> allNumbers = Enumerable.Range(firstNumber, lastNumber - firstNumber + 1).ToList();
List<int> missingNumbers = allNumbers.Except(selectedNumbers).ToList();
foreach (int i in missingNumbers)
{
Response.Write(i);
}
LINQ's Except method would be the most readable. Whether it performs adequately for you or not would be a matter for testing.
E.g.
range.Except(listOfValues);
Edit
Here's the program I used for my mini-benchmark, for others to plug away with:
static void Main()
{
var a = Enumerable.Range(0, 1000000);
var b = new List<int>();
for (int i = 0; i < 1000000; i += 10)
{
b.Add(i);
}
Stopwatch sw = new Stopwatch();
sw.Start();
var c = a.Except(b).ToList();
sw.Stop();
Console.WriteLine("Milliseconds {0}", sw.ElapsedMilliseconds );
sw.Reset();
Console.ReadLine();
}
An alternative method which works in general for any two IEnunumerable<T> where T :IComparable. If the IEnumerables are both sorted, this works in O(1) memory (i.e. there is no creating another ICollection and subtracting, etc.) and in O(n) time.
The use of IEnumerable<IComparable> and GetEnumerator makes this a little less readable, but far more general.
Implementation
/// <summary>
/// <para>For two sorted IEnumerable<T> (superset and subset),</para>
/// <para>returns the values in superset which are not in subset.</para>
/// </summary>
public static IEnumerable<T> CompareSortedEnumerables<T>(IEnumerable<T> superset, IEnumerable<T> subset)
where T : IComparable
{
IEnumerator<T> supersetEnumerator = superset.GetEnumerator();
IEnumerator<T> subsetEnumerator = subset.GetEnumerator();
bool itemsRemainingInSubset = subsetEnumerator.MoveNext();
// handle the case when the first item in subset is less than the first item in superset
T firstInSuperset = superset.First();
while ( itemsRemainingInSubset && supersetEnumerator.Current.CompareTo(subsetEnumerator.Current) >= 0 )
itemsRemainingInSubset = subsetEnumerator.MoveNext();
while ( supersetEnumerator.MoveNext() )
{
int comparison = supersetEnumerator.Current.CompareTo(subsetEnumerator.Current);
if ( !itemsRemainingInSubset || comparison < 0 )
{
yield return supersetEnumerator.Current;
}
else if ( comparison >= 0 )
{
while ( itemsRemainingInSubset && supersetEnumerator.Current.CompareTo(subsetEnumerator.Current) >= 0 )
itemsRemainingInSubset = subsetEnumerator.MoveNext();
}
}
}
Usage
var values = Enumerable.Range(0, 11);
var list = new List<int> { 1, 2, 4, 7, 9 };
var notIncluded = CompareSortedEnumerables(values, list);
If the range is predictable I suggest the following solution:
public static void Main()
{
//set up the expected range
var expectedRange = Enumerable.Range(0, 10);
//set up the current list
var currentList = new List<int> {1, 2, 4, 7, 9};
//get the missing items
var missingItems = expectedRange.Except(currentList);
//print the missing items
foreach (int missingItem in missingItems)
{
Console.WriteLine(missingItem);
}
Console.ReadLine();
}
Regards,
y00daa
This does not use LINQ but it works in linear time.
I assume that input list is sorted.
This takes O(list.Count).
private static IEnumerable<int> get_miss(List<int> list,int length)
{
var miss = new List<int>();
int i =0;
for ( i = 0; i < list.Count - 1; i++)
{
foreach (var item in
Enumerable.Range(list[i] + 1, list[i + 1] - list[i] - 1))
{
yield return item;
}
}
foreach (var item in Enumerable.Range(list[i]+1,length-list[i]))
{
yield return item;
}
}
This should take O(n) where n is length of full range.
static void Main()
{
List<int> identifiers = new List<int>() { 1, 2, 4, 7, 9 };
Stopwatch sw = new Stopwatch();
sw.Start();
List<int> miss = GetMiss(identifiers,150000);
sw.Stop();
Console.WriteLine("{0}",sw.ElapsedMilliseconds);
}
private static List<int> GetMiss(List<int> identifiers,int length)
{
List<int> miss = new List<int>();
int j = 0;
for (int i = 0; i < length; i++)
{
if (i < identifiers[j])
miss.Add(i);
else if (i == identifiers[j])
j++;
if (j == identifiers.Count)
{
miss.AddRange(Enumerable.Range(i + 1, length - i));
break;
}
}
return miss;
}
Ok, really, create a new list which parallels the initial list and run the method Except over it...
I have created a fully linq answer using the Aggregate method instead to find the missings:
var list = new List<int>(new[] { 1, 2, 4, 7, 9 }); // Assumes list is ordered at this point
list.Insert(0, 0); // No error checking, just put in the lowest and highest possibles.
list.Add(10); // For real world processing, put in check and if not represented then add it/them.
var missing = new List<int>(); // Hold any missing values found.
list.Aggregate ((seed, aggr) => // Seed is the previous #, aggr is the current number.
{
var diff = (aggr - seed) -1; // A difference between them indicates missing.
if (diff > 0) // Missing found...put in the missing range.
missing.AddRange(Enumerable.Range((aggr - diff), diff));
return aggr;
});
The missing list has this after the above code has been executed:
3, 5, 6, 8
for a List L a general solution (works in all programming languages) would be simply
L.Count()*(L.Count()+1)/2 - L.Sum();
which returns the expected sum of series minus the actual series.
for a List of size n the missing number is:
n(n+1)/2 - (sum of list numbers)
this method here returns the number of missing elements ,sort the set , add all elements from range 0 to range max , then remove the original elements , then you will have the missing set
int makeArrayConsecutive(int[] statues)
{
Array.Sort(statues);
HashSet<int> set = new HashSet<int>();
for(int i = statues[0]; i< statues[statues.Length -1]; i++)
{
set.Add(i);
}
for (int i = 0; i < statues.Length; i++)
{
set.Remove(statues[i]);
}
var x = set.Count;
return x;
// return set ; // use this if you need the actual elements + change the method return type
}
Create an array of num items
const int numItems = 1000;
bool found[numItems] = new bool[numItems];
List<int> list;
PopulateList(list);
list.ForEach( i => found[i] = true );
// now iterate found for the numbers found
for(int count = 0; i < numItems; ++numItems){
Console.WriteList("Item {0} is {1}", count, found[count] ? "there" : "not there");
}
This method does not use LINQ and works in general for any two IEnunumerable<T> where T :IComparable
public static IEnumerable<T> FindMissing<T>(IEnumerable<T> superset, IEnumerable<T> subset) where T : IComparable
{
bool include = true;
foreach (var i in superset)
{
foreach (var j in subset)
{
include = i.CompareTo(j) == 0;
if (include)
break;
}
if (!include)
yield return i;
}
}
int sum = 0,missingNumber;
int[] arr = { 1,2,3,4,5,6,7,8,9};
for (int i = 0; i < arr.Length; i++)
{
sum += arr[i];
}
Console.WriteLine("The sum from 1 to 10 is 55");
Console.WriteLine("Sum is :" +sum);
missingNumber = 55 - sum;
Console.WriteLine("Missing Number is :-"+missingNumber);
Console.ReadLine();

How do I remove duplicates from a C# array?

I have been working with a string[] array in C# that gets returned from a function call. I could possibly cast to a Generic collection, but I was wondering if there was a better way to do it, possibly by using a temp array.
What is the best way to remove duplicates from a C# array?
You could possibly use a LINQ query to do this:
int[] s = { 1, 2, 3, 3, 4};
int[] q = s.Distinct().ToArray();
Here is the HashSet<string> approach:
public static string[] RemoveDuplicates(string[] s)
{
HashSet<string> set = new HashSet<string>(s);
string[] result = new string[set.Count];
set.CopyTo(result);
return result;
}
Unfortunately this solution also requires .NET framework 3.5 or later as HashSet was not added until that version. You could also use array.Distinct(), which is a feature of LINQ.
The following tested and working code will remove duplicates from an array. You must include the System.Collections namespace.
string[] sArray = {"a", "b", "b", "c", "c", "d", "e", "f", "f"};
var sList = new ArrayList();
for (int i = 0; i < sArray.Length; i++) {
if (sList.Contains(sArray[i]) == false) {
sList.Add(sArray[i]);
}
}
var sNew = sList.ToArray();
for (int i = 0; i < sNew.Length; i++) {
Console.Write(sNew[i]);
}
You could wrap this up into a function if you wanted to.
If you needed to sort it, then you could implement a sort that also removes duplicates.
Kills two birds with one stone, then.
This might depend on how much you want to engineer the solution - if the array is never going to be that big and you don't care about sorting the list you might want to try something similar to the following:
public string[] RemoveDuplicates(string[] myList) {
System.Collections.ArrayList newList = new System.Collections.ArrayList();
foreach (string str in myList)
if (!newList.Contains(str))
newList.Add(str);
return (string[])newList.ToArray(typeof(string));
}
List<String> myStringList = new List<string>();
foreach (string s in myStringArray)
{
if (!myStringList.Contains(s))
{
myStringList.Add(s);
}
}
This is O(n^2), which won't matter for a short list which is going to be stuffed into a combo, but could be rapidly be a problem on a big collection.
-- This is Interview Question asked every time. Now i done its coding.
static void Main(string[] args)
{
int[] array = new int[] { 4, 8, 4, 1, 1, 4, 8 };
int numDups = 0, prevIndex = 0;
for (int i = 0; i < array.Length; i++)
{
bool foundDup = false;
for (int j = 0; j < i; j++)
{
if (array[i] == array[j])
{
foundDup = true;
numDups++; // Increment means Count for Duplicate found in array.
break;
}
}
if (foundDup == false)
{
array[prevIndex] = array[i];
prevIndex++;
}
}
// Just Duplicate records replce by zero.
for (int k = 1; k <= numDups; k++)
{
array[array.Length - k] = '\0';
}
Console.WriteLine("Console program for Remove duplicates from array.");
Console.Read();
}
Here is a O(n*n) approach that uses O(1) space.
void removeDuplicates(char* strIn)
{
int numDups = 0, prevIndex = 0;
if(NULL != strIn && *strIn != '\0')
{
int len = strlen(strIn);
for(int i = 0; i < len; i++)
{
bool foundDup = false;
for(int j = 0; j < i; j++)
{
if(strIn[j] == strIn[i])
{
foundDup = true;
numDups++;
break;
}
}
if(foundDup == false)
{
strIn[prevIndex] = strIn[i];
prevIndex++;
}
}
strIn[len-numDups] = '\0';
}
}
The hash/linq approaches above are what you would generally use in real life. However in interviews they usually want to put some constraints e.g. constant space which rules out hash or no internal api - which rules out using LINQ.
protected void Page_Load(object sender, EventArgs e)
{
string a = "a;b;c;d;e;v";
string[] b = a.Split(';');
string[] c = b.Distinct().ToArray();
if (b.Length != c.Length)
{
for (int i = 0; i < b.Length; i++)
{
try
{
if (b[i].ToString() != c[i].ToString())
{
Response.Write("Found duplicate " + b[i].ToString());
return;
}
}
catch (Exception ex)
{
Response.Write("Found duplicate " + b[i].ToString());
return;
}
}
}
else
{
Response.Write("No duplicate ");
}
}
Add all the strings to a dictionary and get the Keys property afterwards. This will produce each unique string, but not necessarily in the same order your original input had them in.
If you require the end result to have the same order as the original input, when you consider the first occurance of each string, use the following algorithm instead:
Have a list (final output) and a dictionary (to check for duplicates)
For each string in the input, check if it exists in the dictionary already
If not, add it both to the dictionary and to the list
At the end, the list contains the first occurance of each unique string.
Make sure you consider things like culture and such when constructing your dictionary, to make sure you handle duplicates with accented letters correctly.
The following piece of code attempts to remove duplicates from an ArrayList though this is not an optimal solution. I was asked this question during an interview to remove duplicates through recursion, and without using a second/temp arraylist:
private void RemoveDuplicate()
{
ArrayList dataArray = new ArrayList(5);
dataArray.Add("1");
dataArray.Add("1");
dataArray.Add("6");
dataArray.Add("6");
dataArray.Add("6");
dataArray.Add("3");
dataArray.Add("6");
dataArray.Add("4");
dataArray.Add("5");
dataArray.Add("4");
dataArray.Add("1");
dataArray.Sort();
GetDistinctArrayList(dataArray, 0);
}
private void GetDistinctArrayList(ArrayList arr, int idx)
{
int count = 0;
if (idx >= arr.Count) return;
string val = arr[idx].ToString();
foreach (String s in arr)
{
if (s.Equals(arr[idx]))
{
count++;
}
}
if (count > 1)
{
arr.Remove(val);
GetDistinctArrayList(arr, idx);
}
else
{
idx += 1;
GetDistinctArrayList(arr, idx);
}
}
Simple solution:
using System.Linq;
...
public static int[] Distinct(int[] handles)
{
return handles.ToList().Distinct().ToArray();
}
Maybe hashset which do not store duplicate elements and silently ignore requests to add
duplicates.
static void Main()
{
string textWithDuplicates = "aaabbcccggg";
Console.WriteLine(textWithDuplicates.Count());
var letters = new HashSet<char>(textWithDuplicates);
Console.WriteLine(letters.Count());
foreach (char c in letters) Console.Write(c);
Console.WriteLine("");
int[] array = new int[] { 12, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2 };
Console.WriteLine(array.Count());
var distinctArray = new HashSet<int>(array);
Console.WriteLine(distinctArray.Count());
foreach (int i in distinctArray) Console.Write(i + ",");
}
NOTE : NOT tested!
string[] test(string[] myStringArray)
{
List<String> myStringList = new List<string>();
foreach (string s in myStringArray)
{
if (!myStringList.Contains(s))
{
myStringList.Add(s);
}
}
return myStringList.ToString();
}
Might do what you need...
EDIT Argh!!! beaten to it by rob by under a minute!
Tested the below & it works. What's cool is that it does a culture sensitive search too
class RemoveDuplicatesInString
{
public static String RemoveDups(String origString)
{
String outString = null;
int readIndex = 0;
CompareInfo ci = CultureInfo.CurrentCulture.CompareInfo;
if(String.IsNullOrEmpty(origString))
{
return outString;
}
foreach (var ch in origString)
{
if (readIndex == 0)
{
outString = String.Concat(ch);
readIndex++;
continue;
}
if (ci.IndexOf(origString, ch.ToString().ToLower(), 0, readIndex) == -1)
{
//Unique char as this char wasn't found earlier.
outString = String.Concat(outString, ch);
}
readIndex++;
}
return outString;
}
static void Main(string[] args)
{
String inputString = "aAbcefc";
String outputString;
outputString = RemoveDups(inputString);
Console.WriteLine(outputString);
}
}
--AptSenSDET
This code 100% remove duplicate values from an array[as I used a[i]].....You can convert it in any OO language..... :)
for(int i=0;i<size;i++)
{
for(int j=i+1;j<size;j++)
{
if(a[i] == a[j])
{
for(int k=j;k<size;k++)
{
a[k]=a[k+1];
}
j--;
size--;
}
}
}
Generic Extension method :
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
HashSet<TSource> set = new HashSet<TSource>(comparer);
foreach (TSource item in source)
{
if (set.Add(item))
{
yield return item;
}
}
}
you can using This code when work with an ArrayList
ArrayList arrayList;
//Add some Members :)
arrayList.Add("ali");
arrayList.Add("hadi");
arrayList.Add("ali");
//Remove duplicates from array
for (int i = 0; i < arrayList.Count; i++)
{
for (int j = i + 1; j < arrayList.Count ; j++)
if (arrayList[i].ToString() == arrayList[j].ToString())
arrayList.Remove(arrayList[j]);
Below is an simple logic in java you traverse elements of array twice and if you see any same element you assign zero to it plus you don't touch the index of element you are comparing.
import java.util.*;
class removeDuplicate{
int [] y ;
public removeDuplicate(int[] array){
y=array;
for(int b=0;b<y.length;b++){
int temp = y[b];
for(int v=0;v<y.length;v++){
if( b!=v && temp==y[v]){
y[v]=0;
}
}
}
}
public static int RemoveDuplicates(ref int[] array)
{
int size = array.Length;
// if 0 or 1, return 0 or 1:
if (size < 2) {
return size;
}
int current = 0;
for (int candidate = 1; candidate < size; ++candidate) {
if (array[current] != array[candidate]) {
array[++current] = array[candidate];
}
}
// index to count conversion:
return ++current;
}
The best way? Hard to say, the HashSet approach looks fast,
but (depending on the data) using a sort algorithm (CountSort ?)
can be much faster.
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main()
{
Random r = new Random(0); int[] a, b = new int[1000000];
for (int i = b.Length - 1; i >= 0; i--) b[i] = r.Next(b.Length);
a = new int[b.Length]; Array.Copy(b, a, b.Length);
a = dedup0(a); Console.WriteLine(a.Length);
a = new int[b.Length]; Array.Copy(b, a, b.Length);
var w = System.Diagnostics.Stopwatch.StartNew();
a = dedup0(a); Console.WriteLine(w.Elapsed); Console.Read();
}
static int[] dedup0(int[] a) // 48 ms
{
return new HashSet<int>(a).ToArray();
}
static int[] dedup1(int[] a) // 68 ms
{
Array.Sort(a); int i = 0, j = 1, k = a.Length; if (k < 2) return a;
while (j < k) if (a[i] == a[j]) j++; else a[++i] = a[j++];
Array.Resize(ref a, i + 1); return a;
}
static int[] dedup2(int[] a) // 8 ms
{
var b = new byte[a.Length]; int c = 0;
for (int i = 0; i < a.Length; i++)
if (b[a[i]] == 0) { b[a[i]] = 1; c++; }
a = new int[c];
for (int j = 0, i = 0; i < b.Length; i++) if (b[i] > 0) a[j++] = i;
return a;
}
}
Almost branch free. How? Debug mode, Step Into (F11) with a small array: {1,3,1,1,0}
static int[] dedupf(int[] a) // 4 ms
{
if (a.Length < 2) return a;
var b = new byte[a.Length]; int c = 0, bi, ai, i, j;
for (i = 0; i < a.Length; i++)
{ ai = a[i]; bi = 1 ^ b[ai]; b[ai] |= (byte)bi; c += bi; }
a = new int[c]; i = 0; while (b[i] == 0) i++; a[0] = i++;
for (j = 0; i < b.Length; i++) a[j += bi = b[i]] += bi * i; return a;
}
A solution with two nested loops might take some time,
especially for larger arrays.
static int[] dedup(int[] a)
{
int i, j, k = a.Length - 1;
for (i = 0; i < k; i++)
for (j = i + 1; j <= k; j++) if (a[i] == a[j]) a[j--] = a[k--];
Array.Resize(ref a, k + 1); return a;
}
private static string[] distinct(string[] inputArray)
{
bool alreadyExists;
string[] outputArray = new string[] {};
for (int i = 0; i < inputArray.Length; i++)
{
alreadyExists = false;
for (int j = 0; j < outputArray.Length; j++)
{
if (inputArray[i] == outputArray[j])
alreadyExists = true;
}
if (alreadyExists==false)
{
Array.Resize<string>(ref outputArray, outputArray.Length + 1);
outputArray[outputArray.Length-1] = inputArray[i];
}
}
return outputArray;
}
int size = a.Length;
for (int i = 0; i < size; i++)
{
for (int j = i + 1; j < size; j++)
{
if (a[i] == a[j])
{
for (int k = j; k < size; k++)
{
if (k != size - 1)
{
int temp = a[k];
a[k] = a[k + 1];
a[k + 1] = temp;
}
}
j--;
size--;
}
}
}
So I was doing an interview session and got the same question to sort and distinct
static void Sort()
{
try
{
int[] number = new int[Convert.ToInt32(Console.ReadLine())];
for (int i = 0; i < number.Length; i++)
{
number[i] = Convert.ToInt32(Console.ReadLine());
}
Array.Sort(number);
int[] num = number.Distinct().ToArray();
for (int i = 0; i < num.Length; i++)
{
Console.WriteLine(num[i]);
}
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
Console.Read();
}
using System;
using System.Collections.Generic;
using System.Linq;
namespace Rextester
{
public class Program
{
public static void Main(string[] args)
{
List<int> listofint1 = new List<int> { 4, 8, 4, 1, 1, 4, 8 };
List<int> updatedlist= removeduplicate(listofint1);
foreach(int num in updatedlist)
Console.WriteLine(num);
}
public static List<int> removeduplicate(List<int> listofint)
{
List<int> listofintwithoutduplicate= new List<int>();
foreach(var num in listofint)
{
if(!listofintwithoutduplicate.Any(p=>p==num))
{
listofintwithoutduplicate.Add(num);
}
}
return listofintwithoutduplicate;
}
}
}
strINvalues = "1,1,2,2,3,3,4,4";
strINvalues = string.Join(",", strINvalues .Split(',').Distinct().ToArray());
Debug.Writeline(strINvalues);
Kkk Not sure if this is witchcraft or just beautiful code
1 strINvalues .Split(',').Distinct().ToArray()
2 string.Join(",", XXX);
1 Splitting the array and using Distinct [LINQ] to remove duplicates
2 Joining it back without the duplicates.
Sorry I never read the text on StackOverFlow just the code. it make more sense than the text ;)
Removing duplicate and ignore case sensitive using Distinct & StringComparer.InvariantCultureIgnoreCase
string[] array = new string[] { "A", "a", "b", "B", "a", "C", "c", "C", "A", "1" };
var r = array.Distinct(StringComparer.InvariantCultureIgnoreCase).ToList();
Console.WriteLine(r.Count); // return 4 items
Find answer below.
class Program
{
static void Main(string[] args)
{
var nums = new int[] { 1, 4, 3, 3, 3, 5, 5, 7, 7, 7, 7, 9, 9, 9 };
var result = removeDuplicates(nums);
foreach (var item in result)
{
Console.WriteLine(item);
}
}
static int[] removeDuplicates(int[] nums)
{
nums = nums.ToList().OrderBy(c => c).ToArray();
int j = 1;
int i = 0;
int stop = 0;
while (j < nums.Length)
{
if (nums[i] != nums[j])
{
nums[i + 1] = nums[j];
stop = i + 2;
i++;
}
j++;
}
nums = nums.Take(stop).ToArray();
return nums;
}
}
Just a bit of contribution based on a test i just solved, maybe helpful and open to improvement by other top contributors here.
Here are the things i did:
I used OrderBy which allows me order or sort the items from smallest to the highest using LINQ
I then convert it to back to an array and then re-assign it back to the primary datasource
So i then initialize j which is my right hand side of the array to be 1 and i which is my left hand side of the array to be 0, i also initialize where i would i to stop to be 0.
I used a while loop to increment through the array by going from one position to the other left to right, for each increment the stop position is the current value of i + 2 which i will use later to truncate the duplicates from the array.
I then increment by moving from left to right from the if statement and from right to right outside of the if statement until i iterate through the entire values of the array.
I then pick from the first element to the stop position which becomes the last i index plus 2. that way i am able to remove all the duplicate items from the int array. which is then reassigned.

Categories