Delete duplicates in a List of int arrays

Delete duplicates in a List of int arrays - c#

having a List of int arrays like:
List<int[]> intArrList = new List<int[]>();
intArrList.Add(new int[3] { 0, 0, 0 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 1, 2, 5 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 12, 22, 54 });
intArrList.Add(new int[5] { 1, 2, 6, 7, 8 });
intArrList.Add(new int[4] { 0, 0, 0, 0 });
How would you remove duplicates (by duplicate I mean element of list has same length and same numbers).
On the example I would remove element { 20, 30, 10, 4, 6 } because it is found twice
I was thinking on sorting the list by element size, then loop each element against rest but I am not sure how to do that.
Other question would be, if using other structure like a Hash would be better... If so how to use it?

Use GroupBy:
var result = intArrList.GroupBy(c => String.Join(",", c))
.Select(c => c.First().ToList()).ToList();
The result:
{0, 0, 0}
{20, 30, 10, 4, 6}
{1, 2, 5}
{12, 22, 54}
{1, 2, 6, 7, 8}
{0, 0, 0, 0}
EDIT: If you want to consider {1,2,3,4} be equal to {2,3,4,1} you need to use OrderBy like this:
var result = intArrList.GroupBy(p => string.Join(", ", p.OrderBy(c => c)))
.Select(c => c.First().ToList()).ToList();
EDIT2: To help understanding how the LINQ GroupBy solution works consider the following method:
public List<int[]> FindDistinctWithoutLinq(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
{
string key = string.Join(",", item.OrderBy(c=>c));
if (!dic.ContainsKey(key))
{
dic.Add(key, item);
}
}
return dic.Values.ToList();
}

You can define your own implementation of IEqualityComparer and use it together with IEnumerable.Distinct:
class MyComparer : IEqualityComparer<int[]>
{
public int GetHashCode(int[] instance) { return 0; } // TODO: better HashCode for arrays
public bool Equals(int[] instance, int[] other)
{
if (other == null || instance == null || instance.Length != other.Length) return false;
return instance.SequenceEqual(other);
}
}
Now write this to get only distinct values for your list:
var result = intArrList.Distinct(new MyComparer());
However if you want different permutations also you should implement your comparer this way:
public bool Equals(int[] instance, int[] other)
{
if (ReferenceEquals(instance, other)) return true; // this will return true when both arrays are NULL
if (other == null || instance == null) return false;
return instance.All(x => other.Contains(x)) && other.All(x => instance.Contains(x));
}
EDIT: For a better GetashCode-implementation you may have a look at this post as also suggested in #Mick´s answer.

Well lifting code from here and here. A more generic implementation of GetHashCode would make this more generic, however I believe the implementation below is the most robust
class Program
{
static void Main(string[] args)
{
List<int[]> intArrList = new List<int[]>();
intArrList.Add(new int[3] { 0, 0, 0 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 1, 2, 5 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 12, 22, 54 });
intArrList.Add(new int[5] { 1, 2, 6, 7, 8 });
intArrList.Add(new int[4] { 0, 0, 0, 0 });
var test = intArrList.Distinct(new IntArrayEqualityComparer());
Console.WriteLine(test.Count());
Console.WriteLine(intArrList.Count());
}
public class IntArrayEqualityComparer : IEqualityComparer<int[]>
{
public bool Equals(int[] x, int[] y)
{
return ArraysEqual(x, y);
}
public int GetHashCode(int[] obj)
{
int hc = obj.Length;
for (int i = 0; i < obj.Length; ++i)
{
hc = unchecked(hc * 17 + obj[i]);
}
return hc;
}
static bool ArraysEqual<T>(T[] a1, T[] a2)
{
if (ReferenceEquals(a1, a2))
return true;
if (a1 == null || a2 == null)
return false;
if (a1.Length != a2.Length)
return false;
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < a1.Length; i++)
{
if (!comparer.Equals(a1[i], a2[i])) return false;
}
return true;
}
}
}
Edit: a Generic implementation of IEqualityComparer for an arrays of any type:-
public class ArrayEqualityComparer<T> : IEqualityComparer<T[]>
{
public bool Equals(T[] x, T[] y)
{
if (ReferenceEquals(x, y))
return true;
if (x == null || y == null)
return false;
if (x.Length != y.Length)
return false;
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < x.Length; i++)
{
if (!comparer.Equals(x[i], y[i])) return false;
}
return true;
}
public int GetHashCode(T[] obj)
{
int hc = obj.Length;
for (int i = 0; i < obj.Length; ++i)
{
hc = unchecked(hc * 17 + obj[i].GetHashCode());
}
return hc;
}
}
Edit2: If ordering of the integers within the arrays doesn't matter I would
var test = intArrList.Select(a => a.OrderBy(e => e).ToArray()).Distinct(comparer).ToList();

List<int[]> CopyString1 = new List<int[]>();
CopyString1.AddRange(intArrList);
List<int[]> CopyString2 = new List<int[]>();
CopyString2.AddRange(intArrList);
for (int i = 0; i < CopyString2.Count(); i++)
{
for (int j = i; j < CopyString1.Count(); j++)
{
if (i != j && CopyString2[i].Count() == CopyString1[j].Count())
{
var cnt = 0;
for (int k = 0; k < CopyString2[i].Count(); k++)
{
if (CopyString2[i][k] == CopyString1[j][k])
cnt++;
else
break;
}
if (cnt == CopyString2[i].Count())
intArrList.RemoveAt(i);
}
}
}

Perf comparison of #S.Akbari's and #Mick's solutions using BenchmarkDotNet
EDIT:
SAkbari_FindDistinctWithoutLinq has redundant call to ContainsKey, so i added impoved and faster version: SAkbari_FindDistinctWithoutLinq2
Method | Mean | Error | StdDev |
--------------------------------- |---------:|----------:|----------:|
SAkbari_FindDistinctWithoutLinq | 4.021 us | 0.0723 us | 0.0676 us |
SAkbari_FindDistinctWithoutLinq2 | 3.930 us | 0.0529 us | 0.0495 us |
SAkbari_FindDistinctLinq | 5.597 us | 0.0264 us | 0.0234 us |
Mick_UsingGetHashCode | 6.339 us | 0.0265 us | 0.0248 us |
BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.248)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=3515625 Hz, Resolution=284.4444 ns, Timer=TSC
.NET Core SDK=2.1.100
[Host] : .NET Core 2.0.5 (CoreCLR 4.6.26020.03, CoreFX 4.6.26018.01), 64bit RyuJIT
DefaultJob : .NET Core 2.0.5 (CoreCLR 4.6.26020.03, CoreFX 4.6.26018.01), 64bit RyuJIT
Benchmark:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp1
{
public class Program
{
List<int[]> intArrList = new List<int[]>
{
new int[] { 0, 0, 0 },
new int[] { 20, 30, 10, 4, 6 }, //this
new int[] { 1, 2, 5 },
new int[] { 20, 30, 10, 4, 6 }, //this
new int[] { 12, 22, 54 },
new int[] { 1, 2, 6, 7, 8 },
new int[] { 0, 0, 0, 0 }
};
[Benchmark]
public List<int[]> SAkbari_FindDistinctWithoutLinq() => FindDistinctWithoutLinq(intArrList);
[Benchmark]
public List<int[]> SAkbari_FindDistinctWithoutLinq2() => FindDistinctWithoutLinq2(intArrList);
[Benchmark]
public List<int[]> SAkbari_FindDistinctLinq() => FindDistinctLinq(intArrList);
[Benchmark]
public List<int[]> Mick_UsingGetHashCode() => FindDistinctLinq(intArrList);
static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<Program>();
}
public static List<int[]> FindDistinctWithoutLinq(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
{
string key = string.Join(",", item.OrderBy(c => c));
if (!dic.ContainsKey(key))
{
dic.Add(key, item);
}
}
return dic.Values.ToList();
}
public static List<int[]> FindDistinctWithoutLinq2(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
dic.TryAdd(string.Join(",", item.OrderBy(c => c)), item);
return dic.Values.ToList();
}
public static List<int[]> FindDistinctLinq(List<int[]> lst)
{
return lst.GroupBy(p => string.Join(", ", p.OrderBy(c => c)))
.Select(c => c.First().ToArray()).ToList();
}
public static List<int[]> UsingGetHashCode(List<int[]> lst)
{
return lst.Select(a => a.OrderBy(e => e).ToArray()).Distinct(new IntArrayEqualityComparer()).ToList();
}
}
public class IntArrayEqualityComparer : IEqualityComparer<int[]>
{
public bool Equals(int[] x, int[] y)
{
return ArraysEqual(x, y);
}
public int GetHashCode(int[] obj)
{
int hc = obj.Length;
for (int i = 0; i < obj.Length; ++i)
{
hc = unchecked(hc * 17 + obj[i]);
}
return hc;
}
static bool ArraysEqual<T>(T[] a1, T[] a2)
{
if (ReferenceEquals(a1, a2))
return true;
if (a1 == null || a2 == null)
return false;
if (a1.Length != a2.Length)
return false;
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < a1.Length; i++)
{
if (!comparer.Equals(a1[i], a2[i])) return false;
}
return true;
}
}
}

Input list;
List<List<int>> initList = new List<List<int>>();
initList.Add(new List<int>{ 0, 0, 0 });
initList.Add(new List<int>{ 20, 30, 10, 4, 6 }); //this
initList.Add(new List<int> { 1, 2, 5 });
initList.Add(new List<int> { 20, 30, 10, 4, 6 }); //this
initList.Add(new List<int> { 12, 22, 54 });
initList.Add(new List<int> { 1, 2, 6, 7, 8 });
initList.Add(new List<int> { 0, 0, 0, 0 });
You can create a result list, and before adding elements you can check if it is already added. I simply compared the list counts and used p.Except(item).Any() call to check if the list contains that element or not.
List<List<int>> returnList = new List<List<int>>();
foreach (var item in initList)
{
if (returnList.Where(p => !p.Except(item).Any() && !item.Except(p).Any()
&& p.Count() == item.Count() ).Count() == 0)
returnList.Add(item);
}

You can use a HashSet.
HashSet is a collection used for guarantee uniqueness and you can compare items on collection, Intersect, Union. etc.
Pros: No duplicates, easy to manipulate groups of data, more efficient
Cons: You can't get a specific item in the collection, for example: list[0] doesn't work for HashSets. You can only Enumerating the items. e.g. foreach
Here is an example:
using System;
using System.Collections.Generic;
namespace ConsoleApp2
{
class Program
{
static void Main(string[] args)
{
HashSet<HashSet<int>> intArrList = new HashSet<HashSet<int>>(new HashSetIntComparer());
intArrList.Add(new HashSet<int>(3) { 0, 0, 0 });
intArrList.Add(new HashSet<int>(5) { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new HashSet<int>(3) { 1, 2, 5 });
intArrList.Add(new HashSet<int>(5) { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new HashSet<int>(3) { 12, 22, 54 });
intArrList.Add(new HashSet<int>(5) { 1, 2, 6, 7, 8 });
intArrList.Add(new HashSet<int>(4) { 0, 0, 0, 0 });
// Checking the output
foreach (var item in intArrList)
{
foreach (var subHasSet in item)
{
Console.Write("{0} ", subHasSet);
}
Console.WriteLine();
}
Console.Read();
}
private class HashSetIntComparer : IEqualityComparer<HashSet<int>>
{
public bool Equals(HashSet<int> x, HashSet<int> y)
{
// SetEquals does't set anything. It's a method for compare the contents of the HashSet.
// Such a poor name from .Net
return x.SetEquals(y);
}
public int GetHashCode(HashSet<int> obj)
{
//TODO: implemente a better HashCode
return base.GetHashCode();
}
}
}
}
Output:
0
20 30 10 4 6
1 2 5
12 22 54
1 2 6 7 8
Note: Since 0 is repeated several times, HashSet considers the 0 only
once. If you need diferentiate between 0 0 0 0 and 0 0 0 then you can
replace HashSet<HashSet<int>> for HashSet<List<int>> and implement
a Comparer to the List instead.
You can use this link to learn how to compare a list:
https://social.msdn.microsoft.com/Forums/en-US/2ff3016c-bd61-4fec-8f8c-7b6c070123fa/c-compare-two-lists-of-objects?forum=csharplanguage
If you want to learn more about Collections and DataTypes this course is a perfect place to learn it:
https://app.pluralsight.com/player?course=csharp-collections&author=simon-robinson&name=csharp-collections-fundamentals-m9-sets&clip=1&mode=live

Using MoreLINQ this can be very simple with DistinctBy.
var result = intArrList.DistinctBy(x => string.Join(",", x));
Similar to the GroupBy answer if you want distinction to be irrespective of order just order in the join.
var result = intArrList.DistinctBy(x => string.Join(",", x.OrderBy(y => y)));
EDIT: This is how it's implemented
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
return _(); IEnumerable<TSource> _()
{
var knownKeys = new HashSet<TKey>(comparer);
foreach (var element in source)
{
if (knownKeys.Add(keySelector(element)))
yield return element;
}
}
}
So you if you don't need MoreLINQ for anything else you can just use a method like this:
private static IEnumerable<int[]> GetUniqueArrays(IEnumerable<int[]> source)
{
var knownKeys = new HashSet<string>();
foreach (var element in source)
{
if (knownKeys.Add(string.Join(",", element)))
yield return element;
}
}

Related

How to Zip two Lists of different size to create a new list that is same as the size of the longest amongst the original lists?

I have two C# Lists of different sizes e.g.
List<int> list1 = new List<int>{1,2,3,4,5,6,7};
List<int> list2 = new List<int>{4,5,6,7,8,9};
I want to use the linq Zip method to combine these two into a list of tuples that is of the size list1. Here is the resulting list I am looking for
{(1,4), (2,5), (3,6), (4,7), (5,8), (6,9), (7,0)} //this is of type List<(int,int)
Since the last item of list1 does not has a counterpart in list2, I fill up my last item of the resulting list with a default value (in this case 0 as in my case it will never appear in any of the original lists).
Is there a way I can use the linq Zip method alone to achieve this?

You can use Concat to make them both the same size, and then zip it:
var zipped = list1.Concat(Enumerable.Repeat(0,Math.Max(list2.Count-list1.Count,0)))
.Zip(list2.Concat(Enumerable.Repeat(0,Math.Max(list1.Count-list2.Count,0))),
(a,b)=>(a,b));
Or create an extension method:
public static class ZipExtension{
public static IEnumerable<TResult> Zip<TFirst,TSecond,TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst,TSecond,TResult> func,
TFirst padder1,
TSecond padder2)
{
var firstExp = first.Concat(
Enumerable.Repeat(
padder1,
Math.Max(second.Count()-first.Count(),0)
)
);
var secExp = second.Concat(
Enumerable.Repeat(
padder2,
Math.Max(first.Count()-second.Count(),0)
)
);
return firstExp.Zip(secExp, (a,b) => func(a,b));
}
}
So you can use like this:
//last 2 arguments are the padder values for list1 and list2
var zipped = list1.Zip(list2, (a,b) => (a,b), 0, 0);

There is a useful and popular MoreLinq library. Install it and use.
using MoreLinq;
var result = list1.ZipLongest(list2, (x, y) => (x, y));

Try this using Zip function-
static void Main(string[] args)
{
List<int> firstList = new List<int>() { 1, 2, 3, 4, 5, 6, 0, 34, 56, 23 };
List<int> secondList = new List<int>() { 4, 5, 6, 7, 8, 9, 1 };
int a = firstList.Count;
int b = secondList.Count;
for (int k = 0; k < (a - b); k++)
{
if(a>b)
secondList.Add(0);
else
firstList.Add(0);
}
var zipArray = firstList.Zip(secondList, (c, d) => c + " " + d);
foreach(var item in zipArray)
{
Console.WriteLine(item);
}
Console.Read();
}
Or you can try this using ZipLongest Function by installing MoreLinq nuget package-
static void Main(string[] args)
{
List<int> firstList = new List<int>() { 1, 2, 3, 4, 5, 6, 0, 34, 56, 23 };
List<int> secondList = new List<int>() { 4, 5, 6, 7, 8, 9, 1 };
var zipArray = firstList.ZipLongest(secondList, (c, d) => (c,d));
foreach (var item in zipArray)
{
Console.WriteLine(item);
}
Console.Read();
}

Try this code-
static void Main(string[] args)
{
List<int> firstList=new List<int>() { 1, 2, 3, 4, 5, 6,0,34,56,23};
List<int> secondList=new List<int>() { 4, 5, 6, 7, 8, 9,1};
int a = firstList.Count;
int b = secondList.Count;
if (a > b)
{
for(int k=0;k<(a-b);k++)
secondList.Add(0);
}
else
{
for (int k = 0; k < (b-a); k++)
firstList.Add(0);
}
for(int i=0;i<firstList.Count;i++)
{
for(int j=0;j<=secondList.Count;j++)
{
if(i==j)
Console.Write($"({Convert.ToInt32(firstList[i])},{ Convert.ToInt32(secondList[j])})" + "");
}
}
Console.Read();
}

How to subdivide a list of doubles into n chunks having first element of next series be the last of previous series

I have a list like:
1.-10
2.-11
3.-12
4.-13
5.-14
6.-15
7.-16
8.-17
9.-18
10.-19
11.-20
I want to split the list in n chunks, for instance n=4 would result in 3 lists:
first list
1.-10
2.-11
3.-12
4.-13
second list
1.-13
2.-14
3.-15
4.-16
third list
1.-16
2.-17
3.-18
4.-19
As this is an incomplete list it is discarded
1.-19
2.-20
I am doing
public static void Main()
{
var list = new List<double>()
{
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};
var subLists = SplitList(list, 3);
}
public static List<List<T>> SplitList<T>(IList<T> source, int chunkSize)
{
var chunks = new List<List<T>>();
List<T> chunk = null;
var total = source.Count;
var discarded = total % chunkSize;
for (var i = 0; i < total - discarded; i++)
{
if (i % chunkSize == 0)
{
chunk = new List<T>(chunkSize);
chunks.Add(chunk);
}
chunk?.Add(source[i]);
}
return chunks;
}
But it gets:
1.-10
2.-11
3.-12
4.-13
1.-14
2.-15
3.-16
4.-17

Use skip and take linq functions:
public static void Main()
{
var list = new List<double>() { 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 };
List<List<double>> chunks = SplitList(list, 4);
}
public static List<List<T>> SplitList<T>(IList<T> source, int chunkSize)
{
List<List<T>> chunks = new List<List<T>>();
for (int i = 0; i < source.Count; i += (chunkSize - 1))
{
var subList = source.Skip(i).Take(chunkSize).ToList();
if (subList.Count == chunkSize)
{
chunks.Add(subList);
}
}
return chunks;
}

Based on this answer you can use for that task LINQ: for Split List into Sublists with LINQ:
using System.Collections.Generic;
using System.Linq;
namespace SplitExample
{
public class Program
{
public static void Main()
{
var list = new List<double>()
{
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};
var subLists = Split<double>(list, 3);
}
public static List<List<T>> Split<T>(List<T> source, int chunkSize)
{
return source
.Select((x, i) => new { Index = i, Value = x })
.GroupBy(x => x.Index / chunkSize)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
}
}
}

Split a list into multiple lists at increasing sequence broken

I've a List of int and I want to create multiple List after splitting the original list when a lower or same number is found. Numbers are not in sorted order.
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
I want the result to be as following lists:
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Currently, I'm using following linq to do this but not helping me out:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
List<List<int>> resultLists = new List<List<int>>();
var res = data.Where((p, i) =>
{
int count = 0;
resultLists.Add(new List<int>());
if (p < data[(i + 1) >= data.Count ? i - 1 : i + 1])
{
resultLists[count].Add(p);
}
else
{
count++;
resultLists.Add(new List<int>());
}
return true;
}).ToList();

I'd just go for something simple:
public static IEnumerable<List<int>> SplitWhenNotIncreasing(List<int> numbers)
{
for (int i = 1, start = 0; i <= numbers.Count; ++i)
{
if (i != numbers.Count && numbers[i] > numbers[i - 1])
continue;
yield return numbers.GetRange(start, i - start);
start = i;
}
}
Which you'd use like so:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
foreach (var subset in SplitWhenNotIncreasing(data))
Console.WriteLine(string.Join(", ", subset));
If you really did need to work with IEnumerable<T>, then the simplest way I can think of is like this:
public sealed class IncreasingSubsetFinder<T> where T: IComparable<T>
{
public static IEnumerable<IEnumerable<T>> Find(IEnumerable<T> numbers)
{
return new IncreasingSubsetFinder<T>().find(numbers.GetEnumerator());
}
IEnumerable<IEnumerable<T>> find(IEnumerator<T> iter)
{
if (!iter.MoveNext())
yield break;
while (!done)
yield return increasingSubset(iter);
}
IEnumerable<T> increasingSubset(IEnumerator<T> iter)
{
while (!done)
{
T prev = iter.Current;
yield return prev;
if ((done = !iter.MoveNext()) || iter.Current.CompareTo(prev) <= 0)
yield break;
}
}
bool done;
}
Which you would call like this:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
foreach (var subset in IncreasingSubsetFinder<int>.Find(data))
Console.WriteLine(string.Join(", ", subset));

This is not a typical LINQ operation, so as usual in such cases (when one insists on using LINQ) I would suggest using Aggregate method:
var result = data.Aggregate(new List<List<int>>(), (r, n) =>
{
if (r.Count == 0 || n <= r.Last().Last()) r.Add(new List<int>());
r.Last().Add(n);
return r;
});

You can use the index to get the previous item and calculate the group id out of comparing the values. Then group on the group ids and get the values out:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
int groupId = 0;
var groups = data.Select
( (item, index)
=> new
{ Item = item
, Group = index > 0 && item <= data[index - 1] ? ++groupId : groupId
}
);
List<List<int>> list = groups.GroupBy(g => g.Group)
.Select(x => x.Select(y => y.Item).ToList())
.ToList();

I really like Matthew Watson's solution. If however you do not want to rely on List<T>, here is my simple generic approach enumerating the enumerable once at most and still retaining the capability for lazy evaluation.
public static IEnumerable<IEnumerable<T>> AscendingSubsets<T>(this IEnumerable<T> superset) where T :IComparable<T>
{
var supersetEnumerator = superset.GetEnumerator();
if (!supersetEnumerator.MoveNext())
{
yield break;
}
T oldItem = supersetEnumerator.Current;
List<T> subset = new List<T>() { oldItem };
while (supersetEnumerator.MoveNext())
{
T currentItem = supersetEnumerator.Current;
if (currentItem.CompareTo(oldItem) > 0)
{
subset.Add(currentItem);
}
else
{
yield return subset;
subset = new List<T>() { currentItem };
}
oldItem = supersetEnumerator.Current;
}
yield return subset;
}
Edit: Simplified the solution further to only use one enumerator.

I have modified your code, and now working fine:
List<int> data = new List<int> { 1, 2, 1, 2, 3,3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
List<List<int>> resultLists = new List<List<int>>();
int last = 0;
int count = 0;
var res = data.Where((p, i) =>
{
if (i > 0)
{
if (p > last && p!=last)
{
resultLists[count].Add(p);
}
else
{
count++;
resultLists.Add(new List<int>());
resultLists[count].Add(p);
}
}
else
{
resultLists.Add(new List<int>());
resultLists[count].Add(p);
}
last = p;
return true;
}).ToList();

For things like this, I'm generally not a fan of solutions that use GroupBy or other methods that materialize the results. The reason is that you never know how long the input sequence will be, and materializations of these sub-sequences can be very costly.
I prefer to stream the results as they are pulled. This allows implementations of IEnumerable<T> that stream results to continue streaming through your transformation of that stream.
Note, this solution won't work if you break out of iterating through the sub-sequence and want to continue to the next sequence; if this is an issue, then one of the solutions that materialize the sub-sequences would probably be better.
However, for forward-only iterations of the entire sequence (which is the most typical use case), this will work just fine.
First, let's set up some helpers for our test classes:
private static IEnumerable<T> CreateEnumerable<T>(IEnumerable<T> enumerable)
{
// Validate parameters.
if (enumerable == null) throw new ArgumentNullException("enumerable");
// Cycle through and yield.
foreach (T t in enumerable)
yield return t;
}
private static void EnumerateAndPrintResults<T>(IEnumerable<T> data,
[CallerMemberName] string name = "") where T : IComparable<T>
{
// Write the name.
Debug.WriteLine("Case: " + name);
// Cycle through the chunks.
foreach (IEnumerable<T> chunk in data.
ChunkWhenNextSequenceElementIsNotGreater())
{
// Print opening brackets.
Debug.Write("{ ");
// Is this the first iteration?
bool firstIteration = true;
// Print the items.
foreach (T t in chunk)
{
// If not the first iteration, write a comma.
if (!firstIteration)
{
// Write the comma.
Debug.Write(", ");
}
// Write the item.
Debug.Write(t);
// Flip the flag.
firstIteration = false;
}
// Write the closing bracket.
Debug.WriteLine(" }");
}
}
CreateEnumerable is used for creating a streaming implementation, and EnumerateAndPrintResults will take the sequence, call ChunkWhenNextSequenceElementIsNotGreater (this is coming up and does the work) and output the results.
Here's the implementation. Note, I've chosen to implement them as extension methods on IEnumerable<T>; this is the first benefit, as it doesn't require a materialized sequence (technically, none of the other solutions do either, but it's better to explicitly state it like this).
First, the entry points:
public static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreater<T>(
this IEnumerable<T> source)
where T : IComparable<T>
{
// Validate parameters.
if (source == null) throw new ArgumentNullException("source");
// Call the overload.
return source.
ChunkWhenNextSequenceElementIsNotGreater(
Comparer<T>.Default.Compare);
}
public static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreater<T>(
this IEnumerable<T> source,
Comparison<T> comparer)
{
// Validate parameters.
if (source == null) throw new ArgumentNullException("source");
if (comparer == null) throw new ArgumentNullException("comparer");
// Call the implementation.
return source.
ChunkWhenNextSequenceElementIsNotGreaterImplementation(
comparer);
}
Note that this works on anything that implements IComparable<T> or where you provide a Comparison<T> delegate; this allows for any type and any kind of rules you want for performing the comparison.
Here's the implementation:
private static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreaterImplementation<T>(
this IEnumerable<T> source, Comparison<T> comparer)
{
// Validate parameters.
Debug.Assert(source != null);
Debug.Assert(comparer != null);
// Get the enumerator.
using (IEnumerator<T> enumerator = source.GetEnumerator())
{
// Move to the first element. If one can't, then get out.
if (!enumerator.MoveNext()) yield break;
// While true.
while (true)
{
// The new enumerator.
var chunkEnumerator = new
ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T>(
enumerator, comparer);
// Yield.
yield return chunkEnumerator;
// If the last move next returned false, then get out.
if (!chunkEnumerator.LastMoveNext) yield break;
}
}
}
Of note: this uses another class ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T> to handle enumerating the sub-sequences. This class will iterate each of the items from the IEnumerator<T> that is obtained from the original IEnumerable<T>.GetEnumerator() call, but store the results of the last call to IEnumerator<T>.MoveNext().
This sub-sequence generator is stored, and the value of the last call to MoveNext is checked to see if the end of the sequence has or hasn't been hit. If it has, then it simply breaks, otherwise, it moves to the next chunk.
Here's the implementation of ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T>:
internal class
ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T> :
IEnumerable<T>
{
#region Constructor.
internal ChunkWhenNextSequenceElementIsNotGreaterEnumerable(
IEnumerator<T> enumerator, Comparison<T> comparer)
{
// Validate parameters.
if (enumerator == null)
throw new ArgumentNullException("enumerator");
if (comparer == null)
throw new ArgumentNullException("comparer");
// Assign values.
_enumerator = enumerator;
_comparer = comparer;
}
#endregion
#region Instance state.
private readonly IEnumerator<T> _enumerator;
private readonly Comparison<T> _comparer;
internal bool LastMoveNext { get; private set; }
#endregion
#region IEnumerable implementation.
public IEnumerator<T> GetEnumerator()
{
// The assumption is that a call to MoveNext
// that returned true has already
// occured. Store as the previous value.
T previous = _enumerator.Current;
// Yield it.
yield return previous;
// While can move to the next item, and the previous
// item is less than or equal to the current item.
while ((LastMoveNext = _enumerator.MoveNext()) &&
_comparer(previous, _enumerator.Current) < 0)
{
// Yield.
yield return _enumerator.Current;
// Store the previous.
previous = _enumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
#endregion
}
Here's the test for the original condition in the question, along with the output:
[TestMethod]
public void TestStackOverflowCondition()
{
var data = new List<int> {
1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestStackOverflowCondition
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Here's the same input, but streamed as an enumerable:
[TestMethod]
public void TestStackOverflowConditionEnumerable()
{
var data = new List<int> {
1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6
};
EnumerateAndPrintResults(CreateEnumerable(data));
}
Output:
Case: TestStackOverflowConditionEnumerable
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Here's a test with non-sequential elements:
[TestMethod]
public void TestNonSequentialElements()
{
var data = new List<int> {
1, 3, 5, 7, 6, 8, 10, 2, 5, 8, 11, 11, 13
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestNonSequentialElements
{ 1, 3, 5, 7 }
{ 6, 8, 10 }
{ 2, 5, 8, 11 }
{ 11, 13 }
Finally, here's a test with characters instead of numbers:
[TestMethod]
public void TestNonSequentialCharacters()
{
var data = new List<char> {
'1', '3', '5', '7', '6', '8', 'a', '2', '5', '8', 'b', 'c', 'a'
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestNonSequentialCharacters
{ 1, 3, 5, 7 }
{ 6, 8, a }
{ 2, 5, 8, b, c }
{ a }

You can do it with Linq using the index to calculate the group:
var result = data.Select((n, i) => new { N = n, G = (i > 0 && n > data[i - 1] ? data[i - 1] + 1 : n) - i })
.GroupBy(a => a.G)
.Select(g => g.Select(n => n.N).ToArray())
.ToArray();

This is my simple loop approach using some yields :
static IEnumerable<IList<int>> Split(IList<int> data)
{
if (data.Count == 0) yield break;
List<int> curr = new List<int>();
curr.Add(data[0]);
int last = data[0];
for (int i = 1; i < data.Count; i++)
{
if (data[i] <= last)
{
yield return curr;
curr = new List<int>();
}
curr.Add(data[i]);
last = data[i];
}
yield return curr;
}

I use a dictionary to get 5 different list as below;
static void Main(string[] args)
{
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
Dictionary<int, List<int>> listDict = new Dictionary<int, List<int>>();
int listCnt = 1;
//as initial value get first value from list
listDict.Add(listCnt, new List<int>());
listDict[listCnt].Add(data[0]);
for (int i = 1; i < data.Count; i++)
{
if (data[i] > listDict[listCnt].Last())
{
listDict[listCnt].Add(data[i]);
}
else
{
//increase list count and add a new list to dictionary
listCnt++;
listDict.Add(listCnt, new List<int>());
listDict[listCnt].Add(data[i]);
}
}
//to use new lists
foreach (var dic in listDict)
{
Console.WriteLine( $"List {dic.Key} : " + string.Join(",", dic.Value.Select(x => x.ToString()).ToArray()));
}
}
Output :
List 1 : 1,2
List 2 : 1,2,3
List 3 : 3
List 4 : 1,2,3,4
List 5 : 1,2,3,4,5,6

Split array with LINQ

Assuming I have a list
var listOfInt = new List<int> {1, 2, 3, 4, 7, 8, 12, 13, 14}
How can I use LINQ to obtain a list of lists as follows:
{{1, 2, 3, 4}, {7, 8}, {12, 13, 14}}
So, i have to take the consecutive values and group them into lists.

You can create extension method (I omitted source check here) which will iterate source and create groups of consecutive items. If next item in source is not consecutive, then current group is yielded:
public static IEnumerable<List<int>> ToConsecutiveGroups(
this IEnumerable<int> source)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
int current = iterator.Current;
List<int> group = new List<int> { current };
while (iterator.MoveNext())
{
int next = iterator.Current;
if (next < current || current + 1 < next)
{
yield return group;
group = new List<int>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13, 14 };
var groups = listOfInt.ToConsecutiveGroups();
Result:
[
[ 1, 2, 3, 4 ],
[ 7, 8 ],
[ 12, 13, 14 ]
]
UPDATE: Here is generic version of this extension method, which accepts predicate for verifying if two values should be considered consecutive:
public static IEnumerable<List<T>> ToConsecutiveGroups<T>(
this IEnumerable<T> source, Func<T,T, bool> isConsequtive)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
T current = iterator.Current;
List<T> group = new List<T> { current };
while (iterator.MoveNext())
{
T next = iterator.Current;
if (!isConsequtive(current, next))
{
yield return group;
group = new List<T>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var result = listOfInt.ToConsecutiveGroups((x,y) => (x == y) || (x == y - 1));

This works for both sorted and unsorted lists:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13 };
int index = 0;
var result = listOfInt.Zip(listOfInt
.Concat(listOfInt.Reverse<int>().Take(1))
.Skip(1),
(v1, v2) =>
new
{
V = v1,
G = (v2 - v1) != 1 ? index++ : index
})
.GroupBy(x => x.G, x => x.V, (k, l) => l.ToList())
.ToList();
External index is building an index of consecutive groups that have value difference of 1. Then you can simply GroupBy with respect to this index.
To clarify solution, here is how this collection looks without grouping (GroupBy commented):

Assuming your input is in order, the following will work:
var grouped = input.Select((n, i) => new { n, d = n - i }).GroupBy(p => p.d, p => p.n);
It won't work if your input is e.g. { 1, 2, 3, 999, 5, 6, 7 }.
You'd get { { 1, 2, 3, 5, 6, 7 }, { 999 } }.

This works:
var results =
listOfInt
.Skip(1)
.Aggregate(
new List<List<int>>(new [] { listOfInt.Take(1).ToList() }),
(a, x) =>
{
if (a.Last().Last() + 1 == x)
{
a.Last().Add(x);
}
else
{
a.Add(new List<int>(new [] { x }));
}
return a;
});
I get this result:

Merging 2 collections

How to combine 2 collections in such a way that the resultant collection contains the values alternatively from both the collections
Example :-
Col A= [1,2,3,4]
Col B= [5,6,7,8]
Result Col C=[1,5,2,6,3,7,4,8]

There are lots of ways you could do this, depending on the types of the input and the required type of the output. There's no library method that I'm aware of, however; you'd have to "roll your own".
One possibility would be a linq-style iterator method, assuming that all we know about the input collections is that they implement IEnumerable<T>:
static IEnumerable<T> Interleave(this IEnumerable<T> a, IEnumerable<T> b)
{
bool bEmpty = false;
using (var enumeratorB b.GetEnumerator())
{
foreach (var elementA in a)
{
yield return elementA;
if (!bEmpty && bEnumerator.MoveNext())
yield return bEnumerator.Current;
else
bEmpty = true;
}
if (!bEmpty)
while (bEnumerator.MoveNext())
yield return bEnumerator.Current;
}
}

int[] a = { 1, 2, 3, 4 };
int[] b = { 5, 6, 7, 8 };
int[] result = a.SelectMany((n, index) => new[] { n, b[index] }).ToArray();
If collection a and b haven't the same length, you need to be careful to use b[index], maybe you need : index >= b.Length ? 0 : b[index]

If the collections do not necessarily have the same length, consider an extension method:
public static IEnumerable<T> AlternateMerge<T>(this IEnumerable<T> source,
IEnumerable<T> other)
{
using(var sourceEnumerator = source.GetEnumerator())
using(var otherEnumerator = other.GetEnumerator())
{
bool haveItemsSource = true;
bool haveItemsOther = true;
while (haveItemsSource || haveItemsOther)
{
haveItemsSource = sourceEnumerator.MoveNext();
haveItemsOther = otherEnumerator.MoveNext();
if (haveItemsSource)
yield return sourceEnumerator.Current;
if (haveItemsOther)
yield return otherEnumerator.Current;
}
}
}
And use :
List<int> A = new List<int> { 1, 2, 3 };
List<int> B = new List<int> { 5, 6, 7, 8 };
var mergedList = A.AlternateMerge(B).ToList();

Assuming that both collections are of equal length:
Debug.Assert(a.Count == b.Count);
for (int i = 0; i < a.Count; i++)
{
c.Add(a[i]);
c.Add(b[i]);
}
Debug.Assert(c.Count == (a.Count + b.Count));

Use Linq's Union extension such as:
var colA = new List<int> { 1, 2, 3, 4 };
var colB = new List<int> { 1, 5, 2, 6, 3, 7, 4, 8};
var result = colA.Union( colB); // 1, 2, 3, 4, 5, 6, 7, 8

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Delete duplicates in a List of int arrays - c#

Related

How to Zip two Lists of different size to create a new list that is same as the size of the longest amongst the original lists?

How to subdivide a list of doubles into n chunks having first element of next series be the last of previous series

Split a list into multiple lists at increasing sequence broken

Split array with LINQ

Merging 2 collections

Categories

Resources