Split Array By Values In Sequence [duplicate]

Split Array By Values In Sequence [duplicate] - c#

This question already has answers here:
LINQ to find series of consecutive numbers
(6 answers)
Closed 5 years ago.
Is there an easy (linq?) way to split an int array into new arrays based off unbroken numerical sequences? For example given this pseudo code:
[Fact]
public void ArraySpike()
{
var source = new[] {1, 2, 3, 7, 8, 9, 12, 13, 24};
var results = SplitArray(source);
Assert.True(results[0] == new[] {1, 2, 3});
Assert.True(results[1] == new[] {7, 8, 9});
Assert.True(results[2] == new[] {12, 13});
Assert.True(results[3] == new[] {24});
}
public int[][] SplitArray(int[] source)
{
return source.???
}

This can work with the linq extension Aggregate. My seeding is not very elegant but that is easy enough to change. The results variable will contain the array of arrays and they are actually of type List<T> because that way they can be easily grown in the function where an array [] is always of fixed size.
This also assumes the source is already ordered and unique, if that is not the case add .OrderBy(x => x).Distinct()
var source = new[] { 1, 2, 3, 7, 8, 9, 12, 13, 24 };
var results = new List<List<int>>{new List<int>()};
var temp = source.Aggregate(results[0], (b, c) =>
{
if (b.Count > 0 && b.Last() != c - 1)
{
b = new List<int>();
results.Add(b);
}
b.Add(c);
return b;
});

I dug up this extension method from my personal collection:
public static IEnumerable<IEnumerable<T>> GroupConnected<T>(this IEnumerable<T> list, Func<T,T,bool> connectionCondition)
{
if (list == null)
{
yield break;
}
using (var enumerator = list.GetEnumerator())
{
T prev = default(T);
var temp = new List<T>();
while (enumerator.MoveNext())
{
T curr = enumerator.Current;
{
if(!prev.Equals(default(T)) && !connectionCondition(prev, curr))
{
yield return temp;
temp = new List<T>();
}
temp.Add(curr);
}
prev = curr;
}
yield return temp;
}
}
It solves the problem in a more general sense: split up a sequence in subsequences of elements that are "connected" somehow. It traverses the sequence and collects each element in a temporary list until the next item isn't "connected". It then returns the temporary list and begins a new one.
Your array elements are connected when they have a difference of 1:
var results = source.GroupConnected((a,b) => b - a == 1);

Related

First found duplicate element using LINQ (Not contiguous)

How do I print the first duplicate elements from an array?
var arr = new int[]{ 3, 2, 5, 1, 5, 4, 2, 15 };
Currently this method print 2 instead of 5.
public int FirstDuplicate(int[] arr)
{
var firstDup = arr
.GroupBy(x => x)
.Where(grp => grp.Count() == 2)
.Select(grp => grp.Key)
.FirstOrDefault();
if (firstDup > 0) return firstDup;
return -1;
}

You can write an extension metod that will return all duplicates from an IEnumerable<T> like
public static class EnumerableExtensions
{
public static IEnumerable<T> Duplicates<T>( this IEnumerable<T> source )
{
var hashset = new HashSet<T>();
foreach ( var item in source )
{
if ( hashset.Contains(item) )
yield return item;
else
hashset.Add(item);
}
}
}
and then use it
var arr = new int[]{ 3, 2, 5, 5, 4, 2, 15 };
var firstDuplicate = arr.Duplicates().First();
see .net fiddle example

This worked for me. I took advantage of comparing values with array indexes, the Distinct() method, and the first element of the resulting array.
var arr = new int[] { 3, 2, 5, 5, 4, 2, 15 };
var adjacentDuplicate = arr.Skip(1) // Skip first
.Where((value,index) => value == arr[index])
.Distinct()
.ToArray(); // Convert to array
if (adjacentDuplicate.Any())
{
Console.WriteLine(adjacentDuplicate[0]); // Print first duplicate
}
else
{
// No duplicates found.
}

Based on Sir Rufo answer, I would make two extensions
public static IEnumerable<T> Duplicates<T>(this IEnumerable<T> source)
{
var hashset = new HashSet<T>();
foreach (var item in source)
{
if (!hashset.Add(item))
{
yield return item;
}
}
}
public static IEnumerable<T?> AsNullable<T>(this IEnumerable<T> source) where T : struct
{
return source.Select(x => (T?)x);
}
You can use it like
var duplicate = arr
.Duplicates()
.AsNullable()
.FirstOrDefault();
The AsNullableconverts int into int? without hard coding the type. When the result is null, there is no duplicity. You can use it in more situations, like calculating the Max of potentially empty sequence of non nullable values (you can define it for IQueryable too). The advantage of this extension is, that when you use it, you know for sure, that null is not valid value in the source. And you would not shoot yourself into the leg when the null suddenly becomes a possible value.

Move list elements meeting condition to the top of the list

I want to move specific number to the top of this list.
int numberToBeMovedOnTop = 4;
List<int> lst = new List<int>(){1, 2, 3, 4, 5, 5, 4, 7, 9, 4, 2, 1};
List<int> lstOdd = lst.FindAll(l => l == numberToBeMovedOnTop);
lstOdd.AddRange(lst.FindAll(l => l != numberToBeMovedOnTop));
Where numberToBeMovedOnTop is a variable.
This gives me the desired result but is a better solution for this? I can iterate the list once and swap first occurence of numberToBeMovedOnTop with first element, second occurence with numberToBeMovedOnTop with second element and so on. But can this be done with some built-in C# function without iterating the list twice?

You could use LINQ:
List<int> lstOdd = lst.OrderByDescending(i => i == numberToBeMovedOnTop).ToList();
Why OrderByDescending? Because the comparison returns a bool and true is higher than false. You could also use:
List<int> lstOdd = lst.OrderBy(i => i == numberToBeMovedOnTop ? 0 : 1).ToList();
Note that this works because OrderBy and OrderByDescending are performing a stable sort. That means that the original order remains for all equal items.
For what it's worth, here is an extension method that works with any type and predicate and is a little bit more efficient:
public static List<T> PrependAll<T>(this List<T> list, Func<T, bool> predicate)
{
var returnList = new List<T>();
var listNonMatch = new List<T>();
foreach (T item in list)
{
if (predicate(item))
returnList.Add(item);
else
listNonMatch.Add(item);
}
returnList.AddRange(listNonMatch);
return returnList;
}
Usage: List<int> lstOdd = lst.PrependAll(i => i == numberToBeMovedOnTop);

Aside from using linq, it might be just as efficient/understandable to do this without linq
var listToAdd = new List<int>();
var listOdd = new List<int>();
for(int i = 0; i < lst.Count; i++)
{
if(lst[i] == numberToBeMovedOnTop)
{
listToAdd.Add(numberToBeMovedOnTop);
}
else
{
listOdd.Add(lst[i]);
}
}
listOdd.AddRange(listToAdd);
Keep track of those that you've removed, then add them on afterwards

Group by the predicate, then union?
var nums = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var grp = nums.GroupBy(x => x % 2 == 0).ToList();
var changed = grp[0].Union(grp[1]).ToList();

Display match for only 3 consecutive numbers

How can I display only 3 consecutive numbers for example in my code below I only want it to return 4, as that appears 3 times.
9 is 4 times so do not want that and 7 is twice so not what want that.
The code I currently have display 9
int[] intArray = { 9, 9, 9, 9, 6, 4, 4, 4, 7, 7 };
var adjacentDuplicateNumbers = intArray
.Skip(1)
.Where((value, index) => value == intArray[index])
.Distinct();
var enumerable = adjacentDuplicateNumbers as int[] ?? adjacentDuplicateNumbers.ToArray();
if (enumerable.Any())
{
Console.WriteLine("{0} is a consecutive number and is repeated 3 times.", enumerable.First());
}
else
{
Console.WriteLine("no consecutive number found.");
}

Using the extension method of this post: LINQ to find series of consecutive numbers
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(this IEnumerable<T> seq, Func<T, T, bool> condition)
{
T prev = seq.First();
List<T> list = new List<T>() { prev };
foreach (T item in seq.Skip(1))
{
if (condition(prev, item) == false)
{
yield return list;
list = new List<T>();
}
list.Add(item);
prev = item;
}
yield return list;
}
Usage:
var res = intArray.GroupWhile((a, b) => a == b).
Where(x => x.Count() == 3).Select(x => x.First());

Sometimes a simple foor loop is enough (and should be faster than linq)
int[] intArray = { 9, 9, 9, 9, 6, 4, 4, 4, 7, 7 };
var minus2 = intArray[0];
var minus1 = intArray[1];
var result = new List<int>();
for(int i = 2; i < intArray.Length; i++)
{
var current = intArray[i];
if(minus2 == minus1 && minus1 == current)
{
result.Add(current);
}
minus2 = minus1;
minus1 = current;
}

var results = intArray.Distinct()
.ToDictionary(k => k, v => intArray.Count(x => x == v))
.Where(x => x.Value == 3)
.Select(x => x.Key);
Take the district elements in the array. Use these as keys in a dictionary that map to the number of occurrences of this key in the original array. Use Where to only select pairs that match the required count (3). Use Select to return the resulting keys - in this example only 4.

The union of the intersects of the 2 set combinations of a sequence of sequences

How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.

// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}

This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;

You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();

You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.

That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.

Selecting unique elements from a List in C#

How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}?

var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
var uniqueNumbers =
from n in numbers
group n by n into nGroup
where nGroup.Count() == 1
select nGroup.Key;
// { 0, 1, 3, 5 }

var nums = new int{ 0...4,4,5};
var distinct = nums.Distinct();
make sure you're using Linq and .NET framework 3.5.

With lambda..
var all = new[] {0,1,1,2,3,4,4,4,5,6,7,8,8}.ToList();
var unique = all.GroupBy(i => i).Where(i => i.Count() == 1).Select(i=>i.Key);

C# 2.0 solution:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T item in things)
{
int count;
if (counts.TryGetValue(item, out count))
counts[item] = ++count;
else
counts.Add(item, 1);
}
foreach (KeyValuePair<T, int> kvp in counts)
{
if (kvp.Value == 1)
yield return kvp.Key;
}
}

Here is another way that works if you have complex type objects in your List and want to get the unique values of a property:
var uniqueValues= myItems.Select(k => k.MyProperty)
.GroupBy(g => g)
.Where(c => c.Count() == 1)
.Select(k => k.Key)
.ToList();
Or to get distinct values:
var distinctValues = myItems.Select(p => p.MyProperty)
.Distinct()
.ToList();
If your property is also a complex type you can create a custom comparer for the Distinct(), such as Distinct(OrderComparer), where OrderComparer could look like:
public class OrderComparer : IEqualityComparer<Order>
{
public bool Equals(Order o1, Order o2)
{
return o1.OrderID == o2.OrderID;
}
public int GetHashCode(Order obj)
{
return obj.OrderID.GetHashCode();
}
}

If Linq isn't available to you because you have to support legacy code that can't be upgraded, then declare a Dictionary, where the first int is the number and the second int is the number of occurences. Loop through your List, loading up your Dictionary. When you're done, loop through your Dictionary selecting only those elements where the number of occurences is 1.

I believe Matt meant to say:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, bool> uniques = new Dictionary<T, bool>();
foreach (T item in things)
{
if (!(uniques.ContainsKey(item)))
{
uniques.Add(item, true);
}
}
return uniques.Keys;
}

There are many ways to skin a cat, but HashSet seems made for the task here.
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
HashSet<int> r = new HashSet<int>(numbers);
foreach( int i in r ) {
Console.Write( "{0} ", i );
}
The output:
0 1 2 3 4 5

Here's a solution with no LINQ:
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
// This assumes the numbers are sorted
var noRepeats = new List<int>();
int temp = numbers[0]; // Or .First() if using IEnumerable
var count = 1;
for(int i = 1; i < numbers.Length; i++) // Or foreach (var n in numbers.Skip(1)) if using IEnumerable
{
if (numbers[i] == temp) count++;
else
{
if(count == 1) noRepeats.Add(temp);
temp = numbers[i];
count = 1;
}
}
if(count == 1) noRepeats.Add(temp);
Console.WriteLine($"[{string.Join(separator: ",", values: numbers)}] -> [{string.Join(separator: ",", values: noRepeats)}]");
This prints:
[0,1,2,2,2,3,4,4,5] -> [0,1,3,5]

In .Net 2.0 I`m pretty sure about this solution:
public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
List<T> uniques = new List<T>();
foreach (T item in source)
{
if (!uniques.Contains(item)) uniques.Add(item);
}
return uniques;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split Array By Values In Sequence [duplicate] - c#

Related

First found duplicate element using LINQ (Not contiguous)

Move list elements meeting condition to the top of the list

Display match for only 3 consecutive numbers

The union of the intersects of the 2 set combinations of a sequence of sequences

Selecting unique elements from a List in C#

Categories

Resources