Split a list into multiple lists at increasing sequence broken - c#

I've a List of int and I want to create multiple List after splitting the original list when a lower or same number is found. Numbers are not in sorted order.
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
I want the result to be as following lists:
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Currently, I'm using following linq to do this but not helping me out:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
List<List<int>> resultLists = new List<List<int>>();
var res = data.Where((p, i) =>
{
int count = 0;
resultLists.Add(new List<int>());
if (p < data[(i + 1) >= data.Count ? i - 1 : i + 1])
{
resultLists[count].Add(p);
}
else
{
count++;
resultLists.Add(new List<int>());
}
return true;
}).ToList();

I'd just go for something simple:
public static IEnumerable<List<int>> SplitWhenNotIncreasing(List<int> numbers)
{
for (int i = 1, start = 0; i <= numbers.Count; ++i)
{
if (i != numbers.Count && numbers[i] > numbers[i - 1])
continue;
yield return numbers.GetRange(start, i - start);
start = i;
}
}
Which you'd use like so:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
foreach (var subset in SplitWhenNotIncreasing(data))
Console.WriteLine(string.Join(", ", subset));
If you really did need to work with IEnumerable<T>, then the simplest way I can think of is like this:
public sealed class IncreasingSubsetFinder<T> where T: IComparable<T>
{
public static IEnumerable<IEnumerable<T>> Find(IEnumerable<T> numbers)
{
return new IncreasingSubsetFinder<T>().find(numbers.GetEnumerator());
}
IEnumerable<IEnumerable<T>> find(IEnumerator<T> iter)
{
if (!iter.MoveNext())
yield break;
while (!done)
yield return increasingSubset(iter);
}
IEnumerable<T> increasingSubset(IEnumerator<T> iter)
{
while (!done)
{
T prev = iter.Current;
yield return prev;
if ((done = !iter.MoveNext()) || iter.Current.CompareTo(prev) <= 0)
yield break;
}
}
bool done;
}
Which you would call like this:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
foreach (var subset in IncreasingSubsetFinder<int>.Find(data))
Console.WriteLine(string.Join(", ", subset));

This is not a typical LINQ operation, so as usual in such cases (when one insists on using LINQ) I would suggest using Aggregate method:
var result = data.Aggregate(new List<List<int>>(), (r, n) =>
{
if (r.Count == 0 || n <= r.Last().Last()) r.Add(new List<int>());
r.Last().Add(n);
return r;
});

You can use the index to get the previous item and calculate the group id out of comparing the values. Then group on the group ids and get the values out:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
int groupId = 0;
var groups = data.Select
( (item, index)
=> new
{ Item = item
, Group = index > 0 && item <= data[index - 1] ? ++groupId : groupId
}
);
List<List<int>> list = groups.GroupBy(g => g.Group)
.Select(x => x.Select(y => y.Item).ToList())
.ToList();

I really like Matthew Watson's solution. If however you do not want to rely on List<T>, here is my simple generic approach enumerating the enumerable once at most and still retaining the capability for lazy evaluation.
public static IEnumerable<IEnumerable<T>> AscendingSubsets<T>(this IEnumerable<T> superset) where T :IComparable<T>
{
var supersetEnumerator = superset.GetEnumerator();
if (!supersetEnumerator.MoveNext())
{
yield break;
}
T oldItem = supersetEnumerator.Current;
List<T> subset = new List<T>() { oldItem };
while (supersetEnumerator.MoveNext())
{
T currentItem = supersetEnumerator.Current;
if (currentItem.CompareTo(oldItem) > 0)
{
subset.Add(currentItem);
}
else
{
yield return subset;
subset = new List<T>() { currentItem };
}
oldItem = supersetEnumerator.Current;
}
yield return subset;
}
Edit: Simplified the solution further to only use one enumerator.

I have modified your code, and now working fine:
List<int> data = new List<int> { 1, 2, 1, 2, 3,3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
List<List<int>> resultLists = new List<List<int>>();
int last = 0;
int count = 0;
var res = data.Where((p, i) =>
{
if (i > 0)
{
if (p > last && p!=last)
{
resultLists[count].Add(p);
}
else
{
count++;
resultLists.Add(new List<int>());
resultLists[count].Add(p);
}
}
else
{
resultLists.Add(new List<int>());
resultLists[count].Add(p);
}
last = p;
return true;
}).ToList();

For things like this, I'm generally not a fan of solutions that use GroupBy or other methods that materialize the results. The reason is that you never know how long the input sequence will be, and materializations of these sub-sequences can be very costly.
I prefer to stream the results as they are pulled. This allows implementations of IEnumerable<T> that stream results to continue streaming through your transformation of that stream.
Note, this solution won't work if you break out of iterating through the sub-sequence and want to continue to the next sequence; if this is an issue, then one of the solutions that materialize the sub-sequences would probably be better.
However, for forward-only iterations of the entire sequence (which is the most typical use case), this will work just fine.
First, let's set up some helpers for our test classes:
private static IEnumerable<T> CreateEnumerable<T>(IEnumerable<T> enumerable)
{
// Validate parameters.
if (enumerable == null) throw new ArgumentNullException("enumerable");
// Cycle through and yield.
foreach (T t in enumerable)
yield return t;
}
private static void EnumerateAndPrintResults<T>(IEnumerable<T> data,
[CallerMemberName] string name = "") where T : IComparable<T>
{
// Write the name.
Debug.WriteLine("Case: " + name);
// Cycle through the chunks.
foreach (IEnumerable<T> chunk in data.
ChunkWhenNextSequenceElementIsNotGreater())
{
// Print opening brackets.
Debug.Write("{ ");
// Is this the first iteration?
bool firstIteration = true;
// Print the items.
foreach (T t in chunk)
{
// If not the first iteration, write a comma.
if (!firstIteration)
{
// Write the comma.
Debug.Write(", ");
}
// Write the item.
Debug.Write(t);
// Flip the flag.
firstIteration = false;
}
// Write the closing bracket.
Debug.WriteLine(" }");
}
}
CreateEnumerable is used for creating a streaming implementation, and EnumerateAndPrintResults will take the sequence, call ChunkWhenNextSequenceElementIsNotGreater (this is coming up and does the work) and output the results.
Here's the implementation. Note, I've chosen to implement them as extension methods on IEnumerable<T>; this is the first benefit, as it doesn't require a materialized sequence (technically, none of the other solutions do either, but it's better to explicitly state it like this).
First, the entry points:
public static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreater<T>(
this IEnumerable<T> source)
where T : IComparable<T>
{
// Validate parameters.
if (source == null) throw new ArgumentNullException("source");
// Call the overload.
return source.
ChunkWhenNextSequenceElementIsNotGreater(
Comparer<T>.Default.Compare);
}
public static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreater<T>(
this IEnumerable<T> source,
Comparison<T> comparer)
{
// Validate parameters.
if (source == null) throw new ArgumentNullException("source");
if (comparer == null) throw new ArgumentNullException("comparer");
// Call the implementation.
return source.
ChunkWhenNextSequenceElementIsNotGreaterImplementation(
comparer);
}
Note that this works on anything that implements IComparable<T> or where you provide a Comparison<T> delegate; this allows for any type and any kind of rules you want for performing the comparison.
Here's the implementation:
private static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreaterImplementation<T>(
this IEnumerable<T> source, Comparison<T> comparer)
{
// Validate parameters.
Debug.Assert(source != null);
Debug.Assert(comparer != null);
// Get the enumerator.
using (IEnumerator<T> enumerator = source.GetEnumerator())
{
// Move to the first element. If one can't, then get out.
if (!enumerator.MoveNext()) yield break;
// While true.
while (true)
{
// The new enumerator.
var chunkEnumerator = new
ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T>(
enumerator, comparer);
// Yield.
yield return chunkEnumerator;
// If the last move next returned false, then get out.
if (!chunkEnumerator.LastMoveNext) yield break;
}
}
}
Of note: this uses another class ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T> to handle enumerating the sub-sequences. This class will iterate each of the items from the IEnumerator<T> that is obtained from the original IEnumerable<T>.GetEnumerator() call, but store the results of the last call to IEnumerator<T>.MoveNext().
This sub-sequence generator is stored, and the value of the last call to MoveNext is checked to see if the end of the sequence has or hasn't been hit. If it has, then it simply breaks, otherwise, it moves to the next chunk.
Here's the implementation of ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T>:
internal class
ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T> :
IEnumerable<T>
{
#region Constructor.
internal ChunkWhenNextSequenceElementIsNotGreaterEnumerable(
IEnumerator<T> enumerator, Comparison<T> comparer)
{
// Validate parameters.
if (enumerator == null)
throw new ArgumentNullException("enumerator");
if (comparer == null)
throw new ArgumentNullException("comparer");
// Assign values.
_enumerator = enumerator;
_comparer = comparer;
}
#endregion
#region Instance state.
private readonly IEnumerator<T> _enumerator;
private readonly Comparison<T> _comparer;
internal bool LastMoveNext { get; private set; }
#endregion
#region IEnumerable implementation.
public IEnumerator<T> GetEnumerator()
{
// The assumption is that a call to MoveNext
// that returned true has already
// occured. Store as the previous value.
T previous = _enumerator.Current;
// Yield it.
yield return previous;
// While can move to the next item, and the previous
// item is less than or equal to the current item.
while ((LastMoveNext = _enumerator.MoveNext()) &&
_comparer(previous, _enumerator.Current) < 0)
{
// Yield.
yield return _enumerator.Current;
// Store the previous.
previous = _enumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
#endregion
}
Here's the test for the original condition in the question, along with the output:
[TestMethod]
public void TestStackOverflowCondition()
{
var data = new List<int> {
1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestStackOverflowCondition
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Here's the same input, but streamed as an enumerable:
[TestMethod]
public void TestStackOverflowConditionEnumerable()
{
var data = new List<int> {
1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6
};
EnumerateAndPrintResults(CreateEnumerable(data));
}
Output:
Case: TestStackOverflowConditionEnumerable
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Here's a test with non-sequential elements:
[TestMethod]
public void TestNonSequentialElements()
{
var data = new List<int> {
1, 3, 5, 7, 6, 8, 10, 2, 5, 8, 11, 11, 13
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestNonSequentialElements
{ 1, 3, 5, 7 }
{ 6, 8, 10 }
{ 2, 5, 8, 11 }
{ 11, 13 }
Finally, here's a test with characters instead of numbers:
[TestMethod]
public void TestNonSequentialCharacters()
{
var data = new List<char> {
'1', '3', '5', '7', '6', '8', 'a', '2', '5', '8', 'b', 'c', 'a'
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestNonSequentialCharacters
{ 1, 3, 5, 7 }
{ 6, 8, a }
{ 2, 5, 8, b, c }
{ a }

You can do it with Linq using the index to calculate the group:
var result = data.Select((n, i) => new { N = n, G = (i > 0 && n > data[i - 1] ? data[i - 1] + 1 : n) - i })
.GroupBy(a => a.G)
.Select(g => g.Select(n => n.N).ToArray())
.ToArray();

This is my simple loop approach using some yields :
static IEnumerable<IList<int>> Split(IList<int> data)
{
if (data.Count == 0) yield break;
List<int> curr = new List<int>();
curr.Add(data[0]);
int last = data[0];
for (int i = 1; i < data.Count; i++)
{
if (data[i] <= last)
{
yield return curr;
curr = new List<int>();
}
curr.Add(data[i]);
last = data[i];
}
yield return curr;
}

I use a dictionary to get 5 different list as below;
static void Main(string[] args)
{
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
Dictionary<int, List<int>> listDict = new Dictionary<int, List<int>>();
int listCnt = 1;
//as initial value get first value from list
listDict.Add(listCnt, new List<int>());
listDict[listCnt].Add(data[0]);
for (int i = 1; i < data.Count; i++)
{
if (data[i] > listDict[listCnt].Last())
{
listDict[listCnt].Add(data[i]);
}
else
{
//increase list count and add a new list to dictionary
listCnt++;
listDict.Add(listCnt, new List<int>());
listDict[listCnt].Add(data[i]);
}
}
//to use new lists
foreach (var dic in listDict)
{
Console.WriteLine( $"List {dic.Key} : " + string.Join(",", dic.Value.Select(x => x.ToString()).ToArray()));
}
}
Output :
List 1 : 1,2
List 2 : 1,2,3
List 3 : 3
List 4 : 1,2,3,4
List 5 : 1,2,3,4,5,6

Related

The union of the intersects of the 2 set combinations of a sequence of sequences

How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.
// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}
This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;
You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();
You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.
That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.

Split list by element

I have list of 1 and 0 like this:
var list = new List<int>{1,1,1,0,1,1,0,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1}
between two items, can be only one zero.
How to split that list into sublists by 0?
Other words: if I have string like this: string myString = "111011011110111111011101" then it is easy to split it by 0 into few strings.
But how to do it with list? This example shoudl produce these sublists:
1,1,1
1,1
1,1,1,1
1,1,1,1,1,1
1,1,1
1
so is there better way then casting each element into string, joining them and doing what I show what can be done with string ?
You can solve your problem by transforming the input sequence into a sequence of sequences just like the LINQ GroupBy does. However, in your case you are grouping on a change in the input sequence. There is perhaps the possibility of combining existing LINQ operators like GroupBy, Zip and Skip into something that does what you want but I think it is easier (and performs better) to create an iterator block that looks at pairs of items in the input sequence:
static class EnumerableExtensions {
public static IEnumerable<IEnumerable<T>> GroupOnChange<T>(
this IEnumerable<T> source,
Func<T, T, Boolean> changePredicate
) {
if (source == null)
throw new ArgumentNullException("source");
if (changePredicate == null)
throw new ArgumentNullException("changePredicate");
using (var enumerator = source.GetEnumerator()) {
if (!enumerator.MoveNext())
yield break;
var firstValue = enumerator.Current;
var currentGroup = new List<T>();
currentGroup.Add(firstValue);
while (enumerator.MoveNext()) {
var secondValue = enumerator.Current;
var change = changePredicate(firstValue, secondValue);
if (change) {
yield return currentGroup;
currentGroup = new List<T>();
}
currentGroup.Add(secondValue);
firstValue = secondValue;
}
yield return currentGroup;
}
}
}
GroupOnChange will take the items in the input sequence and group them into a sequence of sequences. A new group is started when changePredicate is true.
You can use GroupOnChange to split your input sequence exactly as you want to. You then have to remove the groups that have zero as a value by using Where.
var groups = items
.GroupOnChange((first, second) => first != second)
.Where(group => group.First() != 0);
You can also use this approach if the input are class instances and you want to group by a property of that class. You then have to modify the predicate accordingly to compare the properties. (I know you need this because you asked a now deleted question that was slightly more complicated where the input sequence was not simply numbers but classes with a number property.)
You could write an extension method like this:
public static class Extensions
{
public static IEnumerable<IEnumerable<TSource>> Split<TSource>(this IEnumerable<TSource> source, TSource splitOn, IEqualityComparer<TSource> comparer = null)
{
if (source == null)
throw new ArgumentNullException("source");
return SplitIterator(source, splitOn, comparer);
}
private static IEnumerable<IEnumerable<TSource>> SplitIterator<TSource>(this IEnumerable<TSource> source, TSource splitOn, IEqualityComparer<TSource> comparer)
{
comparer = comparer ?? EqualityComparer<TSource>.Default;
var current = new List<TSource>();
foreach (var item in source)
{
if (comparer.Equals(item, splitOn))
{
if (current.Count > 0)
{
yield return current;
current = new List<TSource>();
}
}
else
{
current.Add(item);
}
}
if (current.Count > 0)
yield return current;
}
}
And use it like this:
var list = new List<int>{1,1,1,0,1,1,0,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1};
var result = list.Split(0);
int c = 0;
var list = new List<int>{1,1,1,0,1,1,0,1,1,1,1,0,1,1,1,1,1,1,0,1,1,1,0,1};
var res = list
// split in groups and set their numbers
// c is a captured variable
.Select(x=>new {Item = x, Subgroup = x==1 ? c : c++})
// remove zeros
.Where(x=>x.Item!=0)
// create groups
.GroupBy(x=>x.Subgroup)
// convert to format List<List<int>>
.Select(gr=>gr.Select(w=>w.Item).ToList())
.ToList();
You can just group by the index of the next zero:
static void Main(string[] args)
{
var list = new List<int> { 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1 };
var result = list.Select((e, i) => new { Element = e, Index = i })
.Where(e => e.Element == 1)
.GroupBy(e => list.IndexOf(0, e.Index));
}
Maybe something simpler:
IEnumerable<IEnumerable<int>> Split(IEnumerable<int> items)
{
List<int> result = new List<int>();
foreach (int item in items)
if (item == 0 && result.Any())
{
yield return result;
result = new List<int>();
}
else
result.Add(item);
if (result.Any())
yield return result;
}
The code below splits by iterating over the list and storing sub-sequences of '1' in a list of lists.
using System;
using System.Collections.Generic;
namespace SplitList
{
class Program
{
static void Main(string[] args)
{
var list = new List<int> { 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1 };
List<List<int>> splitSequences = new List<List<int>>();
List<int> curSequence = new List<int>();
foreach (var item in list)
{
if (item == 1) {
curSequence.Add(item);
} else {
//Only push the current sequence onto the list of sequences if it is not empty,
//which could happen if multiple zeroes are encountered in a row
if (curSequence.Count > 0) {
splitSequences.Add(curSequence);
curSequence = new List<int>();
}
}
}
//push any final list
if (curSequence.Count > 0)
{
splitSequences.Add(curSequence);
}
foreach (var seq in splitSequences) {
String line = String.Join(",", seq);
Console.WriteLine(line);
}
Console.WriteLine("");
Console.WriteLine("Press any key to exit");
var discard = Console.ReadKey();
}
}
}
I did like ASh's solution most. I've used it with a slight change. There is no captured variable in "my" variant:
var list = new List<int> { 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1 };
var res = list
// Step 1: associate every value with an index
.Select((x, i) => (Value: x, Index: i))
// Step 2: remove separator values
.Where(x => x.Value != 0)
// Step 3: adjacent items will have the same result
// subtracting real position from the index marked during the 1st step
.Select((tuple, realIndex) => (tuple.Value, GroupId: tuple.Index - realIndex))
// Step 4: group by groupId
.GroupBy(tuple => tuple.GroupId)
// Step 5: convert to List<List<int>>
.Select(group => group.Select(tuple => tuple.Value).ToList())
.ToList();
More on step 3:
Let's say, we have 5 items with indices:
[ 0 1 2 3 4 ]
{ 1, 1, 0, 1, 1 }
After filtration from the step 2, list of numbers looks next:
[ 0 1 3 4 ]
{ 1, 1, 1, 1 }
What does step 3:
real indices (ri): [ 0 1 2 3 ]
indices from the 1st step (i): [ 0 1 3 4 ]
numbers: { 1, 1, 1, 1 }
i - ri: [ 0, 0, 1, 1 ]
And step 4 just groups by result of the subtraction i - ri, so called GroupID.

Split array with LINQ

Assuming I have a list
var listOfInt = new List<int> {1, 2, 3, 4, 7, 8, 12, 13, 14}
How can I use LINQ to obtain a list of lists as follows:
{{1, 2, 3, 4}, {7, 8}, {12, 13, 14}}
So, i have to take the consecutive values and group them into lists.
You can create extension method (I omitted source check here) which will iterate source and create groups of consecutive items. If next item in source is not consecutive, then current group is yielded:
public static IEnumerable<List<int>> ToConsecutiveGroups(
this IEnumerable<int> source)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
int current = iterator.Current;
List<int> group = new List<int> { current };
while (iterator.MoveNext())
{
int next = iterator.Current;
if (next < current || current + 1 < next)
{
yield return group;
group = new List<int>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13, 14 };
var groups = listOfInt.ToConsecutiveGroups();
Result:
[
[ 1, 2, 3, 4 ],
[ 7, 8 ],
[ 12, 13, 14 ]
]
UPDATE: Here is generic version of this extension method, which accepts predicate for verifying if two values should be considered consecutive:
public static IEnumerable<List<T>> ToConsecutiveGroups<T>(
this IEnumerable<T> source, Func<T,T, bool> isConsequtive)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
T current = iterator.Current;
List<T> group = new List<T> { current };
while (iterator.MoveNext())
{
T next = iterator.Current;
if (!isConsequtive(current, next))
{
yield return group;
group = new List<T>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var result = listOfInt.ToConsecutiveGroups((x,y) => (x == y) || (x == y - 1));
This works for both sorted and unsorted lists:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13 };
int index = 0;
var result = listOfInt.Zip(listOfInt
.Concat(listOfInt.Reverse<int>().Take(1))
.Skip(1),
(v1, v2) =>
new
{
V = v1,
G = (v2 - v1) != 1 ? index++ : index
})
.GroupBy(x => x.G, x => x.V, (k, l) => l.ToList())
.ToList();
External index is building an index of consecutive groups that have value difference of 1. Then you can simply GroupBy with respect to this index.
To clarify solution, here is how this collection looks without grouping (GroupBy commented):
Assuming your input is in order, the following will work:
var grouped = input.Select((n, i) => new { n, d = n - i }).GroupBy(p => p.d, p => p.n);
It won't work if your input is e.g. { 1, 2, 3, 999, 5, 6, 7 }.
You'd get { { 1, 2, 3, 5, 6, 7 }, { 999 } }.
This works:
var results =
listOfInt
.Skip(1)
.Aggregate(
new List<List<int>>(new [] { listOfInt.Take(1).ToList() }),
(a, x) =>
{
if (a.Last().Last() + 1 == x)
{
a.Last().Add(x);
}
else
{
a.Add(new List<int>(new [] { x }));
}
return a;
});
I get this result:

How to select the last set of concatenated sequence from multiple similar sequence

I have an int array, it's a concatenated array from multiple similar arrays all starting at 1.
1, 2, 3, 4
1, 2
1, 2, 3
1, 2
int[] list = { 1, 2, 3, 4, 1, 2, 1, 2, 3, 1, 2 };
What I am trying to achieve is to get the "last set" of the result which is {1, 2}.
Attempt:
int[] list = { 1, 2, 3, 4, 1, 2, 1, 2, 3, 1, 2 };
List<int> lastSet = new List<int>();
var totalSets = list.Count(x => x == 1);
int encounter = 0;
foreach (var i in list)
{
if (i == 1)
encounter += 1;
if (encounter == totalSets)
lastSet.Add(i);
}
lastSet.ToList().ForEach(x => Console.WriteLine(x));
Is there a better way to achieve this using LINQ, perhaps SkipWhile, GroupBy, Aggregate?
If you can either make your list be an actual List<int> or if it doesn't bother you to create a copy of the list via .ToList(), you can do this:
var list = new[]{ 1, 2, 3, 4, 1, 2, 1, 2, 3, 1, 2 }.ToList();
var lastSet = list.Skip(list.LastIndexOf(1)).ToList();
Otherwise, Aggregate can work, but it's a little ugly:
var lastSet = list.Aggregate(new List<int>{1}, (seed, i) => {
if(i == 1) {seed.Clear(); }
seed.Add(i);
return seed;
})
Update
As dtb points out, you can use Array.LastIndexOf rather than creating a List:
var list = new[]{ 1, 2, 3, 4, 1, 2, 1, 2, 3, 1, 2 };
var lastSet = list.Skip(Array.LastIndexOf(list, 1)).ToList();
Works for any IEnumerable (but is slower than direct List versions)
var sub = list.Reverse<int>()
.TakeWhile(i => i != 1)
.Concat(new[]{1})
.Reverse<int>();
Run a ToArray() on the result if you like.
Using the GroupAdjacent Extension Method below, you can split the list into the sequences beginning with 1 and then take the last sequence:
var result = list.GroupAdjacent((g, x) => x != 1)
.Last()
.ToList();
with
public static IEnumerable<IEnumerable<T>> GroupAdjacent<T>(
this IEnumerable<T> source, Func<IEnumerable<T>, T, bool> adjacent)
{
var g = new List<T>();
foreach (var x in source)
{
if (g.Count != 0 && !adjacent(g, x))
{
yield return g;
g = new List<T>();
}
g.Add(x);
}
yield return g;
}
LINQ is overrated:
int[] list = { 1, 2, 3, 4, 1, 2, 1, 2, 3, 1, 2 };
int pos = Array.LastIndexOf(list, 1);
int[] result = new int[list.Length - pos];
Array.Copy(list, pos, result, 0, result.Length);
// result == { 1, 2 }
Now with 100% more readable:
int[] list = { 1, 2, 3, 4, 1, 2, 1, 2, 3, 1, 2 };
int[] result = list.Slice(list.LastIndexOf(1));
// result == { 1, 2 }
where
static int LastIndexOf<T>(this T[] array, T value)
{
return Array.LastIndexOf<T>(array, value);
}
static T[] Slice<T>(this T[] array, int offset)
{
return Slice(array, offset, array.Length - offset);
}
static T[] Slice<T>(this T[] array, int offset, int length)
{
T[] result = new T[length];
Array.Copy(array, offset, result, 0, length);
return result;
}

Merging 2 collections

How to combine 2 collections in such a way that the resultant collection contains the values alternatively from both the collections
Example :-
Col A= [1,2,3,4]
Col B= [5,6,7,8]
Result Col C=[1,5,2,6,3,7,4,8]
There are lots of ways you could do this, depending on the types of the input and the required type of the output. There's no library method that I'm aware of, however; you'd have to "roll your own".
One possibility would be a linq-style iterator method, assuming that all we know about the input collections is that they implement IEnumerable<T>:
static IEnumerable<T> Interleave(this IEnumerable<T> a, IEnumerable<T> b)
{
bool bEmpty = false;
using (var enumeratorB b.GetEnumerator())
{
foreach (var elementA in a)
{
yield return elementA;
if (!bEmpty && bEnumerator.MoveNext())
yield return bEnumerator.Current;
else
bEmpty = true;
}
if (!bEmpty)
while (bEnumerator.MoveNext())
yield return bEnumerator.Current;
}
}
int[] a = { 1, 2, 3, 4 };
int[] b = { 5, 6, 7, 8 };
int[] result = a.SelectMany((n, index) => new[] { n, b[index] }).ToArray();
If collection a and b haven't the same length, you need to be careful to use b[index], maybe you need : index >= b.Length ? 0 : b[index]
If the collections do not necessarily have the same length, consider an extension method:
public static IEnumerable<T> AlternateMerge<T>(this IEnumerable<T> source,
IEnumerable<T> other)
{
using(var sourceEnumerator = source.GetEnumerator())
using(var otherEnumerator = other.GetEnumerator())
{
bool haveItemsSource = true;
bool haveItemsOther = true;
while (haveItemsSource || haveItemsOther)
{
haveItemsSource = sourceEnumerator.MoveNext();
haveItemsOther = otherEnumerator.MoveNext();
if (haveItemsSource)
yield return sourceEnumerator.Current;
if (haveItemsOther)
yield return otherEnumerator.Current;
}
}
}
And use :
List<int> A = new List<int> { 1, 2, 3 };
List<int> B = new List<int> { 5, 6, 7, 8 };
var mergedList = A.AlternateMerge(B).ToList();
Assuming that both collections are of equal length:
Debug.Assert(a.Count == b.Count);
for (int i = 0; i < a.Count; i++)
{
c.Add(a[i]);
c.Add(b[i]);
}
Debug.Assert(c.Count == (a.Count + b.Count));
Use Linq's Union extension such as:
var colA = new List<int> { 1, 2, 3, 4 };
var colB = new List<int> { 1, 5, 2, 6, 3, 7, 4, 8};
var result = colA.Union( colB); // 1, 2, 3, 4, 5, 6, 7, 8

Categories