Split array with LINQ - c#

Assuming I have a list
var listOfInt = new List<int> {1, 2, 3, 4, 7, 8, 12, 13, 14}
How can I use LINQ to obtain a list of lists as follows:
{{1, 2, 3, 4}, {7, 8}, {12, 13, 14}}
So, i have to take the consecutive values and group them into lists.

You can create extension method (I omitted source check here) which will iterate source and create groups of consecutive items. If next item in source is not consecutive, then current group is yielded:
public static IEnumerable<List<int>> ToConsecutiveGroups(
this IEnumerable<int> source)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
int current = iterator.Current;
List<int> group = new List<int> { current };
while (iterator.MoveNext())
{
int next = iterator.Current;
if (next < current || current + 1 < next)
{
yield return group;
group = new List<int>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13, 14 };
var groups = listOfInt.ToConsecutiveGroups();
Result:
[
[ 1, 2, 3, 4 ],
[ 7, 8 ],
[ 12, 13, 14 ]
]
UPDATE: Here is generic version of this extension method, which accepts predicate for verifying if two values should be considered consecutive:
public static IEnumerable<List<T>> ToConsecutiveGroups<T>(
this IEnumerable<T> source, Func<T,T, bool> isConsequtive)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
else
{
T current = iterator.Current;
List<T> group = new List<T> { current };
while (iterator.MoveNext())
{
T next = iterator.Current;
if (!isConsequtive(current, next))
{
yield return group;
group = new List<T>();
}
current = next;
group.Add(current);
}
if (group.Any())
yield return group;
}
}
}
Usage is simple:
var result = listOfInt.ToConsecutiveGroups((x,y) => (x == y) || (x == y - 1));

This works for both sorted and unsorted lists:
var listOfInt = new List<int> { 1, 2, 3, 4, 7, 8, 12, 13 };
int index = 0;
var result = listOfInt.Zip(listOfInt
.Concat(listOfInt.Reverse<int>().Take(1))
.Skip(1),
(v1, v2) =>
new
{
V = v1,
G = (v2 - v1) != 1 ? index++ : index
})
.GroupBy(x => x.G, x => x.V, (k, l) => l.ToList())
.ToList();
External index is building an index of consecutive groups that have value difference of 1. Then you can simply GroupBy with respect to this index.
To clarify solution, here is how this collection looks without grouping (GroupBy commented):

Assuming your input is in order, the following will work:
var grouped = input.Select((n, i) => new { n, d = n - i }).GroupBy(p => p.d, p => p.n);
It won't work if your input is e.g. { 1, 2, 3, 999, 5, 6, 7 }.
You'd get { { 1, 2, 3, 5, 6, 7 }, { 999 } }.

This works:
var results =
listOfInt
.Skip(1)
.Aggregate(
new List<List<int>>(new [] { listOfInt.Take(1).ToList() }),
(a, x) =>
{
if (a.Last().Last() + 1 == x)
{
a.Last().Add(x);
}
else
{
a.Add(new List<int>(new [] { x }));
}
return a;
});
I get this result:

Related

Split a list into multiple lists at increasing sequence broken

I've a List of int and I want to create multiple List after splitting the original list when a lower or same number is found. Numbers are not in sorted order.
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
I want the result to be as following lists:
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Currently, I'm using following linq to do this but not helping me out:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
List<List<int>> resultLists = new List<List<int>>();
var res = data.Where((p, i) =>
{
int count = 0;
resultLists.Add(new List<int>());
if (p < data[(i + 1) >= data.Count ? i - 1 : i + 1])
{
resultLists[count].Add(p);
}
else
{
count++;
resultLists.Add(new List<int>());
}
return true;
}).ToList();
I'd just go for something simple:
public static IEnumerable<List<int>> SplitWhenNotIncreasing(List<int> numbers)
{
for (int i = 1, start = 0; i <= numbers.Count; ++i)
{
if (i != numbers.Count && numbers[i] > numbers[i - 1])
continue;
yield return numbers.GetRange(start, i - start);
start = i;
}
}
Which you'd use like so:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
foreach (var subset in SplitWhenNotIncreasing(data))
Console.WriteLine(string.Join(", ", subset));
If you really did need to work with IEnumerable<T>, then the simplest way I can think of is like this:
public sealed class IncreasingSubsetFinder<T> where T: IComparable<T>
{
public static IEnumerable<IEnumerable<T>> Find(IEnumerable<T> numbers)
{
return new IncreasingSubsetFinder<T>().find(numbers.GetEnumerator());
}
IEnumerable<IEnumerable<T>> find(IEnumerator<T> iter)
{
if (!iter.MoveNext())
yield break;
while (!done)
yield return increasingSubset(iter);
}
IEnumerable<T> increasingSubset(IEnumerator<T> iter)
{
while (!done)
{
T prev = iter.Current;
yield return prev;
if ((done = !iter.MoveNext()) || iter.Current.CompareTo(prev) <= 0)
yield break;
}
}
bool done;
}
Which you would call like this:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
foreach (var subset in IncreasingSubsetFinder<int>.Find(data))
Console.WriteLine(string.Join(", ", subset));
This is not a typical LINQ operation, so as usual in such cases (when one insists on using LINQ) I would suggest using Aggregate method:
var result = data.Aggregate(new List<List<int>>(), (r, n) =>
{
if (r.Count == 0 || n <= r.Last().Last()) r.Add(new List<int>());
r.Last().Add(n);
return r;
});
You can use the index to get the previous item and calculate the group id out of comparing the values. Then group on the group ids and get the values out:
List<int> data = new List<int> { 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
int groupId = 0;
var groups = data.Select
( (item, index)
=> new
{ Item = item
, Group = index > 0 && item <= data[index - 1] ? ++groupId : groupId
}
);
List<List<int>> list = groups.GroupBy(g => g.Group)
.Select(x => x.Select(y => y.Item).ToList())
.ToList();
I really like Matthew Watson's solution. If however you do not want to rely on List<T>, here is my simple generic approach enumerating the enumerable once at most and still retaining the capability for lazy evaluation.
public static IEnumerable<IEnumerable<T>> AscendingSubsets<T>(this IEnumerable<T> superset) where T :IComparable<T>
{
var supersetEnumerator = superset.GetEnumerator();
if (!supersetEnumerator.MoveNext())
{
yield break;
}
T oldItem = supersetEnumerator.Current;
List<T> subset = new List<T>() { oldItem };
while (supersetEnumerator.MoveNext())
{
T currentItem = supersetEnumerator.Current;
if (currentItem.CompareTo(oldItem) > 0)
{
subset.Add(currentItem);
}
else
{
yield return subset;
subset = new List<T>() { currentItem };
}
oldItem = supersetEnumerator.Current;
}
yield return subset;
}
Edit: Simplified the solution further to only use one enumerator.
I have modified your code, and now working fine:
List<int> data = new List<int> { 1, 2, 1, 2, 3,3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
List<List<int>> resultLists = new List<List<int>>();
int last = 0;
int count = 0;
var res = data.Where((p, i) =>
{
if (i > 0)
{
if (p > last && p!=last)
{
resultLists[count].Add(p);
}
else
{
count++;
resultLists.Add(new List<int>());
resultLists[count].Add(p);
}
}
else
{
resultLists.Add(new List<int>());
resultLists[count].Add(p);
}
last = p;
return true;
}).ToList();
For things like this, I'm generally not a fan of solutions that use GroupBy or other methods that materialize the results. The reason is that you never know how long the input sequence will be, and materializations of these sub-sequences can be very costly.
I prefer to stream the results as they are pulled. This allows implementations of IEnumerable<T> that stream results to continue streaming through your transformation of that stream.
Note, this solution won't work if you break out of iterating through the sub-sequence and want to continue to the next sequence; if this is an issue, then one of the solutions that materialize the sub-sequences would probably be better.
However, for forward-only iterations of the entire sequence (which is the most typical use case), this will work just fine.
First, let's set up some helpers for our test classes:
private static IEnumerable<T> CreateEnumerable<T>(IEnumerable<T> enumerable)
{
// Validate parameters.
if (enumerable == null) throw new ArgumentNullException("enumerable");
// Cycle through and yield.
foreach (T t in enumerable)
yield return t;
}
private static void EnumerateAndPrintResults<T>(IEnumerable<T> data,
[CallerMemberName] string name = "") where T : IComparable<T>
{
// Write the name.
Debug.WriteLine("Case: " + name);
// Cycle through the chunks.
foreach (IEnumerable<T> chunk in data.
ChunkWhenNextSequenceElementIsNotGreater())
{
// Print opening brackets.
Debug.Write("{ ");
// Is this the first iteration?
bool firstIteration = true;
// Print the items.
foreach (T t in chunk)
{
// If not the first iteration, write a comma.
if (!firstIteration)
{
// Write the comma.
Debug.Write(", ");
}
// Write the item.
Debug.Write(t);
// Flip the flag.
firstIteration = false;
}
// Write the closing bracket.
Debug.WriteLine(" }");
}
}
CreateEnumerable is used for creating a streaming implementation, and EnumerateAndPrintResults will take the sequence, call ChunkWhenNextSequenceElementIsNotGreater (this is coming up and does the work) and output the results.
Here's the implementation. Note, I've chosen to implement them as extension methods on IEnumerable<T>; this is the first benefit, as it doesn't require a materialized sequence (technically, none of the other solutions do either, but it's better to explicitly state it like this).
First, the entry points:
public static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreater<T>(
this IEnumerable<T> source)
where T : IComparable<T>
{
// Validate parameters.
if (source == null) throw new ArgumentNullException("source");
// Call the overload.
return source.
ChunkWhenNextSequenceElementIsNotGreater(
Comparer<T>.Default.Compare);
}
public static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreater<T>(
this IEnumerable<T> source,
Comparison<T> comparer)
{
// Validate parameters.
if (source == null) throw new ArgumentNullException("source");
if (comparer == null) throw new ArgumentNullException("comparer");
// Call the implementation.
return source.
ChunkWhenNextSequenceElementIsNotGreaterImplementation(
comparer);
}
Note that this works on anything that implements IComparable<T> or where you provide a Comparison<T> delegate; this allows for any type and any kind of rules you want for performing the comparison.
Here's the implementation:
private static IEnumerable<IEnumerable<T>>
ChunkWhenNextSequenceElementIsNotGreaterImplementation<T>(
this IEnumerable<T> source, Comparison<T> comparer)
{
// Validate parameters.
Debug.Assert(source != null);
Debug.Assert(comparer != null);
// Get the enumerator.
using (IEnumerator<T> enumerator = source.GetEnumerator())
{
// Move to the first element. If one can't, then get out.
if (!enumerator.MoveNext()) yield break;
// While true.
while (true)
{
// The new enumerator.
var chunkEnumerator = new
ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T>(
enumerator, comparer);
// Yield.
yield return chunkEnumerator;
// If the last move next returned false, then get out.
if (!chunkEnumerator.LastMoveNext) yield break;
}
}
}
Of note: this uses another class ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T> to handle enumerating the sub-sequences. This class will iterate each of the items from the IEnumerator<T> that is obtained from the original IEnumerable<T>.GetEnumerator() call, but store the results of the last call to IEnumerator<T>.MoveNext().
This sub-sequence generator is stored, and the value of the last call to MoveNext is checked to see if the end of the sequence has or hasn't been hit. If it has, then it simply breaks, otherwise, it moves to the next chunk.
Here's the implementation of ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T>:
internal class
ChunkWhenNextSequenceElementIsNotGreaterEnumerable<T> :
IEnumerable<T>
{
#region Constructor.
internal ChunkWhenNextSequenceElementIsNotGreaterEnumerable(
IEnumerator<T> enumerator, Comparison<T> comparer)
{
// Validate parameters.
if (enumerator == null)
throw new ArgumentNullException("enumerator");
if (comparer == null)
throw new ArgumentNullException("comparer");
// Assign values.
_enumerator = enumerator;
_comparer = comparer;
}
#endregion
#region Instance state.
private readonly IEnumerator<T> _enumerator;
private readonly Comparison<T> _comparer;
internal bool LastMoveNext { get; private set; }
#endregion
#region IEnumerable implementation.
public IEnumerator<T> GetEnumerator()
{
// The assumption is that a call to MoveNext
// that returned true has already
// occured. Store as the previous value.
T previous = _enumerator.Current;
// Yield it.
yield return previous;
// While can move to the next item, and the previous
// item is less than or equal to the current item.
while ((LastMoveNext = _enumerator.MoveNext()) &&
_comparer(previous, _enumerator.Current) < 0)
{
// Yield.
yield return _enumerator.Current;
// Store the previous.
previous = _enumerator.Current;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
#endregion
}
Here's the test for the original condition in the question, along with the output:
[TestMethod]
public void TestStackOverflowCondition()
{
var data = new List<int> {
1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestStackOverflowCondition
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Here's the same input, but streamed as an enumerable:
[TestMethod]
public void TestStackOverflowConditionEnumerable()
{
var data = new List<int> {
1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6
};
EnumerateAndPrintResults(CreateEnumerable(data));
}
Output:
Case: TestStackOverflowConditionEnumerable
{ 1, 2 }
{ 1, 2, 3 }
{ 3 }
{ 1, 2, 3, 4 }
{ 1, 2, 3, 4, 5, 6 }
Here's a test with non-sequential elements:
[TestMethod]
public void TestNonSequentialElements()
{
var data = new List<int> {
1, 3, 5, 7, 6, 8, 10, 2, 5, 8, 11, 11, 13
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestNonSequentialElements
{ 1, 3, 5, 7 }
{ 6, 8, 10 }
{ 2, 5, 8, 11 }
{ 11, 13 }
Finally, here's a test with characters instead of numbers:
[TestMethod]
public void TestNonSequentialCharacters()
{
var data = new List<char> {
'1', '3', '5', '7', '6', '8', 'a', '2', '5', '8', 'b', 'c', 'a'
};
EnumerateAndPrintResults(data);
}
Output:
Case: TestNonSequentialCharacters
{ 1, 3, 5, 7 }
{ 6, 8, a }
{ 2, 5, 8, b, c }
{ a }
You can do it with Linq using the index to calculate the group:
var result = data.Select((n, i) => new { N = n, G = (i > 0 && n > data[i - 1] ? data[i - 1] + 1 : n) - i })
.GroupBy(a => a.G)
.Select(g => g.Select(n => n.N).ToArray())
.ToArray();
This is my simple loop approach using some yields :
static IEnumerable<IList<int>> Split(IList<int> data)
{
if (data.Count == 0) yield break;
List<int> curr = new List<int>();
curr.Add(data[0]);
int last = data[0];
for (int i = 1; i < data.Count; i++)
{
if (data[i] <= last)
{
yield return curr;
curr = new List<int>();
}
curr.Add(data[i]);
last = data[i];
}
yield return curr;
}
I use a dictionary to get 5 different list as below;
static void Main(string[] args)
{
List<int> data = new List<int> { 1, 2, 1, 2, 3, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6 };
Dictionary<int, List<int>> listDict = new Dictionary<int, List<int>>();
int listCnt = 1;
//as initial value get first value from list
listDict.Add(listCnt, new List<int>());
listDict[listCnt].Add(data[0]);
for (int i = 1; i < data.Count; i++)
{
if (data[i] > listDict[listCnt].Last())
{
listDict[listCnt].Add(data[i]);
}
else
{
//increase list count and add a new list to dictionary
listCnt++;
listDict.Add(listCnt, new List<int>());
listDict[listCnt].Add(data[i]);
}
}
//to use new lists
foreach (var dic in listDict)
{
Console.WriteLine( $"List {dic.Key} : " + string.Join(",", dic.Value.Select(x => x.ToString()).ToArray()));
}
}
Output :
List 1 : 1,2
List 2 : 1,2,3
List 3 : 3
List 4 : 1,2,3,4
List 5 : 1,2,3,4,5,6

The union of the intersects of the 2 set combinations of a sequence of sequences

How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.
// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}
This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;
You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();
You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.
That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.

LINQ: Separating single list to multiple lists

I have a single array with these entries:
{1, 1, 2, 2, 3,3,3, 4}
and i want to transform them to ( 3 lists in this case ):
{1,2,3,4}
{1,2,3}
{3}
Is there any way to do this with LINQ or SQL? I guess there's a mathematical term for this operation, which I don't know unfortunately...
Or do I have to do it with loops?
=======
EDIT: I can't really describe the logic, so here are more examples.. It more or less loops multiple times over the array and takes every number once ( but every number only once per round ) until there are no numbers left
{1, 1, 2, 2, 3,3,3, 4, 5}
would be
{1,2,3,4,5}
{1,2,3}
{3}
or
{1, 1, 2, 2,2, 3,3,3, 4, 5}
would be
{1,2,3,4,5}
{1,2,3}
{2,3}
private IEnumerable<List<int>> FooSplit(IEnumerable<int> items)
{
List<int> source = new List<int>(items);
while (source.Any())
{
var result = source.Distinct().ToList();
yield return result;
result.ForEach(item => source.Remove(item));
}
}
Usage:
int[] items = { 1, 1, 2, 2, 3, 3, 3, 4 };
foreach(var subList in FooSplit(items))
{
// here you have your three sublists
}
Here is another solution, which is less readable but it will have better performance:
private IEnumerable<IEnumerable<int>> FooSplit(IEnumerable<int> items)
{
var groups = items.GroupBy(i => i).Select(g => g.ToList()).ToList();
while (groups.Count > 0)
{
yield return groups.Select( g =>
{ var i = g[0]; g.RemoveAt(g.Count - 1); return i; });
groups.RemoveAll(g => g.Count == 0);
}
}
this does the job:
static void Main(string[] args)
{
int[] numbers = {1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 5};
List<int> nums = new List<int>(numbers.Length);
nums.AddRange(numbers);
while (nums.Count > 0)
{
int[] n = nums.Distinct().ToArray();
for (int i = 0; i < n.Count(); i++)
{
Console.Write("{0}\t", n[i]);
nums.Remove(n[i]);
}
Console.WriteLine();
}
Console.Read();
}
Here's an alternative console app:
class Program
{
class Freq
{
public int Num { get; set; }
public int Count { get; set; }
}
static void Main(string[] args)
{
var nums = new[] { 1, 1, 2, 2, 3, 3, 3, 4 };
var groups = nums.GroupBy(i => i).Select(g => new Freq { Num = g.Key, Count = g.Count() }).ToList();
while (groups.Any(g => g.Count > 0))
{
var list = groups.Where(g => g.Count > 0).Select(g => g.Num).ToList();
list.ForEach(li => groups.First(g => g.Num == li).Count--);
Console.WriteLine(String.Join(",", list));
}
Console.ReadKey();
}
}

Merging 2 collections

How to combine 2 collections in such a way that the resultant collection contains the values alternatively from both the collections
Example :-
Col A= [1,2,3,4]
Col B= [5,6,7,8]
Result Col C=[1,5,2,6,3,7,4,8]
There are lots of ways you could do this, depending on the types of the input and the required type of the output. There's no library method that I'm aware of, however; you'd have to "roll your own".
One possibility would be a linq-style iterator method, assuming that all we know about the input collections is that they implement IEnumerable<T>:
static IEnumerable<T> Interleave(this IEnumerable<T> a, IEnumerable<T> b)
{
bool bEmpty = false;
using (var enumeratorB b.GetEnumerator())
{
foreach (var elementA in a)
{
yield return elementA;
if (!bEmpty && bEnumerator.MoveNext())
yield return bEnumerator.Current;
else
bEmpty = true;
}
if (!bEmpty)
while (bEnumerator.MoveNext())
yield return bEnumerator.Current;
}
}
int[] a = { 1, 2, 3, 4 };
int[] b = { 5, 6, 7, 8 };
int[] result = a.SelectMany((n, index) => new[] { n, b[index] }).ToArray();
If collection a and b haven't the same length, you need to be careful to use b[index], maybe you need : index >= b.Length ? 0 : b[index]
If the collections do not necessarily have the same length, consider an extension method:
public static IEnumerable<T> AlternateMerge<T>(this IEnumerable<T> source,
IEnumerable<T> other)
{
using(var sourceEnumerator = source.GetEnumerator())
using(var otherEnumerator = other.GetEnumerator())
{
bool haveItemsSource = true;
bool haveItemsOther = true;
while (haveItemsSource || haveItemsOther)
{
haveItemsSource = sourceEnumerator.MoveNext();
haveItemsOther = otherEnumerator.MoveNext();
if (haveItemsSource)
yield return sourceEnumerator.Current;
if (haveItemsOther)
yield return otherEnumerator.Current;
}
}
}
And use :
List<int> A = new List<int> { 1, 2, 3 };
List<int> B = new List<int> { 5, 6, 7, 8 };
var mergedList = A.AlternateMerge(B).ToList();
Assuming that both collections are of equal length:
Debug.Assert(a.Count == b.Count);
for (int i = 0; i < a.Count; i++)
{
c.Add(a[i]);
c.Add(b[i]);
}
Debug.Assert(c.Count == (a.Count + b.Count));
Use Linq's Union extension such as:
var colA = new List<int> { 1, 2, 3, 4 };
var colB = new List<int> { 1, 5, 2, 6, 3, 7, 4, 8};
var result = colA.Union( colB); // 1, 2, 3, 4, 5, 6, 7, 8

Selecting unique elements from a List in C#

How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}?
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
var uniqueNumbers =
from n in numbers
group n by n into nGroup
where nGroup.Count() == 1
select nGroup.Key;
// { 0, 1, 3, 5 }
var nums = new int{ 0...4,4,5};
var distinct = nums.Distinct();
make sure you're using Linq and .NET framework 3.5.
With lambda..
var all = new[] {0,1,1,2,3,4,4,4,5,6,7,8,8}.ToList();
var unique = all.GroupBy(i => i).Where(i => i.Count() == 1).Select(i=>i.Key);
C# 2.0 solution:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T item in things)
{
int count;
if (counts.TryGetValue(item, out count))
counts[item] = ++count;
else
counts.Add(item, 1);
}
foreach (KeyValuePair<T, int> kvp in counts)
{
if (kvp.Value == 1)
yield return kvp.Key;
}
}
Here is another way that works if you have complex type objects in your List and want to get the unique values of a property:
var uniqueValues= myItems.Select(k => k.MyProperty)
.GroupBy(g => g)
.Where(c => c.Count() == 1)
.Select(k => k.Key)
.ToList();
Or to get distinct values:
var distinctValues = myItems.Select(p => p.MyProperty)
.Distinct()
.ToList();
If your property is also a complex type you can create a custom comparer for the Distinct(), such as Distinct(OrderComparer), where OrderComparer could look like:
public class OrderComparer : IEqualityComparer<Order>
{
public bool Equals(Order o1, Order o2)
{
return o1.OrderID == o2.OrderID;
}
public int GetHashCode(Order obj)
{
return obj.OrderID.GetHashCode();
}
}
If Linq isn't available to you because you have to support legacy code that can't be upgraded, then declare a Dictionary, where the first int is the number and the second int is the number of occurences. Loop through your List, loading up your Dictionary. When you're done, loop through your Dictionary selecting only those elements where the number of occurences is 1.
I believe Matt meant to say:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, bool> uniques = new Dictionary<T, bool>();
foreach (T item in things)
{
if (!(uniques.ContainsKey(item)))
{
uniques.Add(item, true);
}
}
return uniques.Keys;
}
There are many ways to skin a cat, but HashSet seems made for the task here.
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
HashSet<int> r = new HashSet<int>(numbers);
foreach( int i in r ) {
Console.Write( "{0} ", i );
}
The output:
0 1 2 3 4 5
Here's a solution with no LINQ:
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
// This assumes the numbers are sorted
var noRepeats = new List<int>();
int temp = numbers[0]; // Or .First() if using IEnumerable
var count = 1;
for(int i = 1; i < numbers.Length; i++) // Or foreach (var n in numbers.Skip(1)) if using IEnumerable
{
if (numbers[i] == temp) count++;
else
{
if(count == 1) noRepeats.Add(temp);
temp = numbers[i];
count = 1;
}
}
if(count == 1) noRepeats.Add(temp);
Console.WriteLine($"[{string.Join(separator: ",", values: numbers)}] -> [{string.Join(separator: ",", values: noRepeats)}]");
This prints:
[0,1,2,2,2,3,4,4,5] -> [0,1,3,5]
In .Net 2.0 I`m pretty sure about this solution:
public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
List<T> uniques = new List<T>();
foreach (T item in source)
{
if (!uniques.Contains(item)) uniques.Add(item);
}
return uniques;
}

Categories