Use Linq to break a list by special values? - c#

I'm trying to use Linq to convert IEnumerable<int> to IEnumerable<List<int>> - the input stream will be separated by special value 0.
IEnumerable<List<int>> Parse(IEnumerable<int> l)
{
l.Select(x => {
.....; //?
return new List<int>();
});
}
var l = new List<int> {0,1,3,5,0,3,4,0,1,4,0};
Parse(l) // returns {{1,3,5}, {3, 4}, {1,4}}
How to implement it using Linq instead of imperative looping?
Or is Linq not good for this requirement because the logic depends on the order of the input stream?

Simple loop would be good option.
Alternatives:
Enumerable.Aggregate and start new list on 0
Write own extension similar to Create batches in linq or Use LINQ to group a sequence of numbers with no gaps
Aggregate sample
var result = list.Aggregate(new List<List<int>>(),
(sum,current) => {
if(current == 0)
sum.Add(new List<int>());
else
sum.Last().Add(current);
return sum;
});
Note: this is only sample of the approach working for given very friendly input like {0,1,2,0,3,4}.
One can even make aggregation into immutable lists but that will look insane with basic .Net types.

Here's an answer that lazily enumerates the source enumerable, but eagerly enumerates the contents of each returned list between zeroes. It properly throws upon null input or upon being given a list that does not start with a zero (though allowing an empty list through--that's really an implementation detail you have to decide on). It does not return an extra and empty list at the end like at least one other answer's possible suggestions does.
public static IEnumerable<List<int>> Parse(this IEnumerable<int> source, int splitValue = 0) {
if (source == null) {
throw new ArgumentNullException(nameof (source));
}
using (var enumerator = source.GetEnumerator()) {
if (!enumerator.MoveNext()) {
return Enumerable.Empty<List<int>>();
}
if (enumerator.Current != splitValue) {
throw new ArgumentException(nameof (source), $"Source enumerable must begin with a {splitValue}.");
}
return ParseImpl(enumerator, splitValue);
}
}
private static IEnumerable<List<int>> ParseImpl(IEnumerator<int> enumerator, int splitValue) {
var list = new List<int>();
while (enumerator.MoveNext()) {
if (enumerator.Current == splitValue) {
yield return list;
list = new List<int>();
}
else {
list.Add(enumerator.Current);
}
}
if (list.Any()) {
yield return list;
}
}
This could easily be adapted to be generic instead of int, just change Parse to Parse<T>, change int to T everywhere, and use a.Equals(b) or !a.Equals(b) instead of a == b or a != b.

You could create an extension method like this:
public static IEnumerable<IEnumerable<T>> SplitBy<T>(this IEnumerable<T> source, T value)
{
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
var list = new List<T> { };
//In case the source doesn't start with 0
if (!e.Current.Equals(value))
{
list.Add(e.Current);
}
while (e.MoveNext())
{
if ( !e.Current.Equals(value))
{
list.Add(e.Current);
}
else
{
yield return list;
list = new List<T> { };
}
}
//In case the source doesn't end with 0
if (list.Count>0)
{
yield return list;
}
}
}
}
Then, you can do the following:
var l = new List<int> { 0, 1, 3, 5, 0, 3, 4, 0, 1, 4, 0 };
var result = l.SplitBy(0);

You could use GroupBy with a counter.
var list = new List<int> {0,1,3,5,0,3,4,0,1,4,0};
int counter = 0;
var result = list.GroupBy(x => x==0 ? counter++ : counter)
.Select(g => g.TakeWhile(x => x!=0).ToList())
.Where(l => l.Any());

Edited to fix possibility of zeroes within numbers
Here is a semi-LINQ solution:
var l = new List<int> {0,1,3,5,0,3,4,0,1,4,0};
string
.Join(",", l.Select(x => x == 0 ? "|" : x.ToString()))
.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries));
This is probably not preferable to using a loop due to performance and other reasons, but it should work.

Related

Split a list of objects into sub-lists of contiguous elements using LINQ?

I have a simple class Item:
public class Item
{
public int Start { get; set;}
public int Stop { get; set;}
}
Given a List<Item> I want to split this into multiple sublists of contiguous elements. e.g. a method
List<Item[]> GetContiguousSequences(Item[] items)
Each element of the returned list should be an array of Item such that list[i].Stop == list[i+1].Start for each element
e.g.
{[1,10], [10,11], [11,20], [25,30], [31,40], [40,45], [45,100]}
=>
{{[1,10], [10,11], [11,20]}, {[25,30]}, {[31,40],[40,45],[45,100]}}
Here is a simple (and not guaranteed bug-free) implementation that simply walks the input data looking for discontinuities:
List<Item[]> GetContiguousSequences(Item []items)
{
var ret = new List<Item[]>();
var i1 = 0;
for(var i2=1;i2<items.Length;++i2)
{
//discontinuity
if(items[i2-1].Stop != items[i2].Start)
{
var num = i2 - i1;
ret.Add(items.Skip(i1).Take(num).ToArray());
i1 = i2;
}
}
//end of array
ret.Add(items.Skip(i1).Take(items.Length-i1).ToArray());
return ret;
}
It's not the most intuitive implementation and I wonder if there is a way to have a neater LINQ-based approach. I was looking at Take and TakeWhile thinking to find the indices where discontinuities occur but couldn't see an easy way to do this.
Is there a simple way to use IEnumerable LINQ algorithms to do this in a more descriptive (not necessarily performant) way?
I set of a simple test-case here: https://dotnetfiddle.net/wrIa2J
I'm really not sure this is much better than your original, but for the purpose of another solution the general process is
Use Select to project a list working out a grouping
Use GroupBy to group by the above
Use Select again to project the grouped items to an array of Item
Use ToList to project the result to a list
public static List<Item[]> GetContiguousSequences2(Item []items)
{
var currIdx = 1;
return items.Select( (item,index) => new {
item = item,
index = index == 0 || items[index-1].Stop == item.Start ? currIdx : ++currIdx
})
.GroupBy(x => x.index, x => x.item)
.Select(x => x.ToArray())
.ToList();
}
Live example: https://dotnetfiddle.net/mBfHru
Another way is to do an aggregation using Aggregate. This means maintaining a final Result list and a Curr list where you can aggregate your sequences, adding them to the Result list as you find discontinuities. This method looks a little closer to your original
public static List<Item[]> GetContiguousSequences3(Item []items)
{
var res = items.Aggregate(new {Result = new List<Item[]>(), Curr = new List<Item>()}, (agg, item) => {
if(!agg.Curr.Any() || agg.Curr.Last().Stop == item.Start) {
agg.Curr.Add(item);
} else {
agg.Result.Add(agg.Curr.ToArray());
agg.Curr.Clear();
agg.Curr.Add(item);
}
return agg;
});
res.Result.Add(res.Curr.ToArray()); // Remember to add the last group
return res.Result;
}
Live example: https://dotnetfiddle.net/HL0VyJ
You can implement ContiguousSplit as a corutine: let's loop over source and either add item into current range or return it and start a new one.
private static IEnumerable<Item[]> ContiguousSplit(IEnumerable<Item> source) {
List<Item> current = new List<Item>();
foreach (var item in source) {
if (current.Count > 0 && current[current.Count - 1].Stop != item.Start) {
yield return current.ToArray();
current.Clear();
}
current.Add(item);
}
if (current.Count > 0)
yield return current.ToArray();
}
then if you want materialization
List<Item[]> GetContiguousSequences(Item []items) => ContiguousSplit(items).ToList();
Your solution is okay. I don't think that LINQ adds any simplification or clarity in this situation. Here is a fast solution that I find intuitive:
static List<Item[]> GetContiguousSequences(Item[] items)
{
var result = new List<Item[]>();
int start = 0;
while (start < items.Length) {
int end = start + 1;
while (end < items.Length && items[end].Start == items[end - 1].Stop) {
end++;
}
int len = end - start;
var a = new Item[len];
Array.Copy(items, start, a, 0, len);
result.Add(a);
start = end;
}
return result;
}

How to check if contents of a List<String> is alphabetical [duplicate]

I am doing some unit tests and I want to know if there's any way to test if a list is ordered by a property of the objects it contains.
Right now I am doing it this way but I don't like it, I want a better way. Can somebody help me please?
// (fill the list)
List<StudyFeedItem> studyFeeds =
Feeds.GetStudyFeeds(2120, DateTime.Today.AddDays(-200), 20);
StudyFeedItem previous = studyFeeds.First();
foreach (StudyFeedItem item in studyFeeds)
{
if (item != previous)
{
Assert.IsTrue(previous.Date > item.Date);
}
previous = item;
}
If you are using MSTest, you may want to take a look at CollectionAssert.AreEqual.
Enumerable.SequenceEqual may be another useful API to use in an assertion.
In both cases you should prepare a list that holds the expected list in the expected order, and then compare that list to the result.
Here's an example:
var studyFeeds = Feeds.GetStudyFeeds(2120, DateTime.Today.AddDays(-200), 20);
var expectedList = studyFeeds.OrderByDescending(x => x.Date);
Assert.IsTrue(expectedList.SequenceEqual(studyFeeds));
A .NET 4.0 way would be to use the Enumerable.Zip method to zip the list with itself offset by one, which pairs each item with the subsequent item in the list. You can then check that the condition holds true for each pair, e.g.
var ordered = studyFeeds.Zip(studyFeeds.Skip(1), (a, b) => new { a, b })
.All(p => p.a.Date < p.b.Date);
If you're on an earlier version of the framework you can write your own Zip method without too much trouble, something like the following (argument validation and disposal of the enumerators if applicable is left to the reader):
public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> selector)
{
var e1 = first.GetEnumerator();
var e2 = second.GetEnumerator();
while (e1.MoveNext() & e2.MoveNext()) // one & is important
yield return selector(e1.Current, e2.Current);
}
Nunit 2.5 introduced CollectionOrderedContraint and a nice syntax for verifying the order of a collection:
Assert.That(collection, Is.Ordered.By("PropertyName"));
No need to manually order and compare.
If your unit testing framework has helper methods to assert equality of collections, you should be able do something like this (NUnit flavored):
var sorted = studyFeeds.OrderBy(s => s.Date);
CollectionAssert.AreEqual(sorted.ToList(), studyFeeds.ToList());
The assert method works with any IEnumerable, but when both collections are of type IList or "array of something", the error message thrown when the assert fails will contain the index of the first out-of-place element.
The solutions posted involving sorting the list are expensive - determining if a list IS sorted can be done in O(N). Here's an extension method which will check:
public static bool IsOrdered<T>(this IList<T> list, IComparer<T> comparer = null)
{
if (comparer == null)
{
comparer = Comparer<T>.Default;
}
if (list.Count > 1)
{
for (int i = 1; i < list.Count; i++)
{
if (comparer.Compare(list[i - 1], list[i]) > 0)
{
return false;
}
}
}
return true;
}
A corresponding IsOrderedDescending could be implemented easily by changing > 0 to < 0.
Greg Beech answer, although excellent, can be simplified further by performing the test in the Zip itself. So instead of:
var ordered = studyFeeds.Zip(studyFeeds.Skip(1), (a, b) => new { a, b })
.All(p => p.a.Date <= p.b.Date);
You can simply do:
var ordered = !studyFeeds.Zip(studyFeeds.Skip(1), (a, b) => a.Date <= b.Date)
.Contains(false);
Which saves you one lambda expression and one anonymous type.
(In my opinion removing the anonymous type also makes it easier to read.)
if(studyFeeds.Length < 2)
return;
for(int i = 1; i < studyFeeds.Length;i++)
Assert.IsTrue(studyFeeds[i-1].Date > studyFeeds[i].Date);
for isn't dead just quite yet!
How about:
var list = items.ToList();
for(int i = 1; i < list.Count; i++) {
Assert.IsTrue(yourComparer.Compare(list[i - 1], list[i]) <= 0);
}
where yourComparer is an instance of YourComparer which implements IComparer<YourBusinessObject>. This ensures that every element is less than the next element in the enumeration.
Linq based answer is:
You can use SequenceEqual method to check if the original and ordered one is same or not.
var isOrderedAscending = lJobsList.SequenceEqual(lJobsList.OrderBy(x => x));
var isOrderedDescending = lJobsList.SequenceEqual(lJobsList.OrderByDescending(x => x));
Don't forget to import System.Linq namespace.
Additionally:
I am repeating that this answer is Linq based, you can achieve more efficiency by creating your custom extension method.
Also, if somebody still wants to use Linq and check if the sequence both is ordered in ascending or descending order, then you can achieve a little bit more efficiency like that:
var orderedSequence = lJobsList.OrderBy(x => x)
.ToList();
var reversedOrderSequence = orderedSequence.AsEnumerable()
.Reverse();
if (lJobsList.SequenceEqual(orderedSequence))
{
// Ordered in ascending
}
else (lJobsList.SequenceEqual(reversedOrderSequence))
{
// Ordered in descending
}
You could use an extension method like this:
public static System.ComponentModel.ListSortDirection? SortDirection<T>(this IEnumerable<T> items, Comparer<T> comparer = null)
{
if (items == null) throw new ArgumentNullException("items");
if (comparer == null) comparer = Comparer<T>.Default;
bool ascendingOrder = true; bool descendingOrder = true;
using (var e = items.GetEnumerator())
{
if (e.MoveNext())
{
T last = e.Current; // first item
while (e.MoveNext())
{
int diff = comparer.Compare(last, e.Current);
if (diff > 0)
ascendingOrder = false;
else if (diff < 0)
descendingOrder = false;
if (!ascendingOrder && !descendingOrder)
break;
last = e.Current;
}
}
}
if (ascendingOrder)
return System.ComponentModel.ListSortDirection.Ascending;
else if (descendingOrder)
return System.ComponentModel.ListSortDirection.Descending;
else
return null;
}
It enables to check if the sequence is sorted and also determines the direction:
var items = new[] { 3, 2, 1, 1, 0 };
var sort = items.SortDirection();
Console.WriteLine("Is sorted? {0}, Direction: {1}", sort.HasValue, sort);
//Is sorted? True, Direction: Descending
Here's how I do it with Linq and I comparable, might not be the best but works for me and it's test framework independent.
So the call looks like this:
myList.IsOrderedBy(a => a.StartDate)
This works for anything that implements IComparable, so numbers strings and anything that inherit from IComparable:
public static bool IsOrderedBy<T, TProperty>(this List<T> list, Expression<Func<T, TProperty>> propertyExpression) where TProperty : IComparable<TProperty>
{
var member = (MemberExpression) propertyExpression.Body;
var propertyInfo = (PropertyInfo) member.Member;
IComparable<TProperty> previousValue = null;
for (int i = 0; i < list.Count(); i++)
{
var currentValue = (TProperty)propertyInfo.GetValue(list[i], null);
if (previousValue == null)
{
previousValue = currentValue;
continue;
}
if(previousValue.CompareTo(currentValue) > 0) return false;
previousValue = currentValue;
}
return true;
}
Hope this helps, took me ages to work this one out.
Checking a sequence can have four different outcomes. Same means that all elements in the sequence are the same (or the sequence is empty):
enum Sort {
Unsorted,
Same,
SortedAscending,
SortedDescending
}
Here is a way to check the sorting of a sequence:
Sort GetSort<T>(IEnumerable<T> source, IComparer<T> comparer = null) {
if (source == null)
throw new ArgumentNullException(nameof(source));
if (comparer == null)
comparer = Comparer<T>.Default;
using (var enumerator = source.GetEnumerator()) {
if (!enumerator.MoveNext())
return Sort.Same;
Sort? result = null;
var previousItem = enumerator.Current;
while (enumerator.MoveNext()) {
var nextItem = enumerator.Current;
var comparison = comparer.Compare(previousItem, nextItem);
if (comparison < 0) {
if (result == Sort.SortedDescending)
return Sort.Unsorted;
result = Sort.SortedAscending;
}
else if (comparison > 0) {
if (result == Sort.SortedAscending)
return Sort.Unsorted;
result = Sort.SortedDescending;
}
}
return result ?? Sort.Same;
}
}
I'm using the enumerator directly instead of a foreach loop because I need to examine the elements of the sequence as pairs. It makes the code more complex but is also more efficient.
Something LINQ-y would be to use a separate sorted query...
var sorted = from item in items
orderby item.Priority
select item;
Assert.IsTrue(items.SequenceEquals(sorted));
Type inference means you'd need a
where T : IHasPriority
However, if you have multiple items of the same priority, then for a unit test assertion you're probably best off just looping with the list index as Jason suggested.
One way or another you're going to have to walk the list and ensure that the items are in the order you want. Since the item comparison is custom, you could look into creating a generic method for this and passing in a comparison function - the same way that sorting the list uses comparison functions.
You can create an ordered and an unordered version of the list first:
var asc = jobs.OrderBy(x => x);
var desc = jobs.OrderByDescending(x => x);
Now compare the original list with both:
if (jobs.SequenceEqual(asc) || jobs.SequenceEquals(desc)) // ...
var studyFeeds = Feeds.GetStudyFeeds(2120, DateTime.Today.AddDays(-200), 20);
var orderedFeeds = studyFeeds.OrderBy(f => f.Date);
for (int i = 0; i < studyFeeds.Count; i++)
{
Assert.AreEqual(orderedFeeds[i].Date, studyFeeds[i].Date);
}
What about something like this, without sorting the list
public static bool IsAscendingOrder<T>(this IEnumerable<T> seq) where T : IComparable
{
var seqArray = seq as T[] ?? seq.ToArray();
return !seqArray.Where((e, i) =>
i < seqArray.Count() - 1 &&
e.CompareTo(seqArray.ElementAt(i + 1)) >= 0).Any();
}
Microsoft.VisualStudio.TestTools.UnitTesting.CollectionAssert.AreEqual(
mylist.OrderBy((a) => a.SomeProperty).ToList(),
mylist,
"Not sorted.");
Here's a more lightweight generic version. To test for descending order, change the >= 0 comparison to <= 0.
public static bool IsAscendingOrder<T>(this IEnumerable<T> seq) where T : IComparable<T>
{
var predecessor = default(T);
var hasPredecessor = false;
foreach(var x in seq)
{
if (hasPredecessor && predecessor.CompareTo(x) >= 0) return false;
predecessor = x;
hasPredecessor = true;
}
return true;
}
Tests:
new int[] { }.IsAscendingOrder() returns true
new int[] { 1 }.IsAscendingOrder() returns true
new int[] { 1,2 }.IsAscendingOrder() returns true
new int[] { 1,2,0 }.IsAscendingOrder() returns false
While AnorZaken's and Greg Beech's answers are very nice, as they don't require using an extension method, it can be good to avoid Zip() sometimes, as some enumerables can be expensive to enumerate in this way.
A solution can be found in Aggregate()
double[] score1 = new double[] { 12.2, 13.3, 5, 17.2, 2.2, 4.5 };
double[] score2 = new double[] { 2.2, 4.5, 5, 12.2, 13.3, 17.2 };
bool isordered1 = score1.Aggregate(double.MinValue,(accum,elem)=>elem>=accum?elem:double.MaxValue) < double.MaxValue;
bool isordered2 = score2.Aggregate(double.MinValue,(accum,elem)=>elem>=accum?elem:double.MaxValue) < double.MaxValue;
Console.WriteLine ("isordered1 {0}",isordered1);
Console.WriteLine ("isordered2 {0}",isordered2);
One thing a little ugly about the above solution, is the double less-than comparisons. Floating comparisons like this make me queasy as it is almost like a floating point equality comparison. But it seems to work for double here. Integer values would be fine, also.
The floating point comparison can be avoided by using nullable types, but then the code becomes a bit harder to read.
double[] score3 = new double[] { 12.2, 13.3, 5, 17.2, 2.2, 4.5 };
double[] score4 = new double[] { 2.2, 4.5, 5, 12.2, 13.3, 17.2 };
bool isordered3 = score3.Aggregate((double?)double.MinValue,(accum,elem)=>(elem>(accum??(double?)double.MaxValue).Value)?(double?)elem:(double?)null) !=null;
bool isordered4 = score4.Aggregate((double?)double.MinValue,(accum,elem)=>(elem>(accum??(double?)double.MaxValue).Value)?(double?)elem:(double?)null) !=null;
Console.WriteLine ("isordered3 {0}",isordered3);
Console.WriteLine ("isordered4 {0}",isordered4);
You can use lambda in extension:
public static bool IsAscending<T>(this IEnumerable<T> self, Func<T, T, int> compareTo) {
var list = self as IList<T> ?? self.ToList();
if (list.Count < 2) {
return true;
}
T a = list[0];
for (int i = 1; i < list.Count; i++) {
T b = list[i];
if (compareTo(a, b) > 0) {
return false;
}
a = b;
}
return true;
}
Using:
bool result1 = Enumerable.Range(2, 10).IsAscending((a, b) => a.CompareTo(b));
more:
var lst = new List<(int, string)> { (1, "b"), (2, "a"), (3, "s1"), (3, "s") };
bool result2 = lst.IsAscending((a, b) => {
var cmp = a.Item1.CompareTo(b.Item1);
if (cmp != 0) {
return cmp;
} else {
return a.Item2.CompareTo(b.Item2);
}
});
var expectedList = resultA.ToArray();
var actualList = resultB.ToArray();
var i = 0;
foreach (var item in expectedList)
{
Assert.True(expectedList[i].id == actualList[i].id);
i++;
}

Find values that appear in all lists (or arrays or collections)

Given the following:
List<List<int>> lists = new List<List<int>>();
lists.Add(new List<int>() { 1,2,3,4,5,6,7 });
lists.Add(new List<int>() { 1,2 });
lists.Add(new List<int>() { 1,2,3,4 });
lists.Add(new List<int>() { 1,2,5,6,7 });
What is the best/fastest way of identifying which numbers appear in all lists?
You can use the .net 3.5 .Intersect() extension method:-
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> common = a.Intersect(b).ToList();
To do it for two lists one would use x.Intersect(y).
To do it for several we would want to do something like:
var intersection = lists.Aggregate((x, y) => x.Intersect(y));
But this won't work because the result of the lambda isn't List<int> and so it can't be fed back in. This might tempt us to try:
var intersection = lists.Aggregate((x, y) => x.Intersect(y).ToList());
But then this makes n-1 needless calls to ToList() which is relatively expensive. We can get around this with:
var intersection = lists.Aggregate(
(IEnumerable<int> x, IEnumerable<int> y) => x.Intersect(y));
Which applies the same logic, but in using explicit types in the lambda, we can feed the result of Intersect() back in without wasting time and memory creating a list each time, and so gives faster results.
If this came up a lot we can get further (slight) performance improvements by rolling our own rather than using Linq:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(!en.MoveNext()) return Enumerable.Empty<T>();
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
return set;
}
}
This assumes its for internal use only. If it could be called from another assembly it should have error checks, and perhaps should be defined so as to only perform the intersect operations on the first MoveNext() of the calling code:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
if(source == null)
throw new ArgumentNullException("source");
return IntersectAllIterator(source);
}
public static IEnumerable<T> IntersectAllIterator<T>(IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(en.MoveNext())
{
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
foreach(T item in set)
yield return item;
}
}
}
(In these final two versions there's an opportunity to short-circuit if we end up emptying the set, but it only pays off if this happens relatively often, otherwise it's a nett loss).
Conversely, if these aren't concerns, and if we know that we're only ever going to want to do this with lists, we can optimise a bit further with the use of Count and indices:
public static IEnumerable<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return Enumerable.Empty<T>();
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return set;
}
And further if we know we're always going to want the results in a list, and we know that either we won't mutate the list, or it won't matter if the input list got mutated, we can optimise for the case of there being zero or one lists (but this costs more if we might ever not need the output in a list):
public static List<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return new List<T>(0);
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return new List<T>(set);
}
Again though, as well as making the method less widely-applicable, this has risks in terms of how it could be used, so is only appropriate for internal code were you can know either that you won't change either the input or the output after the fact, or that this won't matter.
Linq already offers Intersect and you can exploit Aggregate as well:
var result = lists.Aggregate((a, b) => a.Intersect(b).ToList());
If you don't trust the Intersect method or you just prefer to see what's going on, here's a snippet of code that should do the trick:
// Output goes here
List<int> output = new List<int>();
// Make sure lists are sorted
for (int i = 0; i < lists.Count; ++i) lists[i].Sort();
// Maintain array of indices so we can step through all the lists in parallel
int[] index = new int[lists.Count];
while(index[0] < lists[0].Count)
{
// Search for each value in the first list
int value = lists[0][index[0]];
// No. lists that value appears in, we want this to equal lists.Count
int count = 1;
// Search all the other lists for the value
for (int i = 1; i < lists.Count; ++i)
{
while (index[i] < lists[i].Count)
{
// Stop if we've passed the spot where value would have been
if (lists[i][index[i]] > value) break;
// Stop if we find value
if (lists[i][index[i]] == value)
{
++count;
break;
}
++index[i];
}
// If we reach the end of any list there can't be any more matches so end the search now
if (index[i] >= lists[i].Count) goto done;
}
// Store the value if we found it in all the lists
if (count == lists.Count) output.Add(value);
// Skip multiple occurrances of the same value
while (index[0] < lists[0].Count && lists[0][index[0]] == value) ++index[0];
}
done:
Edit:
I got bored and did some benchmarks on this vs. Jon Hanna's version. His is consistently faster, typically by around 50%. Mine wins by about the same margin if you happen to have presorted lists, though. Also you can gain a further 20% or so with unsafe optimisations. Just thought I'd share that.
You can also get it with SelectMany and Distinct:
List<int> result = lists
.SelectMany(x => x.Where(e => lists.All(l => l.Contains(e))))
.Distinct().ToList();
Edit:
List<int> result2 = lists.First().Where(e => lists.Skip(1).All(l => l.Contains(e)))
.ToList();
Edit 2:
List<int> result3 = lists
.Select(l => l.OrderBy(n => n).Take(lists.Min(x => x.Count()))).First()
.TakeWhile((n, index) => lists.Select(l => l.OrderBy(x => x)).Skip(1).All(l => l.ElementAt(index) == n))
.ToList();

Interleaved merge with LINQ?

I'm currently experimenting a bit with LINQ. Let's say I have two collections of identical length:
var first = new string[] { "1", "2", "3" };
var second = new string[] { "a", "b", "c" };
I would like to merge those two collections into one, but in an interleaved fashion. The resulting sequence should thus be:
"1", "a", "2", "b", "3", "c"
What I've come up with so far is a combination of Zip, an anonymous type and SelectMany:
var result = first.Zip( second, ( f, s ) => new { F = f, S = s } )
.SelectMany( fs => new string[] { fs.F, fs.S } );
Does anybody know of an alternate/simpler way to achieve such an interleaved merge with LINQ?
The example you provided can by made simpler by dispensing with the anonymous type:
var result = first.Zip(second, (f, s) => new[] { f, s })
.SelectMany(f => f);
Warning: this will skip trailing elements if the enumerations have different lengths. If you'd rather substitute in nulls to pad out the shorter collection, use Andrew Shepherd's answer below.
You could write your own Interleave extension method, like in this example.
internal static IEnumerable<T> InterleaveEnumerationsOfEqualLength<T>(
this IEnumerable<T> first,
IEnumerable<T> second)
{
using (IEnumerator<T>
enumerator1 = first.GetEnumerator(),
enumerator2 = second.GetEnumerator())
{
while (enumerator1.MoveNext() && enumerator2.MoveNext())
{
yield return enumerator1.Current;
yield return enumerator2.Current;
}
}
}
The given implementation in the accepted answer has an inconsistency:
The resulting sequence will always contain all elements of the first sequence (because of the outer while loop), but if the second sequence contains more elements, than those elements will not be appended.
From an Interleave method I would expect that the resulting sequence contains
only 'pairs' (length of resulting sequence: min(length_1, length_2) * 2)), or that
the remaining elements of the longer sequence are always appended (length of resulting sequence: length_1 + length_2).
The following implementation follows the second approach.
Note the single | in the or-comparison which avoids short-circuit evaluation.
public static IEnumerable<T> Interleave<T> (
this IEnumerable<T> first, IEnumerable<T> second)
{
using (var enumerator1 = first.GetEnumerator())
using (var enumerator2 = second.GetEnumerator())
{
bool firstHasMore;
bool secondHasMore;
while ((firstHasMore = enumerator1.MoveNext())
| (secondHasMore = enumerator2.MoveNext()))
{
if (firstHasMore)
yield return enumerator1.Current;
if (secondHasMore)
yield return enumerator2.Current;
}
}
}
You can just loop and select the array depending on the index:
var result =
Enumerable.Range(0, first.Length * 2)
.Select(i => (i % 2 == 0 ? first : second)[i / 2]);
var result = first.SelectMany( ( f, i ) => new List<string> { f, second[ i ] } );
This is a modified version of the answer from #Douglas. This allows for a dynamic number of IEnumerables to interleave, and goes through all elements, not stopping at the end of the shortest IEnumerable.
public static IEnumerable<T> Interleave<T>(params IEnumerable<T>[] enumerables)
{
var enumerators = enumerables.Select(e => e.GetEnumerator()).ToList();
while (enumerators.Any())
{
enumerators.RemoveAll(e => {
var ended = !e.MoveNext();
if (ended) e.Dispose();
return ended;
});
foreach (var enumerator in enumerators)
yield return enumerator.Current;
}
}

Simple sort verification for unit testing an ORDER BY? [duplicate]

I am doing some unit tests and I want to know if there's any way to test if a list is ordered by a property of the objects it contains.
Right now I am doing it this way but I don't like it, I want a better way. Can somebody help me please?
// (fill the list)
List<StudyFeedItem> studyFeeds =
Feeds.GetStudyFeeds(2120, DateTime.Today.AddDays(-200), 20);
StudyFeedItem previous = studyFeeds.First();
foreach (StudyFeedItem item in studyFeeds)
{
if (item != previous)
{
Assert.IsTrue(previous.Date > item.Date);
}
previous = item;
}
If you are using MSTest, you may want to take a look at CollectionAssert.AreEqual.
Enumerable.SequenceEqual may be another useful API to use in an assertion.
In both cases you should prepare a list that holds the expected list in the expected order, and then compare that list to the result.
Here's an example:
var studyFeeds = Feeds.GetStudyFeeds(2120, DateTime.Today.AddDays(-200), 20);
var expectedList = studyFeeds.OrderByDescending(x => x.Date);
Assert.IsTrue(expectedList.SequenceEqual(studyFeeds));
A .NET 4.0 way would be to use the Enumerable.Zip method to zip the list with itself offset by one, which pairs each item with the subsequent item in the list. You can then check that the condition holds true for each pair, e.g.
var ordered = studyFeeds.Zip(studyFeeds.Skip(1), (a, b) => new { a, b })
.All(p => p.a.Date < p.b.Date);
If you're on an earlier version of the framework you can write your own Zip method without too much trouble, something like the following (argument validation and disposal of the enumerators if applicable is left to the reader):
public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> selector)
{
var e1 = first.GetEnumerator();
var e2 = second.GetEnumerator();
while (e1.MoveNext() & e2.MoveNext()) // one & is important
yield return selector(e1.Current, e2.Current);
}
Nunit 2.5 introduced CollectionOrderedContraint and a nice syntax for verifying the order of a collection:
Assert.That(collection, Is.Ordered.By("PropertyName"));
No need to manually order and compare.
If your unit testing framework has helper methods to assert equality of collections, you should be able do something like this (NUnit flavored):
var sorted = studyFeeds.OrderBy(s => s.Date);
CollectionAssert.AreEqual(sorted.ToList(), studyFeeds.ToList());
The assert method works with any IEnumerable, but when both collections are of type IList or "array of something", the error message thrown when the assert fails will contain the index of the first out-of-place element.
The solutions posted involving sorting the list are expensive - determining if a list IS sorted can be done in O(N). Here's an extension method which will check:
public static bool IsOrdered<T>(this IList<T> list, IComparer<T> comparer = null)
{
if (comparer == null)
{
comparer = Comparer<T>.Default;
}
if (list.Count > 1)
{
for (int i = 1; i < list.Count; i++)
{
if (comparer.Compare(list[i - 1], list[i]) > 0)
{
return false;
}
}
}
return true;
}
A corresponding IsOrderedDescending could be implemented easily by changing > 0 to < 0.
Greg Beech answer, although excellent, can be simplified further by performing the test in the Zip itself. So instead of:
var ordered = studyFeeds.Zip(studyFeeds.Skip(1), (a, b) => new { a, b })
.All(p => p.a.Date <= p.b.Date);
You can simply do:
var ordered = !studyFeeds.Zip(studyFeeds.Skip(1), (a, b) => a.Date <= b.Date)
.Contains(false);
Which saves you one lambda expression and one anonymous type.
(In my opinion removing the anonymous type also makes it easier to read.)
if(studyFeeds.Length < 2)
return;
for(int i = 1; i < studyFeeds.Length;i++)
Assert.IsTrue(studyFeeds[i-1].Date > studyFeeds[i].Date);
for isn't dead just quite yet!
How about:
var list = items.ToList();
for(int i = 1; i < list.Count; i++) {
Assert.IsTrue(yourComparer.Compare(list[i - 1], list[i]) <= 0);
}
where yourComparer is an instance of YourComparer which implements IComparer<YourBusinessObject>. This ensures that every element is less than the next element in the enumeration.
Linq based answer is:
You can use SequenceEqual method to check if the original and ordered one is same or not.
var isOrderedAscending = lJobsList.SequenceEqual(lJobsList.OrderBy(x => x));
var isOrderedDescending = lJobsList.SequenceEqual(lJobsList.OrderByDescending(x => x));
Don't forget to import System.Linq namespace.
Additionally:
I am repeating that this answer is Linq based, you can achieve more efficiency by creating your custom extension method.
Also, if somebody still wants to use Linq and check if the sequence both is ordered in ascending or descending order, then you can achieve a little bit more efficiency like that:
var orderedSequence = lJobsList.OrderBy(x => x)
.ToList();
var reversedOrderSequence = orderedSequence.AsEnumerable()
.Reverse();
if (lJobsList.SequenceEqual(orderedSequence))
{
// Ordered in ascending
}
else (lJobsList.SequenceEqual(reversedOrderSequence))
{
// Ordered in descending
}
You could use an extension method like this:
public static System.ComponentModel.ListSortDirection? SortDirection<T>(this IEnumerable<T> items, Comparer<T> comparer = null)
{
if (items == null) throw new ArgumentNullException("items");
if (comparer == null) comparer = Comparer<T>.Default;
bool ascendingOrder = true; bool descendingOrder = true;
using (var e = items.GetEnumerator())
{
if (e.MoveNext())
{
T last = e.Current; // first item
while (e.MoveNext())
{
int diff = comparer.Compare(last, e.Current);
if (diff > 0)
ascendingOrder = false;
else if (diff < 0)
descendingOrder = false;
if (!ascendingOrder && !descendingOrder)
break;
last = e.Current;
}
}
}
if (ascendingOrder)
return System.ComponentModel.ListSortDirection.Ascending;
else if (descendingOrder)
return System.ComponentModel.ListSortDirection.Descending;
else
return null;
}
It enables to check if the sequence is sorted and also determines the direction:
var items = new[] { 3, 2, 1, 1, 0 };
var sort = items.SortDirection();
Console.WriteLine("Is sorted? {0}, Direction: {1}", sort.HasValue, sort);
//Is sorted? True, Direction: Descending
Here's how I do it with Linq and I comparable, might not be the best but works for me and it's test framework independent.
So the call looks like this:
myList.IsOrderedBy(a => a.StartDate)
This works for anything that implements IComparable, so numbers strings and anything that inherit from IComparable:
public static bool IsOrderedBy<T, TProperty>(this List<T> list, Expression<Func<T, TProperty>> propertyExpression) where TProperty : IComparable<TProperty>
{
var member = (MemberExpression) propertyExpression.Body;
var propertyInfo = (PropertyInfo) member.Member;
IComparable<TProperty> previousValue = null;
for (int i = 0; i < list.Count(); i++)
{
var currentValue = (TProperty)propertyInfo.GetValue(list[i], null);
if (previousValue == null)
{
previousValue = currentValue;
continue;
}
if(previousValue.CompareTo(currentValue) > 0) return false;
previousValue = currentValue;
}
return true;
}
Hope this helps, took me ages to work this one out.
Checking a sequence can have four different outcomes. Same means that all elements in the sequence are the same (or the sequence is empty):
enum Sort {
Unsorted,
Same,
SortedAscending,
SortedDescending
}
Here is a way to check the sorting of a sequence:
Sort GetSort<T>(IEnumerable<T> source, IComparer<T> comparer = null) {
if (source == null)
throw new ArgumentNullException(nameof(source));
if (comparer == null)
comparer = Comparer<T>.Default;
using (var enumerator = source.GetEnumerator()) {
if (!enumerator.MoveNext())
return Sort.Same;
Sort? result = null;
var previousItem = enumerator.Current;
while (enumerator.MoveNext()) {
var nextItem = enumerator.Current;
var comparison = comparer.Compare(previousItem, nextItem);
if (comparison < 0) {
if (result == Sort.SortedDescending)
return Sort.Unsorted;
result = Sort.SortedAscending;
}
else if (comparison > 0) {
if (result == Sort.SortedAscending)
return Sort.Unsorted;
result = Sort.SortedDescending;
}
}
return result ?? Sort.Same;
}
}
I'm using the enumerator directly instead of a foreach loop because I need to examine the elements of the sequence as pairs. It makes the code more complex but is also more efficient.
Something LINQ-y would be to use a separate sorted query...
var sorted = from item in items
orderby item.Priority
select item;
Assert.IsTrue(items.SequenceEquals(sorted));
Type inference means you'd need a
where T : IHasPriority
However, if you have multiple items of the same priority, then for a unit test assertion you're probably best off just looping with the list index as Jason suggested.
One way or another you're going to have to walk the list and ensure that the items are in the order you want. Since the item comparison is custom, you could look into creating a generic method for this and passing in a comparison function - the same way that sorting the list uses comparison functions.
You can create an ordered and an unordered version of the list first:
var asc = jobs.OrderBy(x => x);
var desc = jobs.OrderByDescending(x => x);
Now compare the original list with both:
if (jobs.SequenceEqual(asc) || jobs.SequenceEquals(desc)) // ...
var studyFeeds = Feeds.GetStudyFeeds(2120, DateTime.Today.AddDays(-200), 20);
var orderedFeeds = studyFeeds.OrderBy(f => f.Date);
for (int i = 0; i < studyFeeds.Count; i++)
{
Assert.AreEqual(orderedFeeds[i].Date, studyFeeds[i].Date);
}
What about something like this, without sorting the list
public static bool IsAscendingOrder<T>(this IEnumerable<T> seq) where T : IComparable
{
var seqArray = seq as T[] ?? seq.ToArray();
return !seqArray.Where((e, i) =>
i < seqArray.Count() - 1 &&
e.CompareTo(seqArray.ElementAt(i + 1)) >= 0).Any();
}
Microsoft.VisualStudio.TestTools.UnitTesting.CollectionAssert.AreEqual(
mylist.OrderBy((a) => a.SomeProperty).ToList(),
mylist,
"Not sorted.");
Here's a more lightweight generic version. To test for descending order, change the >= 0 comparison to <= 0.
public static bool IsAscendingOrder<T>(this IEnumerable<T> seq) where T : IComparable<T>
{
var predecessor = default(T);
var hasPredecessor = false;
foreach(var x in seq)
{
if (hasPredecessor && predecessor.CompareTo(x) >= 0) return false;
predecessor = x;
hasPredecessor = true;
}
return true;
}
Tests:
new int[] { }.IsAscendingOrder() returns true
new int[] { 1 }.IsAscendingOrder() returns true
new int[] { 1,2 }.IsAscendingOrder() returns true
new int[] { 1,2,0 }.IsAscendingOrder() returns false
While AnorZaken's and Greg Beech's answers are very nice, as they don't require using an extension method, it can be good to avoid Zip() sometimes, as some enumerables can be expensive to enumerate in this way.
A solution can be found in Aggregate()
double[] score1 = new double[] { 12.2, 13.3, 5, 17.2, 2.2, 4.5 };
double[] score2 = new double[] { 2.2, 4.5, 5, 12.2, 13.3, 17.2 };
bool isordered1 = score1.Aggregate(double.MinValue,(accum,elem)=>elem>=accum?elem:double.MaxValue) < double.MaxValue;
bool isordered2 = score2.Aggregate(double.MinValue,(accum,elem)=>elem>=accum?elem:double.MaxValue) < double.MaxValue;
Console.WriteLine ("isordered1 {0}",isordered1);
Console.WriteLine ("isordered2 {0}",isordered2);
One thing a little ugly about the above solution, is the double less-than comparisons. Floating comparisons like this make me queasy as it is almost like a floating point equality comparison. But it seems to work for double here. Integer values would be fine, also.
The floating point comparison can be avoided by using nullable types, but then the code becomes a bit harder to read.
double[] score3 = new double[] { 12.2, 13.3, 5, 17.2, 2.2, 4.5 };
double[] score4 = new double[] { 2.2, 4.5, 5, 12.2, 13.3, 17.2 };
bool isordered3 = score3.Aggregate((double?)double.MinValue,(accum,elem)=>(elem>(accum??(double?)double.MaxValue).Value)?(double?)elem:(double?)null) !=null;
bool isordered4 = score4.Aggregate((double?)double.MinValue,(accum,elem)=>(elem>(accum??(double?)double.MaxValue).Value)?(double?)elem:(double?)null) !=null;
Console.WriteLine ("isordered3 {0}",isordered3);
Console.WriteLine ("isordered4 {0}",isordered4);
You can use lambda in extension:
public static bool IsAscending<T>(this IEnumerable<T> self, Func<T, T, int> compareTo) {
var list = self as IList<T> ?? self.ToList();
if (list.Count < 2) {
return true;
}
T a = list[0];
for (int i = 1; i < list.Count; i++) {
T b = list[i];
if (compareTo(a, b) > 0) {
return false;
}
a = b;
}
return true;
}
Using:
bool result1 = Enumerable.Range(2, 10).IsAscending((a, b) => a.CompareTo(b));
more:
var lst = new List<(int, string)> { (1, "b"), (2, "a"), (3, "s1"), (3, "s") };
bool result2 = lst.IsAscending((a, b) => {
var cmp = a.Item1.CompareTo(b.Item1);
if (cmp != 0) {
return cmp;
} else {
return a.Item2.CompareTo(b.Item2);
}
});
var expectedList = resultA.ToArray();
var actualList = resultB.ToArray();
var i = 0;
foreach (var item in expectedList)
{
Assert.True(expectedList[i].id == actualList[i].id);
i++;
}

Categories