How to flatten tree via LINQ? - c#

So I have simple tree:
class MyNode
{
public MyNode Parent;
public IEnumerable<MyNode> Elements;
int group = 1;
}
I have a IEnumerable<MyNode>. I want to get a list of all MyNode (including inner node objects (Elements)) as one flat list Where group == 1. How to do such thing via LINQ?

You can flatten a tree like this:
IEnumerable<MyNode> Flatten(IEnumerable<MyNode> e) =>
e.SelectMany(c => Flatten(c.Elements)).Concat(new[] { e });
You can then filter by group using Where(...).
To earn some "points for style", convert Flatten to an extension function in a static class.
public static IEnumerable<MyNode> Flatten(this IEnumerable<MyNode> e) =>
e.SelectMany(c => c.Elements.Flatten()).Concat(e);
To earn more points for "even better style", convert Flatten to a generic extension method that takes a tree and a function that produces descendants from a node:
public static IEnumerable<T> Flatten<T>(
this IEnumerable<T> e
, Func<T,IEnumerable<T>> f
) => e.SelectMany(c => f(c).Flatten(f)).Concat(e);
Call this function like this:
IEnumerable<MyNode> tree = ....
var res = tree.Flatten(node => node.Elements);
If you would prefer flattening in pre-order rather than in post-order, switch around the sides of the Concat(...).

The problem with the accepted answer is that it is inefficient if the tree is deep. If the tree is very deep then it blows the stack. You can solve the problem by using an explicit stack:
public static IEnumerable<MyNode> Traverse(this MyNode root)
{
var stack = new Stack<MyNode>();
stack.Push(root);
while(stack.Count > 0)
{
var current = stack.Pop();
yield return current;
foreach(var child in current.Elements)
stack.Push(child);
}
}
Assuming n nodes in a tree of height h and a branching factor considerably less than n, this method is O(1) in stack space, O(h) in heap space and O(n) in time. The other algorithm given is O(h) in stack, O(1) in heap and O(nh) in time. If the branching factor is small compared to n then h is between O(lg n) and O(n), which illustrates that the naïve algorithm can use a dangerous amount of stack and a large amount of time if h is close to n.
Now that we have a traversal, your query is straightforward:
root.Traverse().Where(item=>item.group == 1);

Just for completeness, here is the combination of the answers from dasblinkenlight and Eric Lippert. Unit tested and everything. :-)
public static IEnumerable<T> Flatten<T>(
this IEnumerable<T> items,
Func<T, IEnumerable<T>> getChildren)
{
var stack = new Stack<T>();
foreach(var item in items)
stack.Push(item);
while(stack.Count > 0)
{
var current = stack.Pop();
yield return current;
var children = getChildren(current);
if (children == null) continue;
foreach (var child in children)
stack.Push(child);
}
}

Update:
For people interested in level of nesting (depth). One of the good things about explicit enumerator stack implementation is that at any moment (and in particular when yielding the element) the stack.Count represents the currently processing depth. So taking this into account and utilizing the C#7.0 value tuples, we can simply change the method declaration as follows:
public static IEnumerable<(T Item, int Level)> ExpandWithLevel<T>(
this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector)
and yield statement:
yield return (item, stack.Count);
Then we can implement the original method by applying simple Select on the above:
public static IEnumerable<T> Expand<T>(
this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector) =>
source.ExpandWithLevel(elementSelector).Select(e => e.Item);
Original:
Surprisingly no one (even Eric) showed the "natural" iterative port of a recursive pre-order DFT, so here it is:
public static IEnumerable<T> Expand<T>(
this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector)
{
var stack = new Stack<IEnumerator<T>>();
var e = source.GetEnumerator();
try
{
while (true)
{
while (e.MoveNext())
{
var item = e.Current;
yield return item;
var elements = elementSelector(item);
if (elements == null) continue;
stack.Push(e);
e = elements.GetEnumerator();
}
if (stack.Count == 0) break;
e.Dispose();
e = stack.Pop();
}
}
finally
{
e.Dispose();
while (stack.Count != 0) stack.Pop().Dispose();
}
}

I found some small issues with the answers given here:
What if the initial list of items is null?
What if there is a null value in the list of children?
Built on the previous answers and came up with the following:
public static class IEnumerableExtensions
{
public static IEnumerable<T> Flatten<T>(
this IEnumerable<T> items,
Func<T, IEnumerable<T>> getChildren)
{
if (items == null)
yield break;
var stack = new Stack<T>(items);
while (stack.Count > 0)
{
var current = stack.Pop();
yield return current;
if (current == null) continue;
var children = getChildren(current);
if (children == null) continue;
foreach (var child in children)
stack.Push(child);
}
}
}
And the unit tests:
[TestClass]
public class IEnumerableExtensionsTests
{
[TestMethod]
public void NullList()
{
IEnumerable<Test> items = null;
var flattened = items.Flatten(i => i.Children);
Assert.AreEqual(0, flattened.Count());
}
[TestMethod]
public void EmptyList()
{
var items = new Test[0];
var flattened = items.Flatten(i => i.Children);
Assert.AreEqual(0, flattened.Count());
}
[TestMethod]
public void OneItem()
{
var items = new[] { new Test() };
var flattened = items.Flatten(i => i.Children);
Assert.AreEqual(1, flattened.Count());
}
[TestMethod]
public void OneItemWithChild()
{
var items = new[] { new Test { Id = 1, Children = new[] { new Test { Id = 2 } } } };
var flattened = items.Flatten(i => i.Children);
Assert.AreEqual(2, flattened.Count());
Assert.IsTrue(flattened.Any(i => i.Id == 1));
Assert.IsTrue(flattened.Any(i => i.Id == 2));
}
[TestMethod]
public void OneItemWithNullChild()
{
var items = new[] { new Test { Id = 1, Children = new Test[] { null } } };
var flattened = items.Flatten(i => i.Children);
Assert.AreEqual(2, flattened.Count());
Assert.IsTrue(flattened.Any(i => i.Id == 1));
Assert.IsTrue(flattened.Any(i => i == null));
}
class Test
{
public int Id { get; set; }
public IEnumerable<Test> Children { get; set; }
}
}

Most of the answers presented here are producing depth-first or zig-zag sequences. For example starting with the tree below:
1 2
/ \ / \
/ \ / \
/ \ / \
/ \ / \
11 12 21 22
/ \ / \ / \ / \
/ \ / \ / \ / \
111 112 121 122 211 212 221 222
Sergey Kalinichenko's answer produces this flattened sequence:
111, 112, 121, 122, 11, 12, 211, 212, 221, 222, 21, 22, 1, 2
Konamiman's answer (that generalizes Eric Lippert's answer) produces this flattened sequence:
2, 22, 222, 221, 21, 212, 211, 1, 12, 122, 121, 11, 112, 111
Ivan Stoev's answer produces this flattened sequence:
1, 11, 111, 112, 12, 121, 122, 2, 21, 211, 212, 22, 221, 222
If you are interested in a breadth-first sequence like this:
1, 2, 11, 12, 21, 22, 111, 112, 121, 122, 211, 212, 221, 222
...then this is the solution for you:
public static IEnumerable<T> Flatten<T>(this IEnumerable<T> source,
Func<T, IEnumerable<T>> childrenSelector)
{
var queue = new Queue<T>(source);
while (queue.Count > 0)
{
var current = queue.Dequeue();
yield return current;
var children = childrenSelector(current);
if (children == null) continue;
foreach (var child in children) queue.Enqueue(child);
}
}
The difference in the implementation is basically using a Queue instead of a Stack. No actual sorting is taking place.
Caution: this implementation is far from optimal regarding memory-efficiency, since a large percentage of the total number of elements will end up being stored in the internal queue during the enumeration. Stack-based tree-traversals are much more efficient regarding memory usage than Queue-based implementations.

In case anyone else finds this, but also needs to know the level after they've flattened the tree, this expands on Konamiman's combination of dasblinkenlight and Eric Lippert's solutions:
public static IEnumerable<Tuple<T, int>> FlattenWithLevel<T>(
this IEnumerable<T> items,
Func<T, IEnumerable<T>> getChilds)
{
var stack = new Stack<Tuple<T, int>>();
foreach (var item in items)
stack.Push(new Tuple<T, int>(item, 1));
while (stack.Count > 0)
{
var current = stack.Pop();
yield return current;
foreach (var child in getChilds(current.Item1))
stack.Push(new Tuple<T, int>(child, current.Item2 + 1));
}
}

A really other option is to have a proper OO design.
e.g. ask the MyNode to return all flatten.
Like this:
class MyNode
{
public MyNode Parent;
public IEnumerable<MyNode> Elements;
int group = 1;
public IEnumerable<MyNode> GetAllNodes()
{
if (Elements == null)
{
return Enumerable.Empty<MyNode>();
}
return Elements.SelectMany(e => e.GetAllNodes());
}
}
Now you could ask the top level MyNode to get all the nodes.
var flatten = topNode.GetAllNodes();
If you can't edit the class, then this isn't an option. But otherwise, I think this is could be preferred of a separate (recursive) LINQ method.
This is using LINQ, So I think this answer is applicable here ;)

Combining Dave's and Ivan Stoev's answer in case you need the level of nesting and the list flattened "in order" and not reversed like in the answer given by Konamiman.
public static class HierarchicalEnumerableUtils
{
private static IEnumerable<Tuple<T, int>> ToLeveled<T>(this IEnumerable<T> source, int level)
{
if (source == null)
{
return null;
}
else
{
return source.Select(item => new Tuple<T, int>(item, level));
}
}
public static IEnumerable<Tuple<T, int>> FlattenWithLevel<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> elementSelector)
{
var stack = new Stack<IEnumerator<Tuple<T, int>>>();
var leveledSource = source.ToLeveled(0);
var e = leveledSource.GetEnumerator();
try
{
while (true)
{
while (e.MoveNext())
{
var item = e.Current;
yield return item;
var elements = elementSelector(item.Item1).ToLeveled(item.Item2 + 1);
if (elements == null) continue;
stack.Push(e);
e = elements.GetEnumerator();
}
if (stack.Count == 0) break;
e.Dispose();
e = stack.Pop();
}
}
finally
{
e.Dispose();
while (stack.Count != 0) stack.Pop().Dispose();
}
}
}

Here some ready to use implementation using Queue and returning the Flatten tree me first and then my children.
public static IEnumerable<T> Flatten<T>(this IEnumerable<T> items,
Func<T,IEnumerable<T>> getChildren)
{
if (items == null)
yield break;
var queue = new Queue<T>();
foreach (var item in items) {
if (item == null)
continue;
queue.Enqueue(item);
while (queue.Count > 0) {
var current = queue.Dequeue();
yield return current;
if (current == null)
continue;
var children = getChildren(current);
if (children == null)
continue;
foreach (var child in children)
queue.Enqueue(child);
}
}
}

void Main()
{
var allNodes = GetTreeNodes().Flatten(x => x.Elements);
allNodes.Dump();
}
public static class ExtensionMethods
{
public static IEnumerable<T> Flatten<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> childrenSelector = null)
{
if (source == null)
{
return new List<T>();
}
var list = source;
if (childrenSelector != null)
{
foreach (var item in source)
{
list = list.Concat(childrenSelector(item).Flatten(childrenSelector));
}
}
return list;
}
}
IEnumerable<MyNode> GetTreeNodes() {
return new[] {
new MyNode { Elements = new[] { new MyNode() }},
new MyNode { Elements = new[] { new MyNode(), new MyNode(), new MyNode() }}
};
}
class MyNode
{
public MyNode Parent;
public IEnumerable<MyNode> Elements;
int group = 1;
}

Building on Konamiman's answer, and the comment that the ordering is unexpected, here's a version with an explicit sort param:
public static IEnumerable<T> TraverseAndFlatten<T, V>(this IEnumerable<T> items, Func<T, IEnumerable<T>> nested, Func<T, V> orderBy)
{
var stack = new Stack<T>();
foreach (var item in items.OrderBy(orderBy))
stack.Push(item);
while (stack.Count > 0)
{
var current = stack.Pop();
yield return current;
var children = nested(current).OrderBy(orderBy);
if (children == null) continue;
foreach (var child in children)
stack.Push(child);
}
}
And a sample usage:
var flattened = doc.TraverseAndFlatten(x => x.DependentDocuments, y => y.Document.DocDated).ToList();

Below is Ivan Stoev's code with the additonal feature of telling the index of every object in the path. E.g. search for "Item_120":
Item_0--Item_00
Item_01
Item_1--Item_10
Item_11
Item_12--Item_120
would return the item and an int array [1,2,0]. Obviously, nesting level is also available, as length of the array.
public static IEnumerable<(T, int[])> Expand<T>(this IEnumerable<T> source, Func<T, IEnumerable<T>> getChildren) {
var stack = new Stack<IEnumerator<T>>();
var e = source.GetEnumerator();
List<int> indexes = new List<int>() { -1 };
try {
while (true) {
while (e.MoveNext()) {
var item = e.Current;
indexes[stack.Count]++;
yield return (item, indexes.Take(stack.Count + 1).ToArray());
var elements = getChildren(item);
if (elements == null) continue;
stack.Push(e);
e = elements.GetEnumerator();
if (indexes.Count == stack.Count)
indexes.Add(-1);
}
if (stack.Count == 0) break;
e.Dispose();
indexes[stack.Count] = -1;
e = stack.Pop();
}
} finally {
e.Dispose();
while (stack.Count != 0) stack.Pop().Dispose();
}
}

Every once in awhile I try to scratch at this problem and devise my own solution that supports arbitrarily deep structures (no recursion), performs breadth first traversal, and doesn't abuse too many LINQ queries or preemptively execute recursion on the children. After digging around in the .NET source and trying many solutions, I've finally come up with this solution. It ended up being very close to Ian Stoev's answer (whose answer I only saw just now), however mine doesn't utilize infinite loops or have unusual code flow.
public static IEnumerable<T> Traverse<T>(
this IEnumerable<T> source,
Func<T, IEnumerable<T>> fnRecurse)
{
if (source != null)
{
Stack<IEnumerator<T>> enumerators = new Stack<IEnumerator<T>>();
try
{
enumerators.Push(source.GetEnumerator());
while (enumerators.Count > 0)
{
var top = enumerators.Peek();
while (top.MoveNext())
{
yield return top.Current;
var children = fnRecurse(top.Current);
if (children != null)
{
top = children.GetEnumerator();
enumerators.Push(top);
}
}
enumerators.Pop().Dispose();
}
}
finally
{
while (enumerators.Count > 0)
enumerators.Pop().Dispose();
}
}
}
A working example can be found here.

Based on previous answer Pre-order flatten
public static IEnumerable<T> Flatten<T>(
this IEnumerable<T> e
, Func<T, IEnumerable<T>> f
) => e.Concat(e.SelectMany(c => f(c).Flatten(f)));

Related

Find object index in binding list? [duplicate]

This question already has answers here:
Get List<> element position in c# using LINQ
(11 answers)
How to get the index of an element in an IEnumerable?
(12 answers)
Closed 8 years ago.
Given a datasource like that:
var c = new Car[]
{
new Car{ Color="Blue", Price=28000},
new Car{ Color="Red", Price=54000},
new Car{ Color="Pink", Price=9999},
// ..
};
How can I find the index of the first car satisfying a certain condition with LINQ?
EDIT:
I could think of something like this but it looks horrible:
int firstItem = someItems.Select((item, index) => new
{
ItemName = item.Color,
Position = index
}).Where(i => i.ItemName == "purple")
.First()
.Position;
Will it be the best to solve this with a plain old loop?
myCars.Select((v, i) => new {car = v, index = i}).First(myCondition).index;
or the slightly shorter
myCars.Select((car, index) => new {car, index}).First(myCondition).index;
or the slightly shorter shorter
myCars.Select((car, index) => (car, index)).First(myCondition).index;
Simply do :
int index = List.FindIndex(your condition);
E.g.
int index = cars.FindIndex(c => c.ID == 150);
An IEnumerable is not an ordered set.
Although most IEnumerables are ordered, some (such as Dictionary or HashSet) are not.
Therefore, LINQ does not have an IndexOf method.
However, you can write one yourself:
///<summary>Finds the index of the first item matching an expression in an enumerable.</summary>
///<param name="items">The enumerable to search.</param>
///<param name="predicate">The expression to test the items against.</param>
///<returns>The index of the first matching item, or -1 if no items match.</returns>
public static int FindIndex<T>(this IEnumerable<T> items, Func<T, bool> predicate) {
if (items == null) throw new ArgumentNullException("items");
if (predicate == null) throw new ArgumentNullException("predicate");
int retVal = 0;
foreach (var item in items) {
if (predicate(item)) return retVal;
retVal++;
}
return -1;
}
///<summary>Finds the index of the first occurrence of an item in an enumerable.</summary>
///<param name="items">The enumerable to search.</param>
///<param name="item">The item to find.</param>
///<returns>The index of the first matching item, or -1 if the item was not found.</returns>
public static int IndexOf<T>(this IEnumerable<T> items, T item) { return items.FindIndex(i => EqualityComparer<T>.Default.Equals(item, i)); }
myCars.TakeWhile(car => !myCondition(car)).Count();
It works! Think about it. The index of the first matching item equals the number of (not matching) item before it.
Story time
I too dislike the horrible standard solution you already suggested in your question. Like the accepted answer I went for a plain old loop although with a slight modification:
public static int FindIndex<T>(this IEnumerable<T> items, Predicate<T> predicate) {
int index = 0;
foreach (var item in items) {
if (predicate(item)) break;
index++;
}
return index;
}
Note that it will return the number of items instead of -1 when there is no match. But let's ignore this minor annoyance for now. In fact the horrible standard solution crashes in that case and I consider returning an index that is out-of-bounds superior.
What happens now is ReSharper telling me Loop can be converted into LINQ-expression. While most of the time the feature worsens readability, this time the result was awe-inspiring. So Kudos to the JetBrains.
Analysis
Pros
Concise
Combinable with other LINQ
Avoids newing anonymous objects
Only evaluates the enumerable until the predicate matches for the first time
Therefore I consider it optimal in time and space while remaining readable.
Cons
Not quite obvious at first
Does not return -1 when there is no match
Of course you can always hide it behind an extension method. And what to do best when there is no match heavily depends on the context.
I will make my contribution here... why? just because :p Its a different implementation, based on the Any LINQ extension, and a delegate. Here it is:
public static class Extensions
{
public static int IndexOf<T>(
this IEnumerable<T> list,
Predicate<T> condition) {
int i = -1;
return list.Any(x => { i++; return condition(x); }) ? i : -1;
}
}
void Main()
{
TestGetsFirstItem();
TestGetsLastItem();
TestGetsMinusOneOnNotFound();
TestGetsMiddleItem();
TestGetsMinusOneOnEmptyList();
}
void TestGetsFirstItem()
{
// Arrange
var list = new string[] { "a", "b", "c", "d" };
// Act
int index = list.IndexOf(item => item.Equals("a"));
// Assert
if(index != 0)
{
throw new Exception("Index should be 0 but is: " + index);
}
"Test Successful".Dump();
}
void TestGetsLastItem()
{
// Arrange
var list = new string[] { "a", "b", "c", "d" };
// Act
int index = list.IndexOf(item => item.Equals("d"));
// Assert
if(index != 3)
{
throw new Exception("Index should be 3 but is: " + index);
}
"Test Successful".Dump();
}
void TestGetsMinusOneOnNotFound()
{
// Arrange
var list = new string[] { "a", "b", "c", "d" };
// Act
int index = list.IndexOf(item => item.Equals("e"));
// Assert
if(index != -1)
{
throw new Exception("Index should be -1 but is: " + index);
}
"Test Successful".Dump();
}
void TestGetsMinusOneOnEmptyList()
{
// Arrange
var list = new string[] { };
// Act
int index = list.IndexOf(item => item.Equals("e"));
// Assert
if(index != -1)
{
throw new Exception("Index should be -1 but is: " + index);
}
"Test Successful".Dump();
}
void TestGetsMiddleItem()
{
// Arrange
var list = new string[] { "a", "b", "c", "d", "e" };
// Act
int index = list.IndexOf(item => item.Equals("c"));
// Assert
if(index != 2)
{
throw new Exception("Index should be 2 but is: " + index);
}
"Test Successful".Dump();
}
Here is a little extension I just put together.
public static class PositionsExtension
{
public static Int32 Position<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
return Positions<TSource>(source, predicate).FirstOrDefault();
}
public static IEnumerable<Int32> Positions<TSource>(this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
if (typeof(TSource) is IDictionary)
{
throw new Exception("Dictionaries aren't supported");
}
if (source == null)
{
throw new ArgumentOutOfRangeException("source is null");
}
if (predicate == null)
{
throw new ArgumentOutOfRangeException("predicate is null");
}
var found = source.Where(predicate).First();
var query = source.Select((item, index) => new
{
Found = ReferenceEquals(item, found),
Index = index
}).Where( it => it.Found).Select( it => it.Index);
return query;
}
}
Then you can call it like this.
IEnumerable<Int32> indicesWhereConditionIsMet =
ListItems.Positions(item => item == this);
Int32 firstWelcomeMessage ListItems.Position(msg =>
msg.WelcomeMessage.Contains("Hello"));
Here's an implementation of the highest-voted answer that returns -1 when the item is not found:
public static int FindIndex<T>(this IEnumerable<T> items, Func<T, bool> predicate)
{
var itemsWithIndices = items.Select((item, index) => new { Item = item, Index = index });
var matchingIndices =
from itemWithIndex in itemsWithIndices
where predicate(itemWithIndex.Item)
select (int?)itemWithIndex.Index;
return matchingIndices.FirstOrDefault() ?? -1;
}

c# faster n-ary cartesian product for

I have searched around for a way to find the product of multiple lists; I have used the popular answer which uses Aggregate+SelectMany. The trouble is that my example runs very slow: I have 4 lists, with 3K entries each and I need to enumerate each possible combinations.
Does anyone know a faster way in C#?
I made a fiddle here, which currently runs out of memory.
Following is the code of fiddle link
public static void Main()
{
var sources = new[]
{
Enumerable.Range(1, 3000),
Enumerable.Range(1, 3000),
Enumerable.Range(1, 3000),
Enumerable.Range(1, 3000),
};
var sw = new System.Diagnostics.Stopwatch();
sw.Start();
Console.Write("linq way...");
foreach(var l in NCartesian(sources))
{
// just enumerate
}
Console.WriteLine("{0}ms", sw.ElapsedMilliseconds);
}
public static IEnumerable<IEnumerable<T>> NCartesian<T>(
IEnumerable<IEnumerable<T>> sequences)
{
if (sequences == null)
{
return null;
}
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>()
};
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) => accumulator.SelectMany(
accseq => sequence,
(accseq, item) => accseq.Concat(new[] { item })));
}
I made one which has less memory usage than the above, still slow though:
public static IEnumerable<IEnumerable<T>> NCartesian<T>(
IEnumerable<IEnumerable<T>> sequences)
{
if (sequences == null)
{
throw new ArgumentNullException(nameof(sequences));
}
var enumerators = new List<IEnumerator<T>>();
foreach (IEnumerator<T> enumerator in sequences
.Select(s => s.GetEnumerator()))
{
enumerator.MoveNext(); // move to the first position
enumerators.Add(enumerator);
}
bool done = false;
while (!done)
{
IList<T> result = enumerators.Select(e => e.Current).ToList();
yield return result;
for (int idx = enumerators.Count - 1; idx >= 0; idx--)
{
bool hasNext = enumerators[idx].MoveNext();
if (hasNext)
{
break;
}
if (idx == 0)
{
// the first enumerator is done
done = true;
break;
}
enumerators[idx].Reset();
enumerators[idx].MoveNext();
}
}
}

How to select last value from each run of similar items?

I have a list. I'd like to take the last value from each run of similar elements.
What do I mean? Let me give a simple example. Given the list of words
['golf', 'hip', 'hop', 'hotel', 'grass', 'world', 'wee']
And the similarity function 'starting with the same letter', the function would return the shorter list
['golf', 'hotel', 'grass', 'wee']
Why? The original list has a 1-run of G words, a 3-run of H words, a 1-run of G words, and a 2-run of W words. The function returns the last word from each run.
How can I do this?
Hypothetical C# syntax (in reality I'm working with customer objects but I wanted to share something you could run and test yourself)
> var words = new List<string>{"golf", "hip", "hop", "hotel", "grass", "world", "wee"};
> words.LastDistinct(x => x[0])
["golf", "hotel", "grass", "wee"]
Edit: I tried .GroupBy(x => x[0]).Select(g => g.Last()) but that gives ['grass',
'hotel', 'wee'] which is not what I want. Read the example carefully.
Edit. Another example.
['apples', 'armies', 'black', 'beer', 'bastion', 'cat', 'cart', 'able', 'art', 'bark']
Here there are 5 runs (a run of A's, a run of B's, a run of C's, a new run of A's, a new run of B's). The last word from each run would be:
['armies', 'bastion', 'cart', 'art', 'bark']
The important thing to understand is that each run is independent. Don't mix-up the run of A's at the start with the run of A's near the end.
There's nothing too complicated with just doing it the old-fashioned way:
Func<string, object> groupingFunction = s => s.Substring(0, 1);
IEnumerable<string> input = new List<string>() {"golf", "hip", "..." };
var output = new List<string>();
if (!input.Any())
{
return output;
}
var lastItem = input.First();
var lastKey = groupingFunction(lastItem);
foreach (var currentItem in input.Skip(1))
{
var currentKey = groupingFunction(str);
if (!currentKey.Equals(lastKey))
{
output.Add(lastItem);
}
lastKey = currentKey;
lastItem = currentItem;
}
output.Add(lastItem);
You could also turn this into a generic extension method as Tim Schmelter has done; I have already taken a couple steps to generalize the code on purpose (using object as the key type and IEnumerable<T> as the input type).
You could use this extension that can group by adjacent/consecutive elements:
public static IEnumerable<IGrouping<TKey, TSource>> GroupAdjacent<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
TKey last = default(TKey);
bool haveLast = false;
List<TSource> list = new List<TSource>();
foreach (TSource s in source)
{
TKey k = keySelector(s);
if (haveLast)
{
if (!k.Equals(last))
{
yield return new GroupOfAdjacent<TSource, TKey>(list, last);
list = new List<TSource>();
list.Add(s);
last = k;
}
else
{
list.Add(s);
last = k;
}
}
else
{
list.Add(s);
last = k;
haveLast = true;
}
}
if (haveLast)
yield return new GroupOfAdjacent<TSource, TKey>(list, last);
}
public class GroupOfAdjacent<TSource, TKey> : IEnumerable<TSource>, IGrouping<TKey, TSource>
{
public TKey Key { get; set; }
private List<TSource> GroupList { get; set; }
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return ((System.Collections.Generic.IEnumerable<TSource>)this).GetEnumerator();
}
System.Collections.Generic.IEnumerator<TSource> System.Collections.Generic.IEnumerable<TSource>.GetEnumerator()
{
foreach (var s in GroupList)
yield return s;
}
public GroupOfAdjacent(List<TSource> source, TKey key)
{
GroupList = source;
Key = key;
}
}
Then it's easy:
var words = new List<string>{"golf", "hip", "hop", "hotel", "grass", "world", "wee"};
IEnumerable<string> lastWordOfConsecutiveFirstCharGroups = words
.GroupAdjacent(str => str[0])
.Select(g => g.Last());
Output:
string.Join(",", lastWordOfConsecutiveFirstCharGroups); // golf,hotel,grass,wee
Your other sample:
words=new List<string>{"apples", "armies", "black", "beer", "bastion", "cat", "cart", "able", "art", "bark"};
lastWordOfConsecutiveFirstCharGroups = words
.GroupAdjacent(str => str[0])
.Select(g => g.Last());
Output:
string.Join(",", lastWordOfConsecutiveFirstCharGroups); // armies,bastion,cart,art,bark
Demonstration
Try this algoritm
var words = new List<string> { "golf", "hip", "hop", "hotel", "grass", "world", "wee" };
var newList = new List<string>();
int i = 0;
while (i < words.Count - 1 && i <= words.Count)
{
if (words[i][0] != words[i+1][0])
{
newList.Add(words[i]);
i++;
}
else
{
var j = i;
while ( j < words.Count - 1 && words[j][0] == words[j + 1][0])
{
j++;
}
newList.Add(words[j]);
i = j+1;
}
}
You can use following extension method to split your sequence into groups (i.e. sub-sequnces) by some condition:
public static IEnumerable<IEnumerable<T>> Split<T, TKey>(
this IEnumerable<T> source, Func<T, TKey> keySelector)
{
var group = new List<T>();
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
else
{
TKey currentKey = keySelector(iterator.Current);
var keyComparer = Comparer<TKey>.Default;
group.Add(iterator.Current);
while (iterator.MoveNext())
{
var key = keySelector(iterator.Current);
if (keyComparer.Compare(currentKey, key) != 0)
{
yield return group;
currentKey = key;
group = new List<T>();
}
group.Add(iterator.Current);
}
}
}
if (group.Any())
yield return group;
}
And getting your expected results looks like:
string[] words = { "golf", "hip", "hop", "hotel", "grass", "world", "wee" };
var result = words.Split(w => w[0])
.Select(g => g.Last());
Result:
golf
hotel
grass
wee
Because your input is a List<>, so I think this should work for you with an acceptable performance and especially it's very concise:
var result = words.Where((x, i) => i == words.Count - 1 ||
words[i][0] != words[i + 1][0]);
You can append ToList() on the result to get a List<string> if you want.
I went with
/// <summary>
/// Given a list, return the last value from each run of similar items.
/// </summary>
public static IEnumerable<T> WithoutDuplicates<T>(this IEnumerable<T> source, Func<T, T, bool> similar)
{
Contract.Requires(source != null);
Contract.Requires(similar != null);
Contract.Ensures(Contract.Result<IEnumerable<T>>().Count() <= source.Count(), "Result should be at most as long as original list");
T last = default(T);
bool first = true;
foreach (var item in source)
{
if (!first && !similar(item, last))
yield return last;
last = item;
first = false;
}
if (!first)
yield return last;
}

finding the count in two lists

I have two lists which I am getting from database as follow:
List<myobject1> frstList = ClientManager.Get_FirstList( PostCode.Text, PhoneNumber.Text);
List<myobject2> secondList = new List<myobject2>;
foreach (var c in frstList )
{
secondList.Add( ClaimManager.GetSecondList(c.ID));
}
now my list will contain data like so:
frstList: id = 1, id = 2
secondList: id=1 parentid = 1, id=2 parentid=1 and id = 3 parentid = 2
I want to count these individually and return the one that has most counts? in above example it should return id=1 from frsList and id1 and id2 from secondList...
tried this but not working
var numbers = (from c in frstList where c.Parent.ID == secondList.Select(cl=> cl.ID) select c).Count();
can someone please help me either in linq or normal foreach to do this?
Thanks
Looking at the question it appears that what you want is to determine which of the parent nodes has the most children, and you want the output to be that parent node along with all of its child nodes.
The query is fairly straightforward:
var largestGroup = secondList.GroupBy(item => item.ParentID)
.MaxBy(group => group.Count());
var mostFrequentParent = largestGroup.Key;
var childrenOfMostFrequentParent = largestGroup.AsEnumerable();
We'll just need this helper function, MaxBy:
public static TSource MaxBy<TSource, TKey>(this IEnumerable<TSource> source
, Func<TSource, TKey> selector
, IComparer<TKey> comparer = null)
{
if (comparer == null)
{
comparer = Comparer<TKey>.Default;
}
using (IEnumerator<TSource> iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
throw new ArgumentException("Source was empty");
}
TSource maxItem = iterator.Current;
TKey maxValue = selector(maxItem);
while (iterator.MoveNext())
{
TKey nextValue = selector(iterator.Current);
if (comparer.Compare(nextValue, maxValue) > 0)
{
maxValue = nextValue;
maxItem = iterator.Current;
}
}
return maxItem;
}
}

Get previous and next item in a IEnumerable using LINQ

I have an IEnumerable of a custom type. (That I've gotten from a SelectMany)
I also have an item (myItem) in that IEnumerable that I desire the previous and next item from the IEnumerable.
Currently, I'm doing the desired like this:
var previousItem = myIEnumerable.Reverse().SkipWhile(
i => i.UniqueObjectID != myItem.UniqueObjectID).Skip(1).FirstOrDefault();
I can get the next item by simply ommitting the .Reverse.
or, I could:
int index = myIEnumerable.ToList().FindIndex(
i => i.UniqueObjectID == myItem.UniqueObjectID)
and then use .ElementAt(index +/- 1) to get the previous or next item.
Which is better between the two options?
Is there an even better option available?
"Better" includes a combination of performance (memory and speed) and readability; with readability being my primary concern.
First off
"Better" includes a combination of performance (memory and speed)
In general you can't have both, the rule of thumb is, if you optimise for speed, it'll cost memory, if you optimise for memory, it'll cost you speed.
There is a better option, that performs well on both memory and speed fronts, and can be used in a readable manner (I'm not delighted with the function name, however, FindItemReturningPreviousItemFoundItemAndNextItem is a bit of a mouthful).
So, it looks like it's time for a custom find extension method, something like . . .
public static IEnumerable<T> FindSandwichedItem<T>(this IEnumerable<T> items, Predicate<T> matchFilling)
{
if (items == null)
throw new ArgumentNullException("items");
if (matchFilling == null)
throw new ArgumentNullException("matchFilling");
return FindSandwichedItemImpl(items, matchFilling);
}
private static IEnumerable<T> FindSandwichedItemImpl<T>(IEnumerable<T> items, Predicate<T> matchFilling)
{
using(var iter = items.GetEnumerator())
{
T previous = default(T);
while(iter.MoveNext())
{
if(matchFilling(iter.Current))
{
yield return previous;
yield return iter.Current;
if (iter.MoveNext())
yield return iter.Current;
else
yield return default(T);
yield break;
}
previous = iter.Current;
}
}
// If we get here nothing has been found so return three default values
yield return default(T); // Previous
yield return default(T); // Current
yield return default(T); // Next
}
You can cache the result of this to a list if you need to refer to the items more than once, but it returns the found item, preceded by the previous item, followed by the following item. e.g.
var sandwichedItems = myIEnumerable.FindSandwichedItem(item => item.objectId == "MyObjectId").ToList();
var previousItem = sandwichedItems[0];
var myItem = sandwichedItems[1];
var nextItem = sandwichedItems[2];
The defaults to return if it's the first or last item may need to change depending on your requirements.
Hope this helps.
For readability, I'd load the IEnumerable into a linked list:
var e = Enumerable.Range(0,100);
var itemIKnow = 50;
var linkedList = new LinkedList<int>(e);
var listNode = linkedList.Find(itemIKnow);
var next = listNode.Next.Value; //probably a good idea to check for null
var prev = listNode.Previous.Value; //ditto
By creating an extension method for establishing context to the current element you can use a Linq query like this:
var result = myIEnumerable.WithContext()
.Single(i => i.Current.UniqueObjectID == myItem.UniqueObjectID);
var previous = result.Previous;
var next = result.Next;
The extension would be something like this:
public class ElementWithContext<T>
{
public T Previous { get; private set; }
public T Next { get; private set; }
public T Current { get; private set; }
public ElementWithContext(T current, T previous, T next)
{
Current = current;
Previous = previous;
Next = next;
}
}
public static class LinqExtensions
{
public static IEnumerable<ElementWithContext<T>>
WithContext<T>(this IEnumerable<T> source)
{
T previous = default(T);
T current = source.FirstOrDefault();
foreach (T next in source.Union(new[] { default(T) }).Skip(1))
{
yield return new ElementWithContext<T>(current, previous, next);
previous = current;
current = next;
}
}
}
You could cache the enumerable in a list
var myList = myIEnumerable.ToList()
iterate over it by index
for (int i = 0; i < myList.Count; i++)
then the current element is myList[i], the previous element is myList[i-1], and the next element is myList[i+1]
(Don't forget about the special cases of the first and last elements in the list.)
You are really over complicating things:
Sometimes just a for loop is going to be better to do something, and I think provide a clearer implementation of what you are trying to do/
var myList = myIEnumerable.ToList();
for(i = 0; i < myList.Length; i++)
{
if(myList[i].UniqueObjectID == myItem.UniqueObjectID)
{
previousItem = myList[(i - 1) % (myList.Length - 1)];
nextItem = myList[(i + 1) % (myList.Length - 1)];
}
}
Here is a LINQ extension method that returns the current item, along with the previous and the next. It yields ValueTuple<T, T, T> values to avoid allocations. The source is enumerated once.
/// <summary>
/// Projects each element of a sequence into a tuple that includes the previous
/// and the next element.
/// </summary>
public static IEnumerable<(T Previous, T Current, T Next)> WithPreviousAndNext<T>(
this IEnumerable<T> source, T firstPrevious = default, T lastNext = default)
{
ArgumentNullException.ThrowIfNull(source);
(T Previous, T Current, bool HasPrevious) queue = (default, firstPrevious, false);
foreach (var item in source)
{
if (queue.HasPrevious)
yield return (queue.Previous, queue.Current, item);
queue = (queue.Current, item, true);
}
if (queue.HasPrevious)
yield return (queue.Previous, queue.Current, lastNext);
}
Usage example:
var source = Enumerable.Range(1, 5);
Console.WriteLine($"Source: {String.Join(", ", source)}");
var result = source.WithPreviousAndNext(firstPrevious: -1, lastNext: -1);
Console.WriteLine($"Result: {String.Join(", ", result)}");
Output:
Source: 1, 2, 3, 4, 5
Result: (-1, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, -1)
To get the previous and the next of a specific item, you could use tuple deconstruction:
var (previous, current, next) = myIEnumerable
.WithPreviousAndNext()
.First(e => e.Current.UniqueObjectID == myItem.UniqueObjectID);
CPU
Depends entirely on where the object is in the sequence. If it is located at the end I would expect the second to be faster with more than a factor 2 (but only a constant factor). If it is located in the beginning the first will be faster because you don't traverse the whole list.
Memory
The first is iterating the sequence without saving the sequence so the memory hit will be very small. The second solution will take as much memory as the length of the list * references + objects + overhead.
I thought I would try to answer this using Zip from Linq.
string[] items = {"nought","one","two","three","four"};
var item = items[2];
var sandwiched =
items
.Zip( items.Skip(1), (previous,current) => new { previous, current } )
.Zip( items.Skip(2), (pair,next) => new { pair.previous, pair.current, next } )
.FirstOrDefault( triplet => triplet.current == item );
This will return a anonymous type {previous,current,next}.
Unfortunately this will only work for indexes 1,2 and 3.
string[] items = {"nought","one","two","three","four"};
var item = items[4];
var pad1 = Enumerable.Repeat( "", 1 );
var pad2 = Enumerable.Repeat( "", 2 );
var padded = pad1.Concat( items );
var next1 = items.Concat( pad1 );
var next2 = items.Skip(1).Concat( pad2 );
var sandwiched =
padded
.Zip( next1, (previous,current) => new { previous, current } )
.Zip( next2, (pair,next) => new { pair.previous, pair.current, next } )
.FirstOrDefault( triplet => triplet.current == item );
This version will work for all indexes.
Both version use lazy evaluation courtesy of Linq.
Here are some extension methods as promised. The names are generic and reusable with any type simple and there are lookup overloads to get at the item needed to get the next or previous items. I would benchmark the solutions and then see where you could squeeze cycles out.
public static class ExtensionMethods
{
public static T Previous<T>(this List<T> list, T item) {
var index = list.IndexOf(item) - 1;
return index > -1 ? list[index] : default(T);
}
public static T Next<T>(this List<T> list, T item) {
var index = list.IndexOf(item) + 1;
return index < list.Count() ? list[index] : default(T);
}
public static T Previous<T>(this List<T> list, Func<T, Boolean> lookup) {
var item = list.SingleOrDefault(lookup);
var index = list.IndexOf(item) - 1;
return index > -1 ? list[index] : default(T);
}
public static T Next<T>(this List<T> list, Func<T,Boolean> lookup) {
var item = list.SingleOrDefault(lookup);
var index = list.IndexOf(item) + 1;
return index < list.Count() ? list[index] : default(T);
}
public static T PreviousOrFirst<T>(this List<T> list, T item) {
if(list.Count() < 1)
throw new Exception("No array items!");
var previous = list.Previous(item);
return previous == null ? list.First() : previous;
}
public static T NextOrLast<T>(this List<T> list, T item) {
if(list.Count() < 1)
throw new Exception("No array items!");
var next = list.Next(item);
return next == null ? list.Last() : next;
}
public static T PreviousOrFirst<T>(this List<T> list, Func<T,Boolean> lookup) {
if(list.Count() < 1)
throw new Exception("No array items!");
var previous = list.Previous(lookup);
return previous == null ? list.First() : previous;
}
public static T NextOrLast<T>(this List<T> list, Func<T,Boolean> lookup) {
if(list.Count() < 1)
throw new Exception("No array items!");
var next = list.Next(lookup);
return next == null ? list.Last() : next;
}
}
And you can use them like this.
var previous = list.Previous(obj);
var next = list.Next(obj);
var previousWithLookup = list.Previous((o) => o.LookupProperty == otherObj.LookupProperty);
var nextWithLookup = list.Next((o) => o.LookupProperty == otherObj.LookupProperty);
var previousOrFirst = list.PreviousOrFirst(obj);
var nextOrLast = list.NextOrLast(ob);
var previousOrFirstWithLookup = list.PreviousOrFirst((o) => o.LookupProperty == otherObj.LookupProperty);
var nextOrLastWithLookup = list.NextOrLast((o) => o.LookupProperty == otherObj.LookupProperty);
I use the following technique:
var items = new[] { "Bob", "Jon", "Zac" };
var sandwiches = items
.Sandwich()
.ToList();
Which produces this result:
Notice that there are nulls for the first Previous value, and the last Next value.
It uses the following extension method:
public static IEnumerable<(T Previous, T Current, T Next)> Sandwich<T>(this IEnumerable<T> source, T beforeFirst = default, T afterLast = default)
{
var sourceList = source.ToList();
T previous = beforeFirst;
T current = sourceList.FirstOrDefault();
foreach (var next in sourceList.Skip(1))
{
yield return (previous, current, next);
previous = current;
current = next;
}
yield return (previous, current, afterLast);
}
If you need it for every element in myIEnumerable I’d just iterate through it keeping references to the 2 previous elements. In the body of the loop I'd do the processing for the second previous element and the current would be its descendant and first previous its ancestor.
If you need it for only one element I'd choose your first approach.

Categories