SelectMany to flatten a nested structure

SelectMany to flatten a nested structure - c#

I am parsing an XML structure and my classes look like the following:
class MyXml
{
//...
List<Node> Content { get; set; }
//...
}
class Node
{
// ...
public List<Node> Nodes { get; set; }
public string Type { get; set; }
//...
}
MyXml represents the XML file I am parsing, whose elements are all called <node>. Each node has a type attribute, which can have different values.
The type of the node is not connected to its depth. I can have any node type at any depth level.
I can parse the structure correctly, so I get a MyXml object whose content is a list of Nodes, where ever node in the List can have subnodes and so on (I used recursion for that).
What I need to do is flatten this whole structure and extract only the nodes of a certain type.
I tried with:
var query = MyXml.Content.SelectMany(n => n.Nodes);
but it's taking only the nodes with a structure depth of 1. I would like to grab every node, regardless of depth, in the same collection and then filter what I need.

This is a naturally recursive problem. Using a recursive lambda, try something like:
Func<Node, IEnumerable<Node>> flattener = null;
flattener = n => new[] { n }
.Concat(n.Nodes == null
? Enumerable.Empty<Node>()
: n.Nodes.SelectMany(flattener));
Note that when you make a recursive Func like this, you must declare the Func separately first, and set it to null.
You could also flatten the list using an iterator-block method:
public static IEnumerable<Node> Flatten(Node node)
{
yield return node;
if (node.Nodes != null)
{
foreach(var child in node.Nodes)
foreach(var descendant in Flatten(child))
yield return descendant;
}
}
Either way, once the tree is flattened you can do simple Linq queries over the flattened list to find nodes:
flattener(node).Where(n => n.Type == myType);
Response adapted from: https://stackoverflow.com/a/17086572/1480391

You should implement a method Node.GetFlattened, which returns the node itself and then calls itself on all subnodes:
public IEnumerable<Node> GetFlattened()
{
yield return this;
foreach (var node in this.Nodes.SelectMany(n => n.GetFlattened()))
yield return node;
}
You would then be able to call this method and it recursively returns all nodes regardless of their depth. This is a depth-first search, if you want a breadth-first search, you will have to try another approach.

class MyXml
{
public List<Node> AllNodes()
{
List<Node> allNodes = new List<Node>();
foreach (var node in Content)
AddNode(node, nodes);
}
public void AddNode(Node node, List<Node> nodes)
{
nodes.Add(node);
foreach (var childNode in node.Nodes)
AddNode(childNode, nodes);
}
public List<Node> AllNodesOfType(NodeType nodeType)
{
return AllNodes().Where(n => n.NodeType == nodeType);
}
}
First flatten the list with a function and query on that.

Related

Is there a way to simplify this reverse-enumerator?

We have a case where we have a hierarchical tree structure and we need to get the 'branch' of any particular node. The structure is a one-way linked-list from the child to the parent, but we want to define the branch in the direction from parent to child.
Here's an over-simplified example of the implementation we came up with. Just wondering if there's a better/more effective way to achieve this given those constraints. Just feels verbose to me to do it this way.
#nullable enable
public class Node {
public Node(String name, Node? parent)
=> (Name, Parent) = (name, parent);
public string Name { get; set; }
public Node? Parent { get; init; }
IEnumerable<Node> GetBranch(){
static IEnumerable<Node> getBranchReversed(Node? node) {
while (node is not null) {
yield return node;
node = node.Parent;
}
}
return getBranchReversed(this).Reverse();
}
}
Only other way I can think of is to accumulate into a list where I insert into the first position, then just return the list (typing this from my memory so it may not compile...)
ReadOnlyCollection<Node> GetBranch(){
Node? node = this;
var branch = new List<Node>();
while (node is not null) {
branch.insert(0, node);
node = node.Parent;
}
return branch.AsReadOnly();
}
Again, just wondering if there's any other way to achieve this.

To summarize the comments
IEnumerable.Reverse more or less does the following
var buffer = new List<int>(ienumerableToReverse);
for (int i = buffer.count - 1; i >= 0; --i)
yield return buffer.items[i];
Something to note about this is that it will do the buffering each time the iterator is used. It has to do this since IEnumerable is lazy. If you want to keep the lazy behavior this is just about as good as you can do, whatever solution you want to use, you will need a buffer.
If you do not want the lazy behavior I would probably do the buffering myself:
var branch = new List<Node>();
while (node is not null) {
branch.Add(node);
node = node.Parent;
}
branch.Reverse();
return branch;
This should be better than your example since Insert will need to move items around for each insert causing quadratic scaling, while .Reverse is linear.

Function which will return particular node from tree structure

I am writing the function which will return particular node from tree structure. But when I search in a tree using LINQ it is searching in the first branch and finally when it reaches to leaf it is throwing null reference exception as leaf don't have any child.
Here is my class,
public class Node
{
public int Id { get; set; }
public string Name { get; set; }
public string Content { get; set; }
public IEnumerable<Node> Children { get; set; }
public IEnumerable<Node> GetNodeAndDescendants() // Note that this method is lazy
{
return new[] { this }
.Concat(Children.SelectMany(child => child.GetNodeAndDescendants()));
}
}
This is how I am calling this function,
var foundNode = Location.GetNodeAndDescendants().FirstOrDefault(node => node.Name.Contains("string to search"));
OR
var foundNode = Location.GetNodeAndDescendants().FirstOrDefault(node => node.Id==123)
What would be the correct way to do this? and any sample code would be grateful

Nothing wrong to write your own function, but implementation based on LINQ or recursive iterator is not a good idea (performance!). But why depending on external libraries? A lot of code that you don't need, implementing interfaces, modifying your classes etc. It's not hard to write a generic function for pre-order tree traversal and use it for any tree structure. Here is a modified version of my participation in How to flatten tree via LINQ? (nothing special, ordinary iterative implementation):
public static class TreeHelper
{
public static IEnumerable<T> PreOrderTraversal<T>(T node, Func<T, IEnumerable<T>> childrenSelector)
{
var stack = new Stack<IEnumerator<T>>();
var e = Enumerable.Repeat(node, 1).GetEnumerator();
try
{
while (true)
{
while (e.MoveNext())
{
var item = e.Current;
yield return item;
var children = childrenSelector(item);
if (children == null) continue;
stack.Push(e);
e = children.GetEnumerator();
}
if (stack.Count == 0) break;
e.Dispose();
e = stack.Pop();
}
}
finally
{
e.Dispose();
while (stack.Count != 0) stack.Pop().Dispose();
}
}
}
and your function inside the class Node becomes:
public IEnumerable<Node> GetNodeAndDescendants() // Note that this method is lazy
{
return TreeHelper.PreOrderTraversal(this, node => node.Children);
}
Everything else stays the way you did it and should work w/o any problem.
EDIT: Looks like you need something like this:
public interface IContainer
{
// ...
}
public class CustomerNodeInstance : IContainer
{
// ...
}
public class ProductNodeInstance : IContainer
{
// ...
}
public class Node : IContainer
{
// ...
public IEnumerable<IContainer> Children { get; set; }
public IEnumerable<IContainer> GetNodeAndDescendants() // Note that this method is lazy
{
return TreeHelper.PreOrderTraversal<IContainer>(this, item => { var node = item as Node; return node != null ? node.Children : null; });
}
}

If you don't mind taking a dependency on a third party solution I have a lightweight library that I have been working on that can do this and many other things with just about any tree. It is called Treenumerable. You can find it on GitHub here: https://github.com/jasonmcboyd/Treenumerable; and the latest version (1.2.0 at this time) on NuGet here: http://www.nuget.org/packages/Treenumerable. It has good test coverage and seems to be stable.
It does require that you create a helper class that implements an ITreeWalker interface with two methods: TryGetParent and GetChildren. As you might guess TryGetParent gets a node's parent so your Node class would have to be modified in a way that it is aware of its parent. I guess you could just throw a NotSupported exception in TryGetParent as that method is not necessary for any of the traversal operations. Anyway, regardless of which way you go the following code would do what you want:
ITreeWaler<Node> walker;
// Don't forget to instantiate 'walker'.
var foundNode =
walker
.PreOrderTraversal(Location)
.FirstOrdefault(node => node.Name.Contains("string to search"));
One difference worth mentioning between my implementation and yours is that my implementation does not rely on recursion. This means you don't have to worry about a deep tree throwing a StackOverflowException.

Iteration of an object, that may have parent, or child of it's own same type

I have a class that may have a parent, or list of children of the same type of it's own. The following code snippet should explain my scenario.
public abstract class X{
public virtual List<X> ChildItems { get; set; }
public virtual X ParentItem { get; set; }
}
I would like to know if there is a particularly efficient method to traverse the objects from an object of type X, checking if the object has a parent, or children starting from bottom up.
public static void SaveSetup(X obj) {
//logic here
}
Any help is appreciated.

What you are dealing with is a tree structure (or possibly many disconnected tree structures). A tree structure has a root element. Usually, a tree structure is traversed starting from the root. If you want to start from any element in the tree, I suggest you to first get the root element and then traverse in the usual manner.
The easiest way to traverse a recursive structure is to use recursive method, i.e., a method that calls itself.
public abstract class X
{
public virtual List<X> ChildItems { get; set; }
public virtual X ParentItem { get; set; }
// Method for traversing from top to bottom
public void Traverse(Action<X> action)
{
action(this);
foreach (X item in ChildItems) {
item.Traverse(action);
}
}
// Get the root (the top) of the tree starting at any item.
public X GetRootItem()
{
X root = this;
while (root.ParentItem != null) {
root = root.ParentItem;
}
return root;
}
}
Now you can save the setup with
X root = item.GetRootItem();
root.Traverse(SaveSetup);
Example with lambda expression. Prints every item of the tree assuming that ToString() has been overridden to return a meaningful string.
root.Traverse(x => Console.WriteLine(x));

Traverse from given object to root (ParentItem = null)
public static void SaveSetup(X obj) {
while (obj != null)
{
// logic here
obj = obj.ParentItem;
}
}

Building a tree using a list of objects

I have a list of objects with property id and parent_id.
I want to build a tree to link up those children and parents.
1 parent may have several children and there is an object which will be the ancestor of all objects.
What's the fastest algorithm to implement that?
I use C# as programming language, but other languages are also okay.

Something like that should do the trick :
public List<Node> MakeTreeFromFlatList(IEnumerable<Node> flatList)
{
var dic = flatList.ToDictionary(n => n.Id, n => n);
var rootNodes = new List<Node>();
foreach(var node in flatList)
{
if (node.ParentId.HasValue)
{
Node parent = dic[node.ParentId.Value];
node.Parent = parent;
parent.Children.Add(node);
}
else
{
rootNodes.Add(node);
}
}
return rootNodes;
}
(assuming that ParentId is a Nullable<int>, and is null for root nodes)

You could use a dictionary:
var dict = new Dictionary<Id, Node>();
foreach (var item in items)
{
dict[item.Id] = new Node(item);
}
foreach (var item in items)
{
dict[item.ParentId].AddChild(dict[item.Id]);
}

I much prefer this kind of structure. By maintaining a single list (you may want to use a dictionary or similar for speed) of items and passing it into the GetChildItems function you have greater flexibilty and ease of sorting, adding, removing, saving to a db etc.
You only really need the GetChildItem function when you are rendering the list to a view and you want the tree structure for easy editing as you say. In this case you can have a view model with the full list and the item which is passed into each item view
public class Item
{
public string Id { get; set; }
public string ParentId { get; set; }
public IEnumerable<Item> GetChildItems(List<Item> allItems)
{
return allItems.Where(i => i.Id == this.ParentId);
}
}
public class Tree
{
public List<Item> Items { get; set; }
public IEnumerable<Item> RootItems(List<Item> allItems)
{
return allItems.Where(i => i.ParentId == null);
}
}
Note: the class structure above is designed to mimic the traditional complex object pattern. these days you would prob just have GetChildItems(List allItems, Item parentItem) in the view model

Wrapping my head around N parent->child associations

I'll try to explain this the best I can. I'm having quite a bit of difficulty trying to figure out this logic.
Basically, I have a collection that includes thousands of objects which are each made up of a Parent and a Child property.
So, roughly, this:
public class MyObject{
public string Parent { get; set; }
public string Child { get; set; }
}
What I'm trying to figure out is how to build this out into a plain TreeView control. I need to build the relationships but I can't figure out how to because they can be mixed. I can probably explain this better with what the tree should look like:
So if I have the following items inside of my collection:
0. Parent: "A", Child: "B"
1. Parent: "B", Child: "C"
2. Parent: "B", Child: "D"
I would want my tree to look this like:
-A
--B
---C
-A
--B
---D
-B
--C
-B
--D
How can I do this in C#? I would need it to support up to N relationships as we have some branches I would expect to reach about 50 nodes deep.

UPDATE
This problem actually turned out to be considerably more complex than I originally realized, given the requirement of repeating the entire tree for each path. I've simply deleted the old code as I don't want to add any further confusion.
I do want to keep it on record that using a recursive data structure makes this easier:
public class MyRecursiveObject
{
public MyRecursiveObject Parent { get; set; }
public string Name { get; set; }
public List<MyRecursiveObject> Children { get; set; }
}
You'll see very clearly why this is easier after reading the implementation code below:
private void PopulateTree(IEnumerable<MyObject> items)
{
var groupedItems =
from i in items
group i by i.Parent into g
select new { Name = g.Key, Children = g.Select(c => c.Child) };
var lookup = groupedItems.ToDictionary(i => i.Name, i => i.Children);
foreach (string parent in lookup.Keys)
{
if (lookup.ContainsKey(parent))
AddToTree(lookup, Enumerable.Empty<string>(), parent);
}
}
private void AddToTree(Dictionary<string, IEnumerable<string>> lookup,
IEnumerable<string> path, string name)
{
IEnumerable<string> children;
if (lookup.TryGetValue(name, out children))
{
IEnumerable<string> newPath = path.Concat(new string[] { name });
foreach (string child in children)
AddToTree(lookup, newPath, child);
}
else
{
TreeNode parentNode = null;
foreach (string item in path)
parentNode = AddTreeNode(parentNode, item);
AddTreeNode(parentNode, name);
}
}
private TreeNode AddTreeNode(TreeNode parent, string name)
{
TreeNode node = new TreeNode(name);
if (parent != null)
parent.Nodes.Add(node);
else
treeView1.Nodes.Add(node);
return node;
}
First of all, I realized that the dictionary will contain keys for intermediate nodes as well as just the root nodes, so we don't need two recursive calls in the recursive AddToTree method in order to get the "B" nodes as roots; the initial walk in the PopulateTree method already does it.
What we do need to guard against is adding leaf nodes in the initial walk; using the data structure in question, these are detectable by checking whether or not there is a key in the parent dictionary. With a recursive data structure, this would be way easier: Just check for Parent == null. But, a recursive structure is not what we have, so the code above is what we have to use.
The AddTreeNode is mostly a utility method, so we don't have to keep repeating this null-checking logic later.
The real ugliness is in the second, recursive AddToTree method. Because we are trying to create a unique copy of every single subtree, we can't simply add a tree node and then recurse with that node as the parent. "A" only has one child here, "B", but "B" has two children, "C" and "D". There needs to be two copies of "A", but there's no way to know about that when "A" is originally passed to the AddToTree method.
So what we actually have to do is not create any nodes until the final stage, and store a temporary path, for which I've chosen IEnumerable<string> because it's immutable and therefore impossible to mess up. When there are more children to add, this method simply adds to the path and recurses; when there are no more children, it walks the entire saved path and adds a node for each.
This is extremely inefficient because we are now creating a new enumerable on every invocation of AddToTree. For large numbers of nodes, it is likely to chew up a lot of memory. This works, but it would be a lot more efficient with a recursive data structure. Using the example structure at the top, you wouldn't have to save the path at all or create the dictionary; when no children are left, simply walk up the path in a while loop using the Parent reference.
Anyway, I guess that's academic because this isn't a recursive object, but I thought it was worth pointing out anyway as something to keep in mind for future designs. The code above will produce exactly the results you want, I've gone ahead and tested it on a real TreeView.
UPDATE 2 - So it turns out that the version above is pretty brutal with respect to memory/stack, most likely a result of creating all those IEnumerable<string> instances. Although it's not great design, we can remove that particular issue by changing to a mutable List<string>. The following snippet shows the differences:
private void PopulateTree(IEnumerable<MyObject> items)
{
// Snip lookup-generation code - same as before ...
List<string> path = new List<string>();
foreach (string parent in lookup.Keys)
{
if (lookup.ContainsKey(parent))
AddToTree(lookup, path, parent);
}
}
private void AddToTree(Dictionary<string, IEnumerable<string>> lookup,
IEnumerable<string> path, string name)
{
IEnumerable<string> children;
if (lookup.TryGetValue(name, out children))
{
path.Add(name);
foreach (string child in children)
AddToTree(lookup, newPath, child);
path.Remove(name);
}
// Snip "else" block - again, this part is the same as before ...
}

like rubens, I tried both, but a little better I think A Generic Tree Collection
this tree collection got some nice functionality build-in to move around the tree, go read the whole article
sample with the link above
Static Class Module1
{
public static void Main()
{
Common.ITree<myObj> myTree = default(Common.ITree<myObj>);
myObj a = new myObj("a");
myObj b = new myObj("b");
myObj c = new myObj("c");
myObj d = new myObj("d");
myTree = Common.NodeTree<myObj>.NewTree;
myTree.InsertChild(a).InsertChild(b).InsertChild(c).Parent.Parent.InsertNext(a).InsertChild(b).InsertChild(d).Parent.Parent.InsertNext(b).InsertChild(c).Parent.InsertNext(b).InsertChild(d);
Console.WriteLine(myTree.ToStringRecursive);
Console.ReadKey();
}
}
Class myObj
{
public string text;
public myObj(string value)
{
text = value;
}
public override string ToString()
{
return text;
}
}
would be exactly what you just showed
-A
--B
---C
-A
--B
---D
-B
--C
-B
--D

If I understand this correctly, what you're trying to do is take one tree and transform it into another. The transformation essentially takes each non-leaf-node in the input tree and creates a node for it (and its descendants) in the output tree.
First off, you'll be happier if you design a data structure for your nodes that is genuinely recursive:
public class Node
{
public Node Parent { get; private set; }
public IEnumerable<Node> Children { get; private set; }
public bool HasChildren { get { return Children.Count() > 0; } }
public Node()
{
Children = new List<Node>();
}
}
Your MyObject class represents parent/child relationships between string values. As long as you're able to implement a FindChildren() method that returns the child values for a given parent value, using this class to rationalize the parent/child relationships is straightforward:
public string Value { get; set; }
public static Node Create(string parentKey)
{
Node n = new Node();
n.Value = parentKey;
foreach (string childKey in FindChildren(parentKey))
{
Node child = n.Children.Add(Node.Create(childKey));
child.Parent = n;
}
return n;
}
It's simple to implement a property that returns a node's descendants:
public IEnumerable<Node> Descendants
{
get
{
foreach (Node child in Children)
{
yield return child;
foreach (Node descendant in child.Descendants)
{
yield return descendant;
}
}
}
}
To add a Node to a TreeView, you need two methods. (Note that these aren't methods of the Node class!) I've made them overloads, but an argument can be made for giving them different names:
public void AddNode(Node n, TreeView tv)
{
TreeNode tn = tv.Nodes.Add(n.Value);
tn.Tag = n;
foreach (Node child in n.Children)
{
AddNode(child, tn);
}
}
public void AddNode(Node n, TreeNode parent)
{
TreeNode tn = parent.Nodes.Add(n.Value);
parent.Tag = n;
foreach (Node child in n.Children)
{
AddNode(child, tn);
}
}
I'm setting the Tag on each TreeNode so that you can find your way back to the original Node.
So to initialize your TreeView from a list of top-level parent keys, you need a method like this:
public void PopulateTreeView(IEnumerable<string> parents, TreeView t)
{
foreach (string parentKey in parents)
{
Node n = Node.Create(parentKey);
AddNode(n, t);
foreach (Node descendant in n.Descendants)
{
if (n.HasChildren)
{
AddNode(descendant, t);
}
}
}
}
Edit:
I didn't quite understand how your MyObject class was working; I think I do now, and I've edited this accordingly.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

SelectMany to flatten a nested structure - c#

Related

Is there a way to simplify this reverse-enumerator?

Function which will return particular node from tree structure

Iteration of an object, that may have parent, or child of it's own same type

Building a tree using a list of objects

Wrapping my head around N parent->child associations

Categories

Resources