How to model a tree structure for use with strings

How to model a tree structure for use with strings - c#

I want to make a tree structure that takes in strings and displays everything in the tree. Please note that the purpose here is not to make a binary search tree or anything related to binary tress, rather it will be modelled on the the basis of: the first string entered is the "root", the second string is a parent, and the third is a child of the parent node. Please see illustration. The number of parent nodes can be however many.
Basically, I would like some ideas on how to approach this. I'm familiar with how a binary tree is coded and how it works, but this one seems a lot more different to implement.

In your case, it is a simple tree composed of a collection of nodes (multiple children), where each child has some associated data and a set of children. With this in mind, lets have a type called Node which will act as a building block of our tree. And try to abstract out as base Node class that can be extended to meet the needs of a tree node through inheritance.
Note: I am going to make it generic to be able to store any type though you wanted to store 'String'.
public class Node<T>
{
// Private member-variables
private T data;//This member variable contains the data stored in the node of the type specified by the developer using this class.
private NodeList<T> neighbors = null; //of type `NodeList<T>`. This member variable represents the node's children.
public Node() {}
public Node(T data) : this(data, null) {}
public Node(T data, NodeList<T> neighbors)
{
this.data = data;
this.neighbors = neighbors;
}
public T Value
{
get
{
return data;
}
set
{
data = value;
}
}
protected NodeList<T> Neighbors
{
get
{
return neighbors;
}
set
{
neighbors = value;
}
}
}
}
The NodeList class contains a strongly-typed collection of Node<T> instances.This class is derived from the Collection<T> in order to have a strong-typed collection, with methods like Add(T), Remove(T), and Clear() etc. Important thing to notice here is that, the arbitrary ('n') number of nodes can be added through the constructor that creates a specified number of nodes in the collection, and a method that searches the collection for an element of a particular value.
public class NodeList<T> : Collection<Node<T>>
{
public NodeList() : base() { }
public NodeList(int initialSize)
{
// Add the specified number of items
for (int i = 0; i < initialSize; i++)
base.Items.Add(default(Node<T>));
}
public Node<T> FindByValue(T value)
{
// search the list for the value
foreach (Node<T> node in Items)
if (node.Value.Equals(value))
return node;
// if we reached here, we didn't find a matching node
return null;
}
}
Finally, we left out with joining all what we discussed.
public class SpecialTree<T> : Node<T>
{
public SpecialTree() : base() {}
public SpecialTree(T data) : base(data, null) {}
public SpecialTree(T data, SpecialTree<T> left, SpecialTree<T> right)
{
base.Value = data;
NodeList<T> children = new NodeList<T>(2);
children[0] = left;
children[1] = right;
base.Neighbors = children;
}
public SpecialTree<T> Left
{
get
{
if (base.Neighbors == null)
return null;
else
return (SpecialTree<T>) base.Neighbors[0];
}
set
{
if (base.Neighbors == null)
base.Neighbors = new NodeList<T>(2);
base.Neighbors[0] = value;
}
}
public SpecialTree<T> Right
{
get
{
if (base.Neighbors == null)
return null;
else
return (SpecialTree<T>) base.Neighbors[1];
}
set
{
if (base.Neighbors == null)
base.Neighbors = new NodeList<T>(2);
base.Neighbors[1] = value;
}
}
}

There are no built in classes in .NET for manipulating tree structures and the simple reason is that there are too many variations.
I’d suggest you make your own class that would represent binary tree. Take a look at these threads for more details.
Why is there no Tree<T> class in .NET?
Tree data structure in C#

Related

How to access parent object in a tree structure

I am using MVVM and it is working all fine, except one thing, accessing parent model objects.
The goal is to access any model object's parent object directly, but I could not find a propper way to do that.
For example:
Grandparents
--- Parents
--- --- Children
--- --- --- Grandchildren
I have a reference to a Child, but I have to check some properties of Children and maybe Parents.
Currently the code is running through all higher level objects until there is a successful match in the Parent's Children's Grandchildren with my Grandchild object, and then it is possible to check the properties.
But this is kind of disgusting in terms of smart code and efficiency, independent of how this is done, I do not want to run through all my data for a lucky match. This is the current imoplementation, some other parts are done by using LINQ.
var someChild = calledChild;
foreach (Grandparent gParent in mainViewModel.SelectedEnvironment.GrandParents)
{
foreach (Parent parent in gParent.Parents)
{
foreach (Child child in parent.Children)
{
if (child.A == calledChild.A)
{
// Match
System.Diagnostics.Debug.WriteLine("CalledChilds grandparent is " + gParent.Name);
}
}
}
}
The model is set up in classes with definitions like this:
public class Parent : ObservableObject
{
public const string NamePropertyName = "Name";
private string _name;
public string Name
{
get
{
return _name;
}
set
{
if (_name == value)
{
return;
}
_name = value;
RaisePropertyChanged(NamePropertyName);
}
}
public const string ChildrenPropertyName = "Children";
private ObservableCollection<Child> _children;
public ObservableCollection<Child> Children
{
get
{
return _children;
}
set
{
if (_children == value)
{
return;
}
_children = value;
RaisePropertyChanged(ChildrenPropertyName);
}
}
}
The model is saved in a json file and parsed back to the model's root object type for usage.
I can not just add a new reference "Parent" to the "Child" object, because it would end up in a loop, due to this concepts restrictions.
It would be great to get references instead of copies of the whole model branch.
Is there a way to access the parent objects directly?
Thank you all!

Easiest way is to store direct reference to parent node in child nodes:
public class ParentNode
{
private ObservableCollection<ChildNode> _children;
public ParentNode()
{
_children = new ObservableCollection<ChildNode>();
Children = new ReadOnlyObservableCollection<ChildNode>(_children);
}
public ReadOnlyObservableCollection<ChildNode> Children { get; }
public void AddChild(ChildNode item)
{
if (item.Parent != null) throw new InvalidOperationException("Item is already added to another node");
item.Parent = this;
_children.Add(item);
}
public void RemoveChild(ChildNode item)
{
if (item.Parent != this) throw new InvalidOperationException("Item is not direct child of this node");
item.Parent = null;
_children.Remove(item);
}
}
public class ChildNode
{
public ParentNode Parent { get; internal set; }
}
just be careful, because this introduces circular references - parent references children and vice versa. It is kind of violation of DRY principle, because the shape of the tree is defined twice and you could easily get out of sync (e.g. you set ChildNode.Parent property to something else than the actual parent).
There are ways to workaround it, but I think you could start with this.

Inner class generic type same as outer type

I found a question here that almost answers my question, but I still don't fully understand.
Trying to write a Tree data structure, I did this:
public class Tree<T>
{
public TreeNode<T> root;
...
public class TreeNode<T>
{
List<TreeNode<T>> children;
T data;
public T Data { get { return data; } }
public TreeNode<T>(T data)
{
this.data = data;
children = new List<TreeNode<T>>();
}
...
}
}
And, anyone who's worked with C# generics apparently knows that I got this compiler warning: Type parameter 'T' has the same name as the type parameter from outer type 'Tree<T>'
My intent was to create an inner class that would be forced to use the same type as the outer class, but I now understand that adding a type parameter actually allows the inner class to be more flexible. But, in my case, I want subclasses of Tree<T> to be able to use TreeNode, for example, like this:
public class IntTree : Tree<int>
{
...
private static IntTree fromNode(TreeNode<int> node)
{
IntTree t = new IntTree();
t.root = node;
return t;
}
}
(That method allows the subclass to implement ToString() recursively)
So my question is, if I take out the parameterization, like this:
public class Tree<T>
{
public TreeNode root;
...
public class TreeNode
{
List<TreeNode> children;
T data;
public T Data { get { return data; } }
public TreeNode(T data)
{
this.data = data;
children = new List<TreeNode>();
}
...
}
}
will the resulting subclass be forced to use integers when creating TreeNodes, and therefore never be able to break the intent I had?
Disclaimer: yes, I know I'm probably doing plenty of things wrong here. I'm still learning C#, coming from a mostly Java and Lisp background, with a little bit of plain C. So suggestions and explanations are welcome.

Yes, it will be forced to use the same type. Look at the declaration again:
public class Tree<T>
{
public class TreeNode
{
private T Data;
}
}
So the type of Data is determined when you instantiate a specific Tree:
var tree = new Tree<int>();
This way the type of Data is declared as int and can be no different.
Note that there is no non-generic TreeNode class. There is only a Tree<int>.TreeNode type:
Tree<int> intTree = new Tree<int>(); // add some nodes
Tree<int>.TreeNode intNode = intTree.Nodes[0]; // for example
Tree<string> stringTree = new Tree<int>(); // add some nodes
Tree<string>.TreeNode stringNode = stringTree.Nodes[0]; // for example
// ERROR: this won't compile as the types are incompatible
Tree<string>.TreeNode stringNode2 = intTree.Nodes[0];
A Tree<string>.TreeNode is a different type than Tree<int>.TreeNode.

The type T declared in the outer class may already be used in all its inner declarations, so you can simply remove the <T> from the inner class:
public class Tree<T>
{
public TreeNode root;
//...
public class TreeNode
{
List<TreeNode> children;
T data;
public T Data { get { return data; } }
public TreeNode(T data)
{
this.data = data;
children = new List<TreeNode>();
}
//...
}
}

Linked-list and generic type

I am learning C# generic types, and I have gotten really confused by the linked-list example in the Generic module on MSDN website :
http://msdn.microsoft.com/en-us/library/0x6a29h6.aspx
I pasted the code here:
My confusion is about :
private Node next;
how should I understand this line of code?
I only can think it is a private field that is created with class name?
public Node Next
{
get { return next; }
set { next = value; }
}
I guess this is a property with class name as its type ?
private Node head;
Why does the nested class name appear where supposed to be the type of head ?
is this a private field of class GenericList<T> ?
// type parameter T in angle brackets
public class GenericList<T>
{
// The nested class is also generic on T.
private class Node
{
// T used in non-generic constructor.
public Node(T t)
{
next = null;
data = t;
}
**private Node next;** // How should I
public Node Next
{
get { return next; }
set { next = value; }
}
// T as private member data type.
private T data;
// T as return type of property.
public T Data
{
get { return data; }
set { data = value; }
}
}
private Node head;
// constructor
public GenericList()
{
head = null;
}
// T as method parameter type:
public void AddHead(T t)
{
Node n = new Node(t);
n.Next = head;
head = n;
}
public IEnumerator<T> GetEnumerator()
{
Node current = head;
while (current != null)
{
yield return current.Data;
current = current.Next;
}
}
}

private Node next;
how should I understand this line of code?
next is a private field of type Node
public Node Next
{
get { return next; }
set { next = value; }
}
I guess this is a property with class name as its type ?
it's a property named Next which type is Node
private Node head;
Why does the nested class name appear where supposed to be the type of head ?
is this a private field of class GenericList<T> ?
because it's the type of head and yes it's a private field of the outer class.
There's nothing special in here, a class can have field, properties, etc. with same type as itself
Edit regarding the comment "how does it make the code work by creating a field that is the nested class type Node ? (...) plus field 'next' is defined as null whenever a Node instance is created"
All the "magic" lies in the AddHead method.
When you create an instance GenericList (say GenericList) at first head is null
so after this step we could say the list can be represented as []
Then you call AddHead(1) for example ; it creates a Node with that value, then sets it's Next to the current head and finally makes that newly created Node as the new head.
so after this step the list is : 1 -> [] (a head with 1 which links to it's next node ; the null node aka the empty list)
After that if you call again AddHead with say 2 ; you'll end with something like : 2 -> 1 -> []
And when comes time to iterate over you simply have to loop while the head isn't empty (aka null), read it's stored value and use the linked node (it's next) as the "new head"

Searching over a templated tree

So I have 2 interfaces:
A node that can have children
public interface INode
{
IEnumeration<INode> Children { get; }
void AddChild(INode node);
}
And a derived "Data Node" that can have data associated with it
public interface IDataNode<DataType> : INode
{
DataType Data;
IDataNode<DataType> FindNode(DataType dt);
}
Keep in mind that each node in the tree could have a different data type associated with it as its Data (because the INode.AddChild function just takes the base INode)
Here is the implementation of the IDataNode interface:
internal class DataNode<DataType> : IDataNode<DataType>
{
List<INode> m_Children;
DataNode(DataType dt)
{
Data = dt;
}
public IEnumerable<INode> Children
{
get { return m_Children; }
}
public void AddChild(INode node)
{
if (null == m_Children)
m_Children = new List<INode>();
m_Children.Add(node);
}
public DataType Data { get; private set; }
Question is how do I implement the FindNode function without knowing what kinds of DataType I will encounter in the tree?
public IDataNode<DataType> FindNode(DataType dt)
{
throw new NotImplementedException();
}
}
As you can imagine something like this will not work out
public IDataNode<DataType> FindNode(DataType dt)
{
IDataNode<DataType> result = null;
foreach (var child in Children)
{
if (child is IDataNode<DataType>)
{
var datachild = child as IDataNode<DataType>;
if (datachild.Data.Equals(dt))
{
result = child as IDataNode<DataType>;
break;
}
}
else
{
// What??
}
// Need to recursively call FindNode on the child
// but can't because it could have a different
// DataType associated with it. Can't call FindNode
// on child because it is of type INode and not IDataNode
result = child.FindNode(dt); // can't do this!
if (null != result)
break;
}
return result;
}
Is my only option to do this when I know what kinds of DataType a particular tree I use will have? Maybe I am going about this in the wrong way, so any tips are appreciated. Thanks!

First of all, you need to put the FindNode method in INode. Otherwise, you cannot find a node of some type DataType... before having found a node of type DataType. Even if you have a reference to an object that you know is a DataNode<X>, this won't help you if someone tells you to find a DataNode<Y>.
There are now two roads you may take: if you want DataNode to be templated, then you need to know all possible types of data in the tree at compile time. If you know that, you can use a generic DataNode. If there's a chance that you may want to find a node with data of some type that will only become known to you at runtime (e.g. from the return value of some method that you do not control) then you cannot use generics.
I will illustrate the generic solution below.
public interface INode
{
IEnumerable<INode> Children { get; }
IDataNode<DataType> FindNode<DataType>(DataType value);
void AddChild(INode node);
}
public interface IDataNode<DataType> : INode
{
DataType Data { get; }
}
INode.FindNode could be implemented like this:
public IDataNode<DataType> FindNode<DataType> (DataType value) {
// If we are searching for ourselves, return this
var self = this as IDataNode<DataType>;
if (self != null && self.Data.Equals(value)) {
return self;
}
// Otherwise:
// 1. For each of our children, call FindNode on it. This will
// find the target node if it is our child, since each child
// will check if it is the node we look for, like we did above.
// 2. If our child is not the one we are looking for, FindNode will
// continue looking into its own children (depth-first search).
// 3. Return the first descendant that comes back and is not null.
// If no node is found, FirstOrDefault means we will return null.
return this.children.Select(c => c.FindNode(value))
.FirstOrDefault(found => found != null);
}
I have to say that the above recursive implementation with LINQ tries perhaps to be too clever and is maybe not very easy to understand. It could always be written with foreach, to make it more clear.

Use a Generic Function:
public IDataNode<DataType> FindNode<DataType>(DataType dt)
{
IDataNode<DataType> result = null;
foreach (var child in Children)
{
if (child is IDataNode<DataType>)
{
var datachild = child as IDataNode<DataType>;
if (datachild.Data.Equals(dt))
{
result = child as IDataNode<DataType>;
break;
}
}
else
{
// it's not a DataType You're looking for, so ignore it!
}
}
return result;
}
Then you call it like this:
var resultsStr = tree.FindNode<string>("Hello");
var resultsInt = tree.FindNode<int>(5);
var resultsCust = tree.FindNode<MyCustomClass>(new MyCustomClass("something"));

How to code a truly generic tree using Generics

Lets say I have a Node class as follows:
class Node<T>
{
T data;
List<Node<T>> children;
internal Node(T data)
{
this.data = data;
}
List<Node<T>> Children
{
get
{
if (children == null)
children = new List<Node<T>>(1);
return children;
}
}
internal IEnumerable<Node<T>> GetChildren()
{
return children;
}
internal bool HasChildren
{
get
{
return children != null;
}
}
internal T Data
{
get
{
return data;
}
}
internal void AddChild(Node<T> child)
{
this.Children.Add(child);
}
internal void AddChild(T child)
{
this.Children.Add(new Node<T>(child));
}
}
The problem is that each and every node of the tree is confined to a single type. However, there are situations where the root node is of one type, which has children of another type which has children of a third type (example documents-->paragraphs-->lines-->words).
How do you define a generic tree for such cases?

If you want a strict hierarchy of types you could declare them like this:
class Node<T, TChild> {...}
Node<Document, Node<Paragraph, Node<Line, Word>>>
I did not claim it would be pretty. :)

How do you define a generic tree for such cases?
I wouldn't try to in the first place. If what I wanted to model was:
I have a list of documents
A document has a list of paragraphs
A paragraph has a list of words
then why do you need generic nodes at all? Make a class Paragraph that has a List<Word>, make a class Document that has a List<Paragraph>, and then make a List<Document> and you're done. Why do you need to artificially impose a generic tree structure? What benefit does that buy you?

Have all of your sub-objects implement a specific eg IDocumentPart then declare Node

I have been reluctant to offer the code example attached, feeling that I don't have a strong sense, yet, of the "norms" of StackOverFlow in terms of posting code that may be "speculative," and, feeling that this particular frolic is some form of "mutant species" escaped from the laboratory on "The Island of Dr. Moreau" :) And, I do think the answer by Eric Lippert above is right-on.
So please take what follows with "a grain of salt" as just an experiment in "probing" .NET inheritance (uses FrameWork 3.5 facilities). My goal in writing this (a few months ago) was to experiment with an Abstract Class foundation for Node structure that implemented an internal List<> of "itself," then implement strongly-typed classes that inherited from the Abstract class ... and, on that foundation, build a generalized Tree data structure.
In fact I was surprised when I tested this, that it worked ! :)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
// experimental code : tested to a limited extent
// use only for educational purposes
namespace complexTree
{
// foundation abstract class template
public abstract class idioNode
{
// a collection of "itself" !
public List<idioNode> Nodes { private set; get; }
public idioNode Parent { get; set; }
public idioNode()
{
Nodes = new List<idioNode>();
}
public void Add(idioNode theNode)
{
Nodes.Add(theNode);
theNode.Parent = this;
}
}
// strongly typed Node of type String
public class idioString : idioNode
{
public string Value { get; set; }
public idioString(string strValue)
{
Value = strValue;
}
}
// strongly typed Node of type Int
public class idioInt : idioNode
{
public int Value { get; set; }
public idioInt(int intValue)
{
Value = intValue;
}
}
// strongly type Node of a complex type
// note : this is just a "made-up" test case
// designed to "stress" this experiment
// it certainly doesn't model any "real world"
// use case
public class idioComplex : idioNode
{
public Dictionary<idioString, idioInt> Value { get; set; }
public idioComplex(idioInt theInt, idioString theString)
{
Value = new Dictionary<idioString, idioInt>();
Value.Add(theString, theInt);
}
public void Add(idioInt theInt, idioString theString)
{
Value.Add(theString, theInt);
theInt.Parent = this;
theString.Parent = this;
}
}
// special case the Tree's root nodes
// no particular reason for doing this
public class idioTreeRootNodes : List<idioNode>
{
public new void Add(idioNode theNode)
{
base.Add(theNode);
theNode.Parent = null;
}
}
// the Tree object
public class idioTree
{
public idioTreeRootNodes Nodes { get; set; }
public idioTree()
{
Nodes = new idioTreeRootNodes();
}
}
}
So, to the test : (call this code from some EventHandler on a WinForm) :
// make a new idioTree
idioTree testIdioTree = new idioTree();
// make a new idioNode of type String
idioString testIdioString = new idioString("a string");
// add the Node to the Tree
testIdioTree.Nodes.Add(testIdioString);
// make a new idioNode of type Int
idioInt testIdioInt = new idioInt(99);
// add to Tree
testIdioTree.Nodes.Add(testIdioInt);
// make another idioNode of type String
idioString testIdioString2 = new idioString("another string");
// add the new Node to the child Node collection of the Int type Node
testIdioInt.Nodes.Add(testIdioString2);
// validate inheritance can be verified at run-time
if (testIdioInt.Nodes[0] is idioString) MessageBox.Show("it's a string, idiot");
if (!(testIdioInt.Nodes[0] is idioInt)) MessageBox.Show("it's not an int, idiot");
// make a new "complex" idioNode
// creating a Key<>Value pair of the required types of idioNodes
idioComplex complexIdio = new idioComplex(new idioInt(88), new idioString("weirder"));
// add a child Node to the complex idioNode
complexIdio.Add(new idioInt(77), new idioString("too weird"));
// make another idioNode of type Int
idioInt idioInt2 = new idioInt(33);
// add the complex idioNode to the child Node collection of the new Int type idioNode
idioInt2.Nodes.Add(complexIdio);
// add the new Int type Node to the Tree
testIdioTree.Nodes.Add(idioInt2);
// validate you can verify the type of idioComplex at run-time
MessageBox.Show(" tree/2/0 is complex = " + (testIdioTree.Nodes[2].Nodes[0] is idioComplex).ToString());
If the "smell" of this code is as bad as the fruit that here in Thailand we call the "durian" : well, so be it :) An obvious possible "weirdness" in this experiment is that you could have references to the same Node in more than one place in the tree at the same time.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to model a tree structure for use with strings - c#

Related

How to access parent object in a tree structure

Inner class generic type same as outer type

Linked-list and generic type

Searching over a templated tree

How to code a truly generic tree using Generics

Categories

Resources