Unit testing simple Tree structure manipulations

Unit testing simple Tree structure manipulations - c#

Given a very simple structure such as this:
public class TreeNode
{
public int ID { get; set; }
public List<TreeNode> Children { get; set; }
}
TreeNode may have other properties.
And when used in the following manner:
var tree = new List<TreeNode>(); //no root node
If I perform add/update/remove operations on the tree based on certain criteria. For example, removal of a node based on one or more of the other properties I mentioned above, I'd like to compare the tree graph before and after the changes and then via unit tests verify some of the follow:
Tree remains unchanged
Specified nodes are removed
Specified nodes are added
Specified nodes are updated
The 3 above whilst also verifying that the rest of the tree is unchanged.
Ideally, I'd throw an expection listing the nodes that were not found, not expected etc. However, at this stage I'd be happy with a true/false to my check.
Are there any known patterns/alogorithms existing projects that would help with this?
I am happy for pseudo-code or examples in other languages as long as they don't rely on features I can't replicate in .NET.
My tree is unlikely to get to more than 7 or 8 levels deep and no more than a hundred nodes in total as it will be test data so brute force looping is fine and performance isn't a consideration at this time.
I'm really looking for tips, tricks, advice, code on how to approach this.
TIA

When I did unit tests for tree structures, I simply built an ad-hoc tree of already known structure, execute operations on it and verified that the changes are exactly the ones I expected, a very simple but usable method, if you create good test cases.
Regardless my experience, you may think of some recursive comparison methods for tree nodes that may return a list of children nodes which are different. So the basic idea is to maintain two equal trees, perform operation on one of them, then check what was changed.
If you don't have any UI that shows the tree, I'd also recommend to make visualizations of a tree, using http://www.graphviz.org/ , you may generate pictures of your tree before and after some operation, so you will see how whole structure was changed(not usable for unit tests, but anyway).
And the last thing, I suggest to have a root node, it will simplify your recursive algorithms. If you don't have root, because of some requirments for UI or so, you may modify that part to simply ignore the root.

You can also have a function that get the string representation of the tree and simply compare 2 string representations instead of comparing 2 trees
I did that earlier this week
example function (swift)
public var description: String {
var s = "\(value)"
if !children.isEmpty {
s += " {" + children.map { "\($0.description)"}.joined(separator: ", ") + "}"
}
return s
}
You can test it like this
XCTAssert ( tree.description == "beverages {hot {tea {black, green, chai}, coffee, cocoa}, cold {soda {ginger ale, bitter lemon}, milk}}");

Related

Roslyn - How can I replace multiple nodes with multiple nodes each?

Background:
Using Roslyn with C#, I am trying to expand auto-implemented properties, so that the accessor bodies can have code injected by later processing. I am using StackExchange.Precompilation as the compiler hook, so these syntax transformations occur in the build pipeline, not as part of an analyzer or refactoring.
I want to turn this:
[SpecialAttribute]
int AutoImplemented { get; set; }
into this:
[SpecialAttribute]
int AutoImplemented {
get { return _autoImplemented; }
set { _autoImplemented = value; }
}
private int _autoImplemented;
The problem:
I have been able to get simple transformations working, but I'm stuck on auto-properties, and a few others that are similar in some ways. The trouble I'm having is in using the SyntaxNodeExtensions.ReplaceNode and SyntaxNodeExtensions.ReplaceNodes extension methods correctly when replacing more than one node in a tree.
I am using a class extending CSharpSyntaxRewriter for the transformations. I'll just share the relevant members of that class here. This class visits each class and struct declaration, and then replaces any property declarations that are marked with SpecialAttribute.
private readonly SemanticModel model;
public override SyntaxNode VisitClassDeclaration(ClassDeclarationSyntax node) {
if (node == null) throw new ArgumentNullException(nameof(node));
node = VisitMembers(node);
return base.VisitClassDeclaration(node);
}
public override SyntaxNode VisitStructDeclaration(StructDeclarationSyntax node) {
if (node == null) throw new ArgumentNullException(nameof(node));
node = VisitMembers(node);
return base.VisitStructDeclaration(node);
}
private TNode VisitMembers<TNode>(TNode node)
where TNode : SyntaxNode {
IEnumerable<PropertyDeclarationSyntax> markedProperties =
node.DescendantNodes()
.OfType<PropertyDeclarationSyntax>()
.Where(prop => prop.HasAttribute<SpecialAttribute>(model));
foreach (var prop in markedProperties) {
SyntaxList<SyntaxNode> expanded = ExpandProperty(prop);
//If I set a breakpoint here, I can see that 'expanded' will hold the correct value.
//ReplaceNode appears to not be replacing anything
node = node.ReplaceNode(prop, expanded);
}
return node;
}
private SyntaxList<SyntaxNode> ExpandProperty(PropertyDeclarationSyntax node) {
//Generates list of new syntax elements from original.
//This method will produce correct output.
}
HasAttribute<TAttribute> is an extension method I defined for PropertyDeclarationSyntax that checks if that property has an attribute of the given type. This method works correctly.
I believe I am just not using ReplaceNode correctly. There are three related methods:
TRoot ReplaceNode<TRoot>(
TRoot root,
SyntaxNode oldNode,
SyntaxNode newNode);
TRoot ReplaceNode<TRoot>(
TRoot root,
SyntaxNode oldNode,
IEnumerable<SyntaxNode> newNodes);
TRoot ReplaceNodes<TRoot, TNode>(
TRoot root,
IEnumerable<TNode> nodes,
Func<TNode, TNode, SyntaxNode> computeReplacementNode);
I am using the second one, because I need to replace each property node with both field and property nodes. I need to do this with many nodes, but there is no overload of ReplaceNodes that allows one-to-many node replacement. The only way I found around having that overload was using a foreach loop, which seems very 'imperative' and against the functional feel of the Roslyn API.
Is there a better way to perform batch transformations like this?
Update:
I found a great blog series on Roslyn and dealing with its immutability. I haven't found the exact answer yet, but it looks like a good place to start.
https://joshvarty.wordpress.com/learn-roslyn-now/
Update:
So here is where I'm really confused. I know that the Roslyn API is all based on immutable data structures, and the problem here is in a subtlety of how the copying of structures is used to mimic mutability. I think the problem is that every time I replace a node in my tree, I then have a new tree, and so when I call ReplaceNode that tree supposedly doesn't contain my original node that I want to replace.
It is my understanding that the way trees are copied in Roslyn is that, when you replace a node in a tree you actually create a new tree that references all the same nodes of the original tree, except the node you replaced and all nodes directly above that one. The nodes below the replaced node may be removed if the replacement node no longer references them, or new references may be added, but all the old references still point to the same node instances as before. I am pretty sure this is exactly what Anders Hejlsberg describes in this interview on Roslyn (20 to 23 min in).
So shouldn't my new node instance still contain the same prop instances found in my original sequence?
Hacky solution for special cases:
I was finally able to get this particular problem of transforming property declarations to work by relying on property identifiers, which will not change in any tree transformations. However, I would still like a general solution for replacing multiple nodes with multiple nodes each. This solution is really working around the API not through it.
Here is the special case solution:
private TNode VisitMembers<TNode>(TNode node)
where TNode : SyntaxNode {
IEnumerable<PropertyDeclarationSyntax> markedPropertyNames =
node.DescendantNodes()
.OfType<PropertyDeclarationSyntax>()
.Where(prop => prop.HasAttribute<SpecialAttribute>(model))
.Select(prop => prop.Identifier.ValueText);
foreach (var prop in markedPropertyNames) {
var oldProp = node.DescendantNodes()
.OfType<PropertyDeclarationSyntax>()
.Single(p => p.Identifier.ValueText == prop.Name);
SyntaxList<SyntaxNode> newProp = ExpandProperty(oldProp);
node = node.ReplaceNode(oldProp, newProp);
}
return node;
}
Another similar problem I am working with is modifying all return statements in a method to insert postcondition checks. This case cannot obviously rely on any kind of unique identifier like a property declaration.

When you do that:
foreach (var prop in markedProperties) {
SyntaxList<SyntaxNode> expanded = ExpandProperty(prop);
//If I set a breakpoint here, I can see that 'expanded' will hold the correct value.
//ReplaceNode appears to not be replacing anything
node = node.ReplaceNode(prop, expanded);
}
After the first replacing, node (your class for example) does not contains the original property anymore .
In Roslyn, everything is immutable, so the first replace should work for you, and the you have a new tree\node.
To make it work you can consider one of the following:
Build the result in your rewriter class, without changing the original tree, and when you finishing, replace all at once. In your case, its mean replace the class note at once. I think its good option when you want to replace statement (I used it when I wrote code to convert linq query (comprehension) to fluent syntax) but for all class, maybe it's not optimal.
Use SyntaxAnnotaion \ TrackNodes to find node after the tree has changed. With these options you can change the tree as you want and you can still keep track of the old nodes in the new tree.
Use DocumentEditor its let you do multiple changes to a document and then return a new Document.
If you need example for one of them, let me know.

Extending the TreeView control for incremental filtering/searching

I'm trying to extend the winforms TreeView control to allow incremental filtering and searching similar to the Solution Explorer in VS2012/VS2013.
Ideally, I would like it to be capable of replacing the existing TreeView with minimal code change - as far as the consumer is concerned, the only difference would be a method void Filter(string). Because of this, I think it would make sense for the Nodes property to return the TreeNodeCollection with ALL nodes, even ones not showing because of an applied filter.
I have the code written to handle the filtering, and it actually works quite well except when I access base.Nodes, it returns my filtered nodes and not the full list.
The problem I have is, I'm unable to clone or create a new instance of TreeNodeCollection, because the constructor is marked as internal. So my ideal code would look something like this:
public class TreeViewEx : TreeView
{
// results in a compiler error:
private TreeNodeCollection _allNodes = new TreeNodeCollection();
public new TreeNodeCollection Nodes { get { return _allNodes; } }
public TreeNodeCollection FilteredNodes { get { return base.Nodes; } }
public void Filter(string searchString)
{
base.BeginUpdate();
base.Nodes.Clear();
foreach (TreeNode node in FilterInternal(_allNodes, searchString))
{
base.Nodes.Add(node);
}
base.EndUpdate();
}
}
So as you can see, I'm trying to decouple the nodes that are shown in the UI from the nodes that the consumer would access. Of course with TreeNodeCollection having an internal constructor only, I'm unable to create a new instance or clone it.
I considered these two options, but neither sound like good solutions:
Use reflection to instantiate the TreeNodeCollection object (due to the internal constructor) for the second list. This option seems like it would be more efficient than #2, but of course I'm creating an instance of an object I'm not supposed to.
Instantiate a second TreeView in memory and use the Nodes property from that to maintain my second list. This seems like it might be a lot of overhead.
I want the end result to still be a TreeNodeCollection so the TreeView can be used to replace our existing controls with minimal code and we do have several places using the Find method, which doesn't exist in List<TreeNode>.
Does anyone have any recommendations on how to handle this? What about performance/resource-wise with my two considerations?
Thank you
Update 1:
Per Pat's recommendation, I decided to take a step back and avoid messing with Nodes altogether. So now I've added a List<TreeNode> AllNodes property and have the Nodes just display the nodes that appear in the TreeView (the filtered list), so now it's a bit simpler.
My problem now is, how do I know when AllNodes has an item added to it so I can keep Nodes in sync? I've considered using a BindingList so I have the ListChanged event, but then I would need to have my TreeNode and node's children/grand-children/etc (AllNodes[0].Nodes) use a custom class that inherits from TreeNode and change the Nodes property, and TreeNode.Nodes isn't overridable. Is there another way? I could make a new property called NodeExs or something, but that seems very unintuitive and I could see another dev coming along later and pulling his hair out because the Nodes property is there but doesn't work.

With regard to your proposed solutions, #2 is out because a TreeNode cannot belong more than one control. And while it might be possible to create an instance of TreeNodeCollection via reflection, it won't be very useful because its designed to be coupled to a TreeView or another TreeNode. You won't be able to add/remove nodes from the collection.
Because of this, I think it would make sense for the Nodes property to
return the TreeNodeCollection with ALL nodes, even ones not showing
because of an applied filter.
I disagree, the TreeNodeCollection returned by the Nodes property is used by the framework and OS to render the control. You really don't want to hide this property or alter its functionality.
If a consumer needs to have access to _allNodes, create a List<TreeNode> AllNodes property or use a custom collection.

I've found out that the TreeNodeCollection should only be used to read the listed nodes. Instead, I've used List<TreeNode> to list nodes. In my project, I created a List<TreeNode> for each level on the TreeView. I filled the lists at the same time when I filled the TreeView, at the startup. In the end, I used AddRange() to make and combine a list of the all nodes. This way I had all the nodes listed and categorized.
It's easy and fast to create this kinds of lists. I also created a List<string> version of the all nodes list, which I set up as an AutoCompleteCustomSource for my TextBox. This way I was able to use TextBox with AutoComplete for searching the nodes.
I'd make different lists for the consumers and other categories. Then I'd only add the items to the TreeView which meet the given criteria. You can also use treeView.Nodes.Remove() to remove any nodes. You'd still have the actual node stored on the lists, and could add it back again later.
These are just some ideas.

Tree structure +2 children

I implemented a tree structure in c# where a node looks like the following
public class Node
{
public int ID{get;set;}
public string Name{get;set;}
public Node Parent {get;set;}
public IList<Node> Children{get;set;}
public IList<Object> Items{get;set;}
public IEnumerable<Ancestors> {get{return this.GetAncestors();}}
}
I want to improve my structure but i am not sure what is this kind of tree is called, its not a binary tree since the children count varies and can be more than 2, i use recursion for almost every operation from getting a node by Name,Id or reference to removing nodes, in my case when a node is removed i add both the Items and Children Properties to the Parent node.
I did it from scratch and i am sure someone did it better, so could you please help me figure the name of this tree structure so i can google it for improvements?

k-ary tree is probably the closest to what you're looking for. This typically refers to a tree where each node has at most k children (for some k, e.g. a binary tree is a 2-ary tree).
If you're looking for the case where the number of children per node is unbounded, I don't believe that has a specific name, it's just called a tree (although I imagine some resources might call that a k-ary tree as well).
An obvious place for improvement I see here is to use generics for your structure (you should replace IList<Object> with a generic data type, and rename Items to Data ... probably).
Without knowing what you want to do, I can't say whether IList<Object> is a good idea - an alternative might be to have a class with members with specific types instead, or IList<SomeOtherType>.
Having each node store a reference to its parent is not that typical, but if there's a need for it, it can be done.

There are a few places where these structures are also called n-ary trees . If you want examples , you can google for Tries and B-tree.
I think a trie comes closest to what you are trying to structure

Localize node texts in treeview using resource files

For a project I need a tree view that allows the user to select a module, which then is displayed in a content area. The project relies heavily on localization and this is provided by the resource files.
Now I discovered today, that the text that are assigned to preset tree view nodes are not contained in the resource files.
So the question is whether there is a way of doing this, short of mapping the elemenst in code. I.e. assigning a name to the node, running over all nodes and pulling the resources from the resouce manager based on the node name.
This is what I am currently doing, however, it just doesn't "feel" right:
private void TranslateNodes(TreeNodeCollection treeNodeCollection) {
var rm = Resources.ResourceManager;
foreach (TreeNode node in treeNodeCollection) {
node.Text = rm.GetString(node.Name + "_Text");
this.TranslateNodes(node.Nodes);
}
}
Thanks!

Your approach looks ok for me, with one exception, it believes that node.Name is unique though entire treeview (which is not correct in general case).
You can use TreeNode.FullPath for unique identify node within treeview. Or alternatively your code can depend on node tag value, but this is highly depend on usage scenario.
And do not forget about calling TreeView's BeginUpdate-EndUpdate.

No suitable, solution found except the one statete in the op ... so closing the question seems apropriate.

Best way to get a list of differences between 2 of the same objects

I would like to generate a list of differences between 2 instances of the the same object. Object in question:
public class Step
{
[DataMember]
public StepInstanceInfo InstanceInfo { get; set; }
[DataMember]
public Collection<string> AdHocRules { get; set; }
[DataMember]
public Collection<StepDoc> StepDocs
{...}
[DataMember]
public Collection<StepUsers> StepUsers
{...}
}
What I would like to do is find an intelligent way to return an object that lists the differences between the two instances (for example, let me know that 2 specific StepDocs were added, 1 specific StepUser was removed, and one rule was changed from "Go" to "Stop"). I have been looking into using a MD5 hash, but I can't find any good examples of traversing an object like this and returning a manifest of the specific differences (not just indicating that they are different).
Additional Background: the reason that I need to do this is the API that I am supporting allows clients to SaveStep(Step step)...this works great for persisting the Step object to the db using entities and repositories. I need to raise specific events (like this user was added, etc) from this SaveStep method, though, in order to alert another system (workflow engine) that a specific element in the step has changed.
Thank you.

You'll need a separate object, like StepDiff with collections for removed and added items. The easiest way to do something like this is to copy the collections from each of the old and new objects, so that StepDiff has collectionOldStepDocs and collectionNewStepDocs.
Grab the shorter collection and iterate through it and see if each StepDoc exists in the other collection. If so, delete the StepDoc reference from both collections. Then when you're finished iterating, collectionOldStepDocs contains stepDocs that were deleted and collectionNewStepDocs contains the stepDocs that were added.
From there you should be able to build your manifest in whatever way necessary.

Implementing the IComparable interface in your object may provide you with the functionality you need. This will provide you a custom way to determine differences between objects without resorting to checksums which really won't help you track what the differences are in usable terms. Otherwise, there's no way to determine equality between two user objects in .NET that I know of. There are some decent examples of the usage of this interface in the help file for Visual Studio, or here. You might be able to glean some directives from the examples on clean ways to compare the properties and store the values in some usable manner for tracking purposes (perhaps a collection, or dictionary object?).
Hope this helps,
Greg

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Unit testing simple Tree structure manipulations - c#

Related

Roslyn - How can I replace multiple nodes with multiple nodes each?

Extending the TreeView control for incremental filtering/searching

Tree structure +2 children

Localize node texts in treeview using resource files

Best way to get a list of differences between 2 of the same objects

Categories

Resources