Roslyn - How can I replace multiple nodes with multiple nodes each? - c#

Background:
Using Roslyn with C#, I am trying to expand auto-implemented properties, so that the accessor bodies can have code injected by later processing. I am using StackExchange.Precompilation as the compiler hook, so these syntax transformations occur in the build pipeline, not as part of an analyzer or refactoring.
I want to turn this:
[SpecialAttribute]
int AutoImplemented { get; set; }
into this:
[SpecialAttribute]
int AutoImplemented {
get { return _autoImplemented; }
set { _autoImplemented = value; }
}
private int _autoImplemented;
The problem:
I have been able to get simple transformations working, but I'm stuck on auto-properties, and a few others that are similar in some ways. The trouble I'm having is in using the SyntaxNodeExtensions.ReplaceNode and SyntaxNodeExtensions.ReplaceNodes extension methods correctly when replacing more than one node in a tree.
I am using a class extending CSharpSyntaxRewriter for the transformations. I'll just share the relevant members of that class here. This class visits each class and struct declaration, and then replaces any property declarations that are marked with SpecialAttribute.
private readonly SemanticModel model;
public override SyntaxNode VisitClassDeclaration(ClassDeclarationSyntax node) {
if (node == null) throw new ArgumentNullException(nameof(node));
node = VisitMembers(node);
return base.VisitClassDeclaration(node);
}
public override SyntaxNode VisitStructDeclaration(StructDeclarationSyntax node) {
if (node == null) throw new ArgumentNullException(nameof(node));
node = VisitMembers(node);
return base.VisitStructDeclaration(node);
}
private TNode VisitMembers<TNode>(TNode node)
where TNode : SyntaxNode {
IEnumerable<PropertyDeclarationSyntax> markedProperties =
node.DescendantNodes()
.OfType<PropertyDeclarationSyntax>()
.Where(prop => prop.HasAttribute<SpecialAttribute>(model));
foreach (var prop in markedProperties) {
SyntaxList<SyntaxNode> expanded = ExpandProperty(prop);
//If I set a breakpoint here, I can see that 'expanded' will hold the correct value.
//ReplaceNode appears to not be replacing anything
node = node.ReplaceNode(prop, expanded);
}
return node;
}
private SyntaxList<SyntaxNode> ExpandProperty(PropertyDeclarationSyntax node) {
//Generates list of new syntax elements from original.
//This method will produce correct output.
}
HasAttribute<TAttribute> is an extension method I defined for PropertyDeclarationSyntax that checks if that property has an attribute of the given type. This method works correctly.
I believe I am just not using ReplaceNode correctly. There are three related methods:
TRoot ReplaceNode<TRoot>(
TRoot root,
SyntaxNode oldNode,
SyntaxNode newNode);
TRoot ReplaceNode<TRoot>(
TRoot root,
SyntaxNode oldNode,
IEnumerable<SyntaxNode> newNodes);
TRoot ReplaceNodes<TRoot, TNode>(
TRoot root,
IEnumerable<TNode> nodes,
Func<TNode, TNode, SyntaxNode> computeReplacementNode);
I am using the second one, because I need to replace each property node with both field and property nodes. I need to do this with many nodes, but there is no overload of ReplaceNodes that allows one-to-many node replacement. The only way I found around having that overload was using a foreach loop, which seems very 'imperative' and against the functional feel of the Roslyn API.
Is there a better way to perform batch transformations like this?
Update:
I found a great blog series on Roslyn and dealing with its immutability. I haven't found the exact answer yet, but it looks like a good place to start.
https://joshvarty.wordpress.com/learn-roslyn-now/
Update:
So here is where I'm really confused. I know that the Roslyn API is all based on immutable data structures, and the problem here is in a subtlety of how the copying of structures is used to mimic mutability. I think the problem is that every time I replace a node in my tree, I then have a new tree, and so when I call ReplaceNode that tree supposedly doesn't contain my original node that I want to replace.
It is my understanding that the way trees are copied in Roslyn is that, when you replace a node in a tree you actually create a new tree that references all the same nodes of the original tree, except the node you replaced and all nodes directly above that one. The nodes below the replaced node may be removed if the replacement node no longer references them, or new references may be added, but all the old references still point to the same node instances as before. I am pretty sure this is exactly what Anders Hejlsberg describes in this interview on Roslyn (20 to 23 min in).
So shouldn't my new node instance still contain the same prop instances found in my original sequence?
Hacky solution for special cases:
I was finally able to get this particular problem of transforming property declarations to work by relying on property identifiers, which will not change in any tree transformations. However, I would still like a general solution for replacing multiple nodes with multiple nodes each. This solution is really working around the API not through it.
Here is the special case solution:
private TNode VisitMembers<TNode>(TNode node)
where TNode : SyntaxNode {
IEnumerable<PropertyDeclarationSyntax> markedPropertyNames =
node.DescendantNodes()
.OfType<PropertyDeclarationSyntax>()
.Where(prop => prop.HasAttribute<SpecialAttribute>(model))
.Select(prop => prop.Identifier.ValueText);
foreach (var prop in markedPropertyNames) {
var oldProp = node.DescendantNodes()
.OfType<PropertyDeclarationSyntax>()
.Single(p => p.Identifier.ValueText == prop.Name);
SyntaxList<SyntaxNode> newProp = ExpandProperty(oldProp);
node = node.ReplaceNode(oldProp, newProp);
}
return node;
}
Another similar problem I am working with is modifying all return statements in a method to insert postcondition checks. This case cannot obviously rely on any kind of unique identifier like a property declaration.

When you do that:
foreach (var prop in markedProperties) {
SyntaxList<SyntaxNode> expanded = ExpandProperty(prop);
//If I set a breakpoint here, I can see that 'expanded' will hold the correct value.
//ReplaceNode appears to not be replacing anything
node = node.ReplaceNode(prop, expanded);
}
After the first replacing, node (your class for example) does not contains the original property anymore .
In Roslyn, everything is immutable, so the first replace should work for you, and the you have a new tree\node.
To make it work you can consider one of the following:
Build the result in your rewriter class, without changing the original tree, and when you finishing, replace all at once. In your case, its mean replace the class note at once. I think its good option when you want to replace statement (I used it when I wrote code to convert linq query (comprehension) to fluent syntax) but for all class, maybe it's not optimal.
Use SyntaxAnnotaion \ TrackNodes to find node after the tree has changed. With these options you can change the tree as you want and you can still keep track of the old nodes in the new tree.
Use DocumentEditor its let you do multiple changes to a document and then return a new Document.
If you need example for one of them, let me know.

Related

Extending the TreeView control for incremental filtering/searching

I'm trying to extend the winforms TreeView control to allow incremental filtering and searching similar to the Solution Explorer in VS2012/VS2013.
Ideally, I would like it to be capable of replacing the existing TreeView with minimal code change - as far as the consumer is concerned, the only difference would be a method void Filter(string). Because of this, I think it would make sense for the Nodes property to return the TreeNodeCollection with ALL nodes, even ones not showing because of an applied filter.
I have the code written to handle the filtering, and it actually works quite well except when I access base.Nodes, it returns my filtered nodes and not the full list.
The problem I have is, I'm unable to clone or create a new instance of TreeNodeCollection, because the constructor is marked as internal. So my ideal code would look something like this:
public class TreeViewEx : TreeView
{
// results in a compiler error:
private TreeNodeCollection _allNodes = new TreeNodeCollection();
public new TreeNodeCollection Nodes { get { return _allNodes; } }
public TreeNodeCollection FilteredNodes { get { return base.Nodes; } }
public void Filter(string searchString)
{
base.BeginUpdate();
base.Nodes.Clear();
foreach (TreeNode node in FilterInternal(_allNodes, searchString))
{
base.Nodes.Add(node);
}
base.EndUpdate();
}
}
So as you can see, I'm trying to decouple the nodes that are shown in the UI from the nodes that the consumer would access. Of course with TreeNodeCollection having an internal constructor only, I'm unable to create a new instance or clone it.
I considered these two options, but neither sound like good solutions:
Use reflection to instantiate the TreeNodeCollection object (due to the internal constructor) for the second list. This option seems like it would be more efficient than #2, but of course I'm creating an instance of an object I'm not supposed to.
Instantiate a second TreeView in memory and use the Nodes property from that to maintain my second list. This seems like it might be a lot of overhead.
I want the end result to still be a TreeNodeCollection so the TreeView can be used to replace our existing controls with minimal code and we do have several places using the Find method, which doesn't exist in List<TreeNode>.
Does anyone have any recommendations on how to handle this? What about performance/resource-wise with my two considerations?
Thank you
Update 1:
Per Pat's recommendation, I decided to take a step back and avoid messing with Nodes altogether. So now I've added a List<TreeNode> AllNodes property and have the Nodes just display the nodes that appear in the TreeView (the filtered list), so now it's a bit simpler.
My problem now is, how do I know when AllNodes has an item added to it so I can keep Nodes in sync? I've considered using a BindingList so I have the ListChanged event, but then I would need to have my TreeNode and node's children/grand-children/etc (AllNodes[0].Nodes) use a custom class that inherits from TreeNode and change the Nodes property, and TreeNode.Nodes isn't overridable. Is there another way? I could make a new property called NodeExs or something, but that seems very unintuitive and I could see another dev coming along later and pulling his hair out because the Nodes property is there but doesn't work.
With regard to your proposed solutions, #2 is out because a TreeNode cannot belong more than one control. And while it might be possible to create an instance of TreeNodeCollection via reflection, it won't be very useful because its designed to be coupled to a TreeView or another TreeNode. You won't be able to add/remove nodes from the collection.
Because of this, I think it would make sense for the Nodes property to
return the TreeNodeCollection with ALL nodes, even ones not showing
because of an applied filter.
I disagree, the TreeNodeCollection returned by the Nodes property is used by the framework and OS to render the control. You really don't want to hide this property or alter its functionality.
If a consumer needs to have access to _allNodes, create a List<TreeNode> AllNodes property or use a custom collection.
I've found out that the TreeNodeCollection should only be used to read the listed nodes. Instead, I've used List<TreeNode> to list nodes. In my project, I created a List<TreeNode> for each level on the TreeView. I filled the lists at the same time when I filled the TreeView, at the startup. In the end, I used AddRange() to make and combine a list of the all nodes. This way I had all the nodes listed and categorized.
It's easy and fast to create this kinds of lists. I also created a List<string> version of the all nodes list, which I set up as an AutoCompleteCustomSource for my TextBox. This way I was able to use TextBox with AutoComplete for searching the nodes.
I'd make different lists for the consumers and other categories. Then I'd only add the items to the TreeView which meet the given criteria. You can also use treeView.Nodes.Remove() to remove any nodes. You'd still have the actual node stored on the lists, and could add it back again later.
These are just some ideas.

Tree structure +2 children

I implemented a tree structure in c# where a node looks like the following
public class Node
{
public int ID{get;set;}
public string Name{get;set;}
public Node Parent {get;set;}
public IList<Node> Children{get;set;}
public IList<Object> Items{get;set;}
public IEnumerable<Ancestors> {get{return this.GetAncestors();}}
}
I want to improve my structure but i am not sure what is this kind of tree is called, its not a binary tree since the children count varies and can be more than 2, i use recursion for almost every operation from getting a node by Name,Id or reference to removing nodes, in my case when a node is removed i add both the Items and Children Properties to the Parent node.
I did it from scratch and i am sure someone did it better, so could you please help me figure the name of this tree structure so i can google it for improvements?
k-ary tree is probably the closest to what you're looking for. This typically refers to a tree where each node has at most k children (for some k, e.g. a binary tree is a 2-ary tree).
If you're looking for the case where the number of children per node is unbounded, I don't believe that has a specific name, it's just called a tree (although I imagine some resources might call that a k-ary tree as well).
An obvious place for improvement I see here is to use generics for your structure (you should replace IList<Object> with a generic data type, and rename Items to Data ... probably).
Without knowing what you want to do, I can't say whether IList<Object> is a good idea - an alternative might be to have a class with members with specific types instead, or IList<SomeOtherType>.
Having each node store a reference to its parent is not that typical, but if there's a need for it, it can be done.
There are a few places where these structures are also called n-ary trees . If you want examples , you can google for Tries and B-tree.
I think a trie comes closest to what you are trying to structure

Unit testing simple Tree structure manipulations

Given a very simple structure such as this:
public class TreeNode
{
public int ID { get; set; }
public List<TreeNode> Children { get; set; }
}
TreeNode may have other properties.
And when used in the following manner:
var tree = new List<TreeNode>(); //no root node
If I perform add/update/remove operations on the tree based on certain criteria. For example, removal of a node based on one or more of the other properties I mentioned above, I'd like to compare the tree graph before and after the changes and then via unit tests verify some of the follow:
Tree remains unchanged
Specified nodes are removed
Specified nodes are added
Specified nodes are updated
The 3 above whilst also verifying that the rest of the tree is unchanged.
Ideally, I'd throw an expection listing the nodes that were not found, not expected etc. However, at this stage I'd be happy with a true/false to my check.
Are there any known patterns/alogorithms existing projects that would help with this?
I am happy for pseudo-code or examples in other languages as long as they don't rely on features I can't replicate in .NET.
My tree is unlikely to get to more than 7 or 8 levels deep and no more than a hundred nodes in total as it will be test data so brute force looping is fine and performance isn't a consideration at this time.
I'm really looking for tips, tricks, advice, code on how to approach this.
TIA
When I did unit tests for tree structures, I simply built an ad-hoc tree of already known structure, execute operations on it and verified that the changes are exactly the ones I expected, a very simple but usable method, if you create good test cases.
Regardless my experience, you may think of some recursive comparison methods for tree nodes that may return a list of children nodes which are different. So the basic idea is to maintain two equal trees, perform operation on one of them, then check what was changed.
If you don't have any UI that shows the tree, I'd also recommend to make visualizations of a tree, using http://www.graphviz.org/ , you may generate pictures of your tree before and after some operation, so you will see how whole structure was changed(not usable for unit tests, but anyway).
And the last thing, I suggest to have a root node, it will simplify your recursive algorithms. If you don't have root, because of some requirments for UI or so, you may modify that part to simply ignore the root.
You can also have a function that get the string representation of the tree and simply compare 2 string representations instead of comparing 2 trees
I did that earlier this week
example function (swift)
public var description: String {
var s = "\(value)"
if !children.isEmpty {
s += " {" + children.map { "\($0.description)"}.joined(separator: ", ") + "}"
}
return s
}
You can test it like this
XCTAssert ( tree.description == "beverages {hot {tea {black, green, chai}, coffee, cocoa}, cold {soda {ginger ale, bitter lemon}, milk}}");

Optimal way to get a single child element

We have C# code that walks various XML documents that we create. Often we need to get a known child element (it can be the only child or there could be other siblings). I have a function that given a parent and the child name will return the child element:
public static XmlElement GetChildElement(XmlElement parentElement, string childName)
{
return parentElement.GetElementsByTagName(childName).Cast<XmlElement>().FirstOrDefault();
}
This works fine but the other day I wondered if it could be done cleaner and easier with XPath or LINQ to XML. Most of the XPath examples I have found seem to want to know the entire structure of the document and I want a generic function that just knows about the parent and child. Linq to XML seems more promising but I haven't found an example matching what I am looking for.
Well LINQ to XML makes this very easy - you just use the XContainer.Element method:
XElement child = parent.Element(elementName);
This will give you the first element if there are any, or null otherwise.
Given what you already have, you can just do this:
public static XmlElement GetChildElement(XmlElement parentElement, string childName)
{
return parentElement[childName];
}
This will return the first matching child element, or null if there is none. Heck, I'm not sure there's even much sense using a convenience method for this, but the above modification will work if you already have references to this method.
One thing to note here is that the code you provided doesn't return the first matching child element; it returns the first matching descendant element. If that is in fact what you want, you can do this:
public static XmlElement GetChildElement(XmlElement parentElement, string childName)
{
return parentElement.SelectSingleNode("//" + childName) as XmlElement;
}
XmlNode.SelectSingleNode is the method you looking for if you can't use XElement:
var result = parentElement.SelectSingleNode(
string.Format("*[local-name()='{0}']", nameWithoutPrefix));
Note that my sample cheats with namespaces (accepts any), you should understand if you need to support namespaces correctly in your case.

Querying using LINQ nested data structure and returning nested groups

Is there a way to do this but for a data structure that has an unknown level of nesting? Is addition, though I believe this was the case in the other question also, every level has more than one entry (assume this though it could have only one, or zero).
In addition, is there a good way to store such a data structure such that the parent of every object can be easily found? I was thinking of something like a jagged array, but that seems hard to generate at runtime, as I do not know how deep the nesting is. Something with a structure like a treeview would be ideal, but I don't want to implement a control if I will just be using it for data storage, and not for the visual part.
As a last resort, I was thinking of writing my own class to store the data, but don't want to do that if I don't have to.
Are you looking for an n-ary tree with parent references?
class Node
{
public Node Parent { get; }
public IEnumerable<Node> Children { get; }
}
IEnumerable<Node> Flatten(Node node)
{
yield return node;
foreach (var child in node.Children)
{
yield return Flatten(child);
}
}

Categories