I need to find a path or paths down a complicated graph structure. The graph is built using something similar to this:
class Node
{
public string Value { get; set;}
public List<Node> Nodes { get; set;}
public Node()
{
Nodes = new List<Node>();
}
}
What makes this complicated is that the nodes can reference back to an earlier node. For example,
A -> C -> E -> A
What I need to do is get a list of stacks which represent paths through the Nodes until I get to a Node with a specific value. Since its possible there can be some very large paths available we can have a maximum Nodes to try.
List<Stack<Node>> paths = FindPaths(string ValueToFind, int MaxNumberNodes);
Does anyone have a way to build this (or something similar)? I've done recursion in the past but I'm having a total brain fart thinking about this for some reason. My question specified a lambda expression but using a lambda is not necessarily required. I'd be grateful for any solution.
Side note: I lifted the class from aku's excellent answer for this recursion question. While his elegant solution shown below traverses the tree structure it doesn't seem to allow enough flexibility to do what I need (for example, dismiss paths that are circular and track paths that are successful).
Action<Node> traverse = null;
traverse = (n) => { Console.WriteLine(n.Value); n.Nodes.ForEach(traverse);};
traverse(root); // where root is the tree structure
Edit:
Based on input from the comments and answers below I found an excellent solution over in CodeProject. It uses the A* path finding algorithm. Here is the link.
If you're issue is related to Pathfinding, you may want to google for "A star" or "A*".
Its a common and efficient pathfinding algorithm. See this article for an example directly related to your problem.
You may also want to look at the Dijsktra Algorithm
I'm not sure whether your intended output is all paths to the goal, the best path to the goal (by some metric, e.g. path length), or just any path to the goal.
Assuming the latter, I'd start with the recursive strategy, including tracking of visited nodes as outlined by Brann, and make these changes:
Add parameters to represent the goal being sought, the collection of successful paths, and the current path from the start.
When entering a node that matches the goal, add the current path (plus the current node) to the list of successful paths.
Extend the current path with the current node to create the path passed on any recursive calls.
Invoke the initial ExploreGraph call with an empty path and an empty list of successful paths.
Upon completion, your algorithm will have traversed the entire graph, and distinct paths to the goal will have been captured.
That's just a quick sketch, but you should be able to flesh it out for your specific needs.
I don't know exactly what you want to achieve, but this circular reference problem is usually solved by tagging already visited nodes.
Just use a Dictionnary to keep track of the nodes which have already been visited so that you don't loop.
Example :
public void ExploreGraph(TreeNode tn, Dictionary<TreeNode, bool> visitednodes)
{
foreach (Treenode childnode in tn.Nodes)
{
if (!visitedNodes.ContainsKey(childnode))
{
visitednodes.Add(childnode);
ExploreGraph(childnode, visitednodes);
}
}
}
Related
I am trying to implement the A* Algorithm in order to solve the following :
I have an initial state
I can apply an "Action" to advance from one state to an other state
I want to reach a final state in the least amount of action
Applying an action to a given state is simple (=fast)
The whole state is a complex object (=huge in memory and slow to clone)
The issue comes from the point 5/ .
Indeed, when looking for the possible childs from a current state, I can not create a whole new state each time because it would be too costly (both in term of memory and speed). As a result, I am working with a single state that I mutate to reflect the resulting state when applying an action to a former state. (I am able to rollback an action). I was thinking to implement A* with something as below :
//_state; //represent the "singleton" states that I can apply and rollback actions instead of cloning it
while (openNodes.Any())
{
var currentNode = openNodes.DeQueue();
currentNode.AdvanceFromStart(_state); //make _state such as all action on the path from the root to currentNode are played
if (IsFinal(_state))
return;
AddToVisitedStates(_state);
foreach(var transition in currentNode.GetPossibleActions())
{
var childNode = new Node(initialState:_state,action:transition.Action);
//here _state is still reflecting the situation from the currentNode point of view
childNode.ApplyAction(_state);
//now _state reflect the situation from childNode point of view
if (WasVisited(_state))
{
childNode.RollbackAction(_state);
continue;
}
if (childNode.CostToReachNode == 0 ||
currentNode.CostToReachNode + transition.Cost < childNode.CostToReachNode)
{
childNode.CostToReachNode = node.CostToReachNode + transition.CostToReachNode;
childNode.CostToReachFinal = childNode.CostToReachNode + HeuristicToReachFinalFromState(_state);
openNodes.ReOrder(childNode);
}
if (!openNodes.Contains(childNode))
openNodes.Add(childNode);
childNode.RollbackAction(_state);
}
currentNode.RollbackToInitialState(_state);//make _state as initially setup
}
I am not a fan of this solution. Is there something in the A* algorithm that I am missing that would help ? I did not finished the implentation yet, do you see some incoming issues/some points to raise ?
Maybe A* is not the right algorithm, I am open to any lead to something different.
PD : if relevant, it is for a C# implementation
You could make it look a lot more like normal A* by storing in each object, not the state, but the sequence of decisions taken starting from the initial state that led to it. When you want to deal with a state, look at the sequence of decisions taken that led to the current state, back up to the common ancestor with the state you need to go to, and then go down that set of recorded decisions. The cost of such a change is at most some constant factor times the depth of the decision tree. If this is heavily branched and balanced, it might not be that deep.
Another option would be some version of https://en.wikipedia.org/wiki/Iterative_deepening_depth-first_search or Limited Discrepancy Search, using the best answer found so far (from previous iterations) together with the A* heuristic to avoid nodes that cannot possibly lead to a possible answer. When you complete a pass when (after trimming) the current limit to the discrepancy or depth has not actually stopped you from investigating every node you wanted to, you know you have found the best answer.
I am working with C# and now trying to improve an algorithm (different story there), and to do that I need to have this data structure:
As you can see it is a linked list, where each node can have zero or one "follower"(the right ones). I am still thinking if more than one is necessary.
I could implement these linked lists by myself "raw" but I am thinking it would be much better if I use a collection from the ones available (such as List etc).
So far I am thinking of building a class "PairClass" which will have the a "first element" and a "follower". (the left node and right node). This could change if I decide to include more than one linked nodes(followers). Then using a List<PairClass>
One final consideration is that it would be nice if the data collection permits me to get the follower by giving the first element in an efficient manner.
Due to this last consideration, I am not sure if List<PairClass> would be the best approach.
Can someone advice me on what to use in these cases? I am always open to learn and discuss better ways of doing things. Basically I am asking for an efficient solution to the problem
EDIT: (in response to the comments)
How do you identify each node, is there an ID? or will the index in a list suffice?
So far, I am content with using just simple integers. But I guess you are right, you just give me an idea and perhaps the solution I need is simpler than I thought!
What are your use cases? How often will you be adding or removing elements? Are you going to iterate over this collection?
I will be adding elements often. The "follower" would likely be replaced often too. The elements are not going to be removed. I am going to iterate over this collection in the end, the reason being that followers are going to be eliminated as elements of consideration and replaced by their first element
(Note aside). The reason I am doing this is because I need to modify an algorithm that is taking too much time, This algorithm performs too many scans on an image (which takes time) so I plan to build this structure to solve the problem, therefore speed is a consideration.
You really need to add more details, however by your description
If you don't need to iterate over the list in order
If you have a key for each node
If you want fast lookups
You could use a Dictionary<Key,Node>
Node
public class Node
{
// examples
public string Id {get;set;}
public Node Parent {get;set;}
public Node Child {get;set;}
public Node Sibling {get;set;}
}
Option 1
var nodes = new Dictionary<string,Node>();
// add like this
nodes.Add(node.Id,node);
// look up like this
node = nodes[key];
// access its relatives
node.Parent
node.Child
Node.Sibling
If you want to iterate over the list often
If the index is all you need to look up the node
Or if you want to query the list via Linq
Option 2
var list = new List<Node>;
// lookup via index
var node = list[index];
// lookup via Linq
var node = list.FirstOrDefault(x => x.Id == someId)
In case it is a single follower scenario then I would suggest dictionary of list as a possible candidate as dictionary will make it accessible faster vertically and being a single follower list you can easily use a link list.
In case it is a multiple follower scenario I would suggest dictionary of dictionary collection which will make whole collection faster to access both vertically or horizontally.
Saruman gave a fairly good example of implementation.
I'm using a diagram control which contains a Node with links to its parents as well as its children. I want to flatten the tree but in order. What's the fastest and most efficient way to do this; preferably in C#.
Here's an example of a tree. I'm having difficulties coming up with a good description of the order so I apologize. The order should list all nodes in order of their creation. For example, in the tree blow, a valid order would be (Root,A,B,C,G,D,H,E,F). This order guarantees that no Node is added to the list before its parent(s).
Your graph is no tree, since a node can have multiple parents. I assume you have a directed acyclic graph (DAG). Your ordered list is called a toplogical ordering of that directed graph, which exists if and only if the graph is acyclic. Luckily for you such an ordering can be produced with linear running time.
You can do this with a depth-first search starting from the root(s):
public static IEnumerable<Node> GetTopologicalGraphOrdering(IEnumerable<Node> roots)
{
var list=new List<Node>();
var visited=new HashSet<Node>();
Action<Node> visit = null;
visit = (n)=>
{
if(visited.Add(n)
{
foreach(Node child in n.Children)
{
visit(child);
}
list.Add(n)
}
}
foreach(Node n in roots)
{
visit(n);
}
return list.Reverse();
}
(Untested notepad code)
Note that this naive implementation will cause a stack overflow for deep graphs. Switch to an explicit stack or an alternative algorithm, if that becomes a problem.
Read the wikipedia article Topological sorting for more details.
I'm wanting to make several modifications to a Roslyn syntax tree at once, all around the same area of code
tree = tree.ReplaceNodes(oldNode, newNode).RemoveNode(toRemove);
however, only the first modification succeeds. It seems that the first change changes all the nodes around it, so the RemoveNodes method no longer finds toRemove in the resulting tree. I really, really, don't want to re-do the work to re-calculate toRemove in the new tree, and using a single SyntaxRewriter to perform all the work (overriding the DefaultVisit method) is ridiculously slow.
How can I do what I want?
Before I offer a few alternatives, your comment that a SyntaxRewriter is "ridiculously slow" is a bit surprising. When you say "slow" do you mean "it's a lot of code to write" or "it's performing terribly"? That is the fastest (execution time wise) way to do multiple replacements, and both ReplaceNodes and RemoveNode use a rewriter internally. If you were having performance problems, make sure when you implement your DefaultVisit that you only visit child types if the nodes you're interested in are under the node it's called on. The simple trick is to compare spans and make sure the span of the node passed intersects with the nodes you are processing.
Anyways, SyntaxAnnotations provide a useful way to locate nodes in trees after a modification. You can just create an instance of the type, and attach it to a node with the WithAdditionalAnnotations extension method. You can locate the node again with the GetAnnotatedNodesOrTokens method.
So one way to approach your problem is to annotate your toRemove, and then when you call ReplaceNodes do two replacements in the same call -- one to do the oldNode -> newNode replacement and then one to do the toRemove -> toRemoveWithAnnotation replacement. Then find the annotated node in the resulting tree and call RemoveNode.
If you know that oldNode and toRemove aren't ancestors of each other (i.e. they're in unrelated parts of the tree), another option would be to reverse the ordering. Grab the parent node (call it oldNodeParent) of toRemove and call RemoveNode, meaning you get an updated parent node (call it oldNodeParentRewritten). Then, call ReplaceNodes doing two replacements: oldNode -> newNode and oldNodeParent -> oldNodeParentRewritten. No annotations needed.
I have an object graph wherein each child object contains a property that refers back to its parent. Are there any good strategies for ignoring the parent references in order to avoid infinite recursion? I have considered adding a special [Parent] attribute to these properties or using a special naming convention, but perhaps there is a better way.
If the loops can be generalised (you can have any number of elements making up the loop), you can keep track of objects you've seen already in a HashSet and stop if the object is already in the set when you visit it. Or add a flag to the objects which you set when you visit it (but you then have to go back & unset all the flags when you're done, and the graph can only be traversed by a single thread at a time).
Alternatively, if the loops will only be back to the parent, you can keep a reference to the parent and not loop on properties that refer back to it.
For simplicity, if you know the parent reference will have a certain name, you could just not loop on that property :)
What a coincidence; this is the topic of my blog this coming Monday. See it for more details. Until then, here's some code to give you an idea of how to do this:
static IEnumerable<T> Traversal<T>(
T item,
Func<T, IEnumerable<T>> children)
{
var seen = new HashSet<T>();
var stack = new Stack<T>();
seen.Add(item);
stack.Push(item);
yield return item;
while(stack.Count > 0)
{
T current = stack.Pop();
foreach(T newItem in children(current))
{
if (!seen.Contains(newItem))
{
seen.Add(newItem);
stack.Push(newItem);
yield return newItem;
}
}
}
}
The method takes two things: an item, and a relation that produces the set of everything that is adjacent to the item. It produces a depth-first traversal of the transitive and reflexive closure of the adjacency relation on the item. Let the number of items in the graph be n, and the maximum depth be 1 <= d <= n, assuming the branching factor is not bounded. This algorithm uses an explicit stack rather than recursion because (1) recursion in this case turns what should be an O(n) algorithm into O(nd), which is then something between O(n) and O(n^2), and (2) excessive recursion can blow the stack if the d is more than a few hundred nodes.
Note that the peak memory usage of this algorithm is of course O(n + d) = O(n).
So, for example:
foreach(Node node in Traversal(myGraph.Root, n => n.Children))
Console.WriteLine(node.Name);
Make sense?
If you're doing a graph traversal, you can have a "visited" flag on each node. This ensures that you don't revisit a node and possibly get stuck in an infinite loop. I believe this is the standard way of performing a graph traversal.
This is a common problem, but the best approach depends on the scenario. An additional problem is that in many cases it isn't a problem visiting the same object twice - that doesn't imply recursion - for example, consider the tree:
A
=> B
=> C
=> D
=> C
This may be valid (think XmlSerializer, which would simply write the C instance out twice), so it is often necessary to push/pop objects on a stack to check for true recursion. The last time I implemented a "visitor", I kept a "depth" counter, and only enabled the stack checking beyond a certain threshold - that means that most trees simply end up doing some ++/--, but nothing more expensive. You can see the approach I took here.
I'm not exactly sure what you are trying to do here but you could just maintain a hashtable with all previously visited nodes when you are doing your breadth first search of depth first search.
I published a post explaining in detail with code examples how to do object traversal by recursive reflection and also detect and avoid recursive references to prevent a stack over flow exception: https://doguarslan.wordpress.com/2016/10/03/object-graph-traversal-by-recursive-reflection/
In that example I did a depth first traversal using recursive reflection and I maintained a HashSet of visited nodes for reference types. One thing to be careful is to initialize your HashSet with your custom equality comparer which uses the object reference for hash calculation, basically the GetHashCode() method implemented by the base object class itself and not any overloaded versions of GetHashCode() because if the types of properties you traverse overload GetHashCode method, you may detect false hash collisions and think that you detected a recursive reference which in reality could be that the overloaded version of GetHashCode producing the same hash value via some heuristics and confusing the HashSet, all you need to detect is to check if there is any parent child in anywhere in the object tree pointing to the same location in memory.