Defensive Code to prevent Infinite Recursion in Parent/Child Hierarchy - c#

Given an Object as Such
public class Thing
{
public Thing() { this.children = new List<Thing>();}
public int Id {get; set;}
public string Name {get; set;}
public List<Thing> children{ get; set;}
public string ToString(int level = 0)
{
//Level is added purely to add a visual hierarchy
var sb = new StringBuilder();
sb.Append(new String('-',level));
sb.AppendLine($"id:{Id} Name:{Name}");
foreach(var child in children)
{
sb.Append(child.ToString(level + 1));
}
return sb.ToString();
}
}
and if used (abused!?) in such a way
public static void Main()
{
var root = new Thing{Id = 1,Name = "Thing1"};
var thing2 = new Thing{Id = 2,Name = "Thing2"};
var thing3 = new Thing{Id = 3,Name = "Thing3"};
root.children.Add(thing2);
thing2.children.Add(thing3);
thing3.children.Add(root); //problem is here
Console.WriteLine(root.ToString());
}
how does one be defensive about this kind of scenario.
This code as it stands produces a stackoverflow, infinite recursion, or memory exceeded error.
In a (IIS) website this was causing the w3 worker processes to crash, and eventually the app pool to shut down (Rapid-Fail Protection)
The code above is indicative only to reproduce the problem. In the actual scenario, the structure is coming from a database with Id and ParentId.
Database table structure similar to
CREATE TABLE Thing(
Id INT NOT NULL PRIMARY KEY,
Name NVARCHAR(255) NOT NULL,
ParentThingId INT NULL //References self
)
The issue is that the creation of the 'things' by users is not preventing a incestuous relationship (i.e. a Parent could have children (who could have children etc.... that one eventually points at the parent again). One could put a constraint on the db to prevent the thing not being its own parent (makes sense), but depending on depth this could get ugly, and there is some argument that a circular reference may be required (we are still debating this....)
So arguably the structures can be circular, but if you want to render this kind of structure on a web page say as a <ul><li><a> tag kind of thing in a parent/child menu, how does one become proactive about dealing with this user generated data issue in code?
.NET fiddle here

One way would be to include a collection of visited nodes in the recursive call. If visited before you are in a cycle.
public string ToString(int level = 0, HashSet<int> visited)
{
foreach(var child in children)
{
if(visited.Add(child.Id))
sb.Append(child.ToString(level + 1, visited));
else
//Handle the case when a cycle is detected.
}
return sb.ToString();
}

You can unfold the tree structure by putting each element on a stack or queue and popping items of there while the collection has items. In the while loop you put the children of each item on the queue.
If you care about the level of the item in the tree you need can use a helper object that stores that.
Edit:
While unfolding the tree you can put each item on a new list and use that as reference for circular problems.

If you can a) eliminate that possibility of wanting to have circular references and b) guarantee that all children are already known of when that parent is created, its a great opportunity to make children an immutable collection that's only set via the constructor.
That gives you a class that, by structural recursion, you know cannot contain any loops, no matter how big the overall structure is. Something like:
public sealed class Thing
{
public Thing(IEnumerable<Thing> children) {
this._children = children.ToList().AsReadOnly();
}
private readonly ReadOnlyCollection<Thing> _children;
public int Id {get; set;}
public string Name {get; set;}
public IEnumerable<Thing> children {
get {
return _children;
}
}
public string ToString(int level = 0)
{
//Level is added purely to add a visual hierarchy
var sb = new StringBuilder();
sb.Append(new String('-',level));
sb.AppendLine($"id:{Id} Name:{Name}");
foreach(var child in children)
{
sb.Append(child.ToString(level + 1));
}
return sb.ToString();
}
}
Now, of course, those conditions I have stated above are quite big "if"s, so you need to consider whether it's a good fit for you.

Related

C# Cloning graph and updating circular references

Let me preface this by stating that I have seen similar posts to this, but none of the solutions have satisfied me and/or applied to C#.
I have a Graph class that consists of Node and Connection objects. The graph contains collections consisting of all of the child Node and Connection objects associated with it. In addition to this, each Node has a collection of Connection objects.
Please note: This is a simplified toy problem. You can view the actual (work-in-progress) production code here. In production, a Neuron is a Node and an Axon is a Connection.
public class Graph : IDeepCloneable<Graph>
{
// These are basically just Dictionary<int,object>s wrapped in an ICollection
public NodeCollection Nodes;
public ConnectionCollection Connections;
public Graph Clone()
{
return new Graph
{
Nodes = this.Nodes.Clone(),
Connections = this.Connections.Clone()
};
}
}
public class Node : IDeepCloneable<Node>
{
public int Id;
public NodeConnectionCollection Connections;
// NodeConnectionCollection is more or less the same as NodeCollection
// except that it stores Connection objects into '.Incoming' and '.Outgoing' properties
public Node Clone()
{
return new Node
{
Id = this.Id,
Connections = this.Connections.Clone()
};
}
}
public class Connection : IDeepCloneable<Connection>
{
public int Id;
public Node From;
public Node To;
public Connection Clone()
{
return new Connection
{
Id = this.Id,
From = this.From.Clone(),
To = this.To.Clone()
};
}
}
public class ConnectionCollection : ICollection<Connection>, IDeepCloneable<ConnectionCollection>
{
private Dictionary<int, Connection> idLookup;
private Dictionary<ProjectionKey, Connection> projectionLookup;
public int Count => idLookup.Count;
public bool IsReadOnly => false;
public Add( Connection conn )
{
idLookup.Add( conn.Id, conn );
projectionLookup.Add( new ProjectionKey( conn.From, conn.To ), conn );
}
...
internal struct ProjectionKey
{
readonly intFrom;
readonly int To;
readonly int HashCode;
public ProjectionKey( int from, int to )
{
From = from;
To = to;
HashCode = ( 23 * 397 + from ) * 397 + to;
}
public override int GetHashCode() { return HashCode; }
}
}
public class NodeCollection : ICollection<Node>, IDeepCloneable<NodeCollection>
{
private Dictionary<int, Node> nodes;
private Dictionary<int, InputNode> inputNodes;
private Dictionary<int, InnerNode> innerNodes;
private Dictionary<int, OutputNode> outputNodes;
...
public Node this[ int id ]
{
get => nodes[ id ];
}
}
Each of these objects support deep cloning, with the main idea being that consuming classes can call Clone() on child classes and work down the stack that way.
However, this is not viable in production. A call to Graph.Clone() will clone the NodeCollection and ConnectionCollection fields, which will clone each Node and Connection instance stored within them, which will each clone other referencing child elements.
A common solution seems to be storing the Ids of each child object and then rebuilding the references when all cloning is complete. However, as far as I am aware, this would require a reference to parent objects and tightly couple the data structure.
I am very puzzled at how to properly approach this. I require a reasonable amount of performance, as my application (a genetic algorithm) performs cloning constantly, but in this case I am more interested in finding a robust design pattern or implementation that will allow me to perform deep cloning of this graph structure while stashing a lot of the gruntwork behind the scenes.
Is there any design pattern that will allow me to clone this data structure as-is while updating circular references and maintaining its integrity?
My suggestion would be to change your approach to the problem from cloning to recreating. I've dealt with a resembling problem, where I was saving a graph user created manually from the user interface, and then upon an import of saved graph I was recreating it. It sounds almost the same if you think about it.
So the solution I came up with was serializing the graph from a central control (considering you are modifying graphs with an heuristic I assume you have central control over the graph). Even if you don't have a central control over the graph I believe it can be traversed in a way to get all the information.
In the simplest form a graph is a collection of neighborhood information.
Can be directed or undirected as well
1 -> 2
1 -> 3
3 -> 2
So if you can come up with a way to generate a list like this, after just tweaking this simple list, you can create your new graph.
Or another approach would be to list your nodes with their neighbors like below,
1, [2,3]
3, [2]
This would even be simpler to recreate the graph in my opinion.
Here is the file from the project I applied this approach if you are curious about - I don't think it would be a reference for the answer or question though.

Entity Framework lazy loaded collection sometimes null

I have 2 models, one of which has a child collection of the other:
[Table("ParentTable")]
public class Parent
{
[Key, Column("Parent")]
public string Id { get; set; }
[Column("ParentName")]
public string Name { get; set; }
public virtual ICollection<Widget> Widgets { get; set; }
}
[Table("WidgetTable")]
public class Widget
{
public string Year { get; set; }
[Column("Parent")]
public string ParentId { get; set; }
public string Comments { get; set; }
[Key, Column("ID_Widget")]
public int Id { get; set; }
[ForeignKey("ParentId"), JsonIgnore]
public virtual Parent Parent { get; set; }
}
This code works for > 99% of widgets:
var parent = _dbContext.Parents.FirstOrDefault(p => p.Id == parentId);
Usually, parent.Widgets is a collection with more than one item. In a couple of instances, however, parent.Widgets is null (not a collection with no items).
I have used Query Analyzer to trace both the query for the parent and the query for widgets belonging to that parent. Both return exactly the rows I expect; however, the model for one or two parent IDs results in a null value for the Widgets collection. What could cause a lazy-loaded collection to be null in some instances but not others?
This situation commonly comes up when a dbContext lifetime is left open across an Add, saveChanges, and then retrieval.
For example:
var context = new MyDbContext(); // holding Parents.
var testParent = new Parent{Id = "Parent1", Name = "Parent 1"};
context.Parents.Add(testParent);
At this point if you were to do:
var result = context.Parents.FirstOrDefault(x=> x.ParentId == "Parent1");
you wouldn't get a parent. Selection comes from committed state.. So...
context.SaveChanges();
var result = context.Parents.FirstOrDefault(x=> x.ParentId == "Parent1");
This will return you a reference to the parent you had inserted since the context knows about this entity and has a reference to the object you created. It doesn't go to data state. Since your definition for Widgets was just defined with a get/set auto-property the Widgets collection in this case will be #null.
if you do this:
context.Dispose();
context = new MyDbContext();
var result = context.Parents.FirstOrDefault(x=> x.ParentId == "Parent1");
In this case the parent is not known by the new context so it goes to data state. EF will return you a proxy list for lazy loading the Widgets, which there are none so you get back an empty list, not #null.
When dealing with collection classes in EF it's best to avoid auto-properties or initialize them in your constructor to avoid this behaviour; you'll typically want to assign Widgets after creating a Parent. Initializing a default member is better because you don't want to encourage ever using a setter on the collection property.
For example:
private readonly List<Widget> _widgets = new List<Widget>();
public virtual ICollection<Widget> Widgets
{
get { return _widgets; }
protected set { throw new InvalidOperationException("Do not set the Widget collection. Use Clear() and Add()"); }
}
Avoid performing a Set operation on a collection property as this will screw up in entity reference scenarios. For instance, if you wanted to sort your Widget collection by year and did something like:
parent.Widgets = parent.Widgets.OrderBy(x=> x.Year).ToList();
Seems innocent enough, but when the Widgets reference was an EF proxy, you've just blown it away. EF now cannot perform change tracking on the collection.
Initialize your collection and you should avoid surprises with #null collection references. Also I would look at the lifetime of your dbContext. It's good to keep one initialized over the lifetime of a request or particular operation, but avoid keeping them alive longer than necessary. Context change tracking and such consume resources and you can find seemingly intermittent odd behaviour like this when they cross operations.

Find orphaned elements within a hierarchy

I am struggling to find a good solution for this. It's fairly straight forward to find the orphaned elements, but the trouble is storing them in such a way that they can easily be merged back into the hierarchy at a later point.
I the following abstract class that has multiple implementations:
public abstract class FilterElement
{
public abstract string ID { get; }
public abstract IEnumerable<FilterElement> Children { get; set; }
public FilterElement Parent { get; set; }
}
I have two hierarchies of FilterElement - the "master" (i.e. the main structure), and the "filters". The filters point at elements in the master - however, if these master elements do not exist, I wish to create a third structure, the "orphans".
I'm struggling to do this. While it's easy to identify the orphaned elements, I don't know how to store them effectively. This is the current solution:
Note: "GetFlatKey" returns a unique key for the element based on it's parents & children, and "RecursiveSelect" effectively flattens the hierarchy:
private IEnumerable<FilterElement> GetOrphanedFilterElements
(IEnumerable<FilterElement> filters,
IEnumerable<IFilterFileViewModel> visibleList)
{
var flattenedMasterList = visibleList.Cast<IFilterViewModel>()
.RecursiveSelect(f => f.Children)
.Select(x => x.GetFlatKey).ToList();
var orphanedFilterFiles = new List<FilterElement>();
foreach (var f in filters.RecursiveSelect(f => f.Children))
{
// Remove non orphaned files.
if (!flattenedMasterList.Contains(f.GetFlatKey))
{
orphanedFilterFiles.Add((f));
}
}
return orphanedFilterFiles;
}
The problem with this is that the elements in the orphanedFilterFiles list contain references to other elements - e.g. An orphan will have a parent, which may have non-orphaned Children. This makes it difficult to merge back into the final hierarchy, which is the main issue.
Can anyone help me find a better solution, or just tell me what I'm doing wrong?

Objects containing list of same object type

Is there anything wrong with defining something like this:
class ObjectA
{
property a;
property b;
List <ObjectA> c;
...
}
No, and because the answer needs at least 30 characters, I'll add that this is a common pattern.
Since you included the oop tag, though, I'll add that this pattern gives a lot of control to the outside world. If c is a list of children, for example, you're giving everyone who has access to an instance of ObjectA the ability to add, delete, or replace its children.
A tighter approach would be to use some sort of read-only type (perhaps implementing IList<ObjectA>) to expose the children.
EDIT
Note that the following still allows others to modify your list:
class ObjectA
{
property a;
property b;
List <ObjectA> c;
...
public List<ObjectA> Children { get { return c; } }
}
The absence of a setter only prevents outsiders from replacing the list object.
Nope. That's perfectly acceptable. Tree structures do this.
It is perfectly valid. For example, you would have to do something like this to build a tree data structure (parent node contains a list of child nodes).
i have to ask if your question is about putting a List< > in there, or if it is about putting a List< ObjectA > inside of ObjectA. and the answer to both questions is "Yes"!
the thing to keep in mind is that by default, the access is private. if you want other classes to use this list, then you need to add a few things to your class...
class ObjectA
{
property a;
property b;
List <ObjectA> c;
// allow access, but not assignment
// you can still modify the list from outside, you just cant
// assign a new list from outside the class
public List<ObjectA> somePropertyName{ get { return this.c;}}
// same as above, only allow derived child classes to set the list
public List<ObjectA> somePropertyName{ get { return this.c;}
protected set { this.c = value;} }
// allow all access
public List<ObjectA> somePropertyName{ get { return this.c;}
set { this.c = value;} }
}
No. This is valid. Many structures uses this graph like pattern.
If you eg have a base collection class
namespace MiniGraphLibrary
{
public class GraphCollection
{
public Node Root { set; get; }
public Node FindChild(Node root)
{
throw new NotImplementedException();
}
public Node InsertNode(Node root, Node nodeToBeInserted)
{
throw new NotImplementedException();
}
}
}
Then you can have the node act like this:
namespace MiniGraphLibrary
{
public class Node
{
private string _info;
private List<Node> _children = new List<Node>();
public Node(Node parent, string info)
{
this._info = info;
this.Parent = parent;
}
public Node Parent { get; set; }
public void AddChild(Node node)
{
if (!this.DoesNodeContainChild(node))
{
node.Parent = this;
_children.Add(node);
}
}
public bool DoesNodeContainChild(Node child)
{
return _children.Contains(child);
}
}
}
Note that this is something I wrote in 2 minutes, and it is problery not good in production, but the 2 main things is that you have a parent node and many children. When you add a child node to a given node, then you make sure that it has its parent node set. Here I first check if the child is allready in the children list before connection the two.
You could make some changes to the code, and make sure that if a child is removed an parent lists that it is allready connected to. I have not done this there.
I have made this to illustrate how it could be used. And it is used many places. Fx clustered indexes in MSSQL uses some sort of this tree like representation. But I am NOT an expert on this subject, so correct me if I am wrong.
I have not implemented the two classes in the GraphCollection class. The downside of my little example is that you if you are going to implement the Find method, then you have to go through the whole graph. You could make a binary tree that only has two children:
namespace MiniTreeLibrary
{
public class SimpleNode
{
private string _info;
private SimpleNode _left;
private SimpleNode _right;
private SimpleNode _parent;
public SimpleNode(Node parent, string info)
{
this._info = info;
this.Parent = parent;
}
public Node Parent { get; private set; }
}
}
I have omitted the insertion of the right and left. Now with this binary tree you could do some pretty darn fast searching, if you wanted!! But that is another discossion.
There is many rules when it comes trees and graphs, and my graph is even a real graph. But I have put these examples here so you can see that it is used alot!! If you want to go more into linear and other data structures, then see this serie of articles. Part 3, 4 and 5 they talks alot more about trees and graphs.

Composite iteration failure (.net)

Below is a first attempt at using a Composite pattern.
It works in the sense that I can arbitrarily nest and get the correct results for the Duration property, with is the focus of the composition. BUT has a coding problem in that the iteration over the children needed to output the composite's ToString() fails:
System.InvalidOperationException : Collection was modified; enumeration operation may not execute.
The are a few extension methods for GetDescendents in this posting, including one that uses a stack to avoid the expense of recursion and
nested iterators.
I would like to understand the pattern better first though, so I have a few questions here:
How can I change the existing iteration code to prevent this error? I know how to convert it to a Linq equivalent but I want to leave it as the loops until I understand what is wrong with it.
Is it typical in the Composite to provide a Count property, or somehow cache the count after an iteration?
in the general case where you don't need a specialized collection, would you typically have your Children property be IEnumerable, IList, or List?
Any good links for examples of working (non-trival) .net code would also be much appreciated.
Cheers,
Berryl
CODE
public interface IComponent {
void Adopt(IComponent node);
void Orphan(IComponent node);
TimeSpan Duration { get; }
IEnumerable<IComponent> Children { get; }
}
public class Allocation : Entity, IAllocationNode {
public void Adopt(IAllocationNode node) { throw new InvalidOperationException(_getExceptionMessage("Adopt", this, node)); }
public void Orphan(IAllocationNode node) { throw new InvalidOperationException(_getExceptionMessage("Orphan", this, node)); }
public IEnumerable<IAllocationNode> Allocations { get { return Enumerable.Empty<IAllocationNode>(); } }
public virtual TimeSpan Duration { get; set; }
}
class MyCompositeClass : IAllocationNode {
public MyCompositeClass() { _children = new List<IAllocationNode>(); }
public void Adopt(IAllocationNode node) { _children.Add(node); }
public void Orphan(IAllocationNode node) { _children.Remove(node); }
public TimeSpan Duration {
get {
return _children.Aggregate(TimeSpan.Zero, (current, child) => current + child.Duration);
}
}
public IEnumerable<IAllocationNode> Children {
get {
var result = _children;
foreach (var child in _children) {
var childOnes = child.Children;
foreach (var node in childOnes) {
result.Add(node);
}
}
return result;
}
}
private readonly IList<IAllocationNode> _children;
#endregion
public override string ToString() {
var count = Children.Count();
var hours = Duration.TotalHours.ToString("F2");
return string.Format("{0} allocations for {1} hours", count, hours);
}
}
How can I change the existing
iteration code to prevent this error?
The exception is occurring because the code in the Children property's getter is modifying a collection while iterating over it.
You appear to be under the impression that the code
var result = _children;
creates a copy of the list referred to by the _children field. It does not, it just copies the reference to the list (which is what the value of the field represents) to the variable.
An easy fix to copy the list over is to instead do:
var result = _children.ToList();
I know how to convert it to a Linq
equivalent.
The LINQ equivalent of your current code, which should work in a lazy manner, is:
return _children.Concat(_children.SelectMany(child => child.Children));
EDIT:
I was originally under the impression that your code was limiting the traversal-depth to two levels (children and grandchildren), but now I can see that this is not the case: there is indeed a recursive call to the property Children rather than just the value of the field _children. This naming is quite confusing because the property and the 'backing' field represent different things entirely. I strongly recommend that you rename the property to something more meaningful, such as Descendants.

Categories