Finding difference between two tree structures C#

Finding difference between two tree structures C# - c#

I am in need of a function that will compare the difference between two different file structures and return that difference as a file structure.
I have a class, "Element" that has properties ID and Children, with ID being a string and Children being a collection of Element.
public class Element
{
string ID { get; }
IEnumerable<Element> Children { get; }
}
Now, let's say I have the following structures of Elements:
Structure A Structure B
- Category 1 - Category 1
- Child X - Child X
- Child Y - Child Z
- Category 2
I would like to return a structure that tells me which elements are present in structure A but missing from structure B, which would look as follows:
Structure Diff
- Category 1
- Child Y
- Category 2
Is there a simple way of doing this using LINQ, or a straight-forward algorithm (Assuming there can be many levels to the tree).

Yes, it is. You can just compare two enumerables of strings that contains paths of files:
Category 1\
Category 1\Child X
Category 1\Child Y
Category 2\
Category 1\
Category 1\Child X
Category 1\Child Z
Having these two enumerables you can call Enumerable.Except method to keep items from the first enumerable that are missing in the second enumerable.

Sample implementation to get you started (tested only on one case):
internal class Program {
private static void Main(string[] args) {
var c1 = new Element[] {
new Element() {ID = "Category 1", Children = new Element[] {
new Element() {ID = "Child X" },
new Element() {ID = "Child Y" }
}},
new Element() {ID = "Category 2",}
};
var c2 = new Element[] {
new Element() {ID = "Category 1", Children = new Element[] {
new Element() {ID = "Child X" },
new Element() {ID = "Child Z" }
}},
};
var keys = new HashSet<string>(GetFlatKeys(c2));
var result = FindDiff(c1, keys).ToArray();
Console.WriteLine(result);
}
private static IEnumerable<Element> FindDiff(Element[] source, HashSet<string> keys, string key = null) {
if (source == null)
yield break;
foreach (var parent in source) {
key += "|" + parent.ID;
parent.Children = FindDiff(parent.Children, keys, key).ToArray();
if (!keys.Contains(key) || (parent.Children != null && parent.Children.Length > 0)) {
yield return parent;
}
}
}
private static IEnumerable<string> GetFlatKeys(IEnumerable<Element> source, string key = null) {
if (source == null)
yield break;
foreach (var parent in source) {
key += "|" + parent.ID;
yield return key;
foreach (var c in GetFlatKeys(parent.Children, key))
yield return c;
}
}
}
As said in another answer, it's easier to first get flat list of keys for each element in second tree, then filter out elements from the first tree based on that list.

Related

Creating a tree from a collection List<T>

In order to create a tree, I use the following code.
var db = _context.GetContext();
var accounts = await db
.Set<TradingAccount>()
.ToListAsync(cancellationToken: token);
accounts.ForEach(
account => account.Children = accounts.Where(
child => child.ParentTradingAccountId == account.Id).ToList()
);
return accounts;
It works well (albeit not fast), but it does not create a completely correct tree. The same element can be both root and dependent. How can I exclude elements from the selection that have already been included in the tree?

The problem is that the code above adds dependent nodes as children, but does not remove them from the top-level list. Ususally recursion can be used to create tree structures, like so:
private IEnumerable<TracingAccount> GetAccounts(IEnumerable<TradingAccount> allAccounts, int parentTrackingAccountId)
{
var accounts = allAccounts
.Where(x => x.ParentTrackingAccountId == parentTrackingAccountId)
.ToList();
foreach (var acc in accounts)
{
// Get children of current node
acc.Children = GetAccounts(allAccounts, acc.Id);
}
return accounts;
}
Above function retrieves all accounts for a specified parent id and calls itself again (that's why it is called a recursive function) to retrieve the children.
You can use the function in your code as follows (I assume that the root level accounts have a parent id of 0):
var db = _context.GetContext();
var allAccounts = await db
.Set<TradingAccount>()
.ToListAsync(cancellationToken: token);
var accounts = GetAccounts(allAccounts, 0);
return accounts;
The call to GetAccounts gets all root level accounts and - because the function calls itself again for each account - by that also retrieves the subtree of the root level accounts.

I wrote an algoritm to build a tree from a flat list. (a faster approach than filtering the same list over and over) As the items comes from a database, the parentId's should exists, and circular references should not occur. This example isn't able to handle those. But it might give you a jump-start how to make a faster algoritm.
In a nutshell, loop until all items are added. Use a while-loop instead of a foreach. The problem with foreach is that you aren't allowed to make modifications to the collection while iterating. This can be solved by creating copies, but it will end-up in many copy actions.
When it is a child (so the parentId is filled), I use the lookup dictionary to check if his parent was already added. If not, I'll skip it and check the next item. (this makes it possible that the parent is below the child in de list).
When it is added to the parent, also add the child to the lookup, so their children are able to add them as child.
When it is a parent, I add it to the rootNodes, and add it to the lookup.
Multiple rootnodes are supported.
public static class TreeBuilder
{
public static IEnumerable<Node> BuildTree(IEnumerable<Item> items)
{
var nodeLookup = new Dictionary<int, Node>();
var rootNodes = new List<Node>();
var itemCopy = items.ToList(); // we don't want to modify the original collection, make one working copy.
int index = 0;
while (true)
{
// when the item copy is empty, we're done.
if (itemCopy.Count == 0)
return rootNodes.ToArray();
// do go out of bounds.
index = index % itemCopy.Count;
// get the current item on that index.
var current = itemCopy[index];
// does it have a parent?
if(current.ParentId.HasValue)
{
// yes, so, it's a child
// look if the parent is already found in the lookup.
if (nodeLookup.TryGetValue(current.ParentId.Value, out var parentNode))
{
// create a new node
var node = new Node { Id = current.Id };
// add it to the lookup
nodeLookup.Add(current.Id, node);
// add it as child node to the parent.
parentNode.ChildNodes.Add(node);
// remove it from the itemCopy (so don't check it again)
itemCopy.RemoveAt(index);
// The index doesn't need to be increase, because the current items is removed.
}
else
// next item, the parent is not in the tree yet.
index++;
}
else
{
// root node
var node = new Node { Id = current.Id };
nodeLookup.Add(current.Id, node);
rootNodes.Add(node);
itemCopy.RemoveAt(index);
}
}
}
}
My test setup:
private void button4_Click(object sender, EventArgs e)
{
/*
* 1
* |
* +- 2
* |
* +- 3
* |
* +- 5
* |
* +- 4
*/
var items = new[]
{
new Item{ Id = 1, ParentId = null},
new Item{ Id = 2, ParentId = 1},
new Item{ Id = 3, ParentId = 2},
new Item{ Id = 4, ParentId = 5},
new Item{ Id = 5, ParentId = 2},
};
var tree = TreeBuilder.BuildTree(items);
DisplayTree(tree);
}
private void DisplayTree(IEnumerable<Node> nodes, string indent = "")
{
foreach (var node in nodes)
{
Trace.WriteLine($"{indent}{node.Id}");
DisplayTree(node.ChildNodes, indent + " ");
}
}
The classes I used are:
public class Node
{
public int Id { get; set; }
public List<Node> ChildNodes { get; } = new List<Node>();
}
public class Item
{
public int Id { get; set; }
public int? ParentId { get; set; }
}
Which results in:
1
2
3
5
4

Modifying Observable Collection elements efficiently

Basically I have an Observable Collection in my class and a static integer keeping track of how many elements there are in the collection. Every element in the collection has a unique ID starting from 1 up to the total number of elements.
What I want to do is, take out an element with a random ID, and then change the IDs of the succeeding elements accordingly so the IDs run continuously from 1 to the total number of elements. So for example if I have 5 elements and I remove the element with ID number 3, then I need some code that will modify the ID property of element with ID 4 and change it to 3, and modify the ID property of element with ID 5 and change it to 4, so all the IDs are in order without gaps.
I thought of doing something like this:
var matches = MyObject.MyCollection.Where((myobject) => myobject.UniqueId.Equals(ID_value_of_removed_item_plus_one))
foreach (MyDataType CollectionItem in matches)
{
MyDataType CollectionItemCopy = ColectionItem
CollectionItemCopy.UniqueId--;
MyCollection.Remove(CollectionItem);
MyCollection.Add(CollectionItemCopy);
}
But I can't help but imagine there's a more efficient way to go about doing this. I know the Observable Collection isn't a suitable choice for this kind of application but the thing is the elements are bound to a ListView so I can't use any other type of generic collection.

Is this what you want?
class Entity
{
public int Id { get; set; }
public string Name { get; set; }
}
var collection = new ObservableCollection<Entity>
{
new Entity { Id = 1, Name = "Apple" },
new Entity { Id = 2, Name = "Peach" },
new Entity { Id = 3, Name = "Plum" },
new Entity { Id = 4, Name = "Grape" },
new Entity { Id = 5, Name = "Orange" },
};
collection.CollectionChanged += (sender, args) =>
{
if (args.Action == NotifyCollectionChangedAction.Remove || args.Action == NotifyCollectionChangedAction.Replace)
{
for (var i = args.OldStartingIndex; i < collection.Count; i++)
{
collection[i].Id--;
}
}
};
collection.RemoveAt(2); // Grape.Id == 3, Orange.Id == 4

Just get all matches at once, then subtract one of the ID. There is no need to remove and add it again, since you are just adding and removing the same item. At the end, remove the original item.
var matches = MyObject.MyCollection.Where(myobject => myobject.UniqueId >= ID_value_of_removed_item_plus_one);
foreach (MyDataType CollectionItem in matches)
{
CollectionItem.UniqueId--;
}
MyCollection.Remove(itemToDelete);

Query for selecting and anonymous object with where clause

This is my code:
var tree = new
{
id = "0",
item = new List<object>()
};
foreach ()
{
tree.item.Add(new
{
id = my_id,
text = my_name,
parent = my_par
});
}
But I want to replace the code in the foreach with the following:
foreach ()
{
tree.item.Where(x => x.id == 2).First().Add(new
{
id = my_id,
text = my_name,
parent = my_par
});
}
How to do this? I get exception that the type doesn't contain a definition for id.
The problem here is the anonymous type.
I tried creating a new class which would have 2 properties: id, text and parent and the syntax worked, but the tree's definition was invalid.
So the question here is how to make a query to an anonymous type, without adding a new class which would represent the anonymous type.

If you want to do it without creating a new class you can use dynamic for the filtering.
tree.item.Where(x => ((dynamic)x).id == 2).First()....
Although that will give you a single anonymous object and not a collection so you can not add anything to it.

One, this is really ugly. You should think of declaring a class for this (you got a downvote from some purist for this I assume ;))
Two, you're doing something that's impossible. Think about this, in your first loop, when you do tree.item.Where(x => x.id == 2).First(), you're getting x back, which is an object and object doesn't have an Add method. To illustrate, take this example:
var tree = new
{
id = "0",
item = new List<object>
{
new
{
id = 2,
text = "",
parent = null
}
}
};
Now when you do
var p = tree.item.Where(x => x.id == 2).First(); //even if that was compilable.
you are getting this
new
{
id = 2,
text = "",
parent = null
}
back. Now how are you going to Add something to that? It really is an anonymous type with no method on it.
I can only assume, but you might want this:
var treeCollection = new
{
id = 0,
item = new List<object> // adding a sample value
{
new // a sample set
{
id = 2,
text = "",
parent = null // some object
}
}
}.Yield(); // an example to make it a collection. I assume it should be a collection
foreach (var tree in treeCollection)
{
if (tree.id == 0)
tree.item.Add(new
{
id = 1,
text = "",
parent = null
});
}
public static IEnumerable<T> Yield<T>(this T item)
{
yield return item;
}
Or in one line:
treeCollection.Where(x => x.id == 0).First().item.Add(new
{
id = 1,
text = "",
parent = null
});

How to create objects with retrieved Hierarchical result set?

I am using C# language. My problem is that i don't know how to store my retrieved hierarchical result set to my object.
Here's is my Object:
public class CategoryItem
{
public string Name { get; set; }
public int CategoryID { get; set; }
public int ParentID { get; set; }
public List<CategoryItem> SubCategory = new List<CategoryItem>();
public List<CategoryItem> GetSubCategory()
{
return SubCategory;
}
public void AddSubCategory(CategoryItem ci)
{
SubCategory.Add(ci);
}
public void RemoveSubCategory(CategoryItem ci)
{
for (int i = 0; i < SubCategory.Count; i++)
{
if (SubCategory.ElementAt(i).CategoryID == ci.CategoryID)
{
SubCategory.RemoveAt(i);
break;
}
}
}
}
Here's is my sample retrieve data set from MSSQL server
ID PrntID Title
_______ _______
1 0 Node1
2 1 Node2
3 1 Node3
4 2 Node4
5 2 Node5
6 2 Node6
7 3 Node7
8 4 Node8
9 4 Node9
10 9 Node10
Tree view for easy reference
Node 1
-Node 2
--Node 4
---Node 8
---Node 9
----Node 10
--Node 5
--Node 6
-Node 3
--Node 7
My problem is how to do I store this result to my "CategoryItem Object". I don't have any clue do I need to use iteration for this? Specially when the node is 2 level-deep.
I want to store it in such a like this:
List<CategoryItem> items = new List<CategoryItem>();
with this I can dig every objects in the 'items' object and I can access its sub-category / child / children using the GetSubCategory() method of my class. Is this possible?

If you know that in your DataSet a node will never appear before its parent, you can use this code. Here you keep track of the already read items in a Dictionary when you can look for parents of the newly read nodes. If you find the parent you add the new item to its children, otherwise it's a first level node.
public static List<CategoryItem> LoadFromDataSet(DataSet aDS)
{
List<CategoryItem> result = new List<CategoryItem>();
Dictionary<int, CategoryItem> alreadyRead = new Dictionary<int, CategoryItem>();
foreach (DataRow aRow in aDS.Tables["YourTable"].Rows)
{
CategoryItem newItem = new CategoryItem();
newItem.CategoryID = (int)aRow["ID"];
newItem.ParentID = (int)aRow["PrntID"];
newItem.Name = (string)aRow["Title"];
alreadyRead[newItem.CategoryID] = newItem;
CategoryItem aParent;
if (alreadyRead.TryGetValue(newItem.ParentID, out aParent))
aParent.AddSubCategory(newItem);
else
result.Add(newItem);
}
return result;
}
If my assumption isn't true (i.e. it is possible for a node to appear in the DataSet before its parent), you have to first read all the nodes (and put them in the Dictionary), then loop through the same Dictionary to build the result. Something like this:
public static List<CategoryItem> LoadFromDataSet(DataSet aDS)
{
List<CategoryItem> result = new List<CategoryItem>();
Dictionary<int, CategoryItem> alreadyRead = new Dictionary<int, CategoryItem>();
foreach (DataRow aRow in aDS.Tables["YourTable"].Rows)
{
CategoryItem newItem = new CategoryItem();
newItem.CategoryID = (int)aRow["ID"];
newItem.ParentID = (int)aRow["PrntID"];
newItem.Name = (string)aRow["Title"];
alreadyRead[newItem.CategoryID] = newItem;
}
foreach (CategoryItem newItem in alreadyRead.Values)
{
CategoryItem aParent;
if (alreadyRead.TryGetValue(newItem.ParentID, out aParent))
aParent.AddSubCategory(newItem);
else
result.Add(newItem);
}
return result;
}

You have to write recursive code to achieve this.
//First of all, find the root level parent
int baseParent = "0";
// Find the lowest root parent value
foreach (var selection in collection)
{
//assign any random parent id, if not assigned before
if (string.IsNullOrEmpty(baseParent))
baseParent = selection["PrntID"];
//check whether it is the minimum value
if (Convert.ToInt32(selection["PrntID"]) < Convert.ToInt32(baseParent))
baseParent = selection["PrntID"];
}
//If you are sure that your parent root level node would always be zero, then you could //probably skip the above part.
//Now start building your hierarchy
foreach (var selection in collection)
{
CategoryItem item = new CategoryItem();
//start from root
if(selection["Id"] == baseParentId)
{
//add item property
item.Id = selection["id];
//go recursive to bring all children
//get all children
GetAllChildren(item , collection);
}
}
private void GetAllChildren(CategoryItem parent, List<Rows> Collection)
{
foreach(var selection in Collection)
{
//find all children of that parent
if(selection["PrntID"] = parent.Id)
{
CategoryItem child = new CategoryItem ();
//set properties
child.Id = selection["Id"];
//add the child to the parent
parent.AddSubCategory(child);
//go recursive and find all child for this node now
GetAllChildren(child, Collection);
}
}
}
Note: This is not exactly working code. But this would give you insight how you go around and build a Hierarchical data structure that has to be represented as object.

Load your table in to a Datatable and fist find the root node and create root object
DataRow[] rootRow = table.Select("PrntID = 0");
CategoryItem root = new CategoryItem() { CategoryID = (int)rootRow[0]["ID"].ToString(), Name = rootRow[0]["Title"].ToString(), ParentID = (int)rootRow[0]["PrntID"].ToString() };
Then you need to call recursive method to add sub categories,
GetCategoryItem((int)rootRow[0]["ID"].ToString(), root);
change below method as you wish.
public void GetCategoryItem(CategoryItem parant)
{
DataRow[] rootRow = table.Select("PrntID =" + parant.CategoryID);
for (int i = 0; i < rootRow.Length; i++)
{
CategoryItem child = new CategoryItem() { CategoryID = (int)rootRow[i]["ID"].ToString(), Name = rootRow[i]["Title"].ToString(), ParentID = (int)rootRow[i]["PrntID"].ToString() };
GetCategoryItem(child);
parant.SubCategory.Add(child);
}
}

Using LINQ to find three or more matching records

First, I'll describe my table structure.
I have table, with 2 columns (ID and Root). This table is converted to a List of Nodes where the simple node structure is:
struct Node
{
public int id;
public int root;
}
I need to find all entries in this List where there's 3 or more roots equals.
Example:
struct TeleDBData
{
public int ID;
public int? RootID;
}
private void InitList()
{
var eqList = new List<TeleDBData>();
TeleDBData root = new TeleDBData();
root.ID = 1;
TeleDBData node1 = new TeleDBData();
node1.ID = 2;
node1.RootID = 1;
TeleDBData node2 = new TeleDBData();
node2.ID = 3;
node2.RootID = 1;
TeleDBData node3 = new TeleDBData();
node3.ID = 4;
node3.RootID = 1;
TeleDBData node4 = new TeleDBData();
node4.ID = 5;
node4.RootID = 2;
eqList.Add(root);
eqList.Add(node1);
eqList.Add(node2);
eqList.Add(node3);
eqList.Add(node4);
}
After running the query, it will return node1, node2 and node3.
How can I find them using LINQ?

You just need to GroupBy accordingly:
var groups = eqList.GroupBy(n => n.RootID).Where(g => g.Count() >= 3);
foreach (var g in groups) {
Console.Out.WriteLine("There are {0} nodes which share RootId = {1}",
g.Count(), g.Key);
foreach (var node in g) {
Console.Out.WriteLine(" node id = " + node.ID);
}
}
See it in action.
Additional info:
In the code above, g is an IGrouping<int?, TeleDBData> so, by the documentation page definition, it's a collection of TeleDBData items that share a common key (which is an int?). groups is an IEnumerable<IGrouping<int?, TeleDBData>>, all of this is standard procedure for the Enumerable.GroupBy method.
The two things you would want to do with an IGrouping<,> is access its Key property to find the key and enumerate over it to process the grouped elements. We 're doing both of this in the above code.
As for the n in the GroupBy lambda, it simply represents each one of the items in eqList in turn; it follows that its type is TeleDBData. I picked n as the parameter name as an abbreviation of "node".

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Finding difference between two tree structures C# - c#

Related

Creating a tree from a collection List<T>

Modifying Observable Collection elements efficiently

Query for selecting and anonymous object with where clause

How to create objects with retrieved Hierarchical result set?

Using LINQ to find three or more matching records

Categories

Resources