SortedSet with element duplication - can't remove element - c#

I'm working on an implementation of the A-star algorithm in C# in Unity.
I need to evaluate a collection of Node :
class Node
{
public Cell cell;
public Node previous;
public int f;
public int h;
public Node(Cell cell, Node previous = null, int f = 0, int h = 0)
{
this.cell = cell;
this.previous = previous;
this.f = f;
this.h = h;
}
}
I have a SortedSet which allows me to store several Node, sorted by h property. Though, I need to be able to store two nodes with the same h property. So I've implemented a specific IComparer, in a way that allow me sorting by h property, and triggerring equality only when two nodes are representing the exact same cell.
class ByHCost : IComparer<Node>
{
public int Compare(Node n1, Node n2)
{
int result = n1.h.CompareTo(n2.h);
result = (result == 0) ? 1 : result;
result = (n1.cell == n2.cell) ? 0 : result;
return result;
}
}
My problem : I have a hard time to remove things from my SortedSet (I named it openSet).Here is an example:
At some point in the algorithm, I need to remove a node from the list based on some criteria (NB: I use isCell127 variable to focus my debug on an unique cell)
int removedNodesNb = openSet.RemoveWhere((Node n) => {
bool isSame = n.cell == candidateNode.cell;
bool hasWorseCost = n.f > candidateNode.f;
if(isCell127)
{
Debug.Log(isSame && hasWorseCost); // the predicate match exactly one time and debug.log return true
}
return isSame && hasWorseCost;
});
if(isCell127)
{
Debug.Log($"removed {removedNodesNb}"); // 0 nodes where removed
}
Here, the removeWhere method seems to find a match, but doesn't remove the node.
I tried another way :
Node worseNode = openSet.SingleOrDefault(n => {
bool isSame = n.cell == candidateNode.cell;
bool hasWorseCost = n.f > candidateNode.f;
return isSame && hasWorseCost;
});
if(isCell127)
{
Debug.Log($"does worseNode exists ? {worseNode != null}"); // Debug returns true, it does exist.
}
if(worseNode != null)
{
if(isCell127)
{
Debug.Log($"openSet length {openSet.Count}"); // 10
}
openSet.Remove(worseNode);
if(isCell127)
{
Debug.Log($"openSet length {openSet.Count}"); // 10 - It should have been 9.
}
}
I think the problem is related to my pretty unusual IComparer, but I can't figure whats exatcly the problem.
Also, I would like to know if there is a significative performance improvment about using an auto SortedSet instead of a manually sorted List, especially in the A-star algorithm use case.

If i write your test you do:
n1.h < n2.h
n1.cell = n2.cell -> final result = 0
n1.h > n2.h
n1.cell = n2.cell -> final result = 0
n1.h = n2.h
n1.cell != n2.cell -> final result = 1
n1.h < n2.h
n1.cell != n2.cell -> final result = -1
n1.h > n2.h
n1.cell != n2.cell -> final result = 1
when you have equality on h value (test number 3) you choose to have always the same result -> 1. so its no good you have to have another test on cell to clarify the position bacause there is a confusion with other test which gives the same result (test number 5)
So i could test with sample, but i am pretty sure you break the Sort.
So if you clarify the test, i suggest you to use Linq with a list...its best performance.

I'll answer my own topic because I've a pretty complete one.
Comparison
The comparison of the IComparer interface needs to follow some rules. Like #frenchy said, my own comparison was broken. Here are math fundamentals of a comparison I totally forgot (I found them here):
1) A.CompareTo(A) must return zero.
2) If A.CompareTo(B) returns zero, then B.CompareTo(A) must return zero.
3) If A.CompareTo(B) returns zero and B.CompareTo(C) returns zero, then A.CompareTo(C) must return zero.
4) If A.CompareTo(B) returns a value other than zero, then B.CompareTo(A) must return a value of the opposite sign.
5) If A.CompareTo(B) returns a value x not equal to zero, and B.CompareTo(C) returns a value y of the same sign as x, then A.CompareTo(C) must return a value of the same sign as x and y.
6) By definition, any object compares greater than (or follows) null, and two null references compare equal to each other.
In my case, rule 4) - symetry - was broken.
I needed to store multiple node with the same h property, but also to sort by that h property. So, I needed to avoid equality when h property are the same.
What I decided to do, instead of a default value when h comparison lead to 0 (which broke 4th rule), is refine the comparison in a way that never lead to 0 with a unique value foreach node instance. Well, this implementation is probably not the best, maybe there is something better to do for a unique value, but here is what I did.
private class Node
{
private static int globalIncrement = 0;
public Cell cell;
public Node previous;
public int f;
public int h;
public int uid;
public Node(Cell cell, Node previous = null, int f = 0, int h = 0)
{
Node.globalIncrement++;
this.cell = cell;
this.previous = previous;
this.f = f;
this.h = h;
this.uid = Node.globalIncrement;
}
}
private class ByHCost : IComparer<Node>
{
public int Compare(Node n1, Node n2)
{
if(n1.cell == n2.cell)
{
return 0;
}
int result = n1.h.CompareTo(n2.h);
result = (result == 0) ? n1.uid.CompareTo(n2.uid) : result; // Here is the additional comparison which never lead to 0. Depending on use case and number of object, it would be better to use another system of unique values.
return result;
}
}
RemoveWhere method
RemoveWhere use a predicate to look into the collection so I didn't think it cares about comparison. But RemoveWhere use internally Remove method, which do care about the comparison. So, even if the RemoveWhere have found one element, if your comparison is inconstent, it will silently pass its way. That's a pretty weird implementation, no ?

Related

How to determine if Binary Tree is BST

I am trying to figure out a logic to determine if Binary Tree is BST. I want to use the inorder approach and I don't want to use an extra array to store all incoming values as we know that Inorder should be in sorted order. I want to check the incoming value w/o having to store it in an array. Below is my attempt which is not working.
public bool CheckBST(BstNode root)
{
BstNode prev = new BstNode(Int32.MinValue);
if (root == null)
return true;
if (root.left != null)
{
return CheckBST(root.left);
}
if (prev != null && prev.data >= root.data) // means data is not sorted hence NOT BST
return false;
prev = root;
if(root.right!=null)
{
return CheckBST(root.right);
}
return true;
}
Given a binary tree, following determines if it is a valid binary search tree (BST).
The left subtree of a node contains only nodes with keys less than
the node's key.
The right subtree of a node contains only nodes with keys greater
than the node's key.
Both the left and right subtrees must also be binary search trees.
Let's see below example:
If you see the above Binary Tree is a BST.
Now let's see another example :
The root node's value is 5 but its right child's value is 4 which does not satisfy the condition mentioned above. So the given tree is not a BST.
Solution Code:
Given that the TreeNode is defined as
public class TreeNode
{
public int Val { get; set; }
public TreeNode Left { get; set; }
public TreeNode Right { get; set; }
public TreeNode(int x) { this.Val = x; }
}
The code to check the validation is
public bool IsValidBST(TreeNode root)
{
return IsValidBST(root, int.MinValue, int.MaxValue);
}
private bool IsValidBST(TreeNode root, int minValue, int maxValue)
{
if (root == null)
{
return true;
}
int nodeValue = root.Val;
if (nodeValue < minValue || nodeValue > maxValue)
{
return false;
}
return IsValidBST(root.Left, minValue, nodeValue - 1) && IsValidBST(root.Right, nodeValue + 1, maxValue);
}
Now the IsValidBST can be invoked with root node
bool isValidBST = IsValidBST(rootNode);
So usually in a BST there are three things in each node. That is the data, and the two pointers left and right. If there are more than two pointers available in any node then it is not a BST. It is probably best to determine at the level of the node if it there are more pointers than there should be. You would be wasting time and resources by searching the tree.
Here is a good way to go about doing it https://www.geeksforgeeks.org/a-program-to-check-if-a-binary-tree-is-bst-or-not/
You can't initilaze the prev everytime in CheckBST. You can make the prev global. Also I have made the prev as type integer.
int prev = Int32.MinValue; //made this global and integer type
public bool CheckBST(BstNode root) {
if (root == null)
return true;
bool isLeftBST = CheckBST(root.left);
if (isLeftBST == false) return false;
if (prev != Int32.MinValue && prev >= root.data) // means data is not sorted hence NOT BST
return false;
prev = root.data; //mark the prev before traversing the right subtree
return isLeftBST && CheckBST(root.right);
}
Ignore the syntax problems, if any. I tried more of a pseudo code.
Ofcourse there are other ways to solve this problem as well. Like keeping track of min and max value so far (in #user1672994 answer).
If you could make CheckBST return the range (min, max) of the BST being checked, then the following recursive function shall do:
// Defines the return value that represents BST check failure.
const pair<int, int> kCheckFailed(Int32.MaxValue, Int32.MinValue);
pair<int, int> CheckBST(const BstNode& curr)
{
pair<int, int> left_ret(curr.value, curr.value);
pair<int, int> right_ret(curr.value, curr.value);
// Makes sure the left subtree, if any, is a BST, and its max
// (`left_ret.second`) is no greater than `curr.value`
if (curr.left) {
left_ret = CheckBST(*curr.left);
if (left_ret == kCheckFailed || left_ret.second > curr.value)
return kCheckFailed;
}
// Makes sure the right subtree, if any, is a BST, and its min
// (`right_ret.first`) is not less than `curr.value`.
if (curr.right) {
right_ret = CheckBST(*curr.right);
if (right_ret == kCheckFailed || right_ret.first < curr.value)
return kCheckFailed;
}
// Returns range by combining min of left subtree and max of right subtree.
return make_pair(left_ret.first, right_ret.second);
}
Note that CheckBST takes a (sub)tree root by reference to ensure the node (curr) is always valid. However, curr.left or curr.right may still be NULL, in which cases, the corresponding min or max values, respectively, are just curr.value, as initialized to both ret_left and ret_right.
Recursive with time complexity of O(1).
Remove the commented our lines to see how it gets called.
For the first call pass isBST(root, null, null).
public bool isBST(Node root, Node l, Node r)
{
// Console.WriteLine($"Processing: isBST({root?.data}, {l?.data}, {r?.data})");
if (root == null) return true;
if (l != null && root.data <= l.data) return false;
if (r != null && root.data >= r.data) return false;
// Console.WriteLine($"isBST({root?.left?.data}, {l}, {root?.data}) && isBST({root?.right?.data}, {root?.data}, {r?.data})");
return isBST(root.left, l, root) && isBST(root.right, root, r);
}
You dont need prev.
Check recursively that max(left) is less than or equal root.
Check recursively that min(right) if greater than root.
Check if the left is BST.
Check if the right is BST.
Of course, check on nulls where needed.

HashSet with complex equality

Consider the following class
public class X
{
//Unique per set / never null
public ulong A { get; set; }
//Unique per set / never null
public string B { get; set; }
//Combination of C and D is Unique per set / both never null
public string C { get; set; }
public string D { get; set; }
public override bool Equals(object obj)
{
var x = (X)obj;
if (A == x.A || B==x.B)
return true;
if (C+D==x.C+x.D)
return true;
return false;
}
public override int GetHashCode()
{
return 0;
}
}
I can't think of writing a hash function in which the combination of comments over the properties above apply, just like in the Equals function, in that case is my best bet returning a 0 from the GetHashCode or am I missing something?
This is not possible. This is fundamental problem. In fact it is possible, but it is VERY hard problem to solve.
Explanation
Just think about it in reverse, in which cases your objects are NOT equal? From code I can see what they are equal by this expression:
return A == x.A || B==x.B || (C+D)==(x.C+x.D)
And not equal expression:
return A!=x.A && B!=x.B && (C+D)!=(x.C+x.D)
So your hash should be same for any particular value in equality expression and same for any particular value in not equality expression. Values can vary to infinity.
The only real possible solution for both expressions is constant value. But this solution is not optional in performance cause it will just evaporate every meaning of GetHashCode override.
Consider using IEqualityComperer interface, and equality alghorithms for task you are solving.
I think best solution to find equal objects is Indexing. You can see for example how databases are made, and how they use bit-indexing.
Why hashes is so cruel?
If it were possible, all databases in the world would easily hash everything in single hash table, and all problems with fast access will be solved.
For example, imagine your object not as object with properties but as entire object state (for example 32 boolean properties can be represented as integer).
Hash function calculates hash based on this state, but in your case you explicitely tell that some states from it's space is actually equal:
class X
{
bool A;
bool B;
}
Your space is:
A B
false false -> 0
false true -> 1
true false -> 2
true true -> 3
If you define equality like this:
bool Equal(X x) { return x.A == A || x.B == B; }
You basicaly define this state equality:
0 == 0
0 == 1
0 == 2
0 != 3
1 == 0
1 == 1
1 != 2
1 == 3
2 == 0
2 != 1
2 == 2
2 == 3
3 != 0
3 == 1
3 == 2
3 == 3
This sets should have same hash: {0,1,2} {0,1,3} {0,2,3} {1,2,3}
So, all your sets should be EQUAL in hash. This concludes that this is impossible to create Hash function better than constant value.
In this case, I would say that the hash code that defines an object as unique (i.e. overriding GetHashCode) shouldn't be the one used for your specific HashSet.
In other words, you should consider two instances of your class equal if their properties are all equal (not if any of the properties match). But then, if you want to group them by a certain criteria, use a specific implementation of IEqualityComparer<X>.
Also, strongly consider making the class immutable.
Apart from that, the only hash code I believe will really will work is constant. Anything trying to be smarter than that will fail:
// if any of the properties match, consider the class equal
public class AnyPropertyEqualityComparer : IEqualityComparer<X>
{
public bool Equals(X x, X y)
{
if (object.ReferenceEquals(x, y))
return true;
if (object.ReferenceEquals(y, null) ||
object.ReferenceEquals(x, null))
return false;
return (x.A == y.A ||
x.B == y.B ||
(x.C + x.D) == (y.C + y.D));
}
public int GetHashCode(X x)
{
return 42;
}
}
Since you will have to evaluate all properties in any case, a HashSet will not help much in this case and you might as well use a plain List<T> (in which case insertion of a list of items into a "hashset" will degrade to O(n*n).
You could consider creating an anonymous type and then returning the hashcode from that:
public override int GetHashCode()
{
// Check that an existing code hasn't already been returned
return new { A, B, C + D }.GetHashCode();
}
Make sure you create some automated tests to verify that objects with the same values return the same hashcode.
Bear in mind that once the hashcode is given out, you must continue to return that code and not a new one.

Why is my OrderBy running forever with this comparator?

I have a class,
public class NullsAreLast : IComparer<int?>
{
public int Compare (int? x, int? y)
{
if(y == null)
return -1;
else if(x == null)
return 1;
else
return (int)x - (int)y;
}
}
which is self-explanatory on how it is supposed to work.
Whenever I run
arr.OrderBy(i => i, new NullsAreLast())
with at least two null values in arr it runs forever! Any idea why?
Keep in mind that a sorting algorithm may compare the same two values several times over the process of ordering the whole sequence. Because of this, it's very important to be aware of all three possible results: less than, greater than, and equal.
This is (mostly) fine for your integer comparison at the end (the subtraction operation). There are some weird/rare edge cases when working with floating point numbers instead of integers, and calling .CompareTo() is the preferred practice anyway, but subtraction is usually good enough in this case. However, the null checks here are a real problem.
Think about what happens as a list is nearly finished sorting. You have two null values that have both made their way to the front of the list; the algorithm just needs to verify they are in the correct position. Because both x and y are null, your function should return 0. They are equivalent (for this purpose, at least). Instead, the code always returns -1. The y value will always be less than then x value, and so the algorithm will always believe it still needs to swap them. It swaps, and tries to do the same thing again. And again. And again. And again. It can never finish.
Try this instead:
public class NullsAreLast : IComparer<int?>
{
public int Compare (int? x, int? y)
{
if(!y.HasValue)
{
if (!x.HasValue) return 0;
return -1;
}
if(!x.HasValue) return 1;
return x.Value.CompareTo(y.Value);
}
}
The minus operation at the end of your Compare method isn't appropriate for comparison. You need to handle exactly three possibilities - x is bigger, y is bigger, or they are the same.
MSDN
Compares two objects and returns a value indicating whether one is
less than, equal to, or greater than the other.
With this code, suppose X was 1000 and Y was 15. Your result would be 985, which doesn't make sense here.
Given your code and method name, I'm going to guess what you meant is this:
public class NullsAreLast : IComparer<int?>
{
public int Compare (int? x, int? y)
{
if(y == null)
return -1;
else if(x == null)
return 1;
else{
int diff = x - y;
if (diff == 0) return 0; //same
if (diff < 0) return 1; //y was bigger
if (diff > 0) return -1; //x was bigger
}
}
}
You could even smash it into a horrible one-liner:
return (y==null?-1:(x==null?1:(x-y==0?0:(x-y<0?1:-1))));

C# nth-child logic test

I've been working on my own, headless browser implementation and I feel like I am making a mess of my nth-child selector logic. Given an element and it's 0-based position in its group of siblings is there a simple, one-line expression to see if that element belongs in the result set?
public bool Evaluate(HTMLElement element)
{
if (element.parentNode == element.ownerDocument)
return false;
List<Element> children = element.Parent.Children
.Where(e => e is Element)
.Cast<Element>()
.ToList();
int index = children.IndexOf(element);
bool result = (an + b test here);
return result;
}
Currently I have a convoluted set of branching logic based on tests for 0 values for (a) and (b) and I suspect I am making it more complicated than it needs to be.
If I'm understanding correctly, you need to determine whether an n exists such that index = a*n + b for some fixed a, b.
bool result = (a == 0) ? b == index : (Math.Abs(index - b) % Math.Abs(a)) == 0;
If a is 0, then index must be b. Otherwise, a must evenly divide the difference between i and b.
Naturally, if a negative value for a is not allowed you can skip the Math.Abs(a) call.

Array sorting by two parameters

I'm having a little difficulty with the array.sort. I have a class and this class has two fields, one is a random string the other one is a random number. If i want to sort it with one parameter it just works fine. But i would like to sort it with two parameters. The first one is the SUM of the numbers(from low to high), and THEN if these numbers are equal by the random string that is give to them(from low to high).
Can you give some hint and tips how may i can "merge" these two kinds of sort?
Array.Sort(Phonebook, delegate(PBook user1, PBook user2)
{ return user1.Sum().CompareTo(user2.Sum()); });
Console.WriteLine("ORDER");
foreach (PBook user in Phonebook)
{
Console.WriteLine(user.name);
}
That's how i order it with one parameter.
i think this is what you are after:
sourcearray.OrderBy(a=> a.sum).ThenBy(a => a.random)
Here is the general algorithm that you'll use for comparing multiple fields in a CompareTo method:
public int compare(MyClass first, MyClass second)
{
int firstComparison = first.FirstValue.CompareTo(second.SecondValue);
if (firstComparison != 0)
{
return firstComparison;
}
else
{
return first.SecondValue.CompareTo(second.SecondValue);
}
}
However, LINQ does make the syntax for doing this much easier, allowing you to only write:
Phonebook = Phonebook.OrderBy(book=> book.Sum())
.ThenBy(book => book.OtherProperty)
.ToArray();
You can do this in-place by using a custom IComparer<PBook>. The following should order your array as per your original code, but if two sums are equal it should fall back on the random string (which I've called RandomString):
public class PBookComparer : IComparer<PBook>
{
public int Compare(PBook x, PBook y)
{
// Sort null items to the top; you can drop this
// if you don't care about null items.
if (x == null)
return y == null ? 0 : -1;
else if (y == null)
return 1;
// Comparison of sums.
var sumCompare = x.Sum().CompareTo(y.Sum());
if (sumCompare != 0)
return sumCompare;
// Sums are the same; return comparison of strings
return String.Compare(x.RandomString, y.RandomString);
}
}
You call this as
Array.Sort(Phonebook, new PBookComparer());
You could just do this inline but it gets a bit hard to follow:
Array.Sort(Phonebook, (x, y) => {
int sc = x.Sum().CompareTo(y.Sum());
return sc != 0 ? sc : string.Compare(x.RandomString, y.RandomString); });
... Actually, that isn't too bad, although I have dropped the null checks.

Categories