IEqualityComparer and weird results - c#

Take a look at this class:
public class MemorialPoint:IMemorialPoint,IEqualityComparer<MemorialPoint>
{
private string _PointName;
private IPoint _PointLocation;
private MemorialPointType _PointType;
private DateTime _PointStartTime;
private DateTime _PointFinishTime;
private string _NeighborName;
private double _Rms;
private double _PointPdop;
private double _PointHdop;
private double _PointVdop;
// getters and setters omitted
public bool Equals(MemorialPoint x, MemorialPoint y)
{
if (x.PointName == y.PointName)
return true;
else if (x.PointName == y.PointName && x.PointLocation.X == y.PointLocation.X && x.PointLocation.Y == y.PointLocation.Y)
return true;
else
return false;
}
public int GetHashCode(MemorialPoint obj)
{
return (obj.PointLocation.X.ToString() + obj.PointLocation.Y.ToString() + obj.PointName).GetHashCode();
}
}
I also have a Vector class, which is merely two points and some other atributes. I don't want to have equal points in my Vector, so I came up with this method:
public void RecalculateVector(IMemorialPoint fromPoint, IMemorialPoint toPoint, int partIndex)
{
if (fromPoint.Equals(toPoint))
throw new ArgumentException(Messages.VectorWithEqualPoints);
this.FromPoint = FromPoint;
this.ToPoint = ToPoint;
this.PartIndex = partIndex;
// the constructDifference method has a weird way of working:
// difference of Point1 and Point 2, so point2 > point1 is the direction
IVector3D vector = new Vector3DClass();
vector.ConstructDifference(toPoint.PointLocation, fromPoint.PointLocation);
this.Azimuth = MathUtilities.RadiansToDegrees(vector.Azimuth);
IPointCollection pointCollection = new PolylineClass();
pointCollection.AddPoint(fromPoint.PointLocation, ref _missing, ref _missing);
pointCollection.AddPoint(toPoint.PointLocation, ref _missing, ref _missing);
this._ResultingPolyline = pointCollection as IPolyline;
}
And this unit test, which should give me an exception:
[TestMethod]
[ExpectedException(typeof(ArgumentException), Messages.VectorWithEqualPoints)]
public void TestMemoriaVector_EqualPoints()
{
IPoint p1 = PointPolygonBuilder.BuildPoint(0, 0);
IPoint p2 = PointPolygonBuilder.BuildPoint(0, 0);
IMemorialPoint mPoint1 = new MemorialPoint("teste1", p1);
IMemorialPoint mPoint2 = new MemorialPoint("teste1", p2);
Console.WriteLine(mPoint1.GetHashCode().ToString());
Console.WriteLine(mPoint2.GetHashCode().ToString());
vector = new MemorialVector(mPoint1, mPoint1, 0);
}
When i use the same point, that is, mPoint1, as in the code the exception is thrown. When I use mPoint2, even their name and coordinates being the same, the exception is not thrown. I checked their hash codes, and they are in fact different. Based on the code I created in GetHashCode, I tought these two point would have the same hashcode.
Can someone explain to me why this is not working as I tought it would? I'm not sure I explained this well, but.. I appreciate the help :D
George

You're implementing IEqualityComparer<T> within the type it's trying to compare - which is very odd. You should almost certainly just be implementing IEquatable<T> and overriding Equals(object) instead. That would definitely make your unit test work.
The difference between IEquatable<T> and IEqualityComparer<T> is that the former is implemented by a class to say, "I can compare myself with another instance of the same type." (It doesn't have to be the same type, but it usually is.) This is appropriate if there's a natural comparison - for example, the comparison chosen by string is ordinal equality - it's got to be exactly the same sequence of char values.
Now IEqualityComparer<T> is different - it can compare any two instances of a type. There can be multiple different implementations of this for a given type, so it doesn't matter whether or not a particular comparison is "the natural one" - it's just got to be the right one for your job. So for example, you could have a Shape class, and different equality comparers to compare shapes by colour, area or something like that.

You need to override Object.Equals as well.
Add this to your implementation:
// In MemorialPoint:
public override bool Equals(object obj)
{
if (obj == null || GetType() != obj.GetType())
return false;
MemorialPoint y = obj as MemorialPoint;
if (this.PointName == y.PointName)
return true;
else if (this.PointName == y.PointName && this.PointLocation.X == y.PointLocation.X && this.PointLocation.Y == y.PointLocation.Y)
return true;
else
return false;
}
I'd then rework your other implementation to use the first, plus add the appropriate null checks.
public bool Equals(MemorialPoint x, MemorialPoint y)
{
if (x == null)
return (y == null);
return x.Equals(y);
}

You also need to rethink your concept of "equality", since it's not currently meeting .NET framework requirements.
If at all possible, I recommend a re-design with a Repository of memorial point objects (possibly keyed by name), so that simple reference equality can be used.

You've put an arcobjects tag on this, so I just thought I'd mention IRelationalOperator.Equals. I've never tested to see if this method honors the cluster tolerance of the geometries' spatial references. This can be adjusted using ISpatialReferenceTolerance.XYTolerance.

Related

Looking if List<T> has <T> (no matter of attribute orders in <T>) in C#

I have List<Moves> listOfMoves
ListOfMoves.Add(new Moves()
{
int position1= number1,
int position2= number2,
});
Now I want to check if ListOfMoves contains for example Move(2,3), but also to check if it contains Move(3,2).
I tried if(ListOfMoves.Contains(new Move(2,3))) but this does not work properly.
Method List<T>.Contains(T item) internally uses method Object.Equals to check if objects are equal. Therefore if you want to use method List<T>.Contains(T item) with your type T to check if the specified item is contained in the List<T> then you need to override method Object.Equals in your type T.
When you override Object.Equals you should also override Object.GetHashCode. Here is a good explanation "Why is it important to override GetHashCode when Equals method is overridden?".
Here is how you should override Object.Equals in the Move class to fit your requirement:
class Move
{
public Move(int p1, int p2)
{
position1 = p1;
position2 = p2;
}
public int position1 { get; }
public int position2 { get; }
public override bool Equals(object obj)
{
if (obj == null)
return false;
if (ReferenceEquals(this, obj))
return true;
Move other = obj as Move;
if (other == null)
return false;
// Here we specify how to compare two Moves. Here we implement your
// requirement that two moves are considered equal regardless of the
// order of the properties.
return (position1 == other.position1 && position2 == other.position2) ||
(position1 == other.position2 && position2 == other.position1);
}
public override int GetHashCode()
{
// When implementing GetHashCode we have to follow the next rules:
// 1. If two objects are equal then their hash codes must be equal too.
// 2. Hash code must not change during the lifetime of the object.
// Therefore Move must be immutable. (Thanks to Enigmativity's usefull tip).
return position1 + position2;
}
}
When you override Object.Equals you will be able to use condition ListOfMoves.Contains(new Move(2, 3)) to check if moves Move(2, 3) or Move(3, 2) are contained in the ListOfMoves.
Here is complete sample that demostrates overriding of Object.Equals.
For this you can use LINQ's Any function. If you want both combinations for the positions [ (2,3) or (3,2) ] you'll need two pass in two checks
ListOfMoves.Any(x =>
(x.position1 == 2 && x.position2 == 3)
|| (x.position1 == 3 && x.position2 == 2) )
Any returns a bool so you can wrap this line of code in an if statement or store the result for multiple uses
Potential improvement
If you're going to be doing a lot of these checks (and you're using at least c# version 7) you could consider some minor refactoring and use the built in tuples type: https://learn.microsoft.com/en-us/dotnet/csharp/tuples
Moves would become
public class Moves
{
public (int position1, int position2) positions { get; set; }
}
And the Any call would become
ListOfMoves.Any(x => x.positions == (2,3) || x.positions == (3,2))
Else where in the code you can still access the underlying value of each position as so:
ListOfMoves[0].positions.position1
Obviously depends on what else is going on in your code so totally up to you!
Obviously it won't work cause you can't compare the entity itself rather you will have to compare with property values like below using System.Linq
ListOfMoves.Where(x => x.position1 == 2 && x.position1 == 3)
Note: Your posted code shouldn't compile at all in first place
You said .. I need to get true if either Move(3,2) or (2,3) is in List
Then use Any() using the same predicate like
if(ListOfMoves.Any(x => x.position1 == 2 && x.position1 == 3))
{
// done something here
}

Validate object/struct without failing

Assume we have a huge list of numeric cartesian coordinates (5;3)(1;-9) etc. To represent a point in oop I created a struct/object (c#):
public struct Point
{
public int X, Y { get; }
public Point(int x, int y)
{
// Check if x,y falls within certain boundaries (ex. -1000, 1000)
}
}
It might be wrong of how I am using struct. I guess normally you would not use a constructor but this is not the point.
Suppose I want to add a list of 1000 points and there is no guarantee that these coordinates fall within boundaries. Simply if the point is not valid, move to the next one without failing and inform user about it. As for object, I would think that Point should be responsible for instantiation and validation by itself but I am not sure how to deal with it in this particular case. Checking x, y beforehand by the caller would be the simplest approach but it does not feel right because caller would have to handle logic that should reside in Point.
What would the most appropriate approach to validate and handle incorrect coordinates without failing and violating SRP?
You can't do this in the constructor; the constructor either runs succesfully or it doesn't. If it doesn't its because an exception is raised, so, so much for silently failing. You could catch exceptions but that woul basically mean you are using exceptions as a control flow mechanism and that is a big no no, don't do that!
One solution similar to what you are thinking is to use a static factory method:
public struct Point
{
public static bool TryCreatePoint(int x, int y, Bounds bounds, out Point point)
{
if (x and y are inside bounds)
{
point = new Point(x, y);
return true;
}
point = default(Point);
return false;
}
//...
}
And the code adding points to the list should act based upon creation success.
Fun fact: if you are using C# 7 the code could look a lot cleaner:
public static (bool Succesful, Point NewPoint) TryCreatePoint(int x, int y, Bounds bounds)
{
if (x and y are inside bounds)
return (true, new Point(x, y));
return (false, default(Point));
}
I can think of three options:
Have the constructor throw an exception that you catch. This is not really great if you are expecting a lot of failures.
Have an IsValid property on the struct that you can use to filter it out once created.
Have the thing loading the data take responsibility for validating the data as well. This would be my preferred option. You say "it does not feel right because caller would have to handle logic that should reside in Point" but I would argue that the responsibility for checking that loaded data is correct is with the thing loading the data, not the data type. You could also have it throw an ArgumentOutOfRangeException in the constructor if the inputs are not valid now that you are no longer expecting anything invalid to be passed as a belt and bracers approach to things.
What you want to do is simply not posible, an instance of a class is either fully created or not at all. If the constructor has been called the only way to not instantiate an instance is by throwing an exception.
So you have these two opportunities to do this:
Extract a method Validate that returns a bool and can be called from the caller of your class.
public struct Point
{
public int X, Y { get; }
public Point(int x, int y)
{
}
}
public bool Validate() { return -1000 <= X && X <= 1000 && -1000 <= Y and Y <= 1000; }
Of course you could do the same using a property.
Throw an exception in the constructor
public Point(int x, int y)
{
if(x > 1000) throw new ArgumentException("Value must be smaller 1000");
// ...
}
However the best solution IMHO is to validate the input before you even think about creating a point, that is check the arguments passed to the constructor beforehand:
if(...)
p = new Point(x, y);
else
...
To be honest, Point shouldn't check boundaries, so the caller should do that. A point is valid in the range that their X and Y can operate (int.MinValue and int.MaxValue). So a -1000000,2000000 is a valid point. The problem is that this point isn't valid for YOUR application, so YOUR application (the caller), the one who is using point, should have that logic, not inside the point constructor.
Structs in C# are funny so I'll add another "funny" way to check:
struct Point
{
int _x;
public int X
{
get { return _x; }
set { _x = value; ForceValidate(); }
} // simple getter & setter for X
int _y;
public int Y
{
get { return _y; }
set { _y = value; ForceValidate(); }
} // simple getter & setter for Y
void ForceValidate()
{
const MAX = 1000;
const MIN = -1000;
if(this.X >= MIN && this.X <= MAX && this.Y >= MIN && this.Y <= MAX)
{
return;
}
this = default(Point); // Yes you can reasign "this" in structs using C#
}
}

Comparing approximate values in c# 4.0?

First of all, please excuse any typo, English is not my native language.
Here's my question. I'm creating a class that represents approximate values as such:
public sealed class ApproximateValue
{
public double MaxValue { get; private set; }
public double MinValue { get; private set; }
public double Uncertainty { get; private set; }
public double Value { get; private set; }
public ApproximateValue(double value, double uncertainty)
{
if (uncertainty < 0) { throw new ArgumentOutOfRangeException("uncertainty", "Value must be postivie or equal to 0."); }
this.Value = value;
this.Uncertainty = uncertainty;
this.MaxValue = this.Value + this.Uncertainty;
this.MinValue = this.Value - this.Uncertainty;
}
}
I want to use this class for uncertain measurments, like x = 8.31246 +/-0.0045 for example and perform calculations on these values.
I want to overload operators in this class. I don't know how to implement the >, >=, <= and < operators... The first thing I thought of is something like this:
public static bool? operator >(ApproximateValue a, ApproximateValue b)
{
if (a == null || b == null) { return null; }
if (a.MinValue > b.MaxValue) { return true; }
else if (a.MaxValue < b.MinValue) { return false; }
else { return null; }
}
However, in the last case, I'm not satisfied with this 'null' as the accurate result is not 'null'. It may be 'true' or it may be 'false'.
Is there any object in .Net 4 that would help implementing this feature I am not aware of, or am I doing the correct way? I was also thinking about using an object instead of a boolean that would define in what circumstances the value is superior or not to another one rather than implementing comparison operators but I feel it's a bit too complex for what I'm trying to achieve...
I'd probably do something like this. I'd implement IComparable<ApproximateValue> and then define <, >, <=, and >= according to the result of CompareTo():
public int CompareTo(ApproximateValue other)
{
// if other is null, we are greater by default in .NET, so return 1.
if (other == null)
{
return 1;
}
// this is > other
if (MinValue > other.MaxValue)
{
return 1;
}
// this is < other
if (MaxValue < other.MinValue)
{
return -1;
}
// "same"-ish
return 0;
}
public static bool operator <(ApproximateValue left, ApproximateValue right)
{
return (left == null) ? (right != null) : left.CompareTo(right) < 0;
}
public static bool operator >(ApproximateValue left, ApproximateValue right)
{
return (right == null) ? (left != null) : right.CompareTo(left) < 0;
}
public static bool operator <=(ApproximateValue left, ApproximateValue right)
{
return (left == null) || left.CompareTo(right) <= 0;
}
public static bool operator >=(ApproximateValue left, ApproximateValue right)
{
return (right == null) || right.CompareTo(left) <= 0;
}
public static bool operator ==(ApproximateValue left, ApproximateValue right)
{
return (left == null) ? (right == null) : left.CompareTo(right) == 0;
}
public static bool operator !=(ApproximateValue left, ApproximateValue right)
{
return (left == null) ? (right != null) : left.CompareTo(left) != 0;
}
This is one of the rare cases where it may make more sense to define a value type (struct), which then eliminates the null case concern. You can also modify MinValue and MaxValue to be computed properties (just implement a get method that computes the result) rather than storing them upon construction.
On a side note, comparison of approximate values is itself an approximate operation, so you need to consider the use cases for your data type; are you only intending to use comparison to determine when the ranges are non-overlapping? It really depends on the meaning of your type. Is this intended to represent a data point from a normally distributed data set, where the uncertainty is some number of standard deviations for the sampling? If so, it might make more sense for a comparison operation to return a numeric probability (which couldn't be called through the comparison operator, of course.)
It looks to me like you need to check if a.MaxValue == b.MinValue also, in your current implementation that would return null, which seems incorrect, it should either return true or false based on how you want the spec to actually work. I'm not sure of any built in .net functionality for this, so I believe you are going about it the correct way.
return a.Value - a.Uncertainty > b.Value + b.Uncertainty
I wouldn't really mess with the semantics of >: I think bool? is a dangerous return type here. That said, given the uncertainty, you could return true, if a is more likely to be > b.
It seems to me that you're trying to implement some form of Ternary Logic because you want the result of applying the operators to be either True, False or Indeterminate. The problem with doing that is that you really cannot combine the built-in boolean values with your indeterminate value. So whilst you could do some limited form of comparison of two ApproximateValues I think that it's inappropriate to use bool as the result of these comparisons because that implies that the result of the comparisons can be freely combined with other expressions that result in bool values, but the possibility of an indeterminate value undermines that. For example, it makes no sense to do the following when the result of operation on the left of the OR is indeterminate.
ApproximateValue approx1 = ...;
ApproximateValue approx2 = ...;
bool result = ...;
bool result = approx1 > approx2 || someBool;
So, in my opinion, I don't think that it's a good idea to implement the comparisons as operators at all if you want to retain the indeterminacy. The solutions offered here eliminate the indeterminacy, which is fine, but not what was originally specified.

C# GetHashCode question

What would be the best way to override the GetHashCode function for the case, when
my objects are considered equal if there is at least ONE field match in them.
In the case of generic Equals method the example might look like this:
public bool Equals(Whatever other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
// Considering that the values can't be 'null' here.
return other.Id.Equals(Id) || Equals(other.Money, Money) ||
Equals(other.Code, Code);
}
Still, I'm confused about making a good GetHashCode implementation for this case.
How should this be done?
Thank you.
This is a terrible definition of Equals because it is not transitive.
Consider
x = { Id = 1, Money = 0.1, Code = "X" }
y = { Id = 1, Money = 0.2, Code = "Y" }
z = { Id = 3, Money = 0.2, Code = "Z" }
Then x == y and y == z but x != z.
Additionally, we can establish that the only reasonable implementation of GetHashCode is a constant map.
Suppose that x and y are distinct objects. Let z be the object
z = { Id = x.Id, Money = y.Money, Code = "Z" }
Then x == z and y == z so that x.GetHashCode() == z.GetHashCode() and y.GetHashCode() == z.GetHashCode() establishing that x.GetHashCode() == y.GetHashCode(). Since x and y were arbitrary we have established that GetHashCode is constant.
Thus, we have shown that the only possible implementation of GetHashCode is
private readonly int constant = 17;
public override int GetHashCode() {
return constant;
}
All of this put together makes it clear that you need to rethink the concept you are trying model, and come up with a different definition of Equals.
I don't think you should be using Equals for this. People have a very explicit notion of what equals means, and if the Ids are different but the code or name are the same, I would not consider those "Equal". Maybe you need a different method like "IsCompatible".
If you want to be able to group them, you could use the extension method ToLookup() on a list of these objects, to use a predicate which would be your IsCompatible method. Then they would be grouped.
The golden rule is: if the objects compare equal, they must produce the same hash code.
Therefore a conforming (but let's say, undesirable) implementation would be
public override int GetHashCode()
{
return 0;
}
Frankly, if Id, Name and Code are independent of each other then I don't know if you can do any better. Putting objects of this type in a hash table is going to be painful.

IEqualityComparer for Value Objects

I have an immutable Value Object, IPathwayModule, whose value is defined by:
(int) Block;
(Entity) Module, identified by (string) ModuleId;
(enum) Status; and
(entity) Class, identified by (string) ClassId - which may be null.
Here's my current IEqualityComparer implementation which seems to work in a few unit tests. However, I don't think I understand what I'm doing well enough to know whether I am doing it right. A previous implementation would sometimes fail on repeated test runs.
private class StandardPathwayModuleComparer : IEqualityComparer<IPathwayModule>
{
public bool Equals(IPathwayModule x, IPathwayModule y)
{
int hx = GetHashCode(x);
int hy = GetHashCode(y);
return hx == hy;
}
public int GetHashCode(IPathwayModule obj)
{
int h;
if (obj.Class != null)
{
h = obj.Block.GetHashCode() + obj.Module.ModuleId.GetHashCode() + obj.Status.GetHashCode() + obj.Class.ClassId.GetHashCode();
}
else
{
h = obj.Block.GetHashCode() + obj.Module.ModuleId.GetHashCode() + obj.Status.GetHashCode() + "NOCLASS".GetHashCode();
}
return h;
}
}
IPathwayModule is definitely immutable and different instances with the same values should be equal and produce the same HashCode since they are used as items within HashSets.
I suppose my questions are:
Am I using the interface correctly in this case?
Are there cases where I might not see the desired behaviour?
Is there any way to improve the robustness, performance?
Are there any good practices that I am not following?
Don't do the Equals in terms of the Hash function's results it's too fragile. Rather do a field value comparison for each of the fields. Something like:
return x != null && y != null && x.Name.Equals(y.Name) && x.Type.Equals(y.Type) ...
Also, the hash functions results aren't really amenable to addition. Try using the ^ operator instead.
return obj.Name.GetHashCode() ^ obj.Type.GetHashCode() ...
You don't need the null check in GetHashCode. If that value is null, you've got bigger problems, no use trying to recover from something over which you have no control...
The only big problem is the implementation of Equals. Hash codes are not unique, you can get the same hash code for objects which are different. You should compare each field of IPathwayModule individually.
GetHashCode() can be improved a bit. You don't need to call GetHashCode() on an int. The int itself is a good hash code. The same for enum values. Your GetHashCode could be then implemented like this:
public int GetHashCode(IPathwayModule obj)
{
unchecked {
int h = obj.Block + obj.Module.ModeleId.GetHashCode() + (int) obj.Status;
if (obj.class != null)
h += obj.Class.ClassId.GetHashCode();
return h;
}
}
The 'unchecked' block is necessary because there may be overflows in the arithmetic operations.
You shouldn't use GetHashCode() as the main way of comparison objects. Compare it field-wise.
There could be multiple objects with the same hash code (this is called 'hash code collisions').
Also, be careful when add together multiple integer values, since you can easily cause an OverflowException. Use 'exclusive or' (^) to combine hashcodes or wrap code into 'unchecked' block.
You should implement better versions of Equals and GetHashCode.
For instance, the hash code of enums is simply their numerical value.
In other words, with these two enums:
public enum A { x, y, z }
public enum B { k, l, m }
Then with your implementation, the following value type:
public struct AB {
public A;
public B;
}
the following two values would be considered equal:
AB ab1 = new AB { A = A.x, B = B.m };
AB ab2 = new AB { A = A.z, B = B.k };
I'm assuming you don't want that.
Also, passing the value types as interfaces will box them, this could have performance concerns, although probably not much. You might consider making the IEqualityComparer implementation take your value types directly.
Assuming that two objects are equal because their hash code is equal is wrong. You need to compare all members individually
It is proabably better to use ^ rather than + to combine the hash codes.
If I understand you well, you'd like to hear some comments on your code. Here're my remarks:
GetHashCode should be XOR'ed together, not added. XOR (^) gives a better chance of preventing collisions
You compare hashcodes. That's good, but only do this if the underlying object overrides the GetHashCode. If not, use properties and their hashcodes and combine them.
Hash codes are important, they make a quick compare possible. But if hash codes are equal, the object can still be different. This happens rarely. But you'll need to compare the fields of your object if hash codes are equal.
You say your value types are immutable, but you reference objects (.Class), which are not immutable
Always optimize comparison by adding reference comparison as first test. References unequal, the objects are unequal, then the structs are unequal.
Point 5 depends on whether the you want the objects that you reference in your value type to return not equal when not the same reference.
EDIT: you compare many strings. The string comparison is optimized in C#. You can, as others suggested, better use == with them in your comparison. For the GetHashCode, use OR ^ as suggested by others as well.
Thanks to all who responded. I have aggregated the feedback from everyone who responded and my improved IEqualityComparer now looks like:
private class StandardPathwayModuleComparer : IEqualityComparer<IPathwayModule>
{
public bool Equals(IPathwayModule x, IPathwayModule y)
{
if (x == y) return true;
if (x == null || y == null) return false;
if ((x.Class == null) ^ (y.Class == null)) return false;
if (x.Class == null) //and implicitly y.Class == null
{
return x.Block.Equals(y.Block) && x.Status.Equals(y.Status) && x.Module.ModuleId.Equals(y.Module.ModuleId);
}
return x.Block.Equals(y.Block) && x.Status.Equals(y.Status) && x.Module.ModuleId.Equals(y.Module.ModuleId) && x.Class.ClassId.Equals(y.Class.ClassId);
}
public int GetHashCode(IPathwayModule obj)
{
unchecked {
int h = obj.Block ^ obj.Module.ModuleId.GetHashCode() ^ (int) obj.Status;
if (obj.Class != null)
{
h ^= obj.Class.ClassId.GetHashCode();
}
return h;
}
}
}

Categories