LINQ Distinct with EqualityComparer<T>.Default: IEquatable<T> implementation ignored? - c#

I have a class Foo with two fields where the Equals and GetHashCode methods have been overridden:
public class Foo
{
private readonly int _x;
private readonly int _y;
public Foo(int x, int y) { _x = x; _y = y; }
public override bool Equals(object obj) {
Foo other = obj as Foo;
return other != null && _y == other._y;
}
public override int GetHashCode() { return _y; }
}
If I create an array of Foo:s and count the number of Distinct values of this array:
var array = new[] { new Foo(1, 1), new Foo(1, 2), new Foo(2, 2), new Foo(3, 2) };
Console.WriteLine(array.Distinct().Count());
The number of distinct values is recognized as:
2
If I now make my class Foo implement IEquatable<Foo> using the following implementation:
public bool Equals(Foo other) { return _y == other._y; }
The number of distinct values is still:
2
But if I change the implementation to this:
public bool Equals(Foo other) { return _x == other._x; }
The computed number of distinct Foo:s is neither 3 (i.e. the number of distinct _x) nor 2 (number of distinct _y), but:
4
And if I comment out the Equals and GetHashCode overrides but keep the IEquatable<Foo> implementation, the answer is also 4.
According to MSDN documentation, this Distinct overload should use the static property EqualityComparer.Default to define the equality comparison, and:
The Default property checks whether type T implements the System.IEquatable<T>
interface and, if so, returns an EqualityComparer<T> that uses that
implementation. Otherwise, it returns an EqualityComparer<T> that uses the
overrides of Object.Equals and Object.GetHashCode provided by T.
But looking at the experiment above, this statement does not seem to hold. At best, the IEquatable<Foo> implementation supports the already provided Equals and GetHashCode overrides, and at worst it completely corrupts the equality comparison.
My questions:
Why does the independent implementation of IEquatable<T> corrupt the equality comparison?
Can it play a role independent of the Equals and GetHashCode overrides?
If not, why does EqualityComparer<T>.Default look for this implementation first?

Your GetHashCode method only depends on y. That means if your Equals method doesn't depend on y, you've broken the contract of equality... they're inconsistent.
Distinct() is going to expect that equal elements have the same hash code. In your case, the only equal elements by x value have different hash codes, therefore Equals won't even get called.
From the docs of IEquatable<T>.Equals:
If you implement Equals, you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behavior is consistent with that of the IEquatable<T>.Equals method.
Your implementation of Equals(Foo) isn't consistent with either Equals(object) or GetHashCode.
EqualityComparer<T>.Default will still delegate to your GetHashCode method - it will just use your Equals(T) method in preference to your Equals(object) method.
So to answer your questions in order:
Why does the independent implementation of IEquatable<T> corrupt the equality comparison?
Because you've introduced an inconsistent implementation. It's not meant to be independent in terms of behaviour. It's just meant to be more efficient by avoiding a type check (and boxing, for value types).
Can it play a role independent of the Equals and GetHashCode overrides?
It should be consistent with Equals(object) for the sake of sanity, and it must be consistent with GetHashCode for the sake of correctness.
If not, why does EqualityComparer<T>.Default look for this implementation first?
To avoid runtime type checking and boxing/unboxing, primarily.

Related

Which IEqualityComparer is used in a Dictionary?

Lets say I instantiate a dictionary like this
var dictionary = new Dictionary<MyClass, SomeValue>();
And MyClass is my own class that implements an IEqualityComparer<>.
Now, when I do operations on the dictionary - such as Add, Contains, TryGetValue etc - does dictionary use the default EqualityComparer<T>.Default since I never passed one into the constructor or does it use the IEqualityComparer that MyClass implements?
Thanks
It will use the default equality comparer.
If an object is capable of comparing itself for equality with other objects then it should implement IEquatable, not IEqualityComparer. If a type implements IEquatable then that will be used as the implementation of EqualityCOmparer.Default, followed by the object.Equals and object.GetHashCode methods otherwise.
An IEqualityComparer is designed to compare other objects for equality, not itself.
It will use IEqualityComparer<T>.Default if you don't specify any equality comparer explicitly.
This default equality comparer will use the methods Equals and GetHashCode of your class.
Your key class should not implement IEqualityComparer, this interface should be implemented when you want to delegate equality comparisons to a different class. When you want the class itself to handle equality comparisons, just override Equals and GetHashCode (you can also implement IEquatable<T> but this is not strictly required).
If you want to use IEquatable<T> you can do so without having to create a separate class but you do need to implement GetHashCode().
It will pair up GetHashCode() and bool Equals(T other) and you don't have to use the archaic Equals signature.
// tested with Dictionary<T>
public class Animal : IEquatable<Animal>
{
public override int GetHashCode()
{
return (species + breed).GetHashCode();
}
public bool Equals(Animal other)
{
return other != null &&
(
this.species == other.species &&
this.breed == other.breed &&
this.color == other.color
);
}
}

What is the proper way to implement Equation functions [duplicate]

I'm having some difficulty using Linq's .Except() method when comparing two collections of a custom object.
I've derived my class from Object and implemented overrides for Equals(), GetHashCode(), and the operators == and !=. I've also created a CompareTo() method.
In my two collections, as a debugging experiment, I took the first item from each list (which is a duplicate) and compared them as follows:
itemListA[0].Equals(itemListB[0]); // true
itemListA[0] == itemListB[0]; // true
itemListA[0].CompareTo(itemListB[0]); // 0
In all three cases, the result is as I wanted. However, when I use Linq's Except() method, the duplicate items are not removed:
List<myObject> newList = itemListA.Except(itemListB).ToList();
Learning about how Linq does comparisons, I've discovered various (conflicting?) methods that say I need to inherit from IEquatable<T> or IEqualityComparer<T> etc.
I'm confused because when I inherit from, for example, IEquatable<T>, I am required to provide a new Equals() method with a different signature from what I've already overridden. Do I need to have two such methods with different signatures, or should I no longer derive my class from Object?
My object definition (simplified) looks like this:
public class MyObject : Object
{
public string Name {get; set;}
public DateTime LastUpdate {get; set;}
public int CompareTo(MyObject other)
{
// ...
}
public override bool Equals(object obj)
{
// allows some tolerance on LastUpdate
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + Name.GetHashCode();
hash = hash * 23 + LastUpdate.GetHashCode();
return hash;
}
}
// Overrides for operators
}
I noticed that when I inherit from IEquatable<T> I can do so using IEquatable<MyObject> or IEquatable<object>; the requirements for the Equals() signature change when I use one or the other. What is the recommended way?
What I am trying to accomplish:
I want to be able to use Linq (Distinct/Except) as well as the standard equality operators (== and !=) without duplicating code. The comparison should allow two objects to be considered equal if their name is identical and the LastUpdate property is within a number of seconds (user-specified) tolerance.
Edit:
Showing GetHashCode() code.
It doesn't matter whether you override object.Equals and object.GetHashCode, implement IEquatable, or provide an IEqualityComparer. All of them can work, just in slightly different ways.
1) Overriding Equals and GetHashCode from object:
This is the base case, in a sense. It will generally work, assuming you're in a position to edit the type to ensure that the implementation of the two methods are as desired. There's nothing wrong with doing just this in many cases.
2) Implementing IEquatable
The key point here is that you can (and should) implement IEquatable<YourTypeHere>. The key difference between this and #1 is that you have strong typing for the Equals method, rather than just having it use object. This is both better for convenience to the programmer (added type safety) and also means that any value types won't be boxed, so this can improve performance for custom structs. If you do this you should pretty much always do it in addition to #1, not instead of. Having the Equals method here differ in functionality from object.Equals would be...bad. Don't do that.
3) Implementing IEqualityComparer
This is entirely different from the first two. The idea here is that the object isn't getting it's own hash code, or seeing if it's equal to something else. The point of this approach is that an object doesn't know how to properly get it's hash or see if it's equal to something else. Perhaps it's because you don't control the code of the type (i.e. a 3rd party library) and they didn't bother to override the behavior, or perhaps they did override it but you just want your own unique definition of "equality" in this particular context.
In this case you create an entirely separate "comparer" object that takes in two different objects and informs you of whether they are equal or not, or what the hash code of one object is. When using this solution it doesn't matter what the Equals or GetHashCode methods do in the type itself, you won't use it.
Note that all of this is entirely unrelated from the == operator, which is its own beast.
The basic pattern I use for equality in an object is the following. Note that only 2 methods have actual logic specific to the object. The rest is just boiler plate code that feeds into these 2 methods
class MyObject : IEquatable<MyObject> {
public bool Equals(MyObject other) {
if (Object.ReferenceEquals(other, null)) {
return false;
}
// Actual equality logic here
}
public override int GetHashCode() {
// Actual Hashcode logic here
}
public override bool Equals(Object obj) {
return Equals(obj as MyObject);
}
public static bool operator==(MyObject left, MyObject right) {
if (Object.ReferenceEquals(left, null)) {
return Object.ReferenceEquals(right, null);
}
return left.Equals(right);
}
public static bool operator!=(MyObject left, MyObject right) {
return !(left == right);
}
}
If you follow this pattern there is really no need to provide a custom IEqualityComparer<MyObject>. The EqualityComparer<MyObject>.Default will be enough as it will rely on IEquatable<MyObject> in order to perform equality checks
You cannot "allow some tolerance on LastUpdate" and then use a GetHashCode() implementation that uses the strict value of LastUpdate!
Suppose the this instance has LastUpdate at 23:13:13.933, and the obj instance has 23:13:13.932. Then these two might compare equal with your tolerance idea. But if so, their hash codes must be the same number. But that will not happen unless you're extremely extremely lucky, for the DateTime.GetHashCode() should not give the same hash for these two times.
Besides, your Equals method most be a transitive relation mathematically. And "approximately equal to" cannot be made transitive. Its transitive closure is the trivial relation that identifies everything.

Equals and GetHashCode confusion

I am trying to implement an immutable Point class where two Point instances are considered equal if they have the same Coordinates. I am using Jon Skeet's implementation of a Coordinate value type.
For comparing equality of Points I have also inherited EqualityComparer<Point> and IEquatable<Point> and I have a unit test as below:
Point.cs:
public class Point : EqualityCompararer<Point>, IEquatable<Point>
{
public Coordinate Coordinate { get; private set; }
// EqualityCompararer<Point>, IEquatable<Point> methods and other methods
}
PointTests.cs:
[Fact]
public void PointReferencesToSamePortalAreNotEqual()
{
var point1 = new Point(22.0, 24.0);
var point2 = new Point(22.0, 24.0);
// Value equality should return true
Assert.Equal(point1, point2);
// Reference equality should return false
Assert.False(point1 == point2);
}
Now I am really confused by the 3 interface/abstract methods that I must implement. These are:
IEquatable<Point>.Equals(Point other)
EqualityComparer<Point>.Equals(Point x, Point y)
EqualityComparer<Point>.GetHashCode(Point obj)
And since I have overriden IEquatable<Point>.Equals, according to MSDN I must also implement:
Object.Equals(object obj)
Object.GetHashCode(object obj)
Now I am really confused about all the Equals and GetHashCode methods that are required to satisfy my unit test (Reference equality should return false and value equality should return true for point1 and point2).
Can anyone explain a bit further about Equals and GetHashCode?
Because Coordinate already implments GetHashCode() and Equals(Coordinate) for you it is actually quite easy, just use the underlying implmentation
public class Point : IEquatable<Point>
{
public Coordinate Coordinate { get; private set; }
public override int GetHashCode()
{
return Coordinate.GetHashCode();
}
public override bool Equals(object obj)
{
return this.Equals(obj as Point);
}
public bool Equals(Point point)
{
if(point == null)
return false;
return this.Coordinate.Equals(point.Coordinate);
}
}
the IEquatable<Point> is unnecessary as all it does is save you a extra cast. It is mainly for struct type classes to prevent the boxing of the struct in to the object passed in to bool Equals(object).
Equals:
Used to check if two objects are equal. There are several checks for equality (by value, by reference), and you really want to have a look at the link to see how they work, and the pitfalls when you don't know who is overriding them how.
GetHashCode:
A hash code is a numeric value that is used to insert and identify an object in a hash-based collection such as the Dictionary class, the Hashtable class, or a type derived from the DictionaryBase class. The GetHashCode method provides this hash code for algorithms that need quick checks of object equality.
Let's assume you're having two huge objects with heaps of objects inside, and that comparing them might take a very long time. And then you have a collection of those objects, and you need to compare them all. As the definitions say, GetHashCode will return a simple number you can compare if you don't want to compare the two objects. (and assuming you implemented them correctly, two different objects will not have the same hashcode, while objects who are supposed to be "equal" will).
And if you want Jon Skeet's opinion on something similar, look here.

Should IEquatable<T>, IComparable<T> be implemented on non-sealed classes?

Anyone have any opinions on whether or not IEquatable<T> or IComparable<T> should generally require that T is sealed (if it's a class)?
This question occurred to me since I'm writing a set of base classes intended to aid in the implementation of immutable classes. Part of the functionality which the base class is intended to provide is automatic implementation of equality comparisons (using the class's fields together with attributes which can be applied to fields to control equality comparisons). It should be pretty nice when I'm finished - I'm using expression trees to dynamically create a compiled comparison function for each T, so the comparison function should be very close to the performance of a regular equality comparison function. (I'm using an immutable dictionary keyed on System.Type and double check locking to store the generated comparison functions in a manner that's reasonably performant)
One thing that has cropped up though, is what functions to use to check equality of the member fields. My initial intention was to check if each member field's type (which I'll call X) implements IEquatable<X>. However, after some thought, I don't think this is safe to use unless X is sealed. The reason being that if X is not sealed, I can't know for sure if X is appropriately delegating equality checks to a virtual method on X, thereby allowing a subtype to override the equality comparison.
This then brings up a more general question - if a type is not sealed, should it really implement these interfaces AT ALL?? I would think not, since I would argue that the interfaces contract is to compare between two X types, not two types which may or may not be X (though they must of course be X or a subtype).
What do you guys think? Should IEquatable<T> and IComparable<T> be avoided for unsealed classes? (Also makes me wonder if there is an fxcop rule for this)
My current thought is to have my generated comparison function only use IEquatable<T> on member fields whose T is sealed, and instead to use the virtual Object.Equals(Object obj) if T is unsealed even if T implements IEquatable<T>, since the field could potentially store subtypes of T and I doubt most implementations of IEquatable<T> are designed appropriately for inheritance.
I've been thinking about this question for a bit and after a bit of consideration I agree that implementing IEquatable<T> and IComparable<T> should only be done on sealed types.
I went back and forth for a bit but then I thought of the following test. Under what circumstances should the following ever return false? IMHO, 2 objects are either equal or they are not.
public void EqualitySanityCheck<T>(T left, T right) where T : IEquatable<T> {
var equals1 = left.Equals(right);
var equals2 = ((IEqutable<T>)left).Equals(right);
Assert.AreEqual(equals1,equals2);
}
The result of IEquatable<T> on a given object should have the same behavior as Object.Equals assuming the comparer is of the equivalent type. Implementing IEquatable<T> twice in an object hierarchy allows for, and implies, there are 2 different ways of expressing equality in your system. It's easy to contrive any number of scenarios where IEquatable<T> and Object.Equals would differ since there are multiple IEquatable<T> implementations but only a single Object.Equals. Hence the above would fail and create a bit of confusion in your code.
Some people may argue that implementing IEquatable<T> at a higher point in the object hierarchy is valid because you want to compare a subset of the objects properties. In that case you should favor an IEqualityComparer<T> which is specifically designed to compare those properties.
I would generally recommend against implementing IEquatable<T> on any non-sealed class, or implementing non-generic IComparable on most, but the same cannot be said for IComparable<T>. Two reasons:
There already exists a means of comparing objects which may or may not be the same type: Object.Equals. Since IEquatable<T> does not include GetHashCode, its behavior essentially has to match that of Object.Equals. The only reason for implementing IEquatable<T> in addition to Object.Equals is performance. IEquatable<T> offers a small performance improvement versus Object.Equals when applied to sealed class types, and a big improvement when applied to structure types. The only way an unsealed type's implementation of IEquatable<T>.Equals can ensure that its behavior matches that of a possibly-overridden Object.Equals, however, is to call Object.Equals. If IEquatable<T>.Equals has to call Object.Equals, any possible performance advantage vanishes.
It is sometimes possible, meaningful, and useful, for a base class to have a defined natural ordering involving only base-class properties, which will be consistent through all subclasses. When checking two objects for equality, the result shouldn't depend upon whether one regards the objects as being a base type or a derived type. When ranking objects, however, the result should often depend upon the type being used as the basis for comparison. Derived-class objects should implement IComparable<TheirOwnType> but should not override the base type's comparison method. It is entirely reasonable for two derived-class objects to compare as "unranked" when compared as the parent type, but for one to compare above the other when compared as the derived type.
Implementation of non-generic IComparable in inheritable classes is perhaps more questionable than implementation of IComparable<T>. Probably the best thing to do is allow a base-class to implement it if it's not expected that any child class will need some other ordering, but for child classes not to reimplement or override parent-class implementations.
Most Equals implementations I've seen check the types of the objects being compared, if they aren't the same then the method returns false.
This neatly avoids the problem of a sub-type being compared against it's parent type, thereby negating the need for sealing a class.
An obvious example of this would be trying to compare a 2D point (A) with a 3D point (B): for a 2D the x and y values of a 3D point might be equal, but for a 3D point, the z value will most likely be different.
This means that A == B would be true, but B == A would be false. Most people like the Equals operators to be commutative, to checking types is clearly a good idea in this case.
But what if you subclass and you don't add any new properties? Well, that's a bit harder to answer, and possibly depends on your situation.
I have stumbled over this topic today when reading
https://blog.mischel.com/2013/01/05/inheritance-and-iequatable-do-not-mix/
and I agree, that there are reasons not to implement IEquatable<T>, because chances exist that it will be done in a wrong way.
However, after reading the linked article I tested my own implementation which I use on various non-sealed, inherited classes, and I found that it's working correctly.
When implementing IEquatable<T>, I referred to this article:
http://www.loganfranken.com/blog/687/overriding-equals-in-c-part-1/
It gives a pretty good explanation what code to use in Equals(). It does not address inheritance though, so I tuned it myself. Here's the result.
And to answer the original question:
I don't say that it should be implemented on non-sealed classes, but I say that it definitely could be implemented without problems.
//============================================================================
class CBase : IEquatable<CBase>
{
private int m_iBaseValue = 0;
//--------------------------------------------------------------------------
public CBase (int i_iBaseValue)
{
m_iBaseValue = i_iBaseValue;
}
//--------------------------------------------------------------------------
public sealed override bool Equals (object i_value)
{
if (ReferenceEquals (null, i_value))
return false;
if (ReferenceEquals (this, i_value))
return true;
if (i_value.GetType () != GetType ())
return false;
return Equals_EXEC ((CBase)i_value);
}
//--------------------------------------------------------------------------
public bool Equals (CBase i_value)
{
if (ReferenceEquals (null, i_value))
return false;
if (ReferenceEquals (this, i_value))
return true;
if (i_value.GetType () != GetType ())
return false;
return Equals_EXEC (i_value);
}
//--------------------------------------------------------------------------
protected virtual bool Equals_EXEC (CBase i_oValue)
{
return i_oValue.m_iBaseValue == m_iBaseValue;
}
}
//============================================================================
class CDerived : CBase, IEquatable<CDerived>
{
public int m_iDerivedValue = 0;
//--------------------------------------------------------------------------
public CDerived (int i_iBaseValue,
int i_iDerivedValue)
: base (i_iBaseValue)
{
m_iDerivedValue = i_iDerivedValue;
}
//--------------------------------------------------------------------------
public bool Equals (CDerived i_value)
{
if (ReferenceEquals (null, i_value))
return false;
if (ReferenceEquals (this, i_value))
return true;
if (i_value.GetType () != GetType ())
return false;
return Equals_EXEC (i_value);
}
//--------------------------------------------------------------------------
protected override bool Equals_EXEC (CBase i_oValue)
{
CDerived oValue = i_oValue as CDerived;
return base.Equals_EXEC (i_oValue)
&& oValue.m_iDerivedValue == m_iDerivedValue;
}
}
Test:
private static void Main (string[] args)
{
// Test with Foo and Fooby for verification of the problem.
// definition of Foo and Fooby copied from
// https://blog.mischel.com/2013/01/05/inheritance-and-iequatable-do-not-mix/
// and not added in this post
var fooby1 = new Fooby (0, "hello");
var fooby2 = new Fooby (0, "goodbye");
Foo foo1 = fooby1;
Foo foo2 = fooby2;
// all false, as expected
bool bEqualFooby12a = fooby1.Equals (fooby2);
bool bEqualFooby12b = fooby2.Equals (fooby1);
bool bEqualFooby12c = object.Equals (fooby1, fooby2);
bool bEqualFooby12d = object.Equals (fooby2, fooby1);
// 2 true (wrong), 2 false
bool bEqualFoo12a = foo1.Equals (foo2); // unexpectedly "true": wrong result, because "wrong" overload is called!
bool bEqualFoo12b = foo2.Equals (foo1); // unexpectedly "true": wrong result, because "wrong" overload is called!
bool bEqualFoo12c = object.Equals (foo1, foo2);
bool bEqualFoo12d = object.Equals (foo2, foo1);
// own test
CBase oB = new CBase (1);
CDerived oD1 = new CDerived (1, 2);
CDerived oD2 = new CDerived (1, 2);
CDerived oD3 = new CDerived (1, 3);
CDerived oD4 = new CDerived (2, 2);
CBase oB1 = oD1;
CBase oB2 = oD2;
CBase oB3 = oD3;
CBase oB4 = oD4;
// all false, as expected
bool bEqualBD1a = object.Equals (oB, oD1);
bool bEqualBD1b = object.Equals (oD1, oB);
bool bEqualBD1c = oB.Equals (oD1);
bool bEqualBD1d = oD1.Equals (oB);
// all true, as expected
bool bEqualD12a = object.Equals (oD1, oD2);
bool bEqualD12b = object.Equals (oD2, oD1);
bool bEqualD12c = oD1.Equals (oD2);
bool bEqualD12d = oD2.Equals (oD1);
bool bEqualB12a = object.Equals (oB1, oB2);
bool bEqualB12b = object.Equals (oB2, oB1);
bool bEqualB12c = oB1.Equals (oB2);
bool bEqualB12d = oB2.Equals (oB1);
// all false, as expected
bool bEqualD13a = object.Equals (oD1, oD3);
bool bEqualD13b = object.Equals (oD3, oD1);
bool bEqualD13c = oD1.Equals (oD3);
bool bEqualD13d = oD3.Equals (oD1);
bool bEqualB13a = object.Equals (oB1, oB3);
bool bEqualB13b = object.Equals (oB3, oB1);
bool bEqualB13c = oB1.Equals (oB3);
bool bEqualB13d = oB3.Equals (oB1);
// all false, as expected
bool bEqualD14a = object.Equals (oD1, oD4);
bool bEqualD14b = object.Equals (oD4, oD1);
bool bEqualD14c = oD1.Equals (oD4);
bool bEqualD14d = oD4.Equals (oD1);
bool bEqualB14a = object.Equals (oB1, oB4);
bool bEqualB14b = object.Equals (oB4, oB1);
bool bEqualB14c = oB1.Equals (oB4);
bool bEqualB14d = oB4.Equals (oB1);
}

What is "Best Practice" For Comparing Two Instances of a Reference Type?

I came across this recently, up until now I have been happily overriding the equality operator (==) and/or Equals method in order to see if two references types actually contained the same data (i.e. two different instances that look the same).
I have been using this even more since I have been getting more in to automated testing (comparing reference/expected data against that returned).
While looking over some of the coding standards guidelines in MSDN I came across an article that advises against it. Now I understand why the article is saying this (because they are not the same instance) but it does not answer the question:
What is the best way to compare two reference types?
Should we implement IComparable? (I have also seen mention that this should be reserved for value types only).
Is there some interface I don't know about?
Should we just roll our own?!
Many Thanks ^_^
Update
Looks like I had mis-read some of the documentation (it's been a long day) and overriding Equals may be the way to go..
If you are implementing reference
types, you should consider overriding
the Equals method on a reference type
if your type looks like a base type
such as a Point, String, BigNumber,
and so on. Most reference types should
not overload the equality operator,
even if they override Equals. However,
if you are implementing a reference
type that is intended to have value
semantics, such as a complex number
type, you should override the equality
operator.
Implementing equality in .NET correctly, efficiently and without code duplication is hard. Specifically, for reference types with value semantics (i.e. immutable types that treat equvialence as equality), you should implement the System.IEquatable<T> interface, and you should implement all the different operations (Equals, GetHashCode and ==, !=).
As an example, here’s a class implementing value equality:
class Point : IEquatable<Point> {
public int X { get; }
public int Y { get; }
public Point(int x = 0, int y = 0) { X = x; Y = y; }
public bool Equals(Point other) {
if (other is null) return false;
return X.Equals(other.X) && Y.Equals(other.Y);
}
public override bool Equals(object obj) => Equals(obj as Point);
public static bool operator ==(Point lhs, Point rhs) => object.Equals(lhs, rhs);
public static bool operator !=(Point lhs, Point rhs) => ! (lhs == rhs);
public override int GetHashCode() => X.GetHashCode() ^ Y.GetHashCode();
}
The only movable parts in the above code are the bolded parts: the second line in Equals(Point other) and the GetHashCode() method. The other code should remain unchanged.
For reference classes that do not represent immutable values, do not implement the operators == and !=. Instead, use their default meaning, which is to compare object identity.
The code intentionally equates even objects of a derived class type. Often, this might not be desirable because equality between the base class and derived classes is not well-defined. Unfortunately, .NET and the coding guidelines are not very clear here. The code that Resharper creates, posted in another answer, is susceptible to undesired behaviour in such cases because Equals(object x) and Equals(SecurableResourcePermission x) will treat this case differently.
In order to change this behaviour, an additional type check has to be inserted in the strongly-typed Equals method above:
public bool Equals(Point other) {
if (other is null) return false;
if (other.GetType() != GetType()) return false;
return X.Equals(other.X) && Y.Equals(other.Y);
}
It looks like you're coding in C#, which has a method called Equals that your class should implement, should you want to compare two objects using some other metric than "are these two pointers (because object handles are just that, pointers) to the same memory address?".
I grabbed some sample code from here:
class TwoDPoint : System.Object
{
public readonly int x, y;
public TwoDPoint(int x, int y) //constructor
{
this.x = x;
this.y = y;
}
public override bool Equals(System.Object obj)
{
// If parameter is null return false.
if (obj == null)
{
return false;
}
// If parameter cannot be cast to Point return false.
TwoDPoint p = obj as TwoDPoint;
if ((System.Object)p == null)
{
return false;
}
// Return true if the fields match:
return (x == p.x) && (y == p.y);
}
public bool Equals(TwoDPoint p)
{
// If parameter is null return false:
if ((object)p == null)
{
return false;
}
// Return true if the fields match:
return (x == p.x) && (y == p.y);
}
public override int GetHashCode()
{
return x ^ y;
}
}
Java has very similar mechanisms. The equals() method is part of the Object class, and your class overloads it if you want this type of functionality.
The reason overloading '==' can be a bad idea for objects is that, usually, you still want to be able to do the "are these the same pointer" comparisons. These are usually relied upon for, for instance, inserting an element into a list where no duplicates are allowed, and some of your framework stuff may not work if this operator is overloaded in a non-standard way.
Below I have summed up what you need to do when implementing IEquatable and provided the justification from the various MSDN documentation pages.
Summary
When testing for value equality is desired (such as when using objects in collections) you should implement the IEquatable interface, override Object.Equals, and GetHashCode for your class.
When testing for reference equality is desired you should use operator==,operator!= and Object.ReferenceEquals.
You should only override operator== and operator!= for ValueTypes and immutable reference types.
Justification
IEquatable
The System.IEquatable interface is used to compare two instances of an object for equality. The objects are compared based on the logic implemented in the class. The comparison results in a boolean value indicating if the objects are different. This is in contrast to the System.IComparable interface, which return an integer indicating how the object values are different.
The IEquatable interface declares two methods that must be overridden. The Equals method contains the implementation to perform the actual comparison and return true if the object values are equal, or false if they are not. The GetHashCode method should return a unique hash value that may be used to uniquely identify identical objects that contain different values. The type of hashing algorithm used is implementation-specific.
IEquatable.Equals Method
You should implement IEquatable for your objects to handle the possibility that they will be stored in an array or generic collection.
If you implement IEquatable you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behavior is consistent with that of the IEquatable.Equals method
Guidelines for Overriding Equals() and Operator == (C# Programming Guide)
x.Equals(x) returns true.
x.Equals(y) returns the same value as y.Equals(x)
if (x.Equals(y) && y.Equals(z)) returns true, then x.Equals(z) returns true.
Successive invocations of x. Equals (y) return the same value as long as the objects referenced by x and y are not modified.
x. Equals (null) returns false (for non-nullable value types only. For more information, see Nullable Types (C# Programming Guide).)
The new implementation of Equals should not throw exceptions.
It is recommended that any class that overrides Equals also override Object.GetHashCode.
Is is recommended that in addition to implementing Equals(object), any class also implement Equals(type) for their own type, to enhance performance.
By default, the operator == tests for reference equality by determining whether two references indicate the same object. Therefore, reference types do not have to implement operator == in order to gain this functionality. When a type is immutable, that is, the data that is contained in the instance cannot be changed, overloading operator == to compare value equality instead of reference equality can be useful because, as immutable objects, they can be considered the same as long as they have the same value. It is not a good idea to override operator == in non-immutable types.
Overloaded operator == implementations should not throw exceptions.
Any type that overloads operator == should also overload operator !=.
== Operator (C# Reference)
For predefined value types, the equality operator (==) returns true if the values of its operands are equal, false otherwise.
For reference types other than string, == returns true if its two operands refer to the same object.
For the string type, == compares the values of the strings.
When testing for null using == comparisons within your operator== overrides, make sure you use the base object class operator. If you don't, infinite recursion will occur resulting in a stackoverflow.
Object.Equals Method (Object)
If your programming language supports operator overloading and if you choose to overload the equality operator for a given type, that type must override the Equals method. Such implementations of the Equals method must return the same results as the equality operator
The following guidelines are for implementing a value type:
Consider overriding Equals to gain increased performance over that provided by the default implementation of Equals on ValueType.
If you override Equals and the language supports operator overloading, you must overload the equality operator for your value type.
The following guidelines are for implementing a reference type:
Consider overriding Equals on a reference type if the semantics of the type are based on the fact that the type represents some value(s).
Most reference types must not overload the equality operator, even if they override Equals. However, if you are implementing a reference type that is intended to have value semantics, such as a complex number type, you must override the equality operator.
Additional Gotchas
When overriding GetHashCode() make sure you test reference types for NULL before using them in the hash code.
I ran into a problem with interface-based programming and operator overloading described here: Operator Overloading with Interface-Based Programming in C#
That article just recommends against overriding the equality operator (for reference types), not against overriding Equals. You should override Equals within your object (reference or value) if equality checks will mean something more than reference checks. If you want an interface, you can also implement IEquatable (used by generic collections). If you do implement IEquatable, however, you should also override equals, as the IEquatable remarks section states:
If you implement IEquatable<T>, you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behavior is consistent with that of the IEquatable<T>.Equals method. If you do override Object.Equals(Object), your overridden implementation is also called in calls to the static Equals(System.Object, System.Object) method on your class. This ensures that all invocations of the Equals method return consistent results.
In regards to whether you should implement Equals and/or the equality operator:
From Implementing the Equals Method
Most reference types should not overload the equality operator, even if they override Equals.
From Guidelines for Implementing Equals and the Equality Operator (==)
Override the Equals method whenever you implement the equality operator (==), and make them do the same thing.
This only says that you need to override Equals whenever you implement the equality operator. It does not say that you need to override the equality operator when you override Equals.
For complex objects that will yield specific comparisons then implementing IComparable and defining the comparison in the Compare methods is a good implementation.
For example we have "Vehicle" objects where the only difference may be the registration number and we use this to compare to ensure that the expected value returned in testing is the one we want.
I tend to use what Resharper automatically makes. for example, it autocreated this for one of my reference types:
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
return obj.GetType() == typeof(SecurableResourcePermission) && Equals((SecurableResourcePermission)obj);
}
public bool Equals(SecurableResourcePermission obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
return obj.ResourceUid == ResourceUid && Equals(obj.ActionCode, ActionCode) && Equals(obj.AllowDeny, AllowDeny);
}
public override int GetHashCode()
{
unchecked
{
int result = (int)ResourceUid;
result = (result * 397) ^ (ActionCode != null ? ActionCode.GetHashCode() : 0);
result = (result * 397) ^ AllowDeny.GetHashCode();
return result;
}
}
If you want to override == and still do ref checks, you can still use Object.ReferenceEquals.
Microsoft appears to have changed their tune, or at least there is conflicting info about not overloading the equality operator. According to this Microsoft article titled How to: Define Value Equality for a Type:
"The == and != operators can be used with classes even if the class does not overload them. However, the default behavior is to perform a reference equality check. In a class, if you overload the Equals method, you should overload the == and != operators, but it is not required."
According to Eric Lippert in his answer to a question I asked about Minimal code for equality in C# - he says:
"The danger you run into here is that you get an == operator defined for you that does reference equality by default. You could easily end up in a situation where an overloaded Equals method does value equality and == does reference equality, and then you accidentally use reference equality on not-reference-equal things that are value-equal. This is an error-prone practice that is hard to spot by human code review.
A couple years ago I worked on a static analysis algorithm to statistically detect this situation, and we found a defect rate of about two instances per million lines of code across all codebases we studied. When considering just codebases which had somewhere overridden Equals, the defect rate was obviously considerably higher!
Moreover, consider the costs vs the risks. If you already have implementations of IComparable then writing all the operators is trivial one-liners that will not have bugs and will never be changed. It's the cheapest code you're ever going to write. If given the choice between the fixed cost of writing and testing a dozen tiny methods vs the unbounded cost of finding and fixing a hard-to-see bug where reference equality is used instead of value equality, I know which one I would pick."
The .NET Framework will not ever use == or != with any type that you write. But, the danger is what would happen if someone else does. So, if the class is for a 3rd party, then I would always provide the == and != operators. If the class is only intended to be used internally by the group, I would still probably implement the == and != operators.
I would only implement the <, <=, >, and >= operators if IComparable was implemented. IComparable should only be implemented if the type needs to support ordering - like when sorting or being used in an ordered generic container like SortedSet.
If the group or company had a policy in place to not ever implement the == and != operators - then I would of course follow that policy. If such a policy were in place, then it would be wise to enforce it with a Q/A code analysis tool that flags any occurrence of the == and != operators when used with a reference type.
I believe getting something as simple as checking objects for equality correct is a bit tricky with .NET's design.
For Struct
1) Implement IEquatable<T>. It improves performance noticeably.
2) Since you're having your own Equals now, override GetHashCode, and to be consistent with various equality checking override object.Equals as well.
3) Overloading == and != operators need not be religiously done since the compiler will warn if you unintentionally equate a struct with another with a == or !=, but its good to do so to be consistent with Equals methods.
public struct Entity : IEquatable<Entity>
{
public bool Equals(Entity other)
{
throw new NotImplementedException("Your equality check here...");
}
public override bool Equals(object obj)
{
if (obj == null || !(obj is Entity))
return false;
return Equals((Entity)obj);
}
public static bool operator ==(Entity e1, Entity e2)
{
return e1.Equals(e2);
}
public static bool operator !=(Entity e1, Entity e2)
{
return !(e1 == e2);
}
public override int GetHashCode()
{
throw new NotImplementedException("Your lightweight hashing algorithm, consistent with Equals method, here...");
}
}
For Class
From MS:
Most reference types should not overload the equality operator, even if they override Equals.
To me == feels like value equality, more like a syntactic sugar for Equals method. Writing a == b is much more intuitive than writing a.Equals(b). Rarely we'll need to check reference equality. In abstract levels dealing with logical representations of physical objects this is not something we would need to check. I think having different semantics for == and Equals can actually be confusing. I believe it should have been == for value equality and Equals for reference (or a better name like IsSameAs) equality in the first place. I would love to not take MS guideline seriously here, not just because it isn't natural to me, but also because overloading == doesn't do any major harm. That's unlike not overriding non-generic Equals or GetHashCode which can bite back, because framework doesn't use == anywhere but only if we ourself use it. The only real benefit I gain from not overloading == and != will be the consistency with design of the entire framework over which I have no control of. And that's indeed a big thing, so sadly I will stick to it.
With reference semantics (mutable objects)
1) Override Equals and GetHashCode.
2) Implementing IEquatable<T> isn't a must, but will be nice if you have one.
public class Entity : IEquatable<Entity>
{
public bool Equals(Entity other)
{
if (ReferenceEquals(this, other))
return true;
if (ReferenceEquals(null, other))
return false;
//if your below implementation will involve objects of derived classes, then do a
//GetType == other.GetType comparison
throw new NotImplementedException("Your equality check here...");
}
public override bool Equals(object obj)
{
return Equals(obj as Entity);
}
public override int GetHashCode()
{
throw new NotImplementedException("Your lightweight hashing algorithm, consistent with Equals method, here...");
}
}
With value semantics (immutable objects)
This is the tricky part. Can get easily messed up if not taken care..
1) Override Equals and GetHashCode.
2) Overload == and != to match Equals. Make sure it works for nulls.
2) Implementing IEquatable<T> isn't a must, but will be nice if you have one.
public class Entity : IEquatable<Entity>
{
public bool Equals(Entity other)
{
if (ReferenceEquals(this, other))
return true;
if (ReferenceEquals(null, other))
return false;
//if your below implementation will involve objects of derived classes, then do a
//GetType == other.GetType comparison
throw new NotImplementedException("Your equality check here...");
}
public override bool Equals(object obj)
{
return Equals(obj as Entity);
}
public static bool operator ==(Entity e1, Entity e2)
{
if (ReferenceEquals(e1, null))
return ReferenceEquals(e2, null);
return e1.Equals(e2);
}
public static bool operator !=(Entity e1, Entity e2)
{
return !(e1 == e2);
}
public override int GetHashCode()
{
throw new NotImplementedException("Your lightweight hashing algorithm, consistent with Equals method, here...");
}
}
Take special care to see how it should fare if your class can be inherited, in such cases you will have to determine if a base class object can be equal to a derived class object. Ideally, if no objects of derived class is used for equality checking, then a base class instance can be equal to a derived class instance and in such cases, there is no need to check Type equality in generic Equals of base class.
In general take care not to duplicate code. I could have made a generic abstract base class (IEqualizable<T> or so) as a template to allow re-use easier, but sadly in C# that stops me from deriving from additional classes.
All the answers above do not consider polymorphism, often you want derived references to use the derived Equals even when compared via a base reference. Please see the question/ discussion/ answers here - Equality and polymorphism

Categories