Why Implement the IEquatable<T> Interface - c#

I have been reading articles and understand interfaces to an extent however, if i wanted to right my own custom Equals method, it seems I can do this without implementing the IEquatable Interface. An example.
using System;
using System.Collections;
using System.ComponentModel;
namespace ProviderJSONConverter.Data.Components
{
public class Address : IEquatable<Address>
{
public string address { get; set; }
[DefaultValue("")]
public string address_2 { get; set; }
public string city { get; set; }
public string state { get; set; }
public string zip { get; set; }
public bool Equals(Address other)
{
if (Object.ReferenceEquals(other, null)) return false;
if (Object.ReferenceEquals(this, other)) return true;
return (this.address.Equals(other.address)
&& this.address_2.Equals(other.address_2)
&& this.city.Equals(other.city)
&& this.state.Equals(other.state)
&& this.zip.Equals(other.zip));
}
}
}
Now if i dont implement the interface and leave : IEquatable<Address> out of the code, it seems the application operates exactly the same. Therefore, I am unclear as to why implement the interface? I can write my own custom Equals method without it and the breakpoint will hit the method still and give back the same results.
Can anyone help explain this to me more? I am hung up on why include "IEquatable<Address>" before calling the Equals method.

Now if i dont implement the interface and leave : IEquatable out of the code, it seems the application operates exactly the same.
Well, that depends on what "the application" does. For example:
List<Address> addresses = new List<Address>
{
new Address { ... }
};
int index = addresses.IndexOf(new Address { ... });
... that won't work (i.e. index will be -1) if you have neither overridden Equals(object) nor implemented IEquatable<T>. List<T>.IndexOf won't call your Equals overload.
Code that knows about your specific class will pick up the Equals overload - but any code (e.g. generic collections, all of LINQ to Objects etc) which just works with arbitrary objects won't pick it up.

The .NET framework has confusingly many possibilities for equality checking:
The virtual Object.Equals(object)
The overloadable equality operators (==, !=, <=, >=)
IEquatable<T>.Equals(T)
IComparable.CompareTo(object)
IComparable<T>.CompareTo(T)
IEqualityComparer.Equals(object, object)
IEqualityComparer<T>.Equals(T, T)
IComparer.Compare(object, object)
IComparer<T>.Compare(T, T)
And I did not mention the ReferenceEquals, the static Object.Equals(object, object) and the special cases (eg. string and floating-point comparison), just the cases where we can implement something.
Additionally, the default behavior of the first two points are different for structs and classes. So it is not a wonder that a user can be confused about what and how to implement.
As a thumb of rule, you can follow the following pattern:
Classes
By default, both the Equals(object) method and equality operators (==, !=) check reference equality.
If reference equality is not right for you, override the Equals method (and also GetHashCode; otherwise, your class will not be able to be used in hashed collections)
You can keep the original reference equality functionality for the == and != operators, it is common for classes. But if you overload them, it must be consistent with Equals.
If your instances can be compared to each other in less or greater meaning, implement the IComparable interface. When Equals reports equality, CompareTo must return 0 (again, consistency).
Basically that's it. Implementing the generic IEquatable<T> and Comparable<T> interfaces for classes is not a must: as there is no boxing, the performance gain would be minimal in the generic collections. But remember, if you implement them, keep the consistency.
Structs
By default, the Equals(object) performs a value comparison for structs (checks the field values). Though normally this is the expected behavior in case of a value type, the base implementation does this by using reflection, which has a terrible performance. So do always override the Equals(object) in a public struct, even if you implement the same functionality as it originally had.
When the Equals(object) method is used for structs, a boxing happens, which have a performance cost (not as bad as the reflection in ValueType.Equals, but it matters). That's why IEquatable<T> interface exists. You should implement it on structs if you want to use them in generic collections. Have I already mentioned to keep consistency?
By default, the == and != operators cannot be used for structs so you must overload them if you want to use them. Simply call the strongly-typed IEquatable<T>.Equals(T) implementation.
Similarly to classes, if less-or-greater is meaningful for your type, implement the IComparable interface. In case of structs, you should implement the IComparable<T> as well to make things performant (eg. Array.Sort, List<T>.BinarySearch, using the type as a key in a SortedList<TKey, TValue>, etc.). If you overloaded the ==, != operators, you should do it for <, >, <=, >=, too.
A little addendum:
If you must use a type that has an improper comparison logic for your needs, you can use the interfaces from 6. to 9. in the list. This is where you can forget consistency (at least considering the self Equals of the type) and you can implement a custom comparison that can be used in hash-based and sorted collections.

If you had overridden the Equals(object obj) method, then it would only be a matter of performances, as noted here: What's the difference between IEquatable and just overriding Object.Equals()?
But as long as you didn't override Equals(object obj) but provided your own strongly typed Equals(Adddress obj) method, without implementing IEquatable<T> you do not indicate to all classes that rely on the implementation of this interface to operate comparisons, that you have your own Equals method that should be used.
So, as John Skeet noted, the EqualityComparer<Address>.Default property used by List<Address>.IndexOf to compare addresses wouldn't be able to know it should use your Equals method.

IEquatable interface just adds Equals method with whatever type we supply in the generic param. Then the funciton overloading takes care of rest.
if we add IEquatable to Employee structure, that object can be compared with Employee object without any type casting. Though the same we can achieved with default Equals method which accepts Object as param,
So converting from Object to struct involves Boxing. Hence having IEquatable <Employee> will improve performance.
for example assume we want to compare Employee structure with another employee
if(e1.Equals(e2))
{
//do some
}
For above example it will use Equals with Employee as param. So no boxing nor unboxing is required
struct Employee : IEquatable<Employee>
{
public int Id { get; set; }
public bool Equals(Employee other)
{
//no boxing not unboxing, direct compare
return this.Id == other.Id;
}
public override bool Equals(object obj)
{
if(obj is Employee)
{ //un boxing
return ((Employee)obj).Id==this.Id;
}
return base.Equals(obj);
}
}
Some more examples:
Int structure implements IEquatable <int>
Bool structure implements IEquatable <bool>
Float structure implements IEquatable <float>
So if you call someInt.Equals(1) it doesn't fires Equals(object) method. it fires Equals(int) method.

Related

How Does FirstOrDefault Test for Equality?

I have a reference type that implements the IEquatable Interface. I have a Hashset that contains a single object. I then create an object that, by IEquatable's standards are example the same. But, when I run
var equivalentEntry = _riskControlATMEntries[grouping.Key].FirstOrDefault(e => e == atmEntry);
on the object I get null.
On the otherhand when I do
var equivalentEntry = _riskControlATMEntries[grouping.Key].FirstOrDefault(e => e.Equals(atmEntry));
I get the object that is considered equal based on the IEquatable interface's implementation.
So why does a HashSet rely on public bool Equals(ReferenceType other) but FirstOrDefault does not? What equality is the == operator in FirstOrDefault(e => e == other) looking for?
FirstOrDefault doesn't compare items for equality at all. You provided a filtering delegate that uses the == operator to compare the two objects in one case and used the Equals method in the other.
The == operator does whatever the class defines it to do by that type, or if not defined, by the closest base type that does (with object being the base type that is always there, and will always have a definition if nothing better was defined; it will compare objects based on their reference). Good design says that you should make sure the == operator for a class is defined to behave exactly the same as the Equals method, but nothing in the language forces you to do this, and apparently this class doesn't ensure they're the same, and it's unsurprisingly causing you problems.

System.Array.IndexOf allocates memory

I've been profiling my code and found that System.Array.IndexOf is allocating a fair bit of memory. I've been trying to find out how come this happens.
public struct LRItem
{
public ProductionRule Rule { get; } // ProductionRule is a class
public int Position { get; }
}
// ...
public List<LRItem> Items { get; } = new List<LRItem>();
// ...
public bool Add(LRItem item)
{
if (Items.Contains(item)) return false;
Items.Add(item);
return true;
}
I'm assuming the IndexOf is called by Items.Contains because I don't think Items.Add has any business checking indices. I've tried looking at the reference source and .NET Core source but to no avail. Is this a bug in the VS profiler? Is this function actually allocating memory? Could I optimize my code somehow?
I know this is probably a bit late, but in case anyone else has the same question...
When List<T>.Contains(...) is called, it uses the EqualityComparer<T>.Default to compare the individual items to find what you've passed in[1]. The docs say this about EqualityComparer<T>.Default:
The Default property checks whether type T implements the System.IEquatable interface and, if so, returns an EqualityComparer that uses that implementation. Otherwise, it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T.
Since your LRItem does not implement IEquatable<T>, then it falls back to using Object.Equals(object, object). And because LRItem is a struct, then it will end up being boxed as an object so it can be passed in to Object.Equals(...), which is where the allocations are coming from.
The easy fix for this is to take a hint from the docs and implement the IEquatable<T> interface:
public struct LRItem : IEquatable<LRItem>
{
// ...
public bool Equals(LRItem other)
{
// Implement this
return true;
}
}
This will now cause EqualityComparer<T>.Default to return a specialised comparer that does not need to box your LRItem structs and hence avoiding the allocation.
[1] I'm not sure if something's changed since this question was asked (or maybe it's a .net framework vs core difference or something) but List<T>.Contains() doesn't call Array.IndexOf() nowadays. Either way, both of them do defer to EqualityComparer<T>.Default, which means that this should still be relevant in either case.

GetHashCode method for Generic HashSet<T>

I'm trying to write a Generic IEqualityComparer for an HashSet, such that two set are equals if and only if their elements match.
So, Equals will look like:
public bool Equals(HashSet<T> A, HashSet<T> B)
{
return (A.All(x => B.Contains(x)) && B.All(x => A.Contains(x)));
}
I am having much more trouble finding a good GetHashCode method. I am aware that
public int GetHashCode(HashSet<int> obj)
{
return 1;
}
is always an option, but I'd like to have something better than that. Has anybody an idea on how i could do that? Is using ToString on each element, order and join them, and get the hashcode for the resulting string a bad idea?
The IEqualityComparer<T> Interface abstract the set of operations required here:
Equals
GetHashode
You can get the default comparer like all .NET classes do: using the EqualityComparer<T>.Default Property
However, it is my understanding that HashSet<> has the policy to use the Comparer associated with the object that you invoke an operation on, even if it takes another HashSet as an argument.

How does implementing an interface give us a strongly typed API?

In C# in depth, Jon Skeet uses IEquatable<> to override overload the Equals() operation.
public sealed class Pair<T1, T2> : IEquatable<Pair<T1, T2>>
{
public bool Equals(Pair<T1, T2> other)
{
//...
}
}
He says we do this "to give a strongly typed API that'll avoid unnecessary execution-time checks".
Which execution time checks are avoided? More importantly, how does implementing an interface achieve a strongly typed API?
I may have missed something in the book's context. I thought interfaces gave us code re-use via polymorphism. I also understand that they are good for programming to an abstraction instead of a concrete type. That's all I'm aware of.
The default Equals method takes an object as the parameter. Thus, when implementing this method, you have to make a runtime check in your code to ensure that this object is of type Pair (before you can compare those two):
public override bool Equals(Object obj) {
// runtime type check here
var otherPair = obj as Pair<T1, T2>;
if (otherPair == null)
return false;
// comparison code here
...
}
The Equals method of IEquatable<T>, however, takes a Pair<T1,T2> as a type parameter. Thus, you can avoid the check in your implementation, making it more efficient:
public bool Equals(Pair<T1, T2> other)
{
// comparison code here
...
}
Classes such as Dictionary<TKey, TValue>, List<T>, and LinkedList<T> are smart enough to use IEquatable<T>.Equals instead of object.Equals on their elements, if available (see MSDN).
In this case he's providing a strongly typed version of Object.Equals, which will replace code that might look like the following:
public override bool Equals(object other)
{
// The following type check is not needed with IEquatable<Pair<T1, T2>>
Pair<T1, T2> pair = other as Pair<T1, T2>;
if (pair != null)
{
// <-- IEquatable<Pair<T1, T2>> implementation
}
else
{
return base.Equals(other);
}
}
The IEquatable<T> interface provides a strongly typed implementation of the Equals method, as opposed to the Equals method in System.Object that receives a System.Object.
I think Jon saying "strongly typed" talks about generics.
I haven't found non-generic IEquitable interface but IComparable<T> vs. IComparable exist.
To be fair to Skeet (although sure he will be along soon) he does devote time to discussing what "strong typing" means in section 2.2.1.
In the context of your question (page 85 in my edition)), I think he means that the default Equals method (which takes an object as an argument) defers to the strongly-typed Equals method that implements the interface.

Equals method implementation helpers (C#)

Everytime I write some data class, I usually spend so much time writing the IEquatable implementation.
The last class I wrote was something like:
public class Polygon
{
public Point[] Vertices { get; set; }
}
Implementing IEquatable was exaustive. Surely C#3.0/LINQ helps a lot, but the vertices can be shifted and/or in the reverse order, and that adds a lot of complexity to the Equals method. After many unit tests, and corresponding implementation, I gave up, and changed my application to accept only triangles, which IEquatable implementation required only 11 unit tests to be fully covered.
There is any tool or technique that helps implementing Equals and GetHashCode?
I use ReSharper to generate equality members. It will optionally implement IEquatable<T> as well as overriding operators if you want that (which of course you never do, but it's cool anyway).
The implementation of Equals includes an override of Object.Equals(Object), as well as a strongly typed variant (which can avoid unnecessary type checking). The lesser typed version calls the strongly typed one after performing a type check. The strongly typed version performs a reference equality check (Object.ReferenceEquals(Object,Object)) and then compares the values of all fields (well, only those that you tell the generator to include).
As for GetHashCode, a smart factorisation of the field's GetHashCode values are combined (using unchecked to avoid overflow exceptions if you use the compiler's checked option). Each of the field's values (apart from the first one) are multiplied by prime numbers before being combined. You can also specify which fields would never be null, and it'll drop any null checks.
Here's what you get for your Polygon class by pressing ALT+Insert then selecting "Generate Equality Members":
public class Polygon : IEquatable<Polygon>
{
public Point[] Vertices { get; set; }
public bool Equals(Polygon other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return Equals(other.Vertices, Vertices);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof (Polygon)) return false;
return Equals((Polygon) obj);
}
public override int GetHashCode()
{
return (Vertices != null ? Vertices.GetHashCode() : 0);
}
}
Some of the features I talked about above don't apply as there is only one field. Note too that it hasn't checked the contents of the array.
In general though, ReSharper pumps out a lot of excellent code in just a matter of seconds. And that feature is pretty low on my list of things that makes ReSharper such an amazing tool.
For comparing two arrays of items, I use the SequenceEqual extension method.
As for a generic Equals and GetHashCode, there's a technique based on serialization that might work for you.
Using MemoryStream and BinaryFormatter for reuseable GetHashCode and DeepCopy functions

Categories