how to implement override of GetHashCode() with logic of overriden Equals() - c#

I have some classes as below, i have implemented the Equals(Object) method for almost all of them. But i don't know how to write GetHashCode() . As far I used these data types as value type in a Dictionary Collection, i think i should override GetHashCode().
1.I don't know how to implement GetHashCode() with logic of Equals(Object).
2.There are some derived classes, if i override GetHashCode() and Equals(Object) for base class ( Param ), is it still necessary to override it for childs?
class Param
{
...
public Int16 id { get; set; }
public String name { get; set; }
...
public override bool Equals(object obj)
{
if ( obj is Param){
Param p = (Param)(obj);
if (id > 0 && p.id > 0)
return (id == p.id);
else if (name != String.Empty && p.name != String.Empty)
return (name.equals(p.name));
else
return object.ReferenceEquals(this, obj);
}
return false;
}
}
class Item
{
public int it_code { get; set; }
public Dictionary<String, Param> paramAr { get; set; }
...
public override bool Equals(Object obj)
{
Item im = new Item();
if (obj is Item)
im = (Item)obj;
else
return false;
if (this.it_code != String.Empty && im.it_code != String.Empty)
if (this.it_code.Equals(im.it_code))
return true;
bool reParams = true;
foreach ( KeyValuePair<String,Param> kvp in paramAr ){
if (kvp.Value != im.paramAr[kvp.Key]) {
reParams = false;
break;
}
}
return reParams;
}
}
class Order
{
public String or_code { get; set; }
public List <Item> items { get; set; }
...
public override bool Equals( Object obj ){
Order o = new Order();
if (obj is Order)
o = (Order)obj;
else
return false;
if (this.or_code != String.Empty && o.or_code != String.Empty)
if (this.or_code.Equals(o.or_code))
return true;
bool flag = true;
foreach( Item i in items){
if (!o.items.Contains(i)) {
flag = false;
break;
}
}
return flag;
}
}
EDIT:
i get this warning:
Warning : 'Item' overrides Object.Equals(object o) but does not
override Object.GetHashCode()

Firstly, as I think you understand, wherever you implement Equals you MUST also implement GetHashCode. The implementation of GetHashCode must reflect the behaviour of the Equals implementation but it doesn't usually use it.
See http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx - especially the "Notes to Implementers"
So if you take your example of the Item implementation of Equals, you're considering both the values of id and name to affect equality. So both of these must contribute to the GetHashCode implementation.
An example of how you could implement GetHashCode for Item would be along the lines of the following (note you may need to make it resilient to a nullable name field):
public override GetHashCode()
{
return id.GetHashCode() ^ name.GetHashCode();
}
See Eric Lippert's blog post on guidelines for GetHashCode - http://ericlippert.com/2011/02/28/guidelines-and-rules-for-gethashcode/
As for whether you need to re-implement GetHashCode in subclasses - Yes if you also override Equals - as per the first (and main) point - the implementation of the two must be consistent - if two items are considered equal by Equals then they must return the same value from GetHashCode.
Side note:
As a performance improvement on your code (avoid multiple casts):
if ( obj is Param){
Param p = (Param)(obj);
Param p = obj as Param;
if (p != null) ...

I prefer Josh Bloch's aproach.
Here's the example for the Param class.
override GetHashCode(object obj)
{
unchecked
{
int hash = 17;
hash = hash * 23 + id.GetHashCode();
hash = hash * 23 + name.GetHashCode();
return hash;
}
}
Also, check this link out : .net - best algorithm for GetHashCode
Properties used for the hashcode computation should be immutable as well.

Related

List equality of a custom class

I have a class A, which holds a string property and overwrites Equals for equality testing.
public class A
{
public string Prop { get; }
public A(string val)
{
Prop = val;
}
public override bool Equals(object obj)
{
return obj is A arg && (Prop == arg.Prop);
}
public override int GetHashCode()
{
return base.GetHashCode();
}
}
I also have a class B which has a List<A> as property:
public class B
{
public IReadOnlyList<A> Prop { get; }
public B(IReadOnlyList<A> val)
{
Prop = val;
}
public override bool Equals(object obj)
{
// ...
}
public override int GetHashCode()
{
return base.GetHashCode();
}
}
I wanna be able to compare to instances of B for equality and order.
How can I write the Equals method in B by not rewriting the same code I wrote in A?
Is there a way to reuse the A Equals?
Update: My first version assumed B is derived from A.
A.Equals:
If A is not sealed, obj is A ... can return a false positive if different types are compared. So the corrected version:
public override bool Equals(object obj)
{
return obj is A other
&& this.Prop == other.Prop
&& this.GetType() == other.GetType(); // not needed if A is sealed
}
A.GetHashCode:
base.GetHashCode will return different hash codes for different but equal instances, which is wrong. Derive the hashcode from self properties instead. If Prop acts like some ID, then simply return Prop.GetHashCode()
B.Equals:
public override bool Equals(object obj)
{
return obj is B other
&& this.Prop.SequenceEqual(other.Prop) // will re-use A.Equals
&& this.Prop.GetType() == other.Prop.GetType() // not needed if different IReadOnlyList types are ok
&& this.GetType() == other.GetType(); // not needed if B is sealed
}
B.GetHashCode:
You can aggregate the hash codes of A instances. Here I use a simple XOR but if the same items can often come in a different order you can come up with something more fancy.
return Prop.Aggregate(0, (h, i) => h ^ i.GetHashCode());
Implementing Equals for a list can be done by using the SequenceEquals method (from System.Linq namespace), which ensures that each item in one list equals the item at the same index in the other list.
One thing you might consider changing, however is your implementation of GetHashCode. This method should return the same number if two items are equal (though it's not guaranteed that two items with the same hash code are equal). Using base.GetHashCode() does not meet this requirement, since the base is object in this case; according to the documentation, "hash codes for reference types are computed by calling the Object.GetHashCode method of the base class, which computes a hash code based on an object's reference", so objects only return the same HashCode if they refer to the exact same object.
The HashCode should be based on the same properties used to determine equality, so in this case we want to use Prop.GetHashCode() for class A, and we want to aggregate the hashcode for all the items in Prop for class B.
Here's one way the classes could be refactored:
public class A : IEquatable<A>
{
public string Prop { get; }
public A(string val)
{
Prop = val;
}
public bool Equals(A other)
{
if (other == null) return false;
return Prop == other.Prop;
}
public override bool Equals(object obj)
{
return Equals(obj as A);
}
public override int GetHashCode()
{
return Prop.GetHashCode();
}
}
public class B : IEquatable<B>
{
public IReadOnlyList<A> Prop { get; }
public B(IReadOnlyList<A> val)
{
Prop = val;
}
public bool Equals(B other)
{
if (other == null) return false;
if (ReferenceEquals(this, other)) return true;
if (Prop == null) return other.Prop == null;
return other.Prop != null && Prop.SequenceEqual(other.Prop);
}
public override bool Equals(object obj)
{
return Equals(obj as B);
}
public override int GetHashCode()
{
return Prop?.Aggregate(17,
(current, item) => current * 17 + item?.GetHashCode() ?? 0)
?? 0;
}
}
Linq contains a useful method to compare collections: SequenceEqual
public override bool Equals(object obj)
{
if (!(obj is B other))
{
return false;
}
if (this.Prop == null || other.Prop == null)
{
return false;
}
return this.Prop.SequenceEqual(other.Prop);
}
Also, implement IEquatable<T> when you override Equals.
How about something like this:
public override bool Equals(object obj)
{
if(!(obj is B))
{
return false;
}
var b = obj as B;
if(b.Prop.Count != this.Prop.Count)
{
return false;
}
for(var i =0; i < Prop.Count; i++)
{
if (!Prop.ElementAt(i).Equals(b.Prop.ElementAt(i)))
{
return false;
}
}
return true;
}

Avoiding duplicates in a HashSet of custom types in C#

I have the following custom class deriving from Tuple:
public class CustomTuple : Tuple<List<string>, DateTime?>
{
public CustomTuple(IEnumerable<string> strings, DateTime? time)
: base(strings.OrderBy(x => x).ToList(), time)
{
}
}
and a HashSet<CustomTuple>. The problem is that when I add items to the set, they are not recognised as duplicates. i.e. this outputs 2, but it should output 1:
void Main()
{
HashSet<CustomTuple> set = new HashSet<CustomTuple>();
var a = new CustomTuple(new List<string>(), new DateTime?());
var b = new CustomTuple(new List<string>(), new DateTime?());
set.Add(a);
set.Add(b);
Console.Write(set.Count); // Outputs 2
}
How can I override the Equals and GetHashCode methods to cause this code to output a set count of 1?
You should override GetHashCode and Equals virtual methods defined in System.Object class.
Please remember that:
If two objects are logically "equal" then they MUST have the same hash code!
If two objects have the same hashcode, then it is not mandatory to have your objects equal.
Also, i've noticed an architectural problem in your code:
List is a mutable type but overriding Equals and GetHashCode usually makes your class logically to behave like a value type. So having "Item1" a mutable type and behaving like a value type is very dangerous. I suggest replacing your List with a ReadOnlyCollection . Then you would have to make a method that checks whether two ReadOnlyCollections are Equal.
For the GetHashCode () method, just compose a string from all string items found in Item1 then append a string that represents the Hash code for datetime then finally call on the concatenated result the "GetHashCode ()" overrided on string method. So normally you would have:
override int GetHashCode () {
return (GetHashCodeForList (Item1) + (Item2 ?? DateTime.MinValue).GetHashCode ()).GetHashCode ();
}
And the GetHashCodeForList method would be something like this:
private string GetHashCodeForList (IEnumerable <string> lst) {
if (lst == null) return string.Empty;
StringBuilder sb = new StringBuilder ();
foreach (var item in lst) {
sb.Append (item);
}
return sb.ToString ();
}
Final note: You could cache the GetHashCode result since it is relative expensive to get and your entire class would became immutable (if you replace List with a readonly collection).
A HashSet<T> will first call GetHashCode, so you need to work on that first. For an implementation, see this answer: https://stackoverflow.com/a/263416/1250301
So a simple, naive, implementation might look like this:
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + this.Item2.GetHashCode();
foreach (var s in this.Item1)
{
hash = hash * 23 + s.GetHashCode();
}
return hash;
}
}
However, if your lists are long, then this might not be efficient enough. So you'll have to decide where to compromise depending on how tolerant you are of collisions.
If the result of GetHashCode for two items are the same, then, and only then, will it call Equals. An implementation of Equals is going to need to compare the items in the list. Something like this:
public override bool Equals(object o1)
{
var o = o1 as CustomTuple;
if (o == null)
{
return false;
}
if (Item2 != o.Item2)
{
return false;
}
if (Item1.Count() != o.Item1.Count())
{
return false;
}
for (int i=0; i < Item1.Count(); i++)
{
if (Item1[i] != o.Item1[i])
{
return false;
}
}
return true;
}
Note that we check the date (Item2) first, because that's cheap. If the date isn't the same, we don't bother with anything else. Next we check the Count on both collections (Item1). If they don't match, there's no point iterating the collections. Then we loop through both collections and compare each item. Once we find one that doesn't match, we return false because there is no point continuing to look.
As pointed out in George's answer, you also have the problem that your list is mutable, which will cause problems with your HashSet, for example:
var a = new CustomTuple(new List<string>() {"foo"} , new DateTime?());
var b = new CustomTuple(new List<string>(), new DateTime?());
set.Add(a);
set.Add(b);
// Hashset now has two entries
((List<string>)a.Item1).Add("foo");
// Hashset still has two entries, but they are now identical.
To solve that, you need to force your IEnumerable<string> to be readonly. You could do something like:
public class CustomTuple : Tuple<IReadOnlyList<string>, DateTime?>
{
public CustomTuple(IEnumerable<string> strings, DateTime? time)
: base(strings.OrderBy(x => x).ToList().AsReadOnly(), time)
{
}
public override bool Equals(object o1)
{
// as above
}
public override int GetHashCode()
{
// as above
}
}
This is is what I went for, which outputs 1 as desired:
private class CustomTuple : Tuple<List<string>, DateTime?>
{
public CustomTuple(IEnumerable<string> strings, DateTime? time)
: base(strings.OrderBy(x => x).ToList(), time)
{
}
public override bool Equals(object obj)
{
if (obj == null || GetType() != obj.GetType())
{
return false;
}
var that = (CustomTuple) obj;
if (Item1 == null && that.Item1 != null || Item1 != null && that.Item1 == null) return false;
if (Item2 == null && that.Item2 != null || Item2 != null && that.Item2 == null) return false;
if (!Item2.Equals(that.Item2)) return false;
if (that.Item1.Count != Item1.Count) return false;
for (int i = 0; i < Item1.Count; i++)
{
if (!Item1[i].Equals(that.Item1[i])) return false;
}
return true;
}
public override int GetHashCode()
{
int hash = 17;
hash = hash*23 + Item2.GetHashCode();
return Item1.Aggregate(hash, (current, s) => current*23 + s.GetHashCode());
}
}

Is there a way to reduce amount of boilerplate code in Equals and GetHashCode?

I frequently have to override Equals and GetHashCode methods for the purpose of unit testing. After this my classes begin to look like this:
public class TestItem
{
public bool BoolValue { get; set; }
public DateTime DateTimeValue { get; set; }
public double DoubleValue { get; set; }
public long LongValue { get; set; }
public string StringValue { get; set; }
public SomeEnumType EnumValue { get; set; }
public decimal? NullableDecimal { get; set; }
public override bool Equals(object obj)
{
var other = obj as TestItem;
if (other == null)
{
return false;
}
if (object.ReferenceEquals(this, other))
{
return true;
}
return this.BoolValue == other.BoolValue
&& this.DateTimeValue == other.DateTimeValue
&& this.DoubleValue == other.DoubleValue // that's not a good way, but it's ok for demo
&& this.EnumValue == other.EnumValue
&& this.LongValue == other.LongValue
&& this.StringValue == other.StringValue
&& this.EnumValue == other.EnumValue
&& this.NullableDecimal == other.NullableDecimal;
}
public override int GetHashCode()
{
return this.BoolValue.GetHashCode()
^ this.DateTimeValue.GetHashCode()
^ this.DoubleValue.GetHashCode()
^ this.EnumValue.GetHashCode()
^ this.LongValue.GetHashCode()
^ this.NullableDecimal.GetHashCode()
^ (this.StringValue != null ? this.StringValue.GetHashCode() : 0);
}
}
While it's not hard to do it, time after time it gets boring and error prone to maintain list of same fields in Equals and GetHashCode. Is there any way to list filelds used for equality checking and hash code function only once? Equals and GetHashCode should be implemented in terms of this setup list.
In my imagination configuration and usage of such setup list may look like
public class TestItem
{
// same properties as before
private static readonly EqualityFieldsSetup Setup = new EqualityFieldsSetup<TestItem>()
.Add(o => o.BoolValue)
.Add(o => o.DateTimeValue)
// ... and so on
// or even .Add(o => o.SomeFunction())
public override bool Equals(object obj)
{
return Setup.Equals(this, obj);
}
public override int GetHashCode()
{
return Setup.GetHashCode(this);
}
}
There's a way to auto implement hashCode and equals in java, project lombok for example. I wonder is there anything serving the purpose of reducing boilerplate code readily available for C#.
I think it would be possible to implement pretty much the same thing as Lombok in C#, but I'm not feeling that ambitious at the moment.
I believe this is what you are after though (pretty much exactly as you have described it). This implementation does box all value types into objects, so it's not the most efficient implementation, but it should be good enough for your purpose of unit tests.
public class EqualityFieldsSetup<T>
where T : class
{
private List<Func<T, object>> _propertySelectors;
public EqualityFieldsSetup()
{
_propertySelectors = new List<Func<T, object>>();
}
public EqualityFieldsSetup<T> Add(Func<T, object> propertySelector)
{
_propertySelectors.Add(propertySelector);
return this;
}
public bool Equals(T objA, object other)
{
//If both are null, then they are equal
// (a condition I think you missed)
if (objA == null && other == null)
return true;
T objB = other as T;
if (objB == null)
return false;
if (object.ReferenceEquals(objA, objB))
return true;
foreach (Func<T, object> propertySelector in _propertySelectors)
{
object objAProperty = propertySelector.Invoke(objA);
object objBProperty = propertySelector.Invoke(objB);
//If both are null, then they are equal
// move on to the next property
if (objAProperty == null && objBProperty == null)
continue;
//Boxing requires the use of Equals() instead of '=='
if (objAProperty == null && objBProperty != null ||
!objAProperty.Equals(objBProperty))
return false;
}
return true;
}
public int GetHashCode(T obj)
{
int hashCode = 0;
foreach (Func<T, object> propertySelector in _propertySelectors)
{
object objProperty = propertySelector.Invoke(obj);
if (objProperty != null)
hashCode ^= objProperty.GetHashCode();
}
return hashCode;
}
}
I've done some research and found several components that were not quite what I wanted:
EqualityComparer (nuget) - does not seem to provide meaningful GetHashCode() by default and too heavyweight to my taste.
AnonymousComparer (nuget) - does not support GetHashCode() composition.
MemberwiseEqualityComparer - requires adding custom attribute to exclude member from comparison this way it's not possible to flexibly configure comparisons for existing types. Personally doing Emit for this task is a little bit overkill.
System.DataStructures.FuncComparer (nuget) - does not support composition.
And also a couple of related discussions:
How to quickly check if two data transfer objects have equal properties in C#?
Is there a better way to implment Equals for object with lots of fields?
So far idea of having explicitly configured list of members seemed unique. And I implemented my own library https://github.com/msugakov/YetAnotherEqualityComparer. It's better than the code suggested by TylerOhlsen in that it does not box extracted members and it uses EqualityComparer<T> to compare members.
Now the code looks like:
public class TestItem
{
private static readonly MemberEqualityComparer<TestItem> Comparer = new MemberEqualityComparer<TestItem>()
.Add(o => o.BoolValue)
.Add(o => o.DateTimeValue)
.Add(o => o.DoubleValue) // IEqualityComparer<double> can (and should) be specified here
.Add(o => o.EnumValue)
.Add(o => o.LongValue)
.Add(o => o.StringValue)
.Add(o => o.NullableDecimal);
// property list is the same
public override bool Equals(object obj)
{
return Comparer.Equals(this, obj);
}
public override int GetHashCode()
{
return Comparer.GetHashCode(this);
}
}
Also the MemberEqualityComparer implements IEqualityComparer<T> and follows its semantics: it can successfully compare default(T) which may be null for reference types and Nullables.
UPDATE: There are tools that can solve the same problem of creating member based IEqualityComparer<T> but also these can provide composite IComparer<T>!
Comparers (by Stephen Cleary) (nuget).
ComparerExtensions (nuget).

C# class definition HELP

This is a project question that i just cant seem to answer
using System;
namespace ConsoleApplication2
{
internal class Equipment : IComparable
{
private readonly string type;
private readonly int serialNo;
private string colour;
public decimal cost;
public Equipment(string type, int serialNo)
{
this.type = type == null ? "" : type.Trim();
this.serialNo = serialNo;
}
public string Key
{
get { return type + ":" + serialNo; }
}
int IComparable.CompareTo(object obj)
{
return 0;
}
}
}
(a) Override the appropriate method o ensure that different instances of the class that represent the same equipment item will be considered the same in the system.
(b) Override the appropriate method to enable instances of this class to be stored (and found) by key in a hash table
You should override the Equals and GetHashCode methods for this purpose.
Override Equals() with an appropriate logic of comparision
Override GetHashCode(), see GetHashCode Guidelines in C#
You must start reading this before doing such a task
Why is it important to override GetHashCode when Equals method is overriden in C#?
Writing GetHashCode manually is not that easy. Anyhow, that's code generated for this purpose by ReSharper. It's a complete solution. (It should be contained within your class definition of course). But what would you say, if you were asked - why and how it works? It might be embarassing.
So, apart from GetHashCode and Equals, which others have suggested you reading about, you might also look up http://msdn.microsoft.com/en-us/library/system.object.referenceequals.aspx as well as http://msdn.microsoft.com/en-us/library/a569z7k8(v=VS.100).aspx
As for the mystery behind 397 in GetHashCode, have a look at this question here on StackOverflow: Why is '397' used for ReSharper GetHashCode override?
public bool Equals(Equipment other)
{
if (ReferenceEquals(null, other))
{
return false;
}
if (ReferenceEquals(this, other))
{
return true;
}
return Equals(other.colour, colour) && other.cost == cost && other.serialNo == serialNo && Equals(other.type, type);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj))
{
return false;
}
if (ReferenceEquals(this, obj))
{
return true;
}
if (obj.GetType() != typeof (Equipment))
{
return false;
}
return Equals((Equipment) obj);
}
// note: if "Override the appropriate method to enable instances of this class
// to be stored (and found) by key in a hash table" is supposed to mean that only type and
// serialNo should be taken into account (since they are used to generate
// the Key value) - just remove the lines with cost and colour
public override int GetHashCode()
{
unchecked
{
int result = (colour != null ? colour.GetHashCode() : 0);
result = (result*397) ^ cost.GetHashCode();
result = (result*397) ^ serialNo;
result = (result*397) ^ (type != null ? type.GetHashCode() : 0);
return result;
}
}

Override Equals and GetHashCode in class with one field

I have a class:
public abstract class AbstractDictionaryObject
{
public virtual int LangId { get; set; }
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
if (other.LangId != LangId)
{
return false;
}
return true;
}
public override int GetHashCode()
{
int hashCode = 0;
hashCode = 19 * hashCode + LangId.GetHashCode();
return hashCode;
}
And I have derived classes:
public class Derived1:AbstractDictionaryObject
{...}
public class Derived2:AbstractDictionaryObject
{...}
In the AbstractDictionaryObject is only one common field: LangId.
I think this is not enough to overload methods (properly).
How can I identify objects?
For one thing you can simplify both your methods:
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
return other.LangId == LangId;
}
public override int GetHashCode()
{
return LangId;
}
But at that point it should be fine. If the two derived classes have other fields, they should override GetHashCode and Equals themselves, first calling base.Equals or base.GetHashCode and then applying their own logic.
Two instances of Derived1 with the same LangId will be equivalent as far as AbstractDictionaryObject is concerned, and so will two instances of Derived2 - but they will be different from each other as they have different types.
If you wanted to give them different hash codes you could change GetHashCode() to:
public override int GetHashCode()
{
int hash = 17;
hash = hash * 31 + GetType().GetHashCode();
hash = hash * 31 + LangId;
return hash;
}
However, hash codes for different objects don't have to be different... it just helps in performance. You may want to do this if you know you will have instances of different types with the same LangId, but otherwise I wouldn't bother.

Categories