Issue with containskey and gethashcode - c#

I am currently trying to use the containskey method to check if a dictionary i have contains a certain key of a custom type. To do this i should override the gethashcode function, which i have, however the containskey method is still not working. There must be something i am not doing right but i havent figured out what exactly in the past 5 hours i have been trying this:
public class Parameter : IEquatable<Parameter>
{
public string Field { get; set; }
public string Content { get; set; }
public bool Equals(Parameter other)
{
if (other == null)
{
return false;
}
return Field.Equals(other.Field) && Content.Equals(other.Content);
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + Field.GetHashCode();
hash = hash * 23 + Content.GetHashCode();
return hash;
}
}
}
public class Trigger : IEquatable<Trigger>
{
public Dictionary<int, Parameter> Parameters { get; private set; }
private string Event { get; set; }
public bool Equals(Trigger item)
{
if (item == null)
{
return false;
}
return Event.Equals(item.Event) && Parameters.Equals(item.Parameters);
}
public override int GetHashCode()
{
unchecked
{
var hash = 17;
hash = hash * 23 + Parameters.GetHashCode();
hash = hash * 23 + Event.GetHashCode();
return hash;
}
}
}
For additional clarity: I have a Dictionary(Trigger, State) that i want to check the keys of so i assumed if i made sure all my sub-classes were equatable i could just use the containskey method, but apparently it does not.
Edit: What i have done now is implement Jon Skeet's Dictionary class and use this to do my checks on:
public override bool Equals(object o)
{
var item = o as Trigger;
if (item == null)
{
return false;
}
return Event.Equals(item.Event) && Dictionaries.Equals(Parameters, item.Parameters);
}
public override int GetHashCode()
{
var hash = 17;
hash = hash * 23 + Dictionaries.GetHashCode(Parameters);
hash = hash * 23 + Event.GetHashCode();
return hash;
}

Dictionary<,> doesn't itself override Equals and GetHashCode - so your Trigger implementations are broken. You would need to work out what equality you wanted and implement it yourself.
I have a sample implementation in protobuf-csharp-port which you might want to look at.
EDIT: Your change still isn't quite right. You should implement equality like this:
return Event.Equals(item.Event) &&
Dictionaries.Equals(Parameters, item.Parameters);
and implement GetHashCode as:
var hash = 17;
hash = hash * 23 + Dictionaries.GetHashCode(Parameters);
hash = hash * 23 + Event.GetHashCode();
return hash;

You should use the overloaded constructor:
Dictionary<TKey, TValue>(IDictionary<TKey, TValue>, IEqualityComparer<TKey>)
instead of using autoproperty, use a backup field.

Related

Linq distinct doesn't call Equals method

I have the following class
public class ModInfo : IEquatable<ModInfo>
{
public int ID { get; set; }
public string MD5 { get; set; }
public bool Equals(ModInfo other)
{
return other.MD5.Equals(MD5);
}
public override int GetHashCode()
{
return MD5.GetHashCode();
}
}
I load some data into a list of that class using a method like this:
public void ReloadEverything() {
var beforeSort = new List<ModInfo>();
// Bunch of loading from local sqlite database.
// not included since it's reload boring to look at
var modinfo = beforeSort.OrderBy(m => m.ID).AsEnumerable().Distinct().ToList();
}
Problem is the Distinct() call doesn't seem to do it's job. There are still objects which are equals each other.
Acording to this article: https://msdn.microsoft.com/en-us/library/vstudio/bb348436%28v=vs.100%29.aspx
that is how you are supposed to make distinct work, however it doesn't seem to be calling to Equals method on the ModInfo object.
What could be causing this to happen?
Example values:
modinfo[0]: id=2069, MD5 =0AAEBF5D2937BDF78CB65807C0DC047C
modinfo[1]: id=2208, MD5 = 0AAEBF5D2937BDF78CB65807C0DC047C
I don't care which value gets chosen, they are likely to be the same anyway since the md5 value is the same.
You also need to override Object.Equals, not just implement IEquatable.
If you add this to your class:
public override bool Equals(object other)
{
ModInfo mod = other as ModInfo;
if (mod != null)
return Equals(mod);
return false;
}
It should work.
See this article for more info: Implementing IEquatable Properly
EDIT: Okay, here's a slightly different implementation based on best practices with GetHashCode.
public class ModInfo : IEquatable<ModInfo>
{
public int ID { get; set; }
public string MD5 { get; set; }
public bool Equals(ModInfo other)
{
if (other == null) return false;
return (this.MD5.Equals(other.MD5));
}
public override int GetHashCode()
{
unchecked
{
int hash = 13;
hash = (hash * 7) + MD5.GetHashCode();
return hash;
}
}
public override bool Equals(object obj)
{
ModInfo other = obj as ModInfo;
if (other != null)
{
return Equals(other);
}
else
{
return false;
}
}
}
You can verify it:
ModInfo mod1 = new ModInfo {ID = 1, MD5 = "0AAEBF5D2937BDF78CB65807C0DC047C"};
ModInfo mod2 = new ModInfo {ID = 2, MD5 = "0AAEBF5D2937BDF78CB65807C0DC047C"};
// You should get true here
bool areEqual = mod1.Equals(mod2);
List<ModInfo> mods = new List<ModInfo> {mod1, mod2};
// You should get 1 result here
mods = mods.Distinct().ToList();
What's with those specific numbers in GetHashCode?
Add
public bool Equals(object other)
{
return this.Equals(other as ModInfo)
}
Also see here the recommendations how to implement the equality members: https://msdn.microsoft.com/en-us/library/ms173147(v=vs.80).aspx

using long (int64) as a hashCode and still use IEqualityComparer for concurrent Dictionary

I have a problem using a self made IEqualityComparer and GetHashCode in a concurrent dictionary.
The class below (simplified with used two properties) works perfect when I implement it like this:
ConcurrentDictionary<TwoUintsKeyInfo,Int64> hashCodePlusIandJDict = new ConcurrentDictionary<TwoUintsKeyInfo, Int64>();
.
public class TwoUintsKeyInfo
{
public uint IdOne { get; set; }
public uint IdTwo { get; set; }
#region Implemetation of the IEqualityComparer
public class EqualityComparerTwoUintsKeyInfo : IEqualityComparer<TwoUintsKeyInfo>
{
System.Reflection.PropertyInfo[] properties;
bool propertyArraySet=false;
public int GetHashCode(TwoUintsKeyInfo obj)
{
unchecked
{
if(!propertyArraySet)
{
properties = obj.GetType().GetProperties().OrderBy(x => x.Name).ToArray();
propertyArraySet = true;
}
decimal hash = 17;
int counter=0;
foreach(System.Reflection.PropertyInfo p in properties)
{
counter++;
var value = p.GetValue(obj);
decimal unique = (decimal)Math.Pow(Math.E, counter);
hash = hash + (value == null ? unique : value.GetHashCode() * unique);
}
return 2147483647M * .001M > hash ? (int)(hash * 1000) : (int)hash;
}
}
public bool Equals(TwoUintsKeyInfo x, TwoUintsKeyInfo y)
{
return GetHashCode(x) == GetHashCode(y);
}
}
#endregion Implemetation of the IEqualityComparer
}
Now I made almost the same class, but instead of the normal IEqualityComparer interface, I made a little change, so I could generate long / int64 hascodes (because when the class hold more and more properties, we encountered multiple values with the same hashcode)
So I wanted to reduce the changes of getting the same hascode. Therefore I wanted to use bigger numbers and if possible multiple by 10000 to get some of the decimals in on the action as well.
therefore I created this interface:
public interface IEqualityComparerInt64<in T>
{
bool Equals(T x, T y);
Int64 GetHashCode(T obj);
}
and altered the property class so it looks like this:
public class TwoUintsKeyInfoInt64
{
public uint IdOne { get; set; }
public uint IdTwo { get; set; }
#region Implemetation of the IEqualityComparer
public class EqualityComparerTwoUintsKeyInfoInt64 : IEqualityComparerInt64<TwoUintsKeyInfoInt64>
{
System.Reflection.PropertyInfo[] properties;
bool propertyArraySet=false;
decimal _upperThreshold,_lowerThreshold;
public EqualityComparerTwoUintsKeyInfoInt64()
{
_upperThreshold = long.MaxValue * .0001M;
_lowerThreshold = -long.MaxValue * .0001M;
}
public long GetHashCode(TwoUintsKeyInfoInt64 obj)
{
unchecked
{
if(!propertyArraySet)
{
properties = obj.GetType().GetProperties().OrderBy(x => x.Name).ToArray();
propertyArraySet = true;
}
decimal hash = 17;
int counter=0;
foreach(System.Reflection.PropertyInfo p in properties)
{
counter++;
var value = p.GetValue(obj);
decimal unique = (decimal)Math.Pow(Math.E, counter);
hash = hash + (value == null ? unique : value.GetHashCode() * unique);
}
return _upperThreshold > hash && _lowerThreshold < hash ? (long)(hash * 10000) : (long)hash;
}
}
public bool Equals(TwoUintsKeyInfoInt64 x, TwoUintsKeyInfoInt64 y)
{
return GetHashCode(x) == GetHashCode(y);
}
}
#endregion Implemetation of the IEqualityComparer
}
GetHashCode worked fine. So far no problem.
But...when I try to add a IEqualityComparer to the concurrentdictionary like this:
ConcurrentDictionary<TwoUintsKeyInfoInt64,Int64> hashCodePlusIandJDict = new ConcurrentDictionary<TwoUintsKeyInfoInt64, Int64>(new TwoUintsKeyInfoInt64.EqualityComparerOneUintAndTwoStringKeyInfo());
I get this error:
Error 3 Argument 1: cannot convert from
'HasCodeTestForUniqueResult.TwoUintsKeyInfoInt64.EqualityComparerOneUintAndTwoStringKeyInfo'
to
'System.Collections.Generic.IEqualityComparer' D:\Users\mldz\Documents\visual
studio
2012\HashCodeTestForUniqueResult\HashCodeTestForUniqueResult\Form1.cs 109 140 HashCodeTestForUniqueResult
I understand that there's a conflict between the int type of the default System.Collections.Generic.IEqualityComparer and my long / int64 result from my own GetHashCode generator. But is there any way to solve this and be able to use long HashCodes?
Kind regards,
Matthijs
P.S. the code above is just to test it and replicate the problem.
According to this you cannot use long hash codes, so the answer to the question is no.
But you can have unique combinations instead of unique values; the solution is to implement a partitioning system, meaning have a dictionary of dictionaries, like:
public class MyClass
{
Dictionary<uint, Dictionary<uint, Int64>> PartDict;
Int64 ReadValue(uint id1, uint id2)
{
return (PartDict[id1])[id2];
}
void AddValue(uint id1, uint id2, Int64 value)
{
Dictionary<uint, Int64> container;
if (!PartDict.TryGetValue(id1, out container))
{
container = new Dictionary<uint, Int64>();
PartDict.Add(id1, container);
}
container.Add(id2, value);
}
}
This way you will have a list of hash codes and each hash code will have again a list of hash codes, the combination being unique. Any reading and writing will be done in 2 steps though (to consider in case you want unique hash for performance).
Hope it helps.

Override Equals and GetHashCode in class with one field

I have a class:
public abstract class AbstractDictionaryObject
{
public virtual int LangId { get; set; }
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
if (other.LangId != LangId)
{
return false;
}
return true;
}
public override int GetHashCode()
{
int hashCode = 0;
hashCode = 19 * hashCode + LangId.GetHashCode();
return hashCode;
}
And I have derived classes:
public class Derived1:AbstractDictionaryObject
{...}
public class Derived2:AbstractDictionaryObject
{...}
In the AbstractDictionaryObject is only one common field: LangId.
I think this is not enough to overload methods (properly).
How can I identify objects?
For one thing you can simplify both your methods:
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
return other.LangId == LangId;
}
public override int GetHashCode()
{
return LangId;
}
But at that point it should be fine. If the two derived classes have other fields, they should override GetHashCode and Equals themselves, first calling base.Equals or base.GetHashCode and then applying their own logic.
Two instances of Derived1 with the same LangId will be equivalent as far as AbstractDictionaryObject is concerned, and so will two instances of Derived2 - but they will be different from each other as they have different types.
If you wanted to give them different hash codes you could change GetHashCode() to:
public override int GetHashCode()
{
int hash = 17;
hash = hash * 31 + GetType().GetHashCode();
hash = hash * 31 + LangId;
return hash;
}
However, hash codes for different objects don't have to be different... it just helps in performance. You may want to do this if you know you will have instances of different types with the same LangId, but otherwise I wouldn't bother.

C# - Generic HashCode implementation for classes

I'm looking at how build the best HashCode for a class and I see some algorithms. I saw this one : Hash Code implementation, seems to be that .NET classes HashCode methods are similar (see by reflecting the code).
So question is, why don't create the above static class in order to build a HashCode automatically, just by passing fields we consider as a "key".
// Old version, see edit
public static class HashCodeBuilder
{
public static int Hash(params object[] keys)
{
if (object.ReferenceEquals(keys, null))
{
return 0;
}
int num = 42;
checked
{
for (int i = 0, length = keys.Length; i < length; i++)
{
num += 37;
if (object.ReferenceEquals(keys[i], null))
{ }
else if (keys[i].GetType().IsArray)
{
foreach (var item in (IEnumerable)keys[i])
{
num += Hash(item);
}
}
else
{
num += keys[i].GetHashCode();
}
}
}
return num;
}
}
And use it as like this :
// Old version, see edit
public sealed class A : IEquatable<A>
{
public A()
{ }
public string Key1 { get; set; }
public string Key2 { get; set; }
public string Value { get; set; }
public override bool Equals(object obj)
{
return this.Equals(obj as A);
}
public bool Equals(A other)
{
if(object.ReferenceEquals(other, null))
? false
: Key1 == other.Key1 && Key2 == other.Key2;
}
public override int GetHashCode()
{
return HashCodeBuilder.Hash(Key1, Key2);
}
}
Will be much simpler that always is own method, no? I'm missing something?
EDIT
According all remarks, I got the following code :
public static class HashCodeBuilder
{
public static int Hash(params object[] args)
{
if (args == null)
{
return 0;
}
int num = 42;
unchecked
{
foreach(var item in args)
{
if (ReferenceEquals(item, null))
{ }
else if (item.GetType().IsArray)
{
foreach (var subItem in (IEnumerable)item)
{
num = num * 37 + Hash(subItem);
}
}
else
{
num = num * 37 + item.GetHashCode();
}
}
}
return num;
}
}
public sealed class A : IEquatable<A>
{
public A()
{ }
public string Key1 { get; set; }
public string Key2 { get; set; }
public string Value { get; set; }
public override bool Equals(object obj)
{
return this.Equals(obj as A);
}
public bool Equals(A other)
{
if(ReferenceEquals(other, null))
{
return false;
}
else if(ReferenceEquals(this, other))
{
return true;
}
return Key1 == other.Key1
&& Key2 == other.Key2;
}
public override int GetHashCode()
{
return HashCodeBuilder.Hash(Key1, Key2);
}
}
Your Equals method is broken - it's assuming that two objects with the same hash code are necessarily equal. That's simply not the case.
Your hash code method looked okay at a quick glance, but could actually do some with some work - see below. It means boxing any value type values and creating an array any time you call it, but other than that it's okay (as SLaks pointed out, there are some issues around the collection handling). You might want to consider writing some generic overloads which would avoid those performance penalties for common cases (1, 2, 3 or 4 arguments, perhaps). You might also want to use a foreach loop instead of a plain for loop, just to be idiomatic.
You could do the same sort of thing for equality, but it would be slightly harder and messier.
EDIT: For the hash code itself, you're only ever adding values. I suspect you were trying to do this sort of thing:
int hash = 17;
hash = hash * 31 + firstValue.GetHashCode();
hash = hash * 31 + secondValue.GetHashCode();
hash = hash * 31 + thirdValue.GetHashCode();
return hash;
But that multiplies the hash by 31, it doesn't add 31. Currently your hash code will always return the same for the same values, whether or not they're in the same order, which isn't ideal.
EDIT: It seems there's some confusion over what hash codes are used for. I suggest that anyone who isn't sure reads the documentation for Object.GetHashCode and then Eric Lippert's blog post about hashing and equality.
This is what I'm using:
public static class ObjectExtensions
{
/// <summary>
/// Simplifies correctly calculating hash codes based upon
/// Jon Skeet's answer here
/// http://stackoverflow.com/a/263416
/// </summary>
/// <param name="obj"></param>
/// <param name="memberThunks">Thunks that return all the members upon which
/// the hash code should depend.</param>
/// <returns></returns>
public static int CalculateHashCode(this object obj, params Func<object>[] memberThunks)
{
// Overflow is okay; just wrap around
unchecked
{
int hash = 5;
foreach (var member in memberThunks)
hash = hash * 29 + member().GetHashCode();
return hash;
}
}
}
Example usage:
public class Exhibit
{
public virtual Document Document { get; set; }
public virtual ExhibitType ExhibitType { get; set; }
#region System.Object
public override bool Equals(object obj)
{
return Equals(obj as Exhibit);
}
public bool Equals(Exhibit other)
{
return other != null &&
Document.Equals(other.Document) &&
ExhibitType.Equals(other.ExhibitType);
}
public override int GetHashCode()
{
return this.CalculateHashCode(
() => Document,
() => ExhibitType);
}
#endregion
}
Instead of calling keys[i].GetType().IsArray, you should try to cast it to IEnumerable (using the as keyword).
You can fix the Equals method without repeating the field list by registering a static list of fields, like I do here using a collection of delegates.
This also avoids the array allocation per-call.
Note, however, that my code doesn't handle collection properties.

How to remove duplicates from a List<T>?

I am following a previous post on stackoverflow about removing duplicates from a List<T> in C#.
If <T> is some user defined type like:
class Contact
{
public string firstname;
public string lastname;
public string phonenum;
}
The suggested (HashMap) doesn't remove duplicate. I think, I have to redefine some method for comparing two objects, isn't it?
A HashSet<T> does remove duplicates, because it's a set... but only when your type defines equality appropriately.
I suspect by "duplicate" you mean "an object with equal field values to another object" - you need to override Equals/GetHashCode for that to work, and/or implement IEquatable<Contact>... or you could provide an IEqualityComparer<Contact> to the HashSet<T> constructor.
Instead of using a HashSet<T> you could just call the Distinct LINQ extension method. For example:
list = list.Distinct().ToList();
But again, you'll need to provide an appropriate definition of equality, somehow or other.
Here's a sample implementation. Note how I've made it immutable (equality is odd with mutable types, because two objects can be equal one minute and non-equal the next) and
made
the fields private, with public properties. Finally, I've sealed the class - immutable types should generally be sealed, and it makes equality easier to talk about.
using System;
using System.Collections.Generic;
public sealed class Contact : IEquatable<Contact>
{
private readonly string firstName;
public string FirstName { get { return firstName; } }
private readonly string lastName;
public string LastName { get { return lastName; } }
private readonly string phoneNumber;
public string PhoneNumber { get { return phoneNumber; } }
public Contact(string firstName, string lastName, string phoneNumber)
{
this.firstName = firstName;
this.lastName = lastName;
this.phoneNumber = phoneNumber;
}
public override bool Equals(object other)
{
return Equals(other as Contact);
}
public bool Equals(Contact other)
{
if (object.ReferenceEquals(other, null))
{
return false;
}
if (object.ReferenceEquals(other, this))
{
return true;
}
return FirstName == other.FirstName &&
LastName == other.LastName &&
PhoneNumber == other.PhoneNumber;
}
public override int GetHashCode()
{
// Note: *not* StringComparer; EqualityComparer<T>
// copes with null; StringComparer doesn't.
var comparer = EqualityComparer<string>.Default;
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + comparer.GetHashCode(FirstName);
hash = hash * 31 + comparer.GetHashCode(LastName);
hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
return hash;
}
}
}
EDIT: Okay, in response to requests for an explanation of the GetHashCode() implementation:
We want to combine the hash codes of the properties of this object
We're not checking for nullity anywhere, so we should assume that some of them may be null. EqualityComparer<T>.Default always handles this, which is nice... so I'm using that to get a hash code of each field.
The "add and multiply" approach to combining several hash codes into one is the standard one recommended by Josh Bloch. There are plenty of other general-purpose hashing algorithms, but this one works fine for most applications.
I don't know whether you're compiling in a checked context by default, so I've put the computation in an unchecked context. We really don't care if the repeated multiply/add leads to an overflow, because we're not looking for a "magnitude" as such... just a number that we can reach repeatedly for equal objects.
Two alternative ways of handling nullity, by the way:
public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName ?? "").GetHashCode();
hash = hash * 31 + (LastName ?? "").GetHashCode();
hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
return hash;
}
}
or
public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
return hash;
}
}
class Contact {
public int Id { get; set; }
public string Name { get; set; }
public override string ToString()
{
return string.Format("{0}:{1}", Id, Name);
}
static private IEqualityComparer<Contact> comparer;
static public IEqualityComparer<Contact> Comparer {
get { return comparer ?? (comparer = new EqualityComparer()); }
}
class EqualityComparer : IEqualityComparer<Contact> {
bool IEqualityComparer<Contact>.Equals(Contact x, Contact y)
{
if (x == y)
return true;
if (x == null || y == null)
return false;
return x.Name == y.Name; // let's compare by Name
}
int IEqualityComparer<Contact>.GetHashCode(Contact c)
{
return c.Name.GetHashCode(); // let's compare by Name
}
}
}
class Program {
public static void Main()
{
var list = new List<Contact> {
new Contact { Id = 1, Name = "John" },
new Contact { Id = 2, Name = "Sylvia" },
new Contact { Id = 3, Name = "John" }
};
var distinctNames = list.Distinct(Contact.Comparer).ToList();
foreach (var contact in distinctNames)
Console.WriteLine(contact);
}
}
gives
1:John
2:Sylvia
For this task I don't necessarily thinks implementing IComparable is the obvious solution. You might want to sort and test for uniqueness in many different ways.
I would favor implementing a IEqualityComparer<Contact>:
sealed class ContactFirstNameLastNameComparer : IEqualityComparer<Contact>
{
public bool Equals (Contact x, Contact y)
{
return x.firstname == y.firstname && x.lastname == y.lastname;
}
public int GetHashCode (Contact obj)
{
return obj.firstname.GetHashCode () ^ obj.lastname.GetHashCode ();
}
}
And then use System.Linq.Enumerable.Distinct (assuming you are using at least .NET 3.5)
var unique = contacts.Distinct (new ContactFirstNameLastNameComparer ()).ToArray ();
PS. Speaking of HashSet<> Note that HashSet<> takes an IEqualityComparer<> as a constructor parameter.

Categories