How to remove duplicates from a List<T>? - c#

I am following a previous post on stackoverflow about removing duplicates from a List<T> in C#.
If <T> is some user defined type like:
class Contact
{
public string firstname;
public string lastname;
public string phonenum;
}
The suggested (HashMap) doesn't remove duplicate. I think, I have to redefine some method for comparing two objects, isn't it?

A HashSet<T> does remove duplicates, because it's a set... but only when your type defines equality appropriately.
I suspect by "duplicate" you mean "an object with equal field values to another object" - you need to override Equals/GetHashCode for that to work, and/or implement IEquatable<Contact>... or you could provide an IEqualityComparer<Contact> to the HashSet<T> constructor.
Instead of using a HashSet<T> you could just call the Distinct LINQ extension method. For example:
list = list.Distinct().ToList();
But again, you'll need to provide an appropriate definition of equality, somehow or other.
Here's a sample implementation. Note how I've made it immutable (equality is odd with mutable types, because two objects can be equal one minute and non-equal the next) and
made
the fields private, with public properties. Finally, I've sealed the class - immutable types should generally be sealed, and it makes equality easier to talk about.
using System;
using System.Collections.Generic;
public sealed class Contact : IEquatable<Contact>
{
private readonly string firstName;
public string FirstName { get { return firstName; } }
private readonly string lastName;
public string LastName { get { return lastName; } }
private readonly string phoneNumber;
public string PhoneNumber { get { return phoneNumber; } }
public Contact(string firstName, string lastName, string phoneNumber)
{
this.firstName = firstName;
this.lastName = lastName;
this.phoneNumber = phoneNumber;
}
public override bool Equals(object other)
{
return Equals(other as Contact);
}
public bool Equals(Contact other)
{
if (object.ReferenceEquals(other, null))
{
return false;
}
if (object.ReferenceEquals(other, this))
{
return true;
}
return FirstName == other.FirstName &&
LastName == other.LastName &&
PhoneNumber == other.PhoneNumber;
}
public override int GetHashCode()
{
// Note: *not* StringComparer; EqualityComparer<T>
// copes with null; StringComparer doesn't.
var comparer = EqualityComparer<string>.Default;
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + comparer.GetHashCode(FirstName);
hash = hash * 31 + comparer.GetHashCode(LastName);
hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
return hash;
}
}
}
EDIT: Okay, in response to requests for an explanation of the GetHashCode() implementation:
We want to combine the hash codes of the properties of this object
We're not checking for nullity anywhere, so we should assume that some of them may be null. EqualityComparer<T>.Default always handles this, which is nice... so I'm using that to get a hash code of each field.
The "add and multiply" approach to combining several hash codes into one is the standard one recommended by Josh Bloch. There are plenty of other general-purpose hashing algorithms, but this one works fine for most applications.
I don't know whether you're compiling in a checked context by default, so I've put the computation in an unchecked context. We really don't care if the repeated multiply/add leads to an overflow, because we're not looking for a "magnitude" as such... just a number that we can reach repeatedly for equal objects.
Two alternative ways of handling nullity, by the way:
public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName ?? "").GetHashCode();
hash = hash * 31 + (LastName ?? "").GetHashCode();
hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
return hash;
}
}
or
public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
return hash;
}
}

class Contact {
public int Id { get; set; }
public string Name { get; set; }
public override string ToString()
{
return string.Format("{0}:{1}", Id, Name);
}
static private IEqualityComparer<Contact> comparer;
static public IEqualityComparer<Contact> Comparer {
get { return comparer ?? (comparer = new EqualityComparer()); }
}
class EqualityComparer : IEqualityComparer<Contact> {
bool IEqualityComparer<Contact>.Equals(Contact x, Contact y)
{
if (x == y)
return true;
if (x == null || y == null)
return false;
return x.Name == y.Name; // let's compare by Name
}
int IEqualityComparer<Contact>.GetHashCode(Contact c)
{
return c.Name.GetHashCode(); // let's compare by Name
}
}
}
class Program {
public static void Main()
{
var list = new List<Contact> {
new Contact { Id = 1, Name = "John" },
new Contact { Id = 2, Name = "Sylvia" },
new Contact { Id = 3, Name = "John" }
};
var distinctNames = list.Distinct(Contact.Comparer).ToList();
foreach (var contact in distinctNames)
Console.WriteLine(contact);
}
}
gives
1:John
2:Sylvia

For this task I don't necessarily thinks implementing IComparable is the obvious solution. You might want to sort and test for uniqueness in many different ways.
I would favor implementing a IEqualityComparer<Contact>:
sealed class ContactFirstNameLastNameComparer : IEqualityComparer<Contact>
{
public bool Equals (Contact x, Contact y)
{
return x.firstname == y.firstname && x.lastname == y.lastname;
}
public int GetHashCode (Contact obj)
{
return obj.firstname.GetHashCode () ^ obj.lastname.GetHashCode ();
}
}
And then use System.Linq.Enumerable.Distinct (assuming you are using at least .NET 3.5)
var unique = contacts.Distinct (new ContactFirstNameLastNameComparer ()).ToArray ();
PS. Speaking of HashSet<> Note that HashSet<> takes an IEqualityComparer<> as a constructor parameter.

Related

LINQ distinct with IEquatable is not returning distinct result [duplicate]

This question already has answers here:
LINQ's Distinct() on a particular property
(23 answers)
Closed 19 days ago.
class Program
{
static void Main(string[] args)
{
List<Book> books = new List<Book>
{
new Book
{
Name="C# in Depth",
Authors = new List<Author>
{
new Author
{
FirstName = "Jon", LastName="Skeet"
},
new Author
{
FirstName = "Jon", LastName="Skeet"
},
}
},
new Book
{
Name="LINQ in Action",
Authors = new List<Author>
{
new Author
{
FirstName = "Fabrice", LastName="Marguerie"
},
new Author
{
FirstName = "Steve", LastName="Eichert"
},
new Author
{
FirstName = "Jim", LastName="Wooley"
},
}
},
};
var temp = books.SelectMany(book => book.Authors).Distinct();
foreach (var author in temp)
{
Console.WriteLine(author.FirstName + " " + author.LastName);
}
Console.Read();
}
}
public class Book
{
public string Name { get; set; }
public List<Author> Authors { get; set; }
}
public class Author
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool Equals(object obj)
{
return true;
//if (obj.GetType() != typeof(Author)) return false;
//else return ((Author)obj).FirstName == this.FirstName && ((Author)obj).FirstName == this.LastName;
}
}
This is based on an example in "LINQ in Action". Listing 4.16.
This prints Jon Skeet twice. Why? I have even tried overriding Equals method in Author class. Still Distinct does not seem to work. What am I missing?
Edit:
I have added == and != operator overload too. Still no help.
public static bool operator ==(Author a, Author b)
{
return true;
}
public static bool operator !=(Author a, Author b)
{
return false;
}
LINQ Distinct is not that smart when it comes to custom objects.
All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields).
One workaround is to implement the IEquatable interface as shown here.
If you modify your Author class like so it should work.
public class Author : IEquatable<Author>
{
public string FirstName { get; set; }
public string LastName { get; set; }
public bool Equals(Author other)
{
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
public override int GetHashCode()
{
int hashFirstName = FirstName == null ? 0 : FirstName.GetHashCode();
int hashLastName = LastName == null ? 0 : LastName.GetHashCode();
return hashFirstName ^ hashLastName;
}
}
Try it as DotNetFiddle
The Distinct() method checks reference equality for reference types. This means it is looking for literally the same object duplicated, not different objects which contain the same values.
There is an overload which takes an IEqualityComparer, so you can specify different logic for determining whether a given object equals another.
If you want Author to normally behave like a normal object (i.e. only reference equality), but for the purposes of Distinct check equality by name values, use an IEqualityComparer. If you always want Author objects to be compared based on the name values, then override GetHashCode and Equals, or implement IEquatable.
The two members on the IEqualityComparer interface are Equals and GetHashCode. Your logic for determining whether two Author objects are equal appears to be if the First and Last name strings are the same.
public class AuthorEquals : IEqualityComparer<Author>
{
public bool Equals(Author left, Author right)
{
if((object)left == null && (object)right == null)
{
return true;
}
if((object)left == null || (object)right == null)
{
return false;
}
return left.FirstName == right.FirstName && left.LastName == right.LastName;
}
public int GetHashCode(Author author)
{
return (author.FirstName + author.LastName).GetHashCode();
}
}
Another solution without implementing IEquatable, Equals and GetHashCode is to use the LINQs GroupBy method and to select the first item from the IGrouping.
var temp = books.SelectMany(book => book.Authors)
.GroupBy (y => y.FirstName + y.LastName )
.Select (y => y.First ());
foreach (var author in temp){
Console.WriteLine(author.FirstName + " " + author.LastName);
}
There is one more way to get distinct values from list of user defined data type:
YourList.GroupBy(i => i.Id).Select(i => i.FirstOrDefault()).ToList();
Surely, it will give distinct set of data
Distinct() performs the default equality comparison on objects in the enumerable. If you have not overridden Equals() and GetHashCode(), then it uses the default implementation on object, which compares references.
The simple solution is to add a correct implementation of Equals() and GetHashCode() to all classes which participate in the object graph you are comparing (ie Book and Author).
The IEqualityComparer interface is a convenience that allows you to implement Equals() and GetHashCode() in a separate class when you don't have access to the internals of the classes you need to compare, or if you are using a different method of comparison.
You've overriden Equals(), but make sure you also override GetHashCode()
The Above answers are wrong!!!
Distinct as stated on MSDN returns the default Equator which as stated The Default property checks whether type T implements the System.IEquatable interface and, if so, returns an EqualityComparer that uses that implementation. Otherwise, it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T
Which means as long as you overide Equals you are fine.
The reason you're code is not working is because you check firstname==lastname.
see https://msdn.microsoft.com/library/bb348436(v=vs.100).aspx and https://msdn.microsoft.com/en-us/library/ms224763(v=vs.100).aspx
You can achieve this several ways:
1. You may to implement the IEquatable interface as shown Enumerable.Distinct Method or you can see #skalb's answer at this post
2. If your object has not unique key, You can use GroupBy method for achive distinct object list, that you must group object's all properties and after select first object.
For example like as below and working for me:
var distinctList= list.GroupBy(x => new {
Name= x.Name,
Phone= x.Phone,
Email= x.Email,
Country= x.Country
}, y=> y)
.Select(x => x.First())
.ToList()
MyObject class is like as below:
public class MyClass{
public string Name{get;set;}
public string Phone{get;set;}
public string Email{get;set;}
public string Country{get;set;}
}
3. If your object's has unique key, you can only use the it in group by.
For example my object's unique key is Id.
var distinctList= list.GroupBy(x =>x.Id)
.Select(x => x.First())
.ToList()
You can use extension method on list which checks uniqueness based on computed Hash.
You can also change extension method to support IEnumerable.
Example:
public class Employee{
public string Name{get;set;}
public int Age{get;set;}
}
List<Employee> employees = new List<Employee>();
employees.Add(new Employee{Name="XYZ", Age=30});
employees.Add(new Employee{Name="XYZ", Age=30});
employees = employees.Unique(); //Gives list which contains unique objects.
Extension Method:
public static class LinqExtension
{
public static List<T> Unique<T>(this List<T> input)
{
HashSet<string> uniqueHashes = new HashSet<string>();
List<T> uniqueItems = new List<T>();
input.ForEach(x =>
{
string hashCode = ComputeHash(x);
if (uniqueHashes.Contains(hashCode))
{
return;
}
uniqueHashes.Add(hashCode);
uniqueItems.Add(x);
});
return uniqueItems;
}
private static string ComputeHash<T>(T entity)
{
System.Security.Cryptography.SHA1CryptoServiceProvider sh = new System.Security.Cryptography.SHA1CryptoServiceProvider();
string input = JsonConvert.SerializeObject(entity);
byte[] originalBytes = ASCIIEncoding.Default.GetBytes(input);
byte[] encodedBytes = sh.ComputeHash(originalBytes);
return BitConverter.ToString(encodedBytes).Replace("-", "");
}
The Equal operator in below code is incorrect.
Old
public bool Equals(Author other)
{
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
NEW
public override bool Equals(Object obj)
{
var other = obj as Author;
if (other is null)
{
return false;
}
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
Instead of
var temp = books.SelectMany(book => book.Authors).Distinct();
Do
var temp = books.SelectMany(book => book.Authors).DistinctBy(f => f.Property);

Distinct() How to find unique elements in list of objects

There is a very simple class:
public class LinkInformation
{
public LinkInformation(string link, string text, string group)
{
this.Link = link;
this.Text = text;
this.Group = group;
}
public string Link { get; set; }
public string Text { get; set; }
public string Group { get; set; }
public override string ToString()
{
return Link.PadRight(70) + Text.PadRight(40) + Group;
}
}
And I create a list of objects of this class, containing multiple duplicates.
So, I tried using Distinct() to get a list of unique values.
But it does not work, so I implemented
IComparable<LinkInformation>
int IComparable<LinkInformation>.CompareTo(LinkInformation other)
{
return this.ToString().CompareTo(other.ToString());
}
and then...
IEqualityComparer<LinkInformation>
public bool Equals(LinkInformation x, LinkInformation y)
{
return x.ToString().CompareTo(y.ToString()) == 0;
}
public int GetHashCode(LinkInformation obj)
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + obj.Link.GetHashCode();
hash = hash * 23 + obj.Text.GetHashCode();
hash = hash * 23 + obj.Group.GetHashCode();
return hash;
}
The code using the Distinct is:
static void Main(string[] args)
{
string[] filePath = { #"C:\temp\html\1.html",
#"C:\temp\html\2.html",
#"C:\temp\html\3.html",
#"C:\temp\html\4.html",
#"C:\temp\html\5.html"};
int index = 0;
foreach (var path in filePath)
{
var parser = new HtmlParser();
var list = parser.Parse(path);
var unique = list.Distinct();
foreach (var elem in unique)
{
var full = new FileInfo(path).Name;
var file = full.Substring(0, full.Length - 5);
Console.WriteLine((++index).ToString().PadRight(5) + file.PadRight(20) + elem);
}
}
Console.ReadKey();
}
What has to be done to get Distinct() working?
You need to actually pass the IEqualityComparer that you've created to Disctinct when you call it. It has two overloads, one accepting no parameters and one accepting an IEqualityComparer. If you don't provide a comparer the default is used, and the default comparer doesn't compare the objects as you want them to be compared.
If you want to return distinct elements from sequences of objects of some custom data type, you have to implement the IEquatable generic interface in the class.
here is a sample implementation:
public class Product : IEquatable<Product>
{
public string Name { get; set; }
public int Code { get; set; }
public bool Equals(Product other)
{
//Check whether the compared object is null.
if (Object.ReferenceEquals(other, null)) return false;
//Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, other)) return true;
//Check whether the products' properties are equal.
return Code.Equals(other.Code) && Name.Equals(other.Name);
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public override int GetHashCode()
{
//Get hash code for the Name field if it is not null.
int hashProductName = Name == null ? 0 : Name.GetHashCode();
//Get hash code for the Code field.
int hashProductCode = Code.GetHashCode();
//Calculate the hash code for the product.
return hashProductName ^ hashProductCode;
}
}
And this is how you do the actual distinct:
Product[] products = { new Product { Name = "apple", Code = 9 },
new Product { Name = "orange", Code = 4 },
new Product { Name = "apple", Code = 9 },
new Product { Name = "lemon", Code = 12 } };
//Exclude duplicates.
IEnumerable<Product> noduplicates =
products.Distinct();
If you are happy with defining the "distinctness" by a single property, you can do
list
.GroupBy(x => x.Text)
.Select(x => x.First())
to get a list of "unique" items.
No need to mess around with IEqualityComparer et al.
Without using Distinct nor the comparer, how about:
list.GroupBy(x => x.ToString()).Select(x => x.First())
I know this solution is not the answer for the exact question, but I think is valid to be open for other solutions.

Issue with containskey and gethashcode

I am currently trying to use the containskey method to check if a dictionary i have contains a certain key of a custom type. To do this i should override the gethashcode function, which i have, however the containskey method is still not working. There must be something i am not doing right but i havent figured out what exactly in the past 5 hours i have been trying this:
public class Parameter : IEquatable<Parameter>
{
public string Field { get; set; }
public string Content { get; set; }
public bool Equals(Parameter other)
{
if (other == null)
{
return false;
}
return Field.Equals(other.Field) && Content.Equals(other.Content);
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + Field.GetHashCode();
hash = hash * 23 + Content.GetHashCode();
return hash;
}
}
}
public class Trigger : IEquatable<Trigger>
{
public Dictionary<int, Parameter> Parameters { get; private set; }
private string Event { get; set; }
public bool Equals(Trigger item)
{
if (item == null)
{
return false;
}
return Event.Equals(item.Event) && Parameters.Equals(item.Parameters);
}
public override int GetHashCode()
{
unchecked
{
var hash = 17;
hash = hash * 23 + Parameters.GetHashCode();
hash = hash * 23 + Event.GetHashCode();
return hash;
}
}
}
For additional clarity: I have a Dictionary(Trigger, State) that i want to check the keys of so i assumed if i made sure all my sub-classes were equatable i could just use the containskey method, but apparently it does not.
Edit: What i have done now is implement Jon Skeet's Dictionary class and use this to do my checks on:
public override bool Equals(object o)
{
var item = o as Trigger;
if (item == null)
{
return false;
}
return Event.Equals(item.Event) && Dictionaries.Equals(Parameters, item.Parameters);
}
public override int GetHashCode()
{
var hash = 17;
hash = hash * 23 + Dictionaries.GetHashCode(Parameters);
hash = hash * 23 + Event.GetHashCode();
return hash;
}
Dictionary<,> doesn't itself override Equals and GetHashCode - so your Trigger implementations are broken. You would need to work out what equality you wanted and implement it yourself.
I have a sample implementation in protobuf-csharp-port which you might want to look at.
EDIT: Your change still isn't quite right. You should implement equality like this:
return Event.Equals(item.Event) &&
Dictionaries.Equals(Parameters, item.Parameters);
and implement GetHashCode as:
var hash = 17;
hash = hash * 23 + Dictionaries.GetHashCode(Parameters);
hash = hash * 23 + Event.GetHashCode();
return hash;
You should use the overloaded constructor:
Dictionary<TKey, TValue>(IDictionary<TKey, TValue>, IEqualityComparer<TKey>)
instead of using autoproperty, use a backup field.

Override Equals and GetHashCode in class with one field

I have a class:
public abstract class AbstractDictionaryObject
{
public virtual int LangId { get; set; }
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
if (other.LangId != LangId)
{
return false;
}
return true;
}
public override int GetHashCode()
{
int hashCode = 0;
hashCode = 19 * hashCode + LangId.GetHashCode();
return hashCode;
}
And I have derived classes:
public class Derived1:AbstractDictionaryObject
{...}
public class Derived2:AbstractDictionaryObject
{...}
In the AbstractDictionaryObject is only one common field: LangId.
I think this is not enough to overload methods (properly).
How can I identify objects?
For one thing you can simplify both your methods:
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
AbstractDictionaryObject other = (AbstractDictionaryObject)obj;
return other.LangId == LangId;
}
public override int GetHashCode()
{
return LangId;
}
But at that point it should be fine. If the two derived classes have other fields, they should override GetHashCode and Equals themselves, first calling base.Equals or base.GetHashCode and then applying their own logic.
Two instances of Derived1 with the same LangId will be equivalent as far as AbstractDictionaryObject is concerned, and so will two instances of Derived2 - but they will be different from each other as they have different types.
If you wanted to give them different hash codes you could change GetHashCode() to:
public override int GetHashCode()
{
int hash = 17;
hash = hash * 31 + GetType().GetHashCode();
hash = hash * 31 + LangId;
return hash;
}
However, hash codes for different objects don't have to be different... it just helps in performance. You may want to do this if you know you will have instances of different types with the same LangId, but otherwise I wouldn't bother.

C# - Generic HashCode implementation for classes

I'm looking at how build the best HashCode for a class and I see some algorithms. I saw this one : Hash Code implementation, seems to be that .NET classes HashCode methods are similar (see by reflecting the code).
So question is, why don't create the above static class in order to build a HashCode automatically, just by passing fields we consider as a "key".
// Old version, see edit
public static class HashCodeBuilder
{
public static int Hash(params object[] keys)
{
if (object.ReferenceEquals(keys, null))
{
return 0;
}
int num = 42;
checked
{
for (int i = 0, length = keys.Length; i < length; i++)
{
num += 37;
if (object.ReferenceEquals(keys[i], null))
{ }
else if (keys[i].GetType().IsArray)
{
foreach (var item in (IEnumerable)keys[i])
{
num += Hash(item);
}
}
else
{
num += keys[i].GetHashCode();
}
}
}
return num;
}
}
And use it as like this :
// Old version, see edit
public sealed class A : IEquatable<A>
{
public A()
{ }
public string Key1 { get; set; }
public string Key2 { get; set; }
public string Value { get; set; }
public override bool Equals(object obj)
{
return this.Equals(obj as A);
}
public bool Equals(A other)
{
if(object.ReferenceEquals(other, null))
? false
: Key1 == other.Key1 && Key2 == other.Key2;
}
public override int GetHashCode()
{
return HashCodeBuilder.Hash(Key1, Key2);
}
}
Will be much simpler that always is own method, no? I'm missing something?
EDIT
According all remarks, I got the following code :
public static class HashCodeBuilder
{
public static int Hash(params object[] args)
{
if (args == null)
{
return 0;
}
int num = 42;
unchecked
{
foreach(var item in args)
{
if (ReferenceEquals(item, null))
{ }
else if (item.GetType().IsArray)
{
foreach (var subItem in (IEnumerable)item)
{
num = num * 37 + Hash(subItem);
}
}
else
{
num = num * 37 + item.GetHashCode();
}
}
}
return num;
}
}
public sealed class A : IEquatable<A>
{
public A()
{ }
public string Key1 { get; set; }
public string Key2 { get; set; }
public string Value { get; set; }
public override bool Equals(object obj)
{
return this.Equals(obj as A);
}
public bool Equals(A other)
{
if(ReferenceEquals(other, null))
{
return false;
}
else if(ReferenceEquals(this, other))
{
return true;
}
return Key1 == other.Key1
&& Key2 == other.Key2;
}
public override int GetHashCode()
{
return HashCodeBuilder.Hash(Key1, Key2);
}
}
Your Equals method is broken - it's assuming that two objects with the same hash code are necessarily equal. That's simply not the case.
Your hash code method looked okay at a quick glance, but could actually do some with some work - see below. It means boxing any value type values and creating an array any time you call it, but other than that it's okay (as SLaks pointed out, there are some issues around the collection handling). You might want to consider writing some generic overloads which would avoid those performance penalties for common cases (1, 2, 3 or 4 arguments, perhaps). You might also want to use a foreach loop instead of a plain for loop, just to be idiomatic.
You could do the same sort of thing for equality, but it would be slightly harder and messier.
EDIT: For the hash code itself, you're only ever adding values. I suspect you were trying to do this sort of thing:
int hash = 17;
hash = hash * 31 + firstValue.GetHashCode();
hash = hash * 31 + secondValue.GetHashCode();
hash = hash * 31 + thirdValue.GetHashCode();
return hash;
But that multiplies the hash by 31, it doesn't add 31. Currently your hash code will always return the same for the same values, whether or not they're in the same order, which isn't ideal.
EDIT: It seems there's some confusion over what hash codes are used for. I suggest that anyone who isn't sure reads the documentation for Object.GetHashCode and then Eric Lippert's blog post about hashing and equality.
This is what I'm using:
public static class ObjectExtensions
{
/// <summary>
/// Simplifies correctly calculating hash codes based upon
/// Jon Skeet's answer here
/// http://stackoverflow.com/a/263416
/// </summary>
/// <param name="obj"></param>
/// <param name="memberThunks">Thunks that return all the members upon which
/// the hash code should depend.</param>
/// <returns></returns>
public static int CalculateHashCode(this object obj, params Func<object>[] memberThunks)
{
// Overflow is okay; just wrap around
unchecked
{
int hash = 5;
foreach (var member in memberThunks)
hash = hash * 29 + member().GetHashCode();
return hash;
}
}
}
Example usage:
public class Exhibit
{
public virtual Document Document { get; set; }
public virtual ExhibitType ExhibitType { get; set; }
#region System.Object
public override bool Equals(object obj)
{
return Equals(obj as Exhibit);
}
public bool Equals(Exhibit other)
{
return other != null &&
Document.Equals(other.Document) &&
ExhibitType.Equals(other.ExhibitType);
}
public override int GetHashCode()
{
return this.CalculateHashCode(
() => Document,
() => ExhibitType);
}
#endregion
}
Instead of calling keys[i].GetType().IsArray, you should try to cast it to IEnumerable (using the as keyword).
You can fix the Equals method without repeating the field list by registering a static list of fields, like I do here using a collection of delegates.
This also avoids the array allocation per-call.
Note, however, that my code doesn't handle collection properties.

Categories