I hava a type Relation that overrides GetHashCode and Equals.
/// <see cref="System.Object.Equals"/>
public override bool Equals(object obj)
{
bool equals = false;
if (obj is Relation)
{
Relation relation = (Relation)obj;
equals = name.Equals(relation.name);
}
return equals;
}
/// <see cref="System.Object.GetHashCode"/>
public override int GetHashCode()
{
return name.GetHashCode();
}
I have two Relation objects relation1 and relation2.
I have a HashSet<Relation> named relations.
HashSet<Relation> relations = new HashSet<Relation>();
relations.Add(relation1);
The following snippet is supposed to output the hash code of relation1 and relation2.
HashSet<Relation>.Enumerator enumerator = relations.GetEnumerator();
enumerator.MoveNext();
Console.WriteLine(relations.Comparer.GetHashCode(enumerator.Current));
Console.WriteLine(relations.Comparer.GetHashCode(relation2));
The output is:
134042217
134042217
The following snippet compares relation1 and relation2 for equality.
Console.WriteLine(relations.Comparer.Equals(enumerator.Current, relation2));
The output is:
True
However, when I try to determine if the hashSet contains relation2 I get an unexpected result.
relations.Contains(relation2)
The output is:
False
I would expect True.
Here is what I understood from MSDN: in order to determine the presence of an element, Contains is supposed to call GetHashCode first and Equals then. If both methods return True then there is a matching.
Can you give me an explanation ?
This would happen if you changed Name after inserting the first object into the HashSet.
The HashSet recorded the object's original hashcode, which is not the same as the second object.
I wrote a test program with your overrides in it.
internal class Relation
{
public string Name { get; set; }
public override bool Equals(object obj)
{
bool equals = false;
if (obj is Relation)
{
Relation relation = (Relation)obj;
equals = Name.Equals(relation.Name);
}
return equals;
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}
class Program
{
static void Main(string[] args)
{
var relation1 = new Relation() {Name = "Bob"};
var relation2 = new Relation {Name = "Bob"};
var relations = new HashSet<Relation>();
relations.Add(relation1);
var does = relations.Contains(relation2);
// does is now true
}
}
So in the minimal case, your code does what you expect. Therefore I suggest that something else is going on that's causing relation1 and relation2 to not be the same by the time you do the Contains check.
Based on your snippets I wrote the following full sample:
class Program
{
static void Main(string[] args)
{
var r1 = new Relation("name");
var r2 = new Relation("name");
HashSet<Relation> r = new HashSet<Relation>();
r.Add(r1);
bool test = r.Contains(r2);
}
}
class Relation
{
public readonly string Name;
public Relation(string name)
{
Name = name;
}
public override bool Equals(object obj)
{
bool equals = false;
if (obj is Relation)
{
Relation relation = (Relation)obj;
equals = Name.Equals(relation.Name);
}
return equals;
}
/// <see cref="System.Object.GetHashCode"/>
public override int GetHashCode()
{
return Name.GetHashCode();
}
}
The value of 'test' is true. The only way I can explain the difference in behaviour between your code and this sample is that the Name property must not be the same between the two objects at the time that you perform the Contains check.
Related
This question already has answers here:
LINQ's Distinct() on a particular property
(23 answers)
Closed 19 days ago.
class Program
{
static void Main(string[] args)
{
List<Book> books = new List<Book>
{
new Book
{
Name="C# in Depth",
Authors = new List<Author>
{
new Author
{
FirstName = "Jon", LastName="Skeet"
},
new Author
{
FirstName = "Jon", LastName="Skeet"
},
}
},
new Book
{
Name="LINQ in Action",
Authors = new List<Author>
{
new Author
{
FirstName = "Fabrice", LastName="Marguerie"
},
new Author
{
FirstName = "Steve", LastName="Eichert"
},
new Author
{
FirstName = "Jim", LastName="Wooley"
},
}
},
};
var temp = books.SelectMany(book => book.Authors).Distinct();
foreach (var author in temp)
{
Console.WriteLine(author.FirstName + " " + author.LastName);
}
Console.Read();
}
}
public class Book
{
public string Name { get; set; }
public List<Author> Authors { get; set; }
}
public class Author
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool Equals(object obj)
{
return true;
//if (obj.GetType() != typeof(Author)) return false;
//else return ((Author)obj).FirstName == this.FirstName && ((Author)obj).FirstName == this.LastName;
}
}
This is based on an example in "LINQ in Action". Listing 4.16.
This prints Jon Skeet twice. Why? I have even tried overriding Equals method in Author class. Still Distinct does not seem to work. What am I missing?
Edit:
I have added == and != operator overload too. Still no help.
public static bool operator ==(Author a, Author b)
{
return true;
}
public static bool operator !=(Author a, Author b)
{
return false;
}
LINQ Distinct is not that smart when it comes to custom objects.
All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields).
One workaround is to implement the IEquatable interface as shown here.
If you modify your Author class like so it should work.
public class Author : IEquatable<Author>
{
public string FirstName { get; set; }
public string LastName { get; set; }
public bool Equals(Author other)
{
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
public override int GetHashCode()
{
int hashFirstName = FirstName == null ? 0 : FirstName.GetHashCode();
int hashLastName = LastName == null ? 0 : LastName.GetHashCode();
return hashFirstName ^ hashLastName;
}
}
Try it as DotNetFiddle
The Distinct() method checks reference equality for reference types. This means it is looking for literally the same object duplicated, not different objects which contain the same values.
There is an overload which takes an IEqualityComparer, so you can specify different logic for determining whether a given object equals another.
If you want Author to normally behave like a normal object (i.e. only reference equality), but for the purposes of Distinct check equality by name values, use an IEqualityComparer. If you always want Author objects to be compared based on the name values, then override GetHashCode and Equals, or implement IEquatable.
The two members on the IEqualityComparer interface are Equals and GetHashCode. Your logic for determining whether two Author objects are equal appears to be if the First and Last name strings are the same.
public class AuthorEquals : IEqualityComparer<Author>
{
public bool Equals(Author left, Author right)
{
if((object)left == null && (object)right == null)
{
return true;
}
if((object)left == null || (object)right == null)
{
return false;
}
return left.FirstName == right.FirstName && left.LastName == right.LastName;
}
public int GetHashCode(Author author)
{
return (author.FirstName + author.LastName).GetHashCode();
}
}
Another solution without implementing IEquatable, Equals and GetHashCode is to use the LINQs GroupBy method and to select the first item from the IGrouping.
var temp = books.SelectMany(book => book.Authors)
.GroupBy (y => y.FirstName + y.LastName )
.Select (y => y.First ());
foreach (var author in temp){
Console.WriteLine(author.FirstName + " " + author.LastName);
}
There is one more way to get distinct values from list of user defined data type:
YourList.GroupBy(i => i.Id).Select(i => i.FirstOrDefault()).ToList();
Surely, it will give distinct set of data
Distinct() performs the default equality comparison on objects in the enumerable. If you have not overridden Equals() and GetHashCode(), then it uses the default implementation on object, which compares references.
The simple solution is to add a correct implementation of Equals() and GetHashCode() to all classes which participate in the object graph you are comparing (ie Book and Author).
The IEqualityComparer interface is a convenience that allows you to implement Equals() and GetHashCode() in a separate class when you don't have access to the internals of the classes you need to compare, or if you are using a different method of comparison.
You've overriden Equals(), but make sure you also override GetHashCode()
The Above answers are wrong!!!
Distinct as stated on MSDN returns the default Equator which as stated The Default property checks whether type T implements the System.IEquatable interface and, if so, returns an EqualityComparer that uses that implementation. Otherwise, it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T
Which means as long as you overide Equals you are fine.
The reason you're code is not working is because you check firstname==lastname.
see https://msdn.microsoft.com/library/bb348436(v=vs.100).aspx and https://msdn.microsoft.com/en-us/library/ms224763(v=vs.100).aspx
You can achieve this several ways:
1. You may to implement the IEquatable interface as shown Enumerable.Distinct Method or you can see #skalb's answer at this post
2. If your object has not unique key, You can use GroupBy method for achive distinct object list, that you must group object's all properties and after select first object.
For example like as below and working for me:
var distinctList= list.GroupBy(x => new {
Name= x.Name,
Phone= x.Phone,
Email= x.Email,
Country= x.Country
}, y=> y)
.Select(x => x.First())
.ToList()
MyObject class is like as below:
public class MyClass{
public string Name{get;set;}
public string Phone{get;set;}
public string Email{get;set;}
public string Country{get;set;}
}
3. If your object's has unique key, you can only use the it in group by.
For example my object's unique key is Id.
var distinctList= list.GroupBy(x =>x.Id)
.Select(x => x.First())
.ToList()
You can use extension method on list which checks uniqueness based on computed Hash.
You can also change extension method to support IEnumerable.
Example:
public class Employee{
public string Name{get;set;}
public int Age{get;set;}
}
List<Employee> employees = new List<Employee>();
employees.Add(new Employee{Name="XYZ", Age=30});
employees.Add(new Employee{Name="XYZ", Age=30});
employees = employees.Unique(); //Gives list which contains unique objects.
Extension Method:
public static class LinqExtension
{
public static List<T> Unique<T>(this List<T> input)
{
HashSet<string> uniqueHashes = new HashSet<string>();
List<T> uniqueItems = new List<T>();
input.ForEach(x =>
{
string hashCode = ComputeHash(x);
if (uniqueHashes.Contains(hashCode))
{
return;
}
uniqueHashes.Add(hashCode);
uniqueItems.Add(x);
});
return uniqueItems;
}
private static string ComputeHash<T>(T entity)
{
System.Security.Cryptography.SHA1CryptoServiceProvider sh = new System.Security.Cryptography.SHA1CryptoServiceProvider();
string input = JsonConvert.SerializeObject(entity);
byte[] originalBytes = ASCIIEncoding.Default.GetBytes(input);
byte[] encodedBytes = sh.ComputeHash(originalBytes);
return BitConverter.ToString(encodedBytes).Replace("-", "");
}
The Equal operator in below code is incorrect.
Old
public bool Equals(Author other)
{
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
NEW
public override bool Equals(Object obj)
{
var other = obj as Author;
if (other is null)
{
return false;
}
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
Instead of
var temp = books.SelectMany(book => book.Authors).Distinct();
Do
var temp = books.SelectMany(book => book.Authors).DistinctBy(f => f.Property);
I need to write a method that will take two employee objects as input parameters and compare their ID's to determine whether or not they match.
It's incomplete at the moment; but so far, I have an Employee class that inherits first and last name properties from a "Person" class, and has ID as its own property. I'm writing the method in the employee file and have already instantiated 2 example employees in my program. As for overloading the ==, I'm running into an error that says "Employee" defines operator == but does not override Object.Equals." It also says I need to define "!=", but I'm confused on how to set up the != overload when it doesn't even figure into this method.
I've seen two ways of doing a compare method, one that returns true or false, and another that simply writes "match" to the console. Either of those would work for my purposes, but I can't figure out a workaround for the errors or how I'd change the code in this situation in order to determine a match between 2 employee ID's. Here is my code below; I'd appreciate any input on what may be going wrong! (I have a feeling it may be very off). I'm also unsure of how to call the method but I'm currently trying to figure it out.
Program file:
namespace OperatorOverload
{
class Program
{
static void Main(string[] args)
{
Employee example = new Employee();
example.FirstName = "Kitty";
example.LastName = "Katz";
example.ID = 24923;
Employee example2 = new Employee();
example2.FirstName = "John";
example2.LastName = "Dudinsky";
example2.ID = 39292;
Console.ReadLine();
}
}
}
Employee Class:
namespace OperatorOverload
{
class Employee : Person
{
public int ID { get; set; }
public static bool operator==(Employee employee, Employee employee2)
{
if (employee.ID == employee2.ID)
return true;
else
return false;
}
}
}
Person Class:
namespace OperatorOverload
{
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
}
you need to also override the Equals method:
public override bool Equals(Object Obj)
{
Person person = obj as Person;
if(person == null)
return false;
return this.ID.Equals(person.ID);
}
Microsoft's recommandations:
Implement the GetHashCode method whenever you implement the Equals
method. This keeps Equals and GetHashCode synchronized.
Override the Equals method whenever you implement the equality
operator (==), and make them do the same thing.
What You need to do is to use the Equals function and override it like this:
public override bool Equals(object obj)
{
var item = obj as Employee;
if (item == null)
{
return false;
}
return this.ID.Equals(item.ID);
}
The compiler is basically telling you that if you want to overload the == operator for the class Employee you will also have to override Object.Equals method and the != operator so that you will get a consistent semantic which you can use to compare instances of type Employee.
This means that you must not look for workarounds: you just have to overload the != operator and override Object.Equals so that Employee objects are compared by ID and not by reference (as they do by default if you don't provide your own equality semantic).
You should use this class. Override operator!=, operator== and Equals(object) method.
class Employee : Person
{
public int ID { get; set; }
public static bool operator ==(Employee employee, Employee employee2)
{
if (employee.ID == employee2.ID)
return true;
else
return false;
//but you should use
//return employee.ID == employee2.ID;
}
public static bool operator !=(Employee employee, Employee employee2)
{
return employee.ID != employee2.ID;
}
public override bool Equals(object obj)
{
var emp = obj as Employee;
if (emp == null)
return false;
return this.ID.Equals(emp.ID);
}
}
This is a better way to override a Equals method:
public override bool Equals(object obj)
{
if (obj is null) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != this.GetType()) return false; //optional. depends on logic
return this.ID.Equals(((Employee)obj).ID);
}
I have a tree structure as follows:
public class TAGNode
{
public string Val;
public string Type = "";
private List<TAGNode> childs;
public IList<TAGNode> Childs
{
get { return childs.AsReadOnly(); }
}
public TAGNode AddChild(string val)
{
TAGNode tree = new TAGNode(val);
tree.Parent = this;
childs.Add(tree);
return tree;
}
public override bool Equals(object obj)
{
var t = obj as TAGNode;
bool eq = Val == t.Val && childs.Count == t.Childs.Count;
if (eq)
{
for (int i = 0; i < childs.Count; i++)
{
eq &= childs[i].Equals(t.childs[i]);
}
}
return eq;
}
}
I have a list of such trees which can contain repeated trees, by repeated I mean they have the same structure with the same labels. Now I want to select distinct trees from this list. I tried
etrees = new List<TAGNode>();
TAGNode test1 = new TAGNode("S");
test1.AddChild("A").AddChild("B");
test1.AddChild("C");
TAGNode test2 = new TAGNode("S");
test2.AddChild("A").AddChild("B");
test2.AddChild("C");
TAGNode test3 = new TAGNode("S");
test3.AddChild("A");
test3.AddChild("B");
etrees.Add(test1);
etrees.Add(test2);
etrees.Add(test3);
var results = etrees.Distinct();
label1.Text = results.Count() + " unique trees";
This returns the count of all the trees (3) while I expect 2 distinct trees! I think maybe I should implement a suitable Equals function for it, but as I tested it doesn't care what Equals returns!
I think maybe I should implement a suitable Equals function for it
Correct.
but as I tested it doesn't care what Equals returns!
Because you have to implement a matching GetHashCode! It doesn't need to include all the items used inside the Equals, in your case Val could be sufficient. Remember, all you need is to return one and the same hash code for the potentially equal items. The items with different hash codes are considered non equal and never checked with Equals.
So something like this should work:
public bool Equals(TAGNode other)
{
if ((object)this == (object)other) return true;
if ((object)other == null) return false;
return Val == other.Val && childs.SequenceEqual(other.childs);
}
public override bool Equals(object obj) => Equals(obj as TAGNode);
public override int GetHashCode() => Val?.GetHashCode() ?? 0;
Once you do that, you can also "mark" your TAGNode as IEquatable<TAGNode>, to let the default equality comparer directly call the Equals(TAGNode other) overload.
see https://msdn.microsoft.com/en-us/library/bb348436(v=vs.100).aspx
If you want to return distinct elements from sequences of objects of some custom data type, you have to implement the IEquatable generic interface in the class. The following code example shows how to implement this interface in a custom data type and provide GetHashCode and Equals methods.
you need to impliment IEquatable for TagNode
try following for GetHashCode. I updated the method below to make more robust. Was afraid original answer may not create unique has values.
private int GetHashCode(TAGNode node)
{
string hash = node.Val;
foreach(TAGNode child in node.childs)
{
hash += GetHashStr(child);
}
return hash.GetHashCode();
}
private string GetHashStr(TAGNode node)
{
string hash = node.Val;
foreach (TAGNode child in node.childs)
{
hash += ":" + GetHashStr(child);
}
return hash;
}
I have the following class
public class ModInfo : IEquatable<ModInfo>
{
public int ID { get; set; }
public string MD5 { get; set; }
public bool Equals(ModInfo other)
{
return other.MD5.Equals(MD5);
}
public override int GetHashCode()
{
return MD5.GetHashCode();
}
}
I load some data into a list of that class using a method like this:
public void ReloadEverything() {
var beforeSort = new List<ModInfo>();
// Bunch of loading from local sqlite database.
// not included since it's reload boring to look at
var modinfo = beforeSort.OrderBy(m => m.ID).AsEnumerable().Distinct().ToList();
}
Problem is the Distinct() call doesn't seem to do it's job. There are still objects which are equals each other.
Acording to this article: https://msdn.microsoft.com/en-us/library/vstudio/bb348436%28v=vs.100%29.aspx
that is how you are supposed to make distinct work, however it doesn't seem to be calling to Equals method on the ModInfo object.
What could be causing this to happen?
Example values:
modinfo[0]: id=2069, MD5 =0AAEBF5D2937BDF78CB65807C0DC047C
modinfo[1]: id=2208, MD5 = 0AAEBF5D2937BDF78CB65807C0DC047C
I don't care which value gets chosen, they are likely to be the same anyway since the md5 value is the same.
You also need to override Object.Equals, not just implement IEquatable.
If you add this to your class:
public override bool Equals(object other)
{
ModInfo mod = other as ModInfo;
if (mod != null)
return Equals(mod);
return false;
}
It should work.
See this article for more info: Implementing IEquatable Properly
EDIT: Okay, here's a slightly different implementation based on best practices with GetHashCode.
public class ModInfo : IEquatable<ModInfo>
{
public int ID { get; set; }
public string MD5 { get; set; }
public bool Equals(ModInfo other)
{
if (other == null) return false;
return (this.MD5.Equals(other.MD5));
}
public override int GetHashCode()
{
unchecked
{
int hash = 13;
hash = (hash * 7) + MD5.GetHashCode();
return hash;
}
}
public override bool Equals(object obj)
{
ModInfo other = obj as ModInfo;
if (other != null)
{
return Equals(other);
}
else
{
return false;
}
}
}
You can verify it:
ModInfo mod1 = new ModInfo {ID = 1, MD5 = "0AAEBF5D2937BDF78CB65807C0DC047C"};
ModInfo mod2 = new ModInfo {ID = 2, MD5 = "0AAEBF5D2937BDF78CB65807C0DC047C"};
// You should get true here
bool areEqual = mod1.Equals(mod2);
List<ModInfo> mods = new List<ModInfo> {mod1, mod2};
// You should get 1 result here
mods = mods.Distinct().ToList();
What's with those specific numbers in GetHashCode?
Add
public bool Equals(object other)
{
return this.Equals(other as ModInfo)
}
Also see here the recommendations how to implement the equality members: https://msdn.microsoft.com/en-us/library/ms173147(v=vs.80).aspx
I have the following class
public class ResourceInfo
{
public string Id { get; set; }
public string Url { get; set; }
}
which contains information about some resource.
Now I need the possibility to check if two such resources are equal by the following scenario (I`ve implemented IEquatable interface)
public class ResourceInfo : IEquatable<ResourceInfo>
{
public string Id { get; set; }
public string Url { get; set; }
public bool Equals(ResourceInfo other)
{
if (other == null)
return false;
// Try to match by Id
if (!string.IsNullOrEmpty(Id) && !string.IsNullOrEmpty(other.Id))
{
return string.Equals(Id, other.Id, StringComparison.InvariantCultureIgnoreCase);
}
// Match by Url if can`t match by Id
return string.Equals(Url, other.Url, StringComparison.InvariantCultureIgnoreCase);
}
}
Usage: oneResource.Equals(otherResource). And everything is just fine. But some time have passed and now I need to use such eqaulity comparing in some linq query.
As a result I need to implement separate Equality comparer which looks like this:
class ResourceInfoEqualityComparer : IEqualityComparer<ResourceInfo>
{
public bool Equals(ResourceInfo x, ResourceInfo y)
{
if (x == null || y == null)
return object.Equals(x, y);
return x.Equals(y);
}
public int GetHashCode(ResourceInfo obj)
{
if (obj == null)
return 0;
return obj.GetHashCode();
}
}
Seems to be ok: it makes some validation logic and uses the native equality comparing logic. But then I need to implement GetHashCode method in the ResourceInfo class and that is the place where I have some problem.
I don`t know how to do this correctly without changing the class itself.
At first glance, the following example can work
public override int GetHashCode()
{
// Try to get hashcode from Id
if(!string.IsNullOrEmpty(Id))
return Id.GetHashCode();
// Try to get hashcode from url
if(!string.IsNullOrEmpty(Url))
return Url.GetHashCode();
// Return zero
return 0;
}
But this implementation is not very good.
GetHashCode should match the Equals method : if two objects are equal, then they should have the same hashcode, right? But my Equals method uses two objects to compare them. Here is the usecase, where you can see the problem itself:
var resInfo1 = new ResourceInfo()
{
Id = null,
Url = "http://res.com/id1"
};
var resInfo2 = new ResourceInfo()
{
Id = "id1",
Url = "http://res.com/id1"
};
So, what will happen, when we invoke Equals method: obviously they will be equal, because Equals method will try to match them by Id and fail, then it tries matching by Url and here we have the same values. As intended.
resInfo1.Equals(resInfo1 ) -> true
But then, if they are equal, they should have the same hash codes:
var hash1 = resInfo.GetHashCode(); // -263327347
var hash2 = resInfo.GetHashCode(); // 1511443452
hash1.GetHashCode() == hash2.GetHashCode() -> false
Shortly speaking, the problem is that Equals method decides which field to use for equality comparing by looking at two different objects, while GetHashCode method have access only to one object.
Is there a way to implement it correctly or I just have to change my class to avoid such situations?
Many thanks.
Your approach to equality fundamentally breaks the specifications in Object.Equals.
In particular, consider:
var x = new ResourceInfo { Id = null, Uri = "a" };
var y = new ResourceInfo { Id = "yz", Uri = "a" };
var z = new ResourceInfo { Id = "yz", Uri = "b" };
Here, x.Equals(y) would be true, and y.Equals(z) would be true - but x.Equals(z) would be false. That is specifically prohibited in the documentation:
If (x.Equals(y) && y.Equals(z)) returns true, then x.Equals(z) returns true.
You'll need to redesign, basically.