How does HashSet compare elements for equality? - c#

I have a class that is IComparable:
public class a : IComparable
{
public int Id { get; set; }
public string Name { get; set; }
public a(int id)
{
this.Id = id;
}
public int CompareTo(object obj)
{
return this.Id.CompareTo(((a)obj).Id);
}
}
When I add a list of object of this class to a hash set:
a a1 = new a(1);
a a2 = new a(2);
HashSet<a> ha = new HashSet<a>();
ha.add(a1);
ha.add(a2);
ha.add(a1);
Everything is fine and ha.count is 2, but:
a a1 = new a(1);
a a2 = new a(2);
HashSet<a> ha = new HashSet<a>();
ha.add(a1);
ha.add(a2);
ha.add(new a(1));
Now ha.count is 3.
Why doesn't HashSet respect a's CompareTo method.
Is HashSet the best way to have a list of unique objects?

It uses an IEqualityComparer<T> (EqualityComparer<T>.Default unless you specify a different one on construction).
When you add an element to the set, it will find the hash code using IEqualityComparer<T>.GetHashCode, and store both the hash code and the element (after checking whether the element is already in the set, of course).
To look an element up, it will first use the IEqualityComparer<T>.GetHashCode to find the hash code, then for all elements with the same hash code, it will use IEqualityComparer<T>.Equals to compare for actual equality.
That means you have two options:
Pass a custom IEqualityComparer<T> into the constructor. This is the best option if you can't modify the T itself, or if you want a non-default equality relation (e.g. "all users with a negative user ID are considered equal"). This is almost never implemented on the type itself (i.e. Foo doesn't implement IEqualityComparer<Foo>) but in a separate type which is only used for comparisons.
Implement equality in the type itself, by overriding GetHashCode and Equals(object). Ideally, implement IEquatable<T> in the type as well, particularly if it's a value type. These methods will be called by the default equality comparer.
Note how none of this is in terms of an ordered comparison - which makes sense, as there are certainly situations where you can easily specify equality but not a total ordering. This is all the same as Dictionary<TKey, TValue>, basically.
If you want a set which uses ordering instead of just equality comparisons, you should use SortedSet<T> from .NET 4 - which allows you to specify an IComparer<T> instead of an IEqualityComparer<T>. This will use IComparer<T>.Compare - which will delegate to IComparable<T>.CompareTo or IComparable.CompareTo if you're using Comparer<T>.Default.

Here's clarification on a part of the answer that's been left unsaid: The object type of your HashSet<T> doesn't have to implement IEqualityComparer<T> but instead just has to override Object.GetHashCode() and Object.Equals(Object obj).
Instead of this:
public class a : IEqualityComparer<a>
{
public int GetHashCode(a obj) { /* Implementation */ }
public bool Equals(a obj1, a obj2) { /* Implementation */ }
}
You do this:
public class a
{
public override int GetHashCode() { /* Implementation */ }
public override bool Equals(object obj) { /* Implementation */ }
}
It is subtle, but this tripped me up for the better part of a day trying to get HashSet to function the way it is intended. And like others have said, HashSet<a> will end up calling a.GetHashCode() and a.Equals(obj) as necessary when working with the set.

HashSet uses Equals and GetHashCode().
CompareTo is for ordered sets.
If you want unique objects, but you don't care about their iteration order, HashSet<T> is typically the best choice.

constructor HashSet receive object what implement IEqualityComparer for adding new object.
if you whant use method in HashSet you nead overrride Equals, GetHashCode
namespace HashSet
{
public class Employe
{
public Employe() {
}
public string Name { get; set; }
public override string ToString() {
return Name;
}
public override bool Equals(object obj) {
return this.Name.Equals(((Employe)obj).Name);
}
public override int GetHashCode() {
return this.Name.GetHashCode();
}
}
class EmployeComparer : IEqualityComparer<Employe>
{
public bool Equals(Employe x, Employe y)
{
return x.Name.Trim().ToLower().Equals(y.Name.Trim().ToLower());
}
public int GetHashCode(Employe obj)
{
return obj.Name.GetHashCode();
}
}
class Program
{
static void Main(string[] args)
{
HashSet<Employe> hashSet = new HashSet<Employe>(new EmployeComparer());
hashSet.Add(new Employe() { Name = "Nik" });
hashSet.Add(new Employe() { Name = "Rob" });
hashSet.Add(new Employe() { Name = "Joe" });
Display(hashSet);
hashSet.Add(new Employe() { Name = "Rob" });
Display(hashSet);
HashSet<Employe> hashSetB = new HashSet<Employe>(new EmployeComparer());
hashSetB.Add(new Employe() { Name = "Max" });
hashSetB.Add(new Employe() { Name = "Solomon" });
hashSetB.Add(new Employe() { Name = "Werter" });
hashSetB.Add(new Employe() { Name = "Rob" });
Display(hashSetB);
var union = hashSet.Union<Employe>(hashSetB).ToList();
Display(union);
var inter = hashSet.Intersect<Employe>(hashSetB).ToList();
Display(inter);
var except = hashSet.Except<Employe>(hashSetB).ToList();
Display(except);
Console.ReadKey();
}
static void Display(HashSet<Employe> hashSet)
{
if (hashSet.Count == 0)
{
Console.Write("Collection is Empty");
return;
}
foreach (var item in hashSet)
{
Console.Write("{0}, ", item);
}
Console.Write("\n");
}
static void Display(List<Employe> list)
{
if (list.Count == 0)
{
Console.WriteLine("Collection is Empty");
return;
}
foreach (var item in list)
{
Console.Write("{0}, ", item);
}
Console.Write("\n");
}
}
}

I came here looking for answers, but found that all the answers had too much info or not enough, so here is my answer...
Since you've created a custom class you need to implement GetHashCode and Equals. In this example I will use a class Student instead of a because it's easier to follow and doesn't violate any naming conventions. Here is what the implementations look like:
public override bool Equals(object obj)
{
return obj is Student student && Id == student.Id;
}
public override int GetHashCode()
{
return HashCode.Combine(Id);
}
I stumbled across this article from Microsoft that gives an incredibly easy way to implement these if you're using Visual Studio. In case it's helpful to anyone else, here are complete steps for using a custom data type in a HashSet using Visual Studio:
Given a class Student with 2 simple properties and an initializer
public class Student
{
public int Id { get; set; }
public string Name { get; set; }
public Student(int id)
{
this.Id = id;
}
}
To Implement IComparable, add : IComparable<Student> like so:
public class Student : IComparable<Student>
You will see a red squiggly appear with an error message saying your class doesn't implement IComparable. Click on suggestions or press Alt+Enter and use the suggestion to implement it.
You will see the method generated. You can then write your own implementation like below:
public int CompareTo(Student student)
{
return this.Id.CompareTo(student.Id);
}
In the above implementation only the Id property is compared, name is ignored. Next right-click in your code and select Quick actions and refactorings, then Generate Equals and GetHashCode
A window will pop up where you can select which properties to use for hashing and even implement IEquitable if you'd like:
Here is the generated code:
public class Student : IComparable<Student>, IEquatable<Student> {
...
public override bool Equals(object obj)
{
return Equals(obj as Student);
}
public bool Equals(Student other)
{
return other != null && Id == other.Id;
}
public override int GetHashCode()
{
return HashCode.Combine(Id);
}
}
Now if you try to add a duplicate item like shown below it will be skipped:
static void Main(string[] args)
{
Student s1 = new Student(1);
Student s2 = new Student(2);
HashSet<Student> hs = new HashSet<Student>();
hs.Add(s1);
hs.Add(s2);
hs.Add(new Student(1)); //will be skipped
hs.Add(new Student(3));
}
You can now use .Contains like so:
for (int i = 0; i <= 4; i++)
{
if (hs.Contains(new Student(i)))
{
Console.WriteLine($#"Set contains student with Id {i}");
}
else
{
Console.WriteLine($#"Set does NOT contain a student with Id {i}");
}
}
Output:

Related

C# Merge two Lists of the same values

this is my Clients class:
public class Clients
{
public string Email { get; set; }
public string Name { get; set; }
public Clients(string e, string n)
{
Email = e;
Name = n;
}
I want to make a new list which contains the same clients from List A and List B .
For example:
List A - John, Jonathan, James ....
List B - Martha, Jane, Jonathan ....
Unsubscribers - Jonathan
public static List<Clients> SameClients(List<Clients> A, List<Clients> B)
{
List<Clients> Unsubscribers = new List<Clients>();
Unsubscribers = A.Intersect(B).ToList();
return Unsubscribers;
}
However for some reasons I get empty list and I have no idea what's wrong.
The problem is that when you are comparing objects Equals and Gethashcode are used to compare them. You can override these two methods and provide your own implementation based on your needs...there is already an answer below covering how to override these two methods
However, normally I prefer to keep my entities/models (or whatever you want to call them) very simple and keep comparison implementation details away from my models. In that case, you can implement an IEqualityComparer<TSource> and use an overload of Intersects that takes in an IEqualityComparer
Here's an example implementation of IEqualityComprarer based on only the Name property...
public class ClientNameEqualityComparer : IEqualityComparer<Clients>
{
public bool Equals(Clients c1, Clients c2)
{
if (c2 == null && c1 == null)
return true;
else if (c1 == null | c2 == null)
return false;
else if(c1.Name == c2.Name)
return true;
else
return false;
}
public int GetHashCode(Client c)
{
return c.Name.GetHashCode();
}
}
Basically, the implementation above only cares about the Name property, if two instances of Clients have the same value for the Name property, then they are considered equal.
Now you can do the followig...
A.Intersect(B, new ClientNameEqualityComparer()).ToList();
And that will produce the results you are expecting...
Intersect uses GetHashCode and Equals by default, but you haven't overriden it, so Object.Equals is used which just compares references. Since all your client-instances are initialized with new they are separate instances even if they have equal values. That's why Intersect "thinks" that there are no common clients.
So you have several options.
implement a custom IEqualityComparer<Clients> and pass that to Intersect(or many other LINQ methods). This has the advantage that you could implement different comparer for different requirements and you don't need to modify the original class
let Clients override Equals and GetHashCode and /or
let Clients implement IEquatable<Clients>
For example(showing the last two because other answer showed already IEqualityComparer<T>):
public class Clients : IEquatable<Clients>
{
public string Email { get; set; }
public string Name { get; set; }
public Clients(string e, string n)
{
Email = e;
Name = n;
}
public override bool Equals(object obj)
{
return obj is Clients && this.Equals((Clients)obj);
}
public bool Equals(Clients other)
{
return Email == other?.Email == true
&& Name == other?.Name == true;
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + (Email?.GetHashCode() ?? 0);
hash = hash * 23 + (Name?.GetHashCode() ?? 0);
return hash;
}
}
}
Worth reading:
Differences between IEquatable<T>, IEqualityComparer<T>, and overriding .Equals() when using LINQ on a custom object collection?

Define equality based on class property to be used in HashSet

I have the following class:
public class OrderRule {
public OrderDirection Direction { get; set; }
public String Property { get; set; }
}
And an HashSet of it:
HashSet<OrderRule> rules = // ...
I need to OrderRules to be considered equal if the Property is equal.
How can I do this?
Since the specification for this equality is not coming from the OrderRule class, but your collection, use the constructor overload of the HashSet that accepts an IEqualityComparer.
public class MyOrderRuleComparer : EqualityComparer<OrderRule>
{
private IEqualityComparer<string> _c = EqualityComparer<string>.Default;
public override bool Equals(OrderRule l, OrderRule r)
{
return _c.Equals(l.Property, r.Property);
}
public override int GetHashCode(OrderRule rule)
{
return _c.GetHashCode(rule.Property);
}
}
...
HashSet<OrderRule> rules = new HashSet(new MyOrderRuleComparer());
Please note that by using OrderRule.Property as a key, you imply that it must not change after the instance is added to the set. This is why implementing IEquatable<OrderRule> could be the best approach depending on your developer team.
If I add two OrderRules with same Property but different Direction I
still need both to be considered equal
You could override Equals and GethashCode and/or implement IEquatable<OrderRule>:
public class OrderRule: IEquatable<OrderRule>
{
public OrderRule(string property)
{
this.Property = property;
}
public OrderDirection Direction { get; set; }
public String Property { get; }
public OrderRule Rule { get; set; }
public bool Equals(OrderRule other)
{
return (other != null && other.Property == this.Property);
}
public override int GetHashCode()
{
return Property?.GetHashCode() ?? int.MinValue;
}
public override bool Equals(object obj)
{
if (obj == null)
return false;
if(ReferenceEquals(this, obj))
return true;
OrderRule other = obj as OrderRule;
return this.Equals(other);
}
}
Note that i've made the property read-only because you should not be able to modify a property or field that is used in GetHashCode.
Why?: "Guideline: the integer returned by GetHashCode should never change
Ideally, the hash code of a mutable object should be computed from only fields which cannot mutate, and therefore the hash value of an object is the same for its entire lifetime."
This value is f.e. used in a dictionary or HashSet to compute the hashcode. If it would change after the object was added it could no longer be found.

Benefits of using IEquatable

I've been researching IEqualityComparer and IEquitable.
From posts such as What is the difference between IEqualityComparer<T> and IEquatable<T>? the difference between the two is now clear. "IEqualityComparer is an interface for an object that performs the comparison on two objects of the type T."
Following the example at https://msdn.microsoft.com/en-us/library/ms132151(v=vs.110).aspx the purpose of IEqualityComparer is clear and simple.
I've followed the example at https://dotnetcodr.com/2015/05/05/implementing-the-iequatable-of-t-interface-for-object-equality-with-c-net/ to work out how to use it and I get the following code:
class clsIEquitable
{
public static void mainLaunch()
{
Person personOne = new Person() { Age = 6, Name = "Eva", Id = 1 };
Person personTwo = new Person() { Age = 7, Name = "Eva", Id = 1 };
//If Person didn't inherit from IEquatable, equals would point to different points in memory.
//This means this would be false as both objects are stored in different locations
//By using IEquatable on class it compares the objects directly
bool p = personOne.Equals(personTwo);
bool o = personOne.Id == personTwo.Id;
//Here is trying to compare and Object type with Person type and would return false.
//To ensure this works we added an overrides on the object equals method and it now works
object personThree = new Person() { Age = 7, Name = "Eva", Id = 1 };
bool p2 = personOne.Equals(personThree);
Console.WriteLine("Equatable Check", p.ToString());
}
}
public class Person : IEquatable<Person>
{
public int Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
public bool Equals(Person other)
{
if (other == null) return false;
return Id == other.Id;
}
//These are to support creating an object and comparing it to person rather than comparing person to person
public override bool Equals(object obj)
{
if (obj is Person)
{
Person p = (Person)obj;
return Equals(p);
}
return false;
}
public override int GetHashCode()
{
return Id;
}
}
My question is WHY would I use it? It seems like a lot of extra code to the simple version below (bool o):
//By using IEquatable on class it compares the objects directly
bool p = personOne.Equals(personTwo);
bool o = personOne.Id == personTwo.Id;
IEquatable<T> is used by generic collections to determine equality.
From this msdn article https://msdn.microsoft.com/en-us/library/ms131187.aspx
The IEquatable interface is used by generic collection objects such as Dictionary, List, and LinkedList when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.
This provides an added benefit when using structs, since calling the IEquatable<T> equals method does not box the struct like calling the base object equals method would.

Distinct List of object in C#

I have to distinct list of object but NOT only by ID because sometimes two different objects have same ID.
I have class:
public class MessageDTO
{
public MessageDTO(MessageDTO a)
{
this.MsgID = a.MsgID;
this.Subject = a.Subject;
this.MessageText = a.MessageText;
this.ViewedDate = a.ViewedDate;
this.CreatedDate = a.CreatedDate;
}
public int? MsgID { get; set; }
public string Subject { get; set; }
public string MessageText { get; set; }
public System.DateTime? ViewedDate { get; set; }
public System.DateTime? CreatedDate { get; set; }
}
How I can distinct list of:
List<MessageDTO> example;
Thanks
Use LINQ.
public class MessageDTOEqualityComparer : EqualityComparer<MessageDTO>
{
public bool Equals(MessageDTO a, MessageDTO b)
{
// your logic, which checks each messages properties for whatever
// grounds you need to deem them "equal." In your case, it sounds like
// this will just be a matter of iterating through each property with an
// if-not-equal-return-false block, then returning true at the end
}
public int GetHashCode(MessageDTO message)
{
// your logic, I'd probably just return the message ID if you can,
// assuming that doesn't overlap too much and that it does
// have to be equal on the two
}
}
Then
return nonDistinct.Distinct(new MessageDTOEqualityComparer());
You can also avoid the need for an extra class by overriding object.Equals(object) and object.GetHashCode() and calling the empty overload of nonDistinct.Distinct(). Make sure you recognize the implications of this decision, though: for instance, those will then become the equality-testing functions in all non-explicit scopes of their use. This might be perfect and exactly what you need, or it could lead to some unexpected consequences. Just make sure you know what you're getting into.
I you want to use other properties, you should implement IEqualityComparer interface. More on: msdn
class MsgComparer : IEqualityComparer<MessageDTO>
{
public bool Equals(MessageDTO x, MessageDTO Oy)
{
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(MessageDTO m)
{
//it must br overwritten also
}
}
Then:
example.Distinct(new MsgComparer());
You could also overwrite Equals in MessageDTO class:
class MessageDTO
{
// rest of members
public override bool Equals(object obj)
{
// your stuff. See: http://msdn.microsoft.com/en-us/library/ms173147%28v=vs.80%29.aspx
}
public override int GetHashCode()
{
}
}
Then it's enough:
example.Distinct();
You could use the extension method DistinctBy from the MoreLinq library:
string[] source = { "first", "second", "third", "fourth", "fifth" };
var distinct = source.DistinctBy(word => word.Length);
See here:
I recommend you using solution of #Matthew Haugen
In case you don't want to create a new class for that, there is a way to use LINQ by grouping you list by distinct field(s) then select the first item on this group. For example:
example.(e => new { e.MsgID, e.Subject }).Select(grp => grp.FirstOrDefault());

Can IEquatable compare custom objects having a list property of other custom objects?

I'd like to compare two custom class objects of the same type. The custom class being compared has a List property which is filled with items of another custom type. Is this possible by inheriting IEquatable?
I couldn't figure out how to make this work by modifying MSDN's code to compare class objects containing List properties of a custom type.
I did successfully derive from the EqualityComparer class to make a separate comparison class (code below), but I'd like to implement the comparison ability in the actual classes being compared. Here's what I have so far:
EDIT: This doesn't work after all. My apologies - I've been working on this awhile and I may have pasted incorrect example code. I'm working on trying to find my working solution...
class Program
{
static void Main(string[] args)
{
// Test the ContractComparer.
Contract a = new Contract("Contract X", new List<Commission>() { new Commission(1), new Commission(2), new Commission(3) });
Contract b = new Contract("Contract X", new List<Commission>() { new Commission(1), new Commission(2), new Commission(3) });
ContractComparer comparer = new ContractComparer();
Console.WriteLine(comparer.Equals(a, b));
// Output returns True. I can't get this to return
// True when I inherit IEquatable in my custom classes
// if I include the list property ("Commissions") in my
// comparison.
Console.ReadLine();
}
}
public class Contract
{
public string Name { get; set; }
public List<Commission> Commissions { get; set; }
public Contract(string name, List<Commission> commissions)
{
this.Name = name;
this.Commissions = commissions;
}
}
public class Commission
{
public int ID;
public Commission(int id)
{
this.ID = id;
}
}
public class ContractComparer : IEqualityComparer<Contract>
{
public bool Equals(Contract a, Contract b)
{
//Check whether the objects are the same object.
if (Object.ReferenceEquals(a, b)) return true;
//Check whether the contracts' properties are equal.
return a != null && b != null && a.Name.Equals(b.Name) && a.Commissions.Equals(b.Commissions);
}
public int GetHashCode(Contract obj)
{
int hashName = obj.Name.GetHashCode();
int hashCommissions = obj.Commissions.GetHashCode();
return hashName ^ hashCommissions;
}
}
You have to implement some kind of comparer for Commission, e.g. by implementing Commission : IEquatable<Commission>, then use it:
... && a.Commissions.SequenceEqual(b.Commissions)

Categories