Define equality based on class property to be used in HashSet - c#

I have the following class:
public class OrderRule {
public OrderDirection Direction { get; set; }
public String Property { get; set; }
}
And an HashSet of it:
HashSet<OrderRule> rules = // ...
I need to OrderRules to be considered equal if the Property is equal.
How can I do this?

Since the specification for this equality is not coming from the OrderRule class, but your collection, use the constructor overload of the HashSet that accepts an IEqualityComparer.
public class MyOrderRuleComparer : EqualityComparer<OrderRule>
{
private IEqualityComparer<string> _c = EqualityComparer<string>.Default;
public override bool Equals(OrderRule l, OrderRule r)
{
return _c.Equals(l.Property, r.Property);
}
public override int GetHashCode(OrderRule rule)
{
return _c.GetHashCode(rule.Property);
}
}
...
HashSet<OrderRule> rules = new HashSet(new MyOrderRuleComparer());
Please note that by using OrderRule.Property as a key, you imply that it must not change after the instance is added to the set. This is why implementing IEquatable<OrderRule> could be the best approach depending on your developer team.

If I add two OrderRules with same Property but different Direction I
still need both to be considered equal
You could override Equals and GethashCode and/or implement IEquatable<OrderRule>:
public class OrderRule: IEquatable<OrderRule>
{
public OrderRule(string property)
{
this.Property = property;
}
public OrderDirection Direction { get; set; }
public String Property { get; }
public OrderRule Rule { get; set; }
public bool Equals(OrderRule other)
{
return (other != null && other.Property == this.Property);
}
public override int GetHashCode()
{
return Property?.GetHashCode() ?? int.MinValue;
}
public override bool Equals(object obj)
{
if (obj == null)
return false;
if(ReferenceEquals(this, obj))
return true;
OrderRule other = obj as OrderRule;
return this.Equals(other);
}
}
Note that i've made the property read-only because you should not be able to modify a property or field that is used in GetHashCode.
Why?: "Guideline: the integer returned by GetHashCode should never change
Ideally, the hash code of a mutable object should be computed from only fields which cannot mutate, and therefore the hash value of an object is the same for its entire lifetime."
This value is f.e. used in a dictionary or HashSet to compute the hashcode. If it would change after the object was added it could no longer be found.

Related

In C#, How can I create or overload an assignment operator to possibly assign two values at once?

This is probably a stupid question, but just in case....
We have a 3rd party package with weird models like:
public partial class CountingDevice
{
public int countingDeviceNo { get; set; }
public string countingDeviceName { get; set; }
public string obis { get; set; }
public int integralPart { get; set; }
public bool integralPartFieldSpecified;
public int fractionalPart { get; set; }
public bool fractionalPartFieldSpecified;
public double value { get; set; }
public bool valueFieldSpecified;
public bool offPeakFlag { get; set; }
public bool offPeakFlagFieldSpecified;
public ExpectedMeterReading expectedMeterReading { get; set; }
// snipped for brevity
}
You'll notice that sometimes there are pairs of fields like integralPart and integralPartFieldSpecified.
Here is the problem: If I simply assign some value to integralPart but do not set integralPartFieldSpecified = true, the value of integralPart will be completely ignored causing the solution to fail.
So when mapping our own models to this madness, I need to litter the code with constructs like:
if (IntegralPart != null)
{
countingDevice.integralPartSpecified = true;
countingDevice.integralPart = (int)IntegralPart!;
}
Both in the interest of reducing lines of code and not stumbling over a minefield, I would like to do any one of the following:
A. Overload the = operator so it will automatically check for a property which is a boolean and has "Specified" concatenated to the current property's name. If such a property exists, it will be assigned true when the value is assigned; if not, then assignment will operate as normal. Ideally, it should be "smart" enough to assign "...Specified" to false if the value assigned is null/default/empty.
B. Create some customer operator which will do the same as A.
C. Create some method which I could invoke in a concise and preferably typesafe way to do the same.
Is this possible?
If so, how?
To make it clear: I need to build quite a few wrappers.
I don't want to repeat this logic for every field and worry about missing some fields which it applies to.
I want a generic way of assigning both fields at once if the "Specified" field exists and being able to do assignments in exactly the same way if it does not exist.
not stumbling over a minefield
Encapsulate the minefield.
If you don't control this 3rd party DTO then don't use it throughout your domain. Encapsulate or wrap the integration of this 3rd party tool within a black box that you control. Then throughout your domain use your models.
Within the integration component for this 3rd party system, simply map to/from your Domain Models and this 3rd party DTO. So this one extra line of code which sets a second field on the DTO only exists in that one place.
Another (expensive) solution would be to write a method that takes in an object, a property name, and the new property value. You can then use reflection to both set the property value for the specified property, as well as search for the bool field that you want to set (if it exists).
Note that you need to pass the correct type for the property. There's no compile-time checking that you're passing a double instead of a string for the value property, for example.
Below I've created an extension method on the object type to simplify calling the method in our main code (the method becomes a member of the object itself):
public static class Extensions
{
// Requires: using System.Reflection;
public static bool SetPropertyAndSpecified(this object obj,
string propertyName, object propertyValue)
{
// Argument validation left to user
// Check if 'obj' has specified 'propertyName'
// and set 'propertyValue' if it does
PropertyInfo prop = obj.GetType().GetProperty(propertyName,
BindingFlags.Public | BindingFlags.Instance);
if (prop != null && prop.CanWrite)
{
prop.SetValue(obj, propertyValue, null);
// Check for related "FieldSpecified" field
// and set it to 'true' if it exists
obj.GetType().GetField($"{propertyName}FieldSpecified",
BindingFlags.Public | BindingFlags.Instance)?.SetValue(obj, true);
return true;
}
return false;
}
}
After you add this class to your project, you can do something like:
static void Main(string[] args)
{
var counter = new CountingDevice();
// Note that 'valueFieldSpecified' and `integralPartFieldSpecified'
// are set to 'false' on 'counter'
// Call our method to set some properties
counter.SetPropertyAndSpecified(nameof(counter.integralPart), 42);
counter.SetPropertyAndSpecified(nameof(counter.value), 69d);
// Now 'valueFieldSpecified' and 'integralPartFieldSpecified'
// are set to 'true' on 'counter'
}
You cannot overload the = operator in C#.
You can just use custom properties and set the "FieldSpecified" fields in the setters e.g.
private int _integralPart;
public int integralPart
{
get { return _integralPart; }
set
{
_integralPart = value;
integralPartFieldSpecified = true;
}
}
public bool integralPartFieldSpecified;
Update
If you want a generic solution you can use a generic class for properties that you want to achieve the specified behaviour with e.g.
public class ValueWithSpecifiedCheck<T>
{
private T _fieldValue;
public T FieldValue
{
get
{
return _fieldValue;
}
set
{
_fieldValue = value;
FieldSpecified = true;
}
}
public bool FieldSpecified { get; set; }
}
public class Data
{
public ValueWithSpecifiedCheck<int> IntegralPart { get; set; }
}
Then the class/property would be used as following:
public static void Main()
{
var data = new Data();
data.IntegralPart = new ValueWithSpecifiedCheck<int>();
data.IntegralPart.FieldValue = 7;
Console.WriteLine(data.IntegralPart.FieldSpecified);// Prints true
}
If you implement a generic solution and add implicit conversion operators, it's quite convenient to use.
Here's a sample Optional<T> struct (I made it a readonly struct to ensure immutable mechanics):
public readonly struct Optional<T> where T : struct
{
public Optional(T value)
{
_value = value;
}
public static implicit operator T(Optional<T> opt) => opt.Value;
public static implicit operator Optional<T>(T opt) => new(opt);
public T Value => _value!.Value;
public bool Specified => _value is not null;
public override string ToString() => _value is null ? "<NONE>" : _value.ToString()!;
readonly T? _value;
}
You could use that to implement your CountingDevice class like so:
public partial class CountingDevice
{
public int countingDeviceNo { get; set; }
public string countingDeviceName { get; set; }
public string obis { get; set; }
public Optional<int> integralPart { get; set; }
public Optional<int> fractionalPart { get; set; }
public Optional<double> value { get; set; }
public Optional<bool> offPeakFlag { get; set; }
// snipped for brevity
}
Usage is quite natural because of the implicit conversions:
public static void Main()
{
var dev = new CountingDevice
{
integralPart = 10, // Can initialise with the underlying type.
value = 123.456
};
Console.WriteLine(dev.fractionalPart.Specified); // False
Console.WriteLine(dev.integralPart.Specified); // True
Console.WriteLine(dev.value); // 123.456
Console.WriteLine(dev.value.ToString()); // 123.456
Console.WriteLine(dev.fractionalPart.ToString()); // "<NONE>"
dev.fractionalPart = 42; // Can set the value using int.
Console.WriteLine(dev.fractionalPart.Specified); // True
Console.WriteLine(dev.fractionalPart); // 42
var optCopy = dev.offPeakFlag;
Console.WriteLine(optCopy.Specified); // False
dev.offPeakFlag = true;
Console.WriteLine(dev.offPeakFlag.Specified); // True
Console.WriteLine(optCopy.Specified); // Still False - not affected by the original.
Console.WriteLine(optCopy); // Throws an exception because its not specified.
}
You might also want to use optional reference types, but to do that you will need to declare a generic with the class constraint:
public readonly struct OptionalRef<T> where T : class
{
public OptionalRef(T value)
{
_value = value;
}
public static implicit operator T(OptionalRef<T> opt) => opt.Value;
public static implicit operator OptionalRef<T>(T opt) => new(opt);
public T Value => _value ?? throw new InvalidOperationException("Accessing an unspecified value.");
public bool Specified => _value is not null;
public override string ToString() => _value is null ? "<NONE>" : _value.ToString()!;
readonly T? _value;
}
Personally, I think that's a bit overkill. I'd just use nullable value types, int?, double? etc, but it depends on the expected usage.
C# doesn't allow overloading the = operator (unlike eg C++). However, your suggestion C should work. It's a bit of a hassle, too, since you'll have to write a bunch of methods, but you could write an extension method such as
public static class Extensions
{
public static void UpdateIntegralPart(this CountingDevice dev, double value)
{
dev.integralPart = value;
dev.integralPartSpecified = true;
}
}
Then you can call
countingDevice.UpdateIntegralPart(1234);

When and why to use a "ValueObject" base class (from the Microsoft Docs) in C#?

I am trying to understand the use case for ValueObject in C#, when to use and what's the need for ValueObject. I see in the documentation that it can be used when we want to initialize object and then don't want to change the properties that mean making it immutable, but is it not same with Singleton pattern where you can initialize object properties in constructor and it will persist for the lifetime of the application ? Now why we need ValueObject and what is all with this EqualityComparer and hashCode() etc.
Code from Microsoft Docs:
public abstract class ValueObject
{
protected static bool EqualOperator(ValueObject left, ValueObject right)
{
if (ReferenceEquals(left, null) ^ ReferenceEquals(right, null))
{
return false;
}
return ReferenceEquals(left, null) || left.Equals(right);
}
protected static bool NotEqualOperator(ValueObject left, ValueObject right)
{
return !(EqualOperator(left, right));
}
protected abstract IEnumerable<object> GetEqualityComponents();
public override bool Equals(object obj)
{
if (obj == null || obj.GetType() != GetType())
{
return false;
}
var other = (ValueObject)obj;
return this.GetEqualityComponents().SequenceEqual(other.GetEqualityComponents());
}
public override int GetHashCode()
{
return GetEqualityComponents()
.Select(x => x != null ? x.GetHashCode() : 0)
.Aggregate((x, y) => x ^ y);
}
// Other utility methods
}
Now Address entity:
public class Address : ValueObject
{
public String Street { get; private set; }
public String City { get; private set; }
public String State { get; private set; }
public String Country { get; private set; }
public String ZipCode { get; private set; }
public Address() { }
public Address(string street, string city, string state, string country, string zipcode)
{
Street = street;
City = city;
State = state;
Country = country;
ZipCode = zipcode;
}
protected override IEnumerable<object> GetEqualityComponents()
{
// Using a yield return statement to return each element one at a time
yield return Street;
yield return City;
yield return State;
yield return Country;
yield return ZipCode;
}
}
In above code what is this address and why did we use ValueObject and what is all this equalOperator and all written in ValueObject class, just trying to understand the use case of using ValueObject in real time and why do we need it and it's real time use case on what scenarios to use ValueObjects, and why do we need Equality operator, not equal operator inside valueObject.
I am novice on this topic.
ValueObject is not the same as value objects or immutable objects. imho that
ValueObject class should be used for business logic when you use a Domain Driven Design similar approach.
In DDD, value objects cannot be identified by an Id, but by the fields in the object, thus the need for equality operators that compares one or more properties in the object.
That's also the reason why you cannot change the properties, since the object would then not be comparable to other loaded instances of it. Like in the example, if you change the street number, it's not the same address anymore.
When implementing DDD value objects you would have to copy all that comparison logic into every class and therefore duplicate code.
The ValueObject class removes the need of that and makes your own objects more business centric (i.e. more readable), as it should when using DDD (or just a clean business layer for that matter).

How to elegantly check for equality in a hierarchy of classes which have a common base class that holds a primary key?

Background
I have a base class which holds an integer ID that is used for ORM (Microsoft Entity Framework). There are about 25 classes derived from this, and the inheritance hierarchy is up to 4 classes deep.
Requirement
I need to be able to test if an object in this hierarchy is equal to another object. To be equal it is necessary but not sufficient for the IDs to be the same. For example, if two Person objects have different IDs then they are not equal, but if they have the same ID then they may or may not be equal.
Algorithm
In order to implement the C# Equals method you have to check that:
The Supplied object is not null.
It must be of the same type as this object
The IDs must match
In addition to this, all other attributes must be compared, except in the special case where the two objects are identical.
Implementation
/// <summary>
/// An object which is stored in the database
/// </summary>
public abstract class DatabaseEntity
{
/// <summary>
/// The unique identifier; if zero (0) then the ID is not assigned
/// </summary>
public int ID { get; set; }
public override bool Equals(object obj)
{
if (obj == null)
{
return false;
}
if (ReferenceEquals(obj, this))
{
return true;
}
if (obj.GetType() != GetType())
{
return false;
}
DatabaseEntity databaseEntity = (DatabaseEntity)obj;
if (ID != databaseEntity.ID)
{
return false;
}
return EqualsIgnoringID(databaseEntity);
}
public override int GetHashCode()
{
return ID;
}
/// <summary>
/// Check if this object is equal to the supplied one, disregarding the IDs
/// </summary>
/// <param name="databaseEntity">another object, which should be of the same type as this one</param>
/// <returns>true if they are equal (disregarding the ID)</returns>
protected abstract bool EqualsIgnoringID(DatabaseEntity databaseEntity);
}
public class Person : DatabaseEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool EqualsIgnoringID(DatabaseEntity databaseEntity)
{
Person person = (Person)databaseEntity;
return person.FirstName == FirstName && person.LastName == LastName;
}
}
public class User: Person
{
public string Password { get; set; }
public override bool EqualsIgnoringID(DatabaseEntity databaseEntity)
{
User user = (User)databaseEntity;
return user.Password == Password;
}
}
Comments
The feature of this solution that I dislike the most is the explicit conversions. Is there an alternative solution, which avoids having to repeat all the common logic (checking for null, type etc) in each class?
It seems simpler if instead of using abstract, you just keep overriding the Equals method for subclasses. Then you can extend like this:
public class Person : DatabaseEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool Equals(object other)
{
if (!base.Equals(other))
return false;
Person person = (Person)other;
return person.FirstName == FirstName && person.LastName == LastName;
}
}
You have to cast to Person, but this works with relatively few lines of code and with long hierarchies without any worries. (Because you already checked for the runtime types being the same in the very root of the hierarchy, you don't even have to do a as Person with a null check.)
As mentioned in the comments, with the above approach you can't stop evaluating (short-circuit) if you know for certain this is equal to other. (Although you do short-circuit if you know for sure this is not equal to other.) For example, if this has reference equality with other, you can short-circuit because there's no doubt that an object is equal to itself.
Being able to return early would mean that you can skip a lot of checks. This is useful if the checks are expensive.
To allow Equals to short-circuit true as well as false, we can add a new equality method that returns bool? to represent three states:
true: this is definitely equal to other without any need to check derived classes' properties. (Short-circuit.)
false: this is definitely not equal to other without any need to check derived classes' properties. (Short-circuit.)
null: this might or might not be equal to other, depending on derived classes' properties. (Do not short-circuit.)
Since this doesn't match the bool of Equals, you need to define Equals in terms of BaseEquals. Each derived class checks its base class' BaseEquals and chooses to short circuit if an answer is already definite (true or false) and if not, find out if the current class proves inequality. In Equals, then, a null means that no class in the inheritance hierarchy could determine inequality, so the two objects are equal and Equals should return true. Here's an implementation that will hopefully explain this better:
public class DatabaseEntity
{
public int ID { get; set; }
public override bool Equals(object other)
{
// Turn a null answer into true: if the most derived class has not
// eliminated the possibility of equality, this and other are equal.
return BaseEquals(other) ?? true;
}
protected virtual bool? BaseEquals(object other)
{
if (other == null)
return false;
if (ReferenceEquals(this, other))
return true;
if (GetType() != other.GetType())
return false;
DatabaseEntity databaseEntity = (DatabaseEntity)other;
if (ID != databaseEntity.ID)
return false;
return null;
}
}
public class Person : DatabaseEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
protected override bool? BaseEquals(object other)
{
bool? baseEquals = base.BaseEquals(other);
if (baseEquals != null)
return baseEquals;
Person person = (Person)other;
if (person.FirstName != FirstName || person.LastName != LastName)
return false;
return null;
}
}
This is pretty easy using generics:
public abstract class Entity<T>
{
protected abstract bool IsEqual(T other);
}
public class Person : Entity<Person>
{
protected override bool IsEqual(Person other) { ... }
}
This works fine for one level of inheritance, or when all the levels are abstract except for the last one.
If that's not good enough for you, you have a decision to make:
If it's not all that common, it might be just fine to keep the few exceptions with manual casts.
If it is common, you're out of luck. Making Person generic works, but it kind of defeats the purpose - it requires you to specify the concrete Person-derived type whenever you need to use Person. This can be handled by having an interface IPerson that's not generic. Of course, in effect, this still means that Person is abstract - you have no way of constructing a non-concrete version of Person. Why wouldn't it be abstract, in fact? Can you have a Person that isn't one of the derived types of Person? That sounds like a bad idea.
Well here is a variant of #31eee384 one's.
I don't use it's trinary abstract method. I suppose that if base.Equals() return true, I still need to perform the derived Equals checks too.
The drawback though is that you renounce to have the Reference Equality in base.Equals to propagate this "short-circuit" in derived classes Equals method.
Maybe there exists something in C# to "force to stop" the overriding somehow and "hard return true" when the reference equality is true without continuing the overridden derived Equals calls.
Also do note that following 31eee384 answer, we give up the template method pattern used by OP. Using this pattern again actually goes back to OP's implementation.
public class Base : IEquatable<Base>
{
public int ID {get; set;}
public Base(int id)
{ID = id;}
public virtual bool Equals(Base other)
{
Console.WriteLine("Begin Base.Equals(Base other);");
if (other == null) return false;
if (ReferenceEquals(this, other)) return true;
if (GetType() != other.GetType()) return false;
return ID == other.ID;
}
public override bool Equals(object other)
{
return this.Equals(other as Base);
}
public override int GetHashCode()
{
unchecked
{
// Choose large primes to avoid hashing collisions
const int HashingBase = (int) 2166136261;
const int HashingMultiplier = 16777619;
int hash = HashingBase;
hash = (hash * HashingMultiplier) ^ (!Object.ReferenceEquals(null, ID) ? ID.GetHashCode() : 0);
return hash;
}
}
public override string ToString()
{
return "A Base object with ["+ID+"] as ID";
}
}
public class Derived : Base, IEquatable<Derived>
{
public string Name {get; set;}
public Derived(int id, string name) : base(id)
{Name = name;}
public bool Equals(Derived other)
{
Console.WriteLine("Begin Derived.Equals(Derived other);");
if (!base.Equals(other)) return false;
return Name == other.Name;
}
public override bool Equals(object other)
{
return this.Equals(other as Derived);
}
public override int GetHashCode()
{
unchecked
{
// Choose large primes to avoid hashing collisions
const int HashingBase = (int) 2166136261;
const int HashingMultiplier = 16777619;
int hash = HashingBase;
hash = (hash * HashingMultiplier) ^ base.GetHashCode();
hash = (hash * HashingMultiplier) ^ (!Object.ReferenceEquals(null, Name) ? Name.GetHashCode() : 0);
return hash;
}
}
public override string ToString()
{
return "A Derived object with '" + Name + "' as Name, and also " + base.ToString();
}
}
Here is my fiddle link.

Optimal way to ensure uniqueness of values in datastructure?

I am required to key a dictionary based on values of a datastructure. I'm wondering what would be the optimal way to create this key?
The datastructure has 3 values: two strings and a datetime. The three of these values combined represents a "unique" key for my dictionary.
public class RouteIdentity
{
public string RouteId {get;set;}
public string RegionId {get;set;}
public DateTime RouteDate {get;set;}
}
One solution that comes to mind is to add a property to RouteIdentity (called Key perhaps?) that returns some representation of the 3 unique values. The type of Key would be the type of the key value of the dictionary. Key could be a string value that simply concatenates the various properties, but this seems terribly inefficient. I suppose if there were a way to implement a fast hashing function to return a different type that might also work.
Another possibility to is override the Equals operator for RouteIdentity. I'm thinking this might be a better approach, but I'm unsure of how to override the GetHashCode() function for such a purpose.
Can anyone shed some light on what the optimal approach would be for this case? If you feel that it would be best to use operator overloading, could you please provide some guidance as to how to implement it properly?
Thanks in advance.
Implement Equals() and GetHashCode(), ..
public class RouteIdentity
{
public string RouteId { get; set; }
public string RegionId { get; set; }
public DateTime RouteDate { get; set; }
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj))
{
return false;
}
if (ReferenceEquals(this, obj))
{
return true;
}
if (obj.GetType() != typeof(RouteIdentity))
{
return false;
}
RouteIdentity other = (RouteIdentity) obj;
return Equals(other.RouteId, RouteId) &&
Equals(other.RegionId, RegionId) &&
other.RouteDate.Equals(RouteDate);
}
public override int GetHashCode()
{
unchecked
{
int result = (RouteId != null ? RouteId.GetHashCode() : 0);
result = (result * 397) ^ (RegionId != null ? RegionId.GetHashCode() : 0);
result = (result * 397) ^ RouteDate.GetHashCode();
return result;
}
}
}
... and use new Dictionary<RouteIdentity, TValue>(), which internally will instantiate EqualityComparer<RouteIdentity>.Default, which uses these 2 methods to compare your RouteIdentity instances.
Implement IComparable for RouteIdentity and use HashSet<RouteIdentity>.

How does HashSet compare elements for equality?

I have a class that is IComparable:
public class a : IComparable
{
public int Id { get; set; }
public string Name { get; set; }
public a(int id)
{
this.Id = id;
}
public int CompareTo(object obj)
{
return this.Id.CompareTo(((a)obj).Id);
}
}
When I add a list of object of this class to a hash set:
a a1 = new a(1);
a a2 = new a(2);
HashSet<a> ha = new HashSet<a>();
ha.add(a1);
ha.add(a2);
ha.add(a1);
Everything is fine and ha.count is 2, but:
a a1 = new a(1);
a a2 = new a(2);
HashSet<a> ha = new HashSet<a>();
ha.add(a1);
ha.add(a2);
ha.add(new a(1));
Now ha.count is 3.
Why doesn't HashSet respect a's CompareTo method.
Is HashSet the best way to have a list of unique objects?
It uses an IEqualityComparer<T> (EqualityComparer<T>.Default unless you specify a different one on construction).
When you add an element to the set, it will find the hash code using IEqualityComparer<T>.GetHashCode, and store both the hash code and the element (after checking whether the element is already in the set, of course).
To look an element up, it will first use the IEqualityComparer<T>.GetHashCode to find the hash code, then for all elements with the same hash code, it will use IEqualityComparer<T>.Equals to compare for actual equality.
That means you have two options:
Pass a custom IEqualityComparer<T> into the constructor. This is the best option if you can't modify the T itself, or if you want a non-default equality relation (e.g. "all users with a negative user ID are considered equal"). This is almost never implemented on the type itself (i.e. Foo doesn't implement IEqualityComparer<Foo>) but in a separate type which is only used for comparisons.
Implement equality in the type itself, by overriding GetHashCode and Equals(object). Ideally, implement IEquatable<T> in the type as well, particularly if it's a value type. These methods will be called by the default equality comparer.
Note how none of this is in terms of an ordered comparison - which makes sense, as there are certainly situations where you can easily specify equality but not a total ordering. This is all the same as Dictionary<TKey, TValue>, basically.
If you want a set which uses ordering instead of just equality comparisons, you should use SortedSet<T> from .NET 4 - which allows you to specify an IComparer<T> instead of an IEqualityComparer<T>. This will use IComparer<T>.Compare - which will delegate to IComparable<T>.CompareTo or IComparable.CompareTo if you're using Comparer<T>.Default.
Here's clarification on a part of the answer that's been left unsaid: The object type of your HashSet<T> doesn't have to implement IEqualityComparer<T> but instead just has to override Object.GetHashCode() and Object.Equals(Object obj).
Instead of this:
public class a : IEqualityComparer<a>
{
public int GetHashCode(a obj) { /* Implementation */ }
public bool Equals(a obj1, a obj2) { /* Implementation */ }
}
You do this:
public class a
{
public override int GetHashCode() { /* Implementation */ }
public override bool Equals(object obj) { /* Implementation */ }
}
It is subtle, but this tripped me up for the better part of a day trying to get HashSet to function the way it is intended. And like others have said, HashSet<a> will end up calling a.GetHashCode() and a.Equals(obj) as necessary when working with the set.
HashSet uses Equals and GetHashCode().
CompareTo is for ordered sets.
If you want unique objects, but you don't care about their iteration order, HashSet<T> is typically the best choice.
constructor HashSet receive object what implement IEqualityComparer for adding new object.
if you whant use method in HashSet you nead overrride Equals, GetHashCode
namespace HashSet
{
public class Employe
{
public Employe() {
}
public string Name { get; set; }
public override string ToString() {
return Name;
}
public override bool Equals(object obj) {
return this.Name.Equals(((Employe)obj).Name);
}
public override int GetHashCode() {
return this.Name.GetHashCode();
}
}
class EmployeComparer : IEqualityComparer<Employe>
{
public bool Equals(Employe x, Employe y)
{
return x.Name.Trim().ToLower().Equals(y.Name.Trim().ToLower());
}
public int GetHashCode(Employe obj)
{
return obj.Name.GetHashCode();
}
}
class Program
{
static void Main(string[] args)
{
HashSet<Employe> hashSet = new HashSet<Employe>(new EmployeComparer());
hashSet.Add(new Employe() { Name = "Nik" });
hashSet.Add(new Employe() { Name = "Rob" });
hashSet.Add(new Employe() { Name = "Joe" });
Display(hashSet);
hashSet.Add(new Employe() { Name = "Rob" });
Display(hashSet);
HashSet<Employe> hashSetB = new HashSet<Employe>(new EmployeComparer());
hashSetB.Add(new Employe() { Name = "Max" });
hashSetB.Add(new Employe() { Name = "Solomon" });
hashSetB.Add(new Employe() { Name = "Werter" });
hashSetB.Add(new Employe() { Name = "Rob" });
Display(hashSetB);
var union = hashSet.Union<Employe>(hashSetB).ToList();
Display(union);
var inter = hashSet.Intersect<Employe>(hashSetB).ToList();
Display(inter);
var except = hashSet.Except<Employe>(hashSetB).ToList();
Display(except);
Console.ReadKey();
}
static void Display(HashSet<Employe> hashSet)
{
if (hashSet.Count == 0)
{
Console.Write("Collection is Empty");
return;
}
foreach (var item in hashSet)
{
Console.Write("{0}, ", item);
}
Console.Write("\n");
}
static void Display(List<Employe> list)
{
if (list.Count == 0)
{
Console.WriteLine("Collection is Empty");
return;
}
foreach (var item in list)
{
Console.Write("{0}, ", item);
}
Console.Write("\n");
}
}
}
I came here looking for answers, but found that all the answers had too much info or not enough, so here is my answer...
Since you've created a custom class you need to implement GetHashCode and Equals. In this example I will use a class Student instead of a because it's easier to follow and doesn't violate any naming conventions. Here is what the implementations look like:
public override bool Equals(object obj)
{
return obj is Student student && Id == student.Id;
}
public override int GetHashCode()
{
return HashCode.Combine(Id);
}
I stumbled across this article from Microsoft that gives an incredibly easy way to implement these if you're using Visual Studio. In case it's helpful to anyone else, here are complete steps for using a custom data type in a HashSet using Visual Studio:
Given a class Student with 2 simple properties and an initializer
public class Student
{
public int Id { get; set; }
public string Name { get; set; }
public Student(int id)
{
this.Id = id;
}
}
To Implement IComparable, add : IComparable<Student> like so:
public class Student : IComparable<Student>
You will see a red squiggly appear with an error message saying your class doesn't implement IComparable. Click on suggestions or press Alt+Enter and use the suggestion to implement it.
You will see the method generated. You can then write your own implementation like below:
public int CompareTo(Student student)
{
return this.Id.CompareTo(student.Id);
}
In the above implementation only the Id property is compared, name is ignored. Next right-click in your code and select Quick actions and refactorings, then Generate Equals and GetHashCode
A window will pop up where you can select which properties to use for hashing and even implement IEquitable if you'd like:
Here is the generated code:
public class Student : IComparable<Student>, IEquatable<Student> {
...
public override bool Equals(object obj)
{
return Equals(obj as Student);
}
public bool Equals(Student other)
{
return other != null && Id == other.Id;
}
public override int GetHashCode()
{
return HashCode.Combine(Id);
}
}
Now if you try to add a duplicate item like shown below it will be skipped:
static void Main(string[] args)
{
Student s1 = new Student(1);
Student s2 = new Student(2);
HashSet<Student> hs = new HashSet<Student>();
hs.Add(s1);
hs.Add(s2);
hs.Add(new Student(1)); //will be skipped
hs.Add(new Student(3));
}
You can now use .Contains like so:
for (int i = 0; i <= 4; i++)
{
if (hs.Contains(new Student(i)))
{
Console.WriteLine($#"Set contains student with Id {i}");
}
else
{
Console.WriteLine($#"Set does NOT contain a student with Id {i}");
}
}
Output:

Categories