Optimal way to ensure uniqueness of values in datastructure? - c#

I am required to key a dictionary based on values of a datastructure. I'm wondering what would be the optimal way to create this key?
The datastructure has 3 values: two strings and a datetime. The three of these values combined represents a "unique" key for my dictionary.
public class RouteIdentity
{
public string RouteId {get;set;}
public string RegionId {get;set;}
public DateTime RouteDate {get;set;}
}
One solution that comes to mind is to add a property to RouteIdentity (called Key perhaps?) that returns some representation of the 3 unique values. The type of Key would be the type of the key value of the dictionary. Key could be a string value that simply concatenates the various properties, but this seems terribly inefficient. I suppose if there were a way to implement a fast hashing function to return a different type that might also work.
Another possibility to is override the Equals operator for RouteIdentity. I'm thinking this might be a better approach, but I'm unsure of how to override the GetHashCode() function for such a purpose.
Can anyone shed some light on what the optimal approach would be for this case? If you feel that it would be best to use operator overloading, could you please provide some guidance as to how to implement it properly?
Thanks in advance.

Implement Equals() and GetHashCode(), ..
public class RouteIdentity
{
public string RouteId { get; set; }
public string RegionId { get; set; }
public DateTime RouteDate { get; set; }
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj))
{
return false;
}
if (ReferenceEquals(this, obj))
{
return true;
}
if (obj.GetType() != typeof(RouteIdentity))
{
return false;
}
RouteIdentity other = (RouteIdentity) obj;
return Equals(other.RouteId, RouteId) &&
Equals(other.RegionId, RegionId) &&
other.RouteDate.Equals(RouteDate);
}
public override int GetHashCode()
{
unchecked
{
int result = (RouteId != null ? RouteId.GetHashCode() : 0);
result = (result * 397) ^ (RegionId != null ? RegionId.GetHashCode() : 0);
result = (result * 397) ^ RouteDate.GetHashCode();
return result;
}
}
}
... and use new Dictionary<RouteIdentity, TValue>(), which internally will instantiate EqualityComparer<RouteIdentity>.Default, which uses these 2 methods to compare your RouteIdentity instances.

Implement IComparable for RouteIdentity and use HashSet<RouteIdentity>.

Related

C# Merge two Lists of the same values

this is my Clients class:
public class Clients
{
public string Email { get; set; }
public string Name { get; set; }
public Clients(string e, string n)
{
Email = e;
Name = n;
}
I want to make a new list which contains the same clients from List A and List B .
For example:
List A - John, Jonathan, James ....
List B - Martha, Jane, Jonathan ....
Unsubscribers - Jonathan
public static List<Clients> SameClients(List<Clients> A, List<Clients> B)
{
List<Clients> Unsubscribers = new List<Clients>();
Unsubscribers = A.Intersect(B).ToList();
return Unsubscribers;
}
However for some reasons I get empty list and I have no idea what's wrong.
The problem is that when you are comparing objects Equals and Gethashcode are used to compare them. You can override these two methods and provide your own implementation based on your needs...there is already an answer below covering how to override these two methods
However, normally I prefer to keep my entities/models (or whatever you want to call them) very simple and keep comparison implementation details away from my models. In that case, you can implement an IEqualityComparer<TSource> and use an overload of Intersects that takes in an IEqualityComparer
Here's an example implementation of IEqualityComprarer based on only the Name property...
public class ClientNameEqualityComparer : IEqualityComparer<Clients>
{
public bool Equals(Clients c1, Clients c2)
{
if (c2 == null && c1 == null)
return true;
else if (c1 == null | c2 == null)
return false;
else if(c1.Name == c2.Name)
return true;
else
return false;
}
public int GetHashCode(Client c)
{
return c.Name.GetHashCode();
}
}
Basically, the implementation above only cares about the Name property, if two instances of Clients have the same value for the Name property, then they are considered equal.
Now you can do the followig...
A.Intersect(B, new ClientNameEqualityComparer()).ToList();
And that will produce the results you are expecting...
Intersect uses GetHashCode and Equals by default, but you haven't overriden it, so Object.Equals is used which just compares references. Since all your client-instances are initialized with new they are separate instances even if they have equal values. That's why Intersect "thinks" that there are no common clients.
So you have several options.
implement a custom IEqualityComparer<Clients> and pass that to Intersect(or many other LINQ methods). This has the advantage that you could implement different comparer for different requirements and you don't need to modify the original class
let Clients override Equals and GetHashCode and /or
let Clients implement IEquatable<Clients>
For example(showing the last two because other answer showed already IEqualityComparer<T>):
public class Clients : IEquatable<Clients>
{
public string Email { get; set; }
public string Name { get; set; }
public Clients(string e, string n)
{
Email = e;
Name = n;
}
public override bool Equals(object obj)
{
return obj is Clients && this.Equals((Clients)obj);
}
public bool Equals(Clients other)
{
return Email == other?.Email == true
&& Name == other?.Name == true;
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + (Email?.GetHashCode() ?? 0);
hash = hash * 23 + (Name?.GetHashCode() ?? 0);
return hash;
}
}
}
Worth reading:
Differences between IEquatable<T>, IEqualityComparer<T>, and overriding .Equals() when using LINQ on a custom object collection?

Define equality based on class property to be used in HashSet

I have the following class:
public class OrderRule {
public OrderDirection Direction { get; set; }
public String Property { get; set; }
}
And an HashSet of it:
HashSet<OrderRule> rules = // ...
I need to OrderRules to be considered equal if the Property is equal.
How can I do this?
Since the specification for this equality is not coming from the OrderRule class, but your collection, use the constructor overload of the HashSet that accepts an IEqualityComparer.
public class MyOrderRuleComparer : EqualityComparer<OrderRule>
{
private IEqualityComparer<string> _c = EqualityComparer<string>.Default;
public override bool Equals(OrderRule l, OrderRule r)
{
return _c.Equals(l.Property, r.Property);
}
public override int GetHashCode(OrderRule rule)
{
return _c.GetHashCode(rule.Property);
}
}
...
HashSet<OrderRule> rules = new HashSet(new MyOrderRuleComparer());
Please note that by using OrderRule.Property as a key, you imply that it must not change after the instance is added to the set. This is why implementing IEquatable<OrderRule> could be the best approach depending on your developer team.
If I add two OrderRules with same Property but different Direction I
still need both to be considered equal
You could override Equals and GethashCode and/or implement IEquatable<OrderRule>:
public class OrderRule: IEquatable<OrderRule>
{
public OrderRule(string property)
{
this.Property = property;
}
public OrderDirection Direction { get; set; }
public String Property { get; }
public OrderRule Rule { get; set; }
public bool Equals(OrderRule other)
{
return (other != null && other.Property == this.Property);
}
public override int GetHashCode()
{
return Property?.GetHashCode() ?? int.MinValue;
}
public override bool Equals(object obj)
{
if (obj == null)
return false;
if(ReferenceEquals(this, obj))
return true;
OrderRule other = obj as OrderRule;
return this.Equals(other);
}
}
Note that i've made the property read-only because you should not be able to modify a property or field that is used in GetHashCode.
Why?: "Guideline: the integer returned by GetHashCode should never change
Ideally, the hash code of a mutable object should be computed from only fields which cannot mutate, and therefore the hash value of an object is the same for its entire lifetime."
This value is f.e. used in a dictionary or HashSet to compute the hashcode. If it would change after the object was added it could no longer be found.

How to elegantly check for equality in a hierarchy of classes which have a common base class that holds a primary key?

Background
I have a base class which holds an integer ID that is used for ORM (Microsoft Entity Framework). There are about 25 classes derived from this, and the inheritance hierarchy is up to 4 classes deep.
Requirement
I need to be able to test if an object in this hierarchy is equal to another object. To be equal it is necessary but not sufficient for the IDs to be the same. For example, if two Person objects have different IDs then they are not equal, but if they have the same ID then they may or may not be equal.
Algorithm
In order to implement the C# Equals method you have to check that:
The Supplied object is not null.
It must be of the same type as this object
The IDs must match
In addition to this, all other attributes must be compared, except in the special case where the two objects are identical.
Implementation
/// <summary>
/// An object which is stored in the database
/// </summary>
public abstract class DatabaseEntity
{
/// <summary>
/// The unique identifier; if zero (0) then the ID is not assigned
/// </summary>
public int ID { get; set; }
public override bool Equals(object obj)
{
if (obj == null)
{
return false;
}
if (ReferenceEquals(obj, this))
{
return true;
}
if (obj.GetType() != GetType())
{
return false;
}
DatabaseEntity databaseEntity = (DatabaseEntity)obj;
if (ID != databaseEntity.ID)
{
return false;
}
return EqualsIgnoringID(databaseEntity);
}
public override int GetHashCode()
{
return ID;
}
/// <summary>
/// Check if this object is equal to the supplied one, disregarding the IDs
/// </summary>
/// <param name="databaseEntity">another object, which should be of the same type as this one</param>
/// <returns>true if they are equal (disregarding the ID)</returns>
protected abstract bool EqualsIgnoringID(DatabaseEntity databaseEntity);
}
public class Person : DatabaseEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool EqualsIgnoringID(DatabaseEntity databaseEntity)
{
Person person = (Person)databaseEntity;
return person.FirstName == FirstName && person.LastName == LastName;
}
}
public class User: Person
{
public string Password { get; set; }
public override bool EqualsIgnoringID(DatabaseEntity databaseEntity)
{
User user = (User)databaseEntity;
return user.Password == Password;
}
}
Comments
The feature of this solution that I dislike the most is the explicit conversions. Is there an alternative solution, which avoids having to repeat all the common logic (checking for null, type etc) in each class?
It seems simpler if instead of using abstract, you just keep overriding the Equals method for subclasses. Then you can extend like this:
public class Person : DatabaseEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool Equals(object other)
{
if (!base.Equals(other))
return false;
Person person = (Person)other;
return person.FirstName == FirstName && person.LastName == LastName;
}
}
You have to cast to Person, but this works with relatively few lines of code and with long hierarchies without any worries. (Because you already checked for the runtime types being the same in the very root of the hierarchy, you don't even have to do a as Person with a null check.)
As mentioned in the comments, with the above approach you can't stop evaluating (short-circuit) if you know for certain this is equal to other. (Although you do short-circuit if you know for sure this is not equal to other.) For example, if this has reference equality with other, you can short-circuit because there's no doubt that an object is equal to itself.
Being able to return early would mean that you can skip a lot of checks. This is useful if the checks are expensive.
To allow Equals to short-circuit true as well as false, we can add a new equality method that returns bool? to represent three states:
true: this is definitely equal to other without any need to check derived classes' properties. (Short-circuit.)
false: this is definitely not equal to other without any need to check derived classes' properties. (Short-circuit.)
null: this might or might not be equal to other, depending on derived classes' properties. (Do not short-circuit.)
Since this doesn't match the bool of Equals, you need to define Equals in terms of BaseEquals. Each derived class checks its base class' BaseEquals and chooses to short circuit if an answer is already definite (true or false) and if not, find out if the current class proves inequality. In Equals, then, a null means that no class in the inheritance hierarchy could determine inequality, so the two objects are equal and Equals should return true. Here's an implementation that will hopefully explain this better:
public class DatabaseEntity
{
public int ID { get; set; }
public override bool Equals(object other)
{
// Turn a null answer into true: if the most derived class has not
// eliminated the possibility of equality, this and other are equal.
return BaseEquals(other) ?? true;
}
protected virtual bool? BaseEquals(object other)
{
if (other == null)
return false;
if (ReferenceEquals(this, other))
return true;
if (GetType() != other.GetType())
return false;
DatabaseEntity databaseEntity = (DatabaseEntity)other;
if (ID != databaseEntity.ID)
return false;
return null;
}
}
public class Person : DatabaseEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
protected override bool? BaseEquals(object other)
{
bool? baseEquals = base.BaseEquals(other);
if (baseEquals != null)
return baseEquals;
Person person = (Person)other;
if (person.FirstName != FirstName || person.LastName != LastName)
return false;
return null;
}
}
This is pretty easy using generics:
public abstract class Entity<T>
{
protected abstract bool IsEqual(T other);
}
public class Person : Entity<Person>
{
protected override bool IsEqual(Person other) { ... }
}
This works fine for one level of inheritance, or when all the levels are abstract except for the last one.
If that's not good enough for you, you have a decision to make:
If it's not all that common, it might be just fine to keep the few exceptions with manual casts.
If it is common, you're out of luck. Making Person generic works, but it kind of defeats the purpose - it requires you to specify the concrete Person-derived type whenever you need to use Person. This can be handled by having an interface IPerson that's not generic. Of course, in effect, this still means that Person is abstract - you have no way of constructing a non-concrete version of Person. Why wouldn't it be abstract, in fact? Can you have a Person that isn't one of the derived types of Person? That sounds like a bad idea.
Well here is a variant of #31eee384 one's.
I don't use it's trinary abstract method. I suppose that if base.Equals() return true, I still need to perform the derived Equals checks too.
The drawback though is that you renounce to have the Reference Equality in base.Equals to propagate this "short-circuit" in derived classes Equals method.
Maybe there exists something in C# to "force to stop" the overriding somehow and "hard return true" when the reference equality is true without continuing the overridden derived Equals calls.
Also do note that following 31eee384 answer, we give up the template method pattern used by OP. Using this pattern again actually goes back to OP's implementation.
public class Base : IEquatable<Base>
{
public int ID {get; set;}
public Base(int id)
{ID = id;}
public virtual bool Equals(Base other)
{
Console.WriteLine("Begin Base.Equals(Base other);");
if (other == null) return false;
if (ReferenceEquals(this, other)) return true;
if (GetType() != other.GetType()) return false;
return ID == other.ID;
}
public override bool Equals(object other)
{
return this.Equals(other as Base);
}
public override int GetHashCode()
{
unchecked
{
// Choose large primes to avoid hashing collisions
const int HashingBase = (int) 2166136261;
const int HashingMultiplier = 16777619;
int hash = HashingBase;
hash = (hash * HashingMultiplier) ^ (!Object.ReferenceEquals(null, ID) ? ID.GetHashCode() : 0);
return hash;
}
}
public override string ToString()
{
return "A Base object with ["+ID+"] as ID";
}
}
public class Derived : Base, IEquatable<Derived>
{
public string Name {get; set;}
public Derived(int id, string name) : base(id)
{Name = name;}
public bool Equals(Derived other)
{
Console.WriteLine("Begin Derived.Equals(Derived other);");
if (!base.Equals(other)) return false;
return Name == other.Name;
}
public override bool Equals(object other)
{
return this.Equals(other as Derived);
}
public override int GetHashCode()
{
unchecked
{
// Choose large primes to avoid hashing collisions
const int HashingBase = (int) 2166136261;
const int HashingMultiplier = 16777619;
int hash = HashingBase;
hash = (hash * HashingMultiplier) ^ base.GetHashCode();
hash = (hash * HashingMultiplier) ^ (!Object.ReferenceEquals(null, Name) ? Name.GetHashCode() : 0);
return hash;
}
}
public override string ToString()
{
return "A Derived object with '" + Name + "' as Name, and also " + base.ToString();
}
}
Here is my fiddle link.

Linq/Enumerable Any Vs Contains

I've solved a problem I was having but although I've found out how something works (or doesn't) I'm not clear on why.
As I'm the type of person who likes to know the "why" I'm hoping someone can explain:
I have list of items and associated comments, and I wanted to differentiate between admin comments and user comments, so I tried the following code:
User commentUser = userRepository.GetUserById(comment.userId);
Role commentUserRole = context.Roles.Single(x=>x.Name == "admin");
if(commentUser.Roles.Contains(commentUserRole)
{
//do stuff
}
else
{
// do other stuff
}
Stepping through the code showed that although it had the correct Role object, it didn't recognise the role in the commentUser.Roles
The code that eventually worked is:
if(commentUser.Roles.Any(x=>x.Name == "admin"))
{
//do stuff
}
I'm happy with this because it's less code and in my opinion cleaner, but I don't understand how contains didn't work.
Hoping someone can clear that up for me.
This is probably because you didn't override the equality comparisons (Equals, GetHashCode, operator==) on your Role class. Therefore, it was doing reference comparison, which really isn't the best idea, as if they're not the same object, it makes it think it's a different. You need to override those equality operators to provide value equality.
You have to override Equals (and always also GetHashCode then) if you want to use Contains. Otherwise Equals will just compare references.
So for example:
public class Role
{
public string RoleName{ get; set; }
public int RoleID{ get; set; }
// ...
public override bool Equals(object obj)
{
Role r2 = obj as Role;
if (r2 == null) return false;
return RoleID == r2.RoleID;
}
public override int GetHashCode()
{
return RoleID;
}
public override string ToString()
{
return RoleName;
}
}
Another option is to implement a custom IEqualityComparer<Role> for the overload of Enumerable.Contains:
public class RoleComparer : IEqualityComparer<Role>
{
public bool Equals(Role x, Role y)
{
return x.RoleID.Equals(y.RoleID);
}
public int GetHashCode(Role obj)
{
return obj.RoleID;
}
}
Use it in this way:
var comparer = new RoleComparer();
User commentUser = userRepository.GetUserById(comment.userId);
Role commentUserRole = context.Roles.Single(x=>x.Name == "admin");
if(commentUser.Roles.Contains(commentUserRole, comparer))
{
// ...
}
When using the Contains-method, you check if the the array Roles of the user-object contains the object you have retrieved from the database beforehand. Though the array contains an object for the role "admin" it does not contain the exact object you fetched before.
When using the Any-method you check if there is any role having the name "admin" - and that delivers the expected result.
To get the same result with the Contains-method implement the IEquatable<Role>-interface on the role-class and compare the name to check whether two instances have actually the same value.
It will be your equality comparison for a Role.
The object in commentUserRole is not the same object as the one you are looking for commentUser.Roles.
Your context object will create a new object when you select from it and populate your Roles property with a collection of new Roles. If your context is not tracking the objects in order to return the same object when a second copy is requested then it will be a different object even though all the properties may be the same. Hence the failure of Contains
Your Any clause is explicitly checking the Name property which is why it works
Try making Role implement IEquatable<Role>
public class Role : IEquatable<Role> {
public bool Equals(Role compare) {
return compare != null && this.Name == compare.Name;
}
}
Whilst MSDN shows you only need this for a List<T> you may actually need to override Equals and GetHashCode to make this work
in which case:
public class Role : IEquatable<Role> {
public bool Equals(Role compare) {
return compare != null && this.Name == compare.Name;
}
public override bool Equals(object compare) {
return this.Equals(compare as Role); // this will call the above equals method
}
public override int GetHashCode() {
return this.Name == null ? 0 : this.Name.GetHashCode();
}
}

How to extract string representation of key object

I'm storing items in a strongly typed IDictionary<TKey, TValue> such that the value also represents the key:
public class MyObject
{
public string Name { get; private set; }
public SectionId Section { get; private set; }
public MyObject(SectionId section, string name)
{
Section = section;
Name = name;
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof(MyObject)) return false;
return Equals((MyObject)obj);
}
public override int GetHashCode()
{
unchecked
{
return (Name.ToLower().GetHashCode() * 397) ^ Section.GetHashCode();
}
}
}
In my presentation tier, I need to iterate through this Dictionary, adding each item to a ListBox control. I'm having a difficult time figuring out how to transform MyObject (which also acts as a key) into a string that the ListBox control can use as a value. Should I just make an explicit call to MyObject.GetHashCode() like this:
MyListBox.Add(new ListItem(myObject.Name, myObject.GetHashCode())
I would think of overriding the toString method and in here you will basically write code that will generate a meaningful string to be displayed in the ui
Hope I understood your question correctly.
Should I just make an explicit call to MyObject.GetHashCode()
No, GetHashCode() is:
Not guaranteed to give you a unique value, and
Going to be very difficult to reverse-engineer to produce a MyObject from.
Instead, each of your MyObjects should have some kind of unique identifier key. This can be an enum value or a number generated by an IDENTITY column in your database, or just a string that uniquely identifies each particular MyObject, and from which you can retrieve the MyObject from whatever collection or database you're using as a repository.
If there can only ever be a single MyObject with a given Name and Section, you could just combine the two: SectionId + ":" + Name. That way you can parse those two values out after the fact.

Categories