DDD: GetHashCode and primary id - c#

I have seen DDD Domain implementations where entities rely on the primary key ID when building Equals/GetHashCode methods.
I understand why it is a good idea, as the primary key might be the only member which is not mutable. But there might also be situations where it is no good idea, though.
Think about a Dictionary, holding entities which where just instantiated. The primary key (assume it is an auto-incrementing value, not a "business-related" value) is not assigned yet, it might be 0 on each entity.
Now, the entities within the Dictionary will be saved. This implies a change of the primary key resulting in a different hash code.
My question: what pattern should be used regarding GetHashCode if no business-related primary key is available and all members are mutable?
Thank you in advance.

In such cases, I only rely on the Id in the Equals and GetHashcode methods, if the Id has been assigned.
In other cases, I compare for equality using the reference.
The same goes for the GetHashCode implementation; I only create a hashcode based on the 'Id', if the Id has been assigned. (That is, if the entity is not transient).
The code below, is what I use. This is somewhat based on the entities in Sharp architecture:
public override bool Equals( object obj )
{
Entity<TId> other = obj as Entity<TId>;
if( other == null || this.GetType() != other.GetType() )
{
return false;
}
bool otherIsTransient = Equals (other.Id, default(TId));
bool thisIsTransient = Equals (this.Id, default (TId));
if( otherIsTransient && thisIsTransient )
{
return ReferenceEquals (this, other);
}
return Id.Equals (other.Id);
}
First, I check whether both entities are of the same type. (Actually, you could create a typed version of this method as well, by implementing the correct interface (IEquality<T>)
When both entities are of the same type, I check whether one of them is transient. I just do it by checking their Id property: if it contains the default value, it means that they haven't been saved in the DB yet.
If both of them are transient, I compare both entities on their reference. Otherwise, I can compare them on their ID.
The GetHashCode method that I use, makes sure that the value that's being returned, never changes:
private int? _oldHashCode;
public override int GetHashCode()
{
// Once we have a hash code we'll never change it
if( _oldHashCode.HasValue )
{
return _oldHashCode.Value;
}
bool thisIsTransient = Equals (Id, default(TId));
// When this instance is transient, we use the base GetHashCode()
// and remember it, so an instance can NEVER change its hash code.
if( thisIsTransient )
{
_oldHashCode = base.GetHashCode ();
return _oldHashCode.Value;
}
return Id.GetHashCode ();
}

You can check, assuming it's an auto-incremented identity, and only compare on Id if they are not 0 or default(int). Something like:
public virtual bool Equals(MyClass other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
if (Id != default(int) && other.Id != default(int))
return Equals(Id, other.Id);
return Equals(other.ChildProp, ChildProp);
}

Well, if you can't tell that an object equals another you should not do it :)
It is ok if equals returns false all the time (given another object than itself). If different local objects (without id) get different ids at the time inserted into the db/storage this is perfectly sane.
If you can tell that two objects are equal and they would both be mapped into the same table line and thus would get the same id on store Equals should return true, intuitively. But you are right that using mutable values to determine equality (and esp. hash code) can be dangerous. Collections usually get confused when you do this.
In this case it is probably better to use other methods than Equals and GetHashCode to implement this kind of equality, as it really changes over time.
So essentially you need to determine which kind of equality you really need for what purpose.

I expect that the systems you have seen use NHibernate for ORM. The guidelines for NHibernate (which supposedly allows you entities to be POCOs, and enforces no requirement on them) recommends this practice. The problem is that NH sometimes has difficulty recognising an entity as being the same as another instance, especially if the session has been closed and reopened, or if two proxies represent the same object. You can end up with the same entity twice in a collection if you do not do this. Follow this link for more information:
http://community.jboss.org/wiki/EqualsAndHashCode

Related

C# List comparison between custom objects

Pair<BoardLocation, BoardLocation> loc = new Pair<BoardLocation, BoardLocation>( this.getLocation(), l );
if(!this.getPlayer().getMoves().Contains( loc )) {
this.getPlayer().addMove( loc );
}
I'm using a Type I have created called "Pair" but, I'm trying to use the contains function in C# that would compare the two types but, I have used override in the Type "Pair" itself to compare the "ToString()" of both Pair objects being compared. So there are 4 strings being compared. The two Keys and two value. If the two Keys are equal, then the two values are compared. The reason why this makes sense is the Key is the originating(key) location for the location(value) being attacked. If the key and value are the same then the object should not be added.
public override bool Equals( object obj ) {
Pair<K, V> objNode = (Pair<K, V>)obj;
if(this.value.ToString().CompareTo( objNode.value.ToString() ) == 0) {
if(this.key.ToString().CompareTo( objNode.key.ToString() ) == 0) {
return true;
} else
return false;
} else {
return false;
}
}
The question is, Is there a better way to do this that doesn't involve stupid amounts of code or creating new objects for dealing with this. Of course if any ideas involve these, I am all ears. The part that confuses me about this is, perhaps I dont understand what is going on but, I was hoping that C# offered a method that just equivalence of values and not the object memory locations and etc.
I've just ported this from Java as well, and it works exactly the same but, I'm asking this question for C# because I'm hoping there was a better way for me to compare these objects without using ToString() with generic Types.
You can definitely make this code a lot simpler by using && and just returning the value of equality comparisons, instead of all those if statements and return true; or return false; statements.
public override bool Equals (object obj) {
// Safety first: handle the case where the other object isn't
// of the same type, or obj is null. In both cases we should
// return false, rather than throwing an exception
Pair<K, V> otherPair = objNode as Pair<K, V>;
if (otherPair == null) {
return false;
}
return key.ToString() == otherPair.key.ToString() &&
value.ToString() == otherPair.value.ToString();
}
In Java you could use equals rather than compareTo.
Note that these aren't exactly the same as == (and Equals) use an ordinal comparison rather than a culture-sensitive one - but I suspect that's what you want anyway.
I would personally shy away from comparing the values by ToString() representations. I would use the natural equality comparisons of the key and value types instead:
public override bool Equals (object obj) {
// Safety first: handle the case where the other object isn't
// of the same type, or obj is null. In both cases we should
// return false, rather than throwing an exception
Pair<K, V> otherPair = objNode as Pair<K, V>;
if (otherPair == null) {
return false;
}
return EqualityComparer<K>.Default.Equals(key, otherPair.key) &&
EqualityComparer<K>.Default.Equals(value, otherPair.value);
}
(As Avner notes, you could just use Tuple of course...)
As noted in comments, I'd also strongly recommend that you start using properties and C# naming conventions, e.g.:
if (!Player.Moves.Contains(loc)) {
Player.AddMove(loc);
}
The simplest way to improve this is to use, instead of your custom Pair class, an instance of the built-in Tuple<T1,T2> class.
The Tuple class, in addition to giving you an easy way to bundle several values together, automatically implements structural equality, meaning that a Tuple object is equal to another if:
It is a Tuple object.
Its two components are of the same types as the current instance.
Its two components are equal to those of the current instance. Equality is determined by the default object equality comparer for each component.
(from MSDN)
This means that instead of your Pair having to compare its values, you're delegating the responsibility to the types held in the Tuple.

Remove duplicates in custom IComparable class

I have a table that has combo pairs identifiers, and I use that to go through CSV files looking for matches. I'm trapping the unidentified pairs in a List, and sending them to an output box for later addition. I would like the output to only have single occurrences of unique pairs. The class is declared as follows:
public class Unmatched:IComparable<Unmatched>
{
public string first_code { get; set; }
public string second_code { get; set; }
public int CompareTo(Unmatched other)
{
if (this.first_code == other.first_code)
{
return this.second_code.CompareTo(other.second_code);
}
return other.first_code.CompareTo(this.first_code);
}
}
One note on the above code: This returns it in reverse alphabetical order, to get it in alphabetical order use this line:
return this.first_code.CompareTo(other.first_code);
Here is the code that adds it. This is directly after the comparison against the datatable elements
unmatched.Add(new Unmatched()
{ first_code = fields[clients[global_index].first_match_column]
, second_code = fields[clients[global_index].second_match_column] });
I would like to remove all pairs from the list where both first code and second code are equal, i.e.;
PTC,138A
PTC,138A
PTC,138A
MA9,5A
MA9,5A
MA9,5A
MA63,138A
MA63,138A
MA59,87BM
MA59,87BM
Should become:
PTC, 138A
MA9, 5A
MA63, 138A
MA59, 87BM
I have tried adding my own Equate and GetHashCode as outlined here:
http://www.morgantechspace.com/2014/01/Use-of-Distinct-with-Custom-Class-objects-in-C-Sharp.html
The SE links I have tried are here:
How would I distinct my list of key/value pairs
Get list of distinct values in List<T> in c#
Get a list of distinct values in List
All of them return a list that still has all the pairs. Here is the current code (Yes, I know there are two distinct lines, neither appears to be working) that outputs the list:
parser.Close();
List<Unmatched> noDupes = unmatched.Distinct().ToList();
noDupes.Sort();
noDupes.Select(x => x.first_code).Distinct();
foreach (var pair in noDupes)
{
txtUnmatchedList.AppendText(pair.first_code + "," + pair.second_code + Environment.NewLine);
}
Here is the Equate/Hash code as requested:
public bool Equals(Unmatched notmatched)
{
//Check whether the compared object is null.
if (Object.ReferenceEquals(notmatched, null)) return false;
//Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, notmatched)) return true;
//Check whether the UserDetails' properties are equal.
return first_code.Equals(notmatched.first_code) && second_code.Equals(notmatched.second_code);
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public override int GetHashCode()
{
//Get hash code for the UserName field if it is not null.
int hashfirst_code = first_code == null ? 0 : first_code.GetHashCode();
//Get hash code for the City field.
int hashsecond_code = second_code.GetHashCode();
//Calculate the hash code for the GPOPolicy.
return hashfirst_code ^ hashsecond_code;
}
I have also looked at a couple of answers that are using queries and Tuples, which I honestly don't understand. Can someone point me to a source or answer that will explain the how (And why) of getting distinct pairs out of a custom list?
(Side question-Can you declare a class as both IComparable and IEquatable?)
The problem is you are not implementing IEquatable<Unmatched>.
public class Unmatched : IComparable<Unmatched>, IEquatable<Unmatched>
EqualityComparer<T>.Default uses the Equals(T) method only if you implement IEquatable<T>. You are not doing this, so it will instead use Object.Equals(object) which uses reference equality.
The overload of Distinct you are calling uses EqualityComparer<T>.Default to compare different elements of the sequence for equality. As the documentation states, the returned comparer uses your implementation of GetHashCode to find potentially-equal elements. It then uses the Equals(T) method to check for equality, or Object.Equals(Object) if you have not implemented IEquatable<T>.
You have an Equals(Unmatched) method, but it will not be used since you are not implementing IEquatable<Unmatched>. Instead, the default Object.Equals method is used which uses reference equality.
Note your current Equals method is not overriding Object.Equals since that takes an Object parameter, and you would need to specify the override modifier.
For an example on using Distinct see here.
You have to implement the IEqualityComparer<TSource> and not IComparable<TSource>.

Implementing GetHashCode for IEqualityComparer<T> with conditional equality

I'm wondering if anyone as any suggestions for this problem.
I'm using intersect and except (Linq) with a custom IEqualityComparer in order to query the set differences and set intersections of two sequences of ISyncableUsers.
public interface ISyncableUser
{
string Guid { get; }
string UserPrincipalName { get; }
}
The logic behind whether two ISyncableUsers are equal is conditional. The conditions center around whether either of the two properties, Guid and UserPrincipalName, have values. The best way to explain this logic is with code. Below is my implementation of the Equals method of my customer IEqualityComparer.
public bool Equals(ISyncableUser userA, ISyncableUser userB)
{
if (userA == null && userB == null)
{
return true;
}
if (userA == null)
{
return false;
}
if (userB == null)
{
return false;
}
if ((!string.IsNullOrWhiteSpace(userA.Guid) && !string.IsNullOrWhiteSpace(userB.Guid)) &&
userA.Guid == userB.Guid)
{
return true;
}
if (UsersHaveUpn(userA, userB))
{
if (userB.UserPrincipalName.Equals(userA.UserPrincipalName, StringComparison.InvariantCultureIgnoreCase))
{
return true;
}
}
return false;
}
private bool UsersHaveUpn(ISyncableUser userA, ISyncableUser userB)
{
return !string.IsNullOrWhiteSpace(userA.UserPrincipalName)
&& !string.IsNullOrWhiteSpace(userB.UserPrincipalName);
}
The problem I'm having, is with implementing GetHashCode so that the above conditional equality, represented above, is respected. The only way I've been able to get the intersect and except calls to work as expected is to simple always return the same value from GetHashCode(), forcing a call to Equals.
public int GetHashCode(ISyncableUser obj)
{
return 0;
}
This works but the performance penalty is huge, as expected. (I've tested this with non-conditional equality. With two sets containing 50000 objects, a proper hashcode implementation allows execution of intercept and except in about 40ms. A hashcode implementation that always returns 0 takes approximately 144000ms (yes, 2.4 minutes!))
So, how would I go about implementing a GetHashCode() in the scenario above?
Any thoughts would be more than welcome!
If I'm reading this correctly, your equality relation is not transitive. Picture the following three ISyncableUsers:
A { Guid: "1", UserPrincipalName: "2" }
B { Guid: "2", UserPrincipalName: "2" }
C { Guid: "2", UserPrincipalName: "1" }
A == B because they have the same UserPrincipalName
B == C because they have the same Guid
A != C because they don't share either.
From the spec,
The Equals method is reflexive, symmetric, and transitive. That is, it returns true if used to compare an object with itself; true for two objects x and y if it is true for y and x; and true for two objects x and z if it is true for x and y and also true for y and z.
If your equality relation isn't consistent, there's no way you can implement a hash code that backs it up.
From another point of view: you're essentially looking for three functions:
G mapping GUIDs to ints (if you know the GUID but the UPN is blank)
U mapping UPNs to ints (if you know the UPN but the GUID is blank)
P mapping (guid, upn) pairs to ints (if you know both)
such that G(g) == U(u) == P(g, u) for all g and u. This is only possible if you ignore g and u completely.
If we suppose that your Equals implementation is correct, i.e. it's reflective, transitive and symmetric then the basic implementation for your GetHashCode function should look like this:
public int GetHashCode(ISyncableUser obj)
{
if (obj == null)
{
return SOME_CONSTANT;
}
if (!string.IsNullOrWhiteSpace(obj.UserPrincipalName) &&
<can have user object with different guid and the same name>)
{
return GetHashCode(obj.UserPrincipalName);
}
return GetHashCode(obj.Guid);
}
You should also understand that you've got rather intricate dependencies between your objects.
Indeed, let's take two ISyncableUser objects: 'u1' and 'u2', such that u1.Guid != u2.Guid, but u1.UserPrincipalName == u2.UserPrincipalName and names are not empty. Requirements for Equality imposes that for any 'ISyncableUser' object 'u' such that u.Guid == u1.Guid, the condition u.UserPrincipalName == u1.UserPrincipalName should be also true. This reasoning dictates GetHashCode implementation, for each user object it should be based either on it's name or guid.
One way would be to maintain a dictionary of hashcodes for usernames and GUIDS.
You could generate this dictionary at the start once for all users, which would probably the cleanest solution.
You could add or update an entry in the Constructor of each user.
Or, you could maintain that dictionary inside the GetHashCode function. This means your GetHashCode function has more work to do and is not free of side-effects. Getting this to work with multiple threads or parallel-linq will need some more carefull work. So I don't know whether I would recommend this approach.
Nevertheless, here is my attempt:
private Dictionary<string, int> _guidHash =
new Dictionary<string, int>();
private Dictionary<string, int> _nameHash =
new Dictionary<string, int>(StringComparer.OrdinalIgnoreCase);
public int GetHashCode(ISyncableUser obj)
{
int hash = 0;
if (obj==null) return hash;
if (!String.IsNullOrWhiteSpace(obj.Guid)
&& _guidHash.TryGetValue(obj.Guid, out hash))
return hash;
if (!String.IsNullOrWhiteSpace(obj.UserPrincipalName)
&& _nameHash.TryGetValue(obj.UserPrincipalName, out hash))
return hash;
hash = RuntimeHelpers.GetHashCode(obj);
// or use some other method to generate an unique hashcode here
if (!String.IsNullOrWhiteSpace(obj.Guid))
_guidHash.Add(obj.Guid, hash);
if (!String.IsNullOrWhiteSpace(obj.UserPrincipalName))
_nameHash.Add(obj.UserPrincipalName, hash);
return hash;
}
Note that this will fail if the ISyncableUser objects do not play nice and exhibit cases like in Rawling's answer. I am assuming that users with the same GUID will have the same name or no name at all, and users with the same principalName have the same GUID or no GUID at all. (I think the given Equals implementation has the same limitations)

LINQ Except() Method Does Not Work

I have 2 IList<T> of the same type of object ItemsDTO. I want to exclude one list from another. However this does not seem to be working for me and I was wondering why?
IList<ItemsDTO> related = itemsbl.GetRelatedItems();
IList<ItemsDTO> relating = itemsbl.GetRelatingItems().Except(related).ToList();
I'm trying to remove items in related from the relating list.
Since class is a reference type, your ItemsDTO class must override Equals and GetHashCode for that to work.
From MSDN:
Produces the set difference of two sequences by using the default
equality comparer to compare values.
The default equality comparer is going to be a reference comparison. So if those lists are populated independently of each other, they may contain the same objects from your point of view but different references.
When you use LINQ against SQL Server you have the benefit of LINQ translating your LINQ statement to a SQL query that can perform logical equality for you based on primary keys or value comparitors. With LINQ to Objects you'll need to define what logical equality means to ItemsDTO. And that means overriding Equals() as well as GetHashCode().
Except works well for value types. However, since you are using Ref types, you need to override Equals and GethashCode on your ItemsDTO in order to get this to work
I just ran into the same problem. Apparently .NET thinks the items in one list are different from the same items in the other list (even though they are actually the same). This is what I did to fix it:
Have your class inherit IEqualityComparer<T>, eg.
public class ItemsDTO: IEqualityComparer<ItemsDTO>
{
public bool Equals(ItemsDTO x, ItemsDTO y)
{
if (x == null || y == null) return false;
return ReferenceEquals(x, y) || (x.Id == y.Id); // In this example, treat the items as equal if they have the same Id
}
public int GetHashCode(ItemsDTO obj)
{
return this.Id.GetHashCode();
}
}

Comparing two objects' properties simply in c#

I have a class with lots of properties. A shallow copy is enough to fully replicate the object.
I need to compare an object just to check if it contains exactly the same values as another.
My ideas:
The first and most obvious solution is just to create a huge method block that compares each property, one after the other.
The second would be to serialize each object and hash the file or do some sort of md5 checksum on it. (Would this actually work?)
The third is to do some sort of reflection on the object, which would automate the first option, but create an added level of complexity.
Speed isn't really an issue.
I'm interested to hear thoughts, or any other methods I am missing to do such a thing.
Edit:
Thanks all. My solution (Modified to now be recursive on generic items):
public static bool IsSame<T>(T objA, T objB)
{
var type = typeof(T);
var properties = type.GetProperties();
foreach (var pi in properties.Where(a => a.CanRead))
{
if (pi.Name == "Item")
{
var piCount = properties.FirstOrDefault(a => a.Name == "Count");
int count = -1;
if (piCount != null && piCount.PropertyType == typeof(System.Int32))
count = (int)piCount.GetValue(objA, null);
if (count > 0)
{
for (int i = 0; i < count; i++)
{
dynamic child1 = pi.GetValue(objA, new object[] { i });
dynamic child2 = pi.GetValue(objB, new object[] { i });
return IsSame(child1, child2);
}
}
}
else
{
var val1 = type.GetProperty(pi.Name).GetValue(objA, null);
var val2 = type.GetProperty(pi.Name).GetValue(objB, null);
if (val1 != val2 && (val1 == null || !val1.Equals(val2)))
return false;
}
}
return true;
}
Most serializers are designed to ensure that data retains its integrity during serialization and deserialization, not to produce a consistent serialized format. I would avoid using serialization for this purpose.
You may consider implementing IEquatable, to have each instance capable of comparing itself with instances of the same type. Or a class do do the comparisons for you that implements IEqualityComparer. How they do this comparison may be the 'big method' that compares properties one after the other, or uses reflection.
Reflection can be a fairly quick and simple way to achieve your goal but can cause problems down the line (for example if someone adds a property to your type that should not be included for comparing equality), but obviously the converse is also true (someone adds a property that should be checked for equality, but the equality comparison isn't updated). Which approach you use should generally be decided by how comfortable the team is with each approach in tandem with the context in which the class will be used.
In your case I'd probably recommend using the reflection based approach since you wish to check the result of a shallow clone operation, so all properties should be equal.
As an aside, I'd recommend using the MemberwiseClone method to create the clone, which would lessen the need for such stringent tests.
The third option (reflection) would be the slowest, but it would also be the most robust/testable.
The hash code would be very similar to the first option, since you would have to call all of your member variables, so 1 and 2 would require you to update your .Equals(obj) and .GenerateHash() methods every time you modify your member variable lists.
Here is some code to get you started:
foreach (FieldInfo field in this.GetType().GetFields())
{
if (o[field.Name] == null)
{
if (!field.GetValue(this).Equals(o[field.Name]))
return false;
}
else
{
return false;
}
}
return true;
Another thought is that if the properties return simple value types you could group them into immutable value types of their own. For instance, if you have a customer with properties string Address1 string Address2 string City string State string Zip, you could create a value type Address that implements its own equality comparer and use that.
Not sure if this applies to your situation, but I've found that when I have a class with a lot of properties, it is often possible to do this, and it tends to make my classes simpler and easier to reason about.
If you serialize you have to specify the serialization order on each Field/Property to ensure that all data is serialized int the same order. But all you would achieve is to implement GetHashCode() in an external class.
GetHashCode() has some examples on how to override it, so look there first.
If your class has a lot of fields, and you don't want to write all the code by hand. You could use reflection to autogenerate the code. If your class changes from time to time then you could create a partial class with the GetHashCode() implementation in a T4 template.

Categories