Checking "value equality" of mutable classes in unit tests - c#

In a project, I have several classes that are mutable.
In unit tests using MSTest, I'd like to check whether objects of those classes are what I expect, which basically means comparing some fields and properties of the objects, i.e. value equality.
As I understand, if I want to use Assert.AreEqual and CollectionAssert.AreEqual etc., I'd have to override Equals (and maybe implement IEquatable) and do the comparison of the fields and properties there.
But then, as I understand, to make sure nothing bad happens when somebody else uses these classes, maybe in a hash set, I have to override GetHashCode. But (according to several questions and answers on Stackoverflow) overriding Equals so that it provides value equality, and overriding GetHashCode for mutable classes cannot be done correctly.
Thus, in unit tests, how can I check that objects of mutable classes are kind of "value equal"?

The problem with overriding GetHashCode is that if you use the object in a hashSet or dictionary, and mutate the object, the dictionary/hashset will not work correctly.
A workaround for this is to instead implement a IEqualityComparer<T>. That reduces the risk of bugs, since anyone using a dictionary would have to do extra work to specifically use your MyValueEqualityComparer.
This EqualityComparer should be possible to use in the Assert.Equals Method, at least in the latest version of MSTest
Assert.Equals( a, b, new MyValueEqualityComparer());

Related

Should == operator behave exactly as Equals()? [duplicate]

This question already has answers here:
Should an override of Equals on a reference type always mean value equality?
(3 answers)
Closed 7 years ago.
Let's consider Polygon class. Check for equality should compare references most of the time, but there are many situations where value equality comes in handy (like when one compares two polygons with Assert.AreEqual).
My idea is to make value equality somewhat secondary to reference equality. In this case it's pretty obvious that ==operator should keep its default reference check implementation.
What about object.Equals() and IEquatable<Polygon>.Equals() then? MSDN doesn't imply that == and .Equals() should do the same but still - wouldn't it make the behavior of Polygon objects too ambiguous?
Also, the Polygon class is mutable.
MSDN is almost clear about it
To check for reference equality, use ReferenceEquals. To check for
value equality, you should generally use Equals. However, Equals as it
is implemented by Object just performs a reference identity check. It
is therefore important, when you call Equals, to verify whether the
type overrides it to provide value equality semantics. When you create
your own types, you should override Equals.
By default, the operator == tests for reference equality by
determining if two references indicate the same object, so reference
types do not need to implement operator == in order to gain this
functionality. When a type is immutable, meaning the data contained in
the instance cannot be changed, overloading operator == to compare
value equality instead of reference equality can be useful because, as
immutable objects, they can be considered the same as long as they
have the same value. Overriding operator == in non-immutable types is
not recommended.
IEquatable documentation is also very clear
Defines a generalized method that a value type or class implements to
create a type-specific method for determining equality of instances.
A major difficulty with equality testing in .NET (and also Java) is that there are two useful equivalence relations, each based on a question that can be sensibly asked of any class object, but .NET isn't consistent about which question or relationship should be encapsulated by Equals and GetHashCode supposed to answer. The questions are:
Will you always and forever be equivalent to the object identified by some particular reference, no matter what happens to you.
Will you consider yourself equivalent to the object identified by some particular reference unless or until something with a reference to you does something that would affect that equivalence.
For immutable objects, both relationships should test value equality. For mutable objects, the first question should test referential equivalence and the second should test value equality. For an immutable object which holds a reference to an object which is of mutable type, but which nobody will ever mutate, both questions should test value equality of that encapsulated object.
My personal recommendation would be that mutable objects not override Object.Equals, but that they provide a static property that returns an IEqualityComparer which tests value equality. This would require that
any object that immutably encapsulates the mutable object will have to
get that IEqualityComparer to be able to report the encapsulated object's
value-equivalence relation as its own, but having an IEqualityComparer
would make it possible to store such things in e.g. a Dictionary provided
they are never modified.

Continuing confusion regarding overring Equals for mutable objects that are used in data bound collections

Background:
I've written a large scale WPF application using MVVM and it's been suffering from some intermittent problems. I initially asked the 'An item with the same key has already been added' Exception on selecting a ListBoxItem from code question here which explains the problem, but got no answers.
After some time, I managed to work out the cause of the Exceptions that I was getting and documented it in the What to return when overriding Object.GetHashCode() in classes with no immutable fields? question. Basically, it was because I had used mutable fields in the formula to return a value for GetHashCode.
From the very useful answers that I received for that question, I managed to deepen my understanding in that area. Here are three relevant rules:
If x equals y then the hash code of x must equal the hash code of y. Equivalently, if the hash code of x does not equal the hash code of y, then x and y must be unequal.
The hash code of x must remain stable while x is in a hash table.
The hash function should generate a random distribution among all
integers for all inputs.
These rules affected the possible solutions that I had to my problem of not knowing what to return from the GetHashCode method:
I couldn't return a constant because that would break the first and third rules above.
I couldn't create an additional readonly field for each class, solely to be used in the GetHashCode method for the same reasons.
The solution that I eventually went with was to remove each item from its ObservableCollection before editing any of the properties used in the GetHashCode method and then to re-add it again afterwards. While this has worked Ok in a number of views so far, I've run into a further problem as my UI items are animated using custom Panels. When I re-add an item (even by inserting it back to its original index in the collection), it sets off the entry animation(s) again.
I had already added a number of base class methods such as AddWithoutAnimation, RemoveWithoutAnimation, which has helped fix some of these issues, but it doesn't affect any Storyboard animations, which still get triggered after re-adding. So finally, we come to the question:
Question:
First, I'd like to clearly state that I am not using any Dictionary objects in my code... the Dictionary that throws the Exception must be internal to an ObservableCollection<T>. This point seems to have been missed by most people in my last question. Therefore, I cannot chose to simply not use a Dictionary... if only I could.
So, my question is 'is there any other way that I can implement GetHashCode in mutable classes while not breaking the three rules above, or avoid implementing it in the first place?'
I received a comment on the previous question from #HansPassant that suggested that
A good starting point is to completely remove the Equals and GetHashCode overrides, the default implementations inherited from Object are excellent and guarantee object uniqueness.
Can anyone tell me how can I remove the Equals and GetHashCode overrides? On the IEquatable<T> Interface page on MSDN it says It should be implemented for any object that might be stored in a generic collection and then on the IEquatable<T>.Equals Method page it says If you implement Equals, you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behaviour is consistent with that of the IEquatable<T>.
If this is possible, it would be my preferred solution.
UPDATE >>>
After downloading and installing dotPeek, I have been able to look inside the PresentationFramework namespace where the Exception is actually occurring. I have found the exact part that uses the Dictionary that is causing this problem. It is in the internal InternalSelectedItemsStorage class constructor:
internal InternalSelectedItemsStorage(Selector.InternalSelectedItemsStorage collection, IEqualityComparer<ItemsControl.ItemInfo> equalityComparer = null)
{
this._equalityComparer = equalityComparer ?? collection._equalityComparer;
this._list = new List<ItemsControl.ItemInfo>((IEnumerable<ItemsControl.ItemInfo>) collection._list);
if (collection.UsesItemHashCodes)
this._set = new Dictionary<ItemsControl.ItemInfo, ItemsControl.ItemInfo>((IDictionary<ItemsControl.ItemInfo, ItemsControl.ItemInfo>) collection._set, this._equalityComparer);
this._resolvedCount = collection._resolvedCount;
this._unresolvedCount = collection._unresolvedCount;
}
This is used internally by the Selector class after the ListBoxItem.OnSelected method has been called, so I can only assume that this has something to do with when a selection is made on the Listbox.
Can anyone tell me how can I remove the Equals and GetHashCode overrides? On the IEquatable Interface page on MSDN it says It should be implemented for any object that might be stored in a generic collection and then on the IEquatable.Equals Method page it says If you implement Equals, you should also override the base class implementations of Object.Equals(Object) and GetHashCode so that their behaviour is consistent with that of the IEquatable.
Mutable objects are comparable by their identity while immutable or value objects by their values.
If you have a mutable object you need to figure out its identity (e.g. if it is a representation of an entity stored in the database the identity is the primary key of the identity; if it is just an 'ad hoc' mutable object created in memory, then its identity is reference of this object (i.e. the default implementation of Equals and GetHashCode)).
So if your object is not an entity you simply implement IEquatable.Equals(T x) { return this.Equals(x); }, i.e. you say that, yes you can compare objects of this class with objects of class T and you compare it by reference (Equals() method inherited from System.Object).
If your object is an entity and e.g. has a primary key PersonId, then you do comparison by PersonId and return PersonId.GetHashCode() from your GetHashCode() method.
Btw, in case of entities you usually use some OR mapper and Identity map pattern which ensures that within a given unit of work you don't have more than one instance of a given entity, i.e. whenever primary keys are equal the object references are equal too.

A questionable inside into overriding Equals

Following Guidelines for Overriding Equals() and Operator == (C# Programming Guide), it seems advisable to override gethashcode when overriding equals(object), as well as equals(type).
It is in my understanding that there is an endless discussion about what's the best implementation for overriding Equals. However, I still like to understand the Equals concept a little better and decide for my own.
My questions will probably be kinda noobish, but here we go:
What is the main difference between Equals(object) and Equals(type) (independently of the given parameters)?
As far as I understand (And I could be completely wrong, so this is a question at the same time):
Equals(object) is a build in method that looks (at default) if object
references are the same. And Equals(Type) is a local method you
create. So in fact, what you have in that class is the method Equals
with 2 overloads.
Why do they check for property equality twice?
In equals(object) :
return base.Equals(obj) && z == p.z;
and in equals(type) :
return base.Equals((TwoDPoint)p) && z == p.z;
Why is it advisable to implement the Equals(type) method?
Most of my questions are rapped in my statement in question 1. So note any wrong or misleading arguments plz. Also, feel free to add any information, it will certainly help.
Thanks in advance
First lets distinguish the 2 methods
object.Equals() is a method on the root object which is marked as virtual and therefore can be overriden in a derived class.
IEquatable<T>.Equals is a method obtained by implementing the IEquatable<T> interface.
The latter is used for determining equality inside a generic Collection; so say the documentation:
The IEquatable<T> interface is used by generic collection objects such as Dictionary<TKey, TValue>, List<T>, and LinkedList<T> when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.
The former is used for determining equality everywhere else.
So with the groundwork in place lets try to answer some of your specific questions
What is the main difference between Equals(object) and Equals(type) (independently of the given parameters)?
One operates on any type, the other compares instances of the same type
Why do they check for property equality twice?
They dont, generally only one is used. However quite often one implementation calls the other internally
Why is it advisable to implement the Equals(type) method?
The answer is above - if you intend to store the object in a generic collection
As a side note, and one which may help you understand this, the default behaviour of equality checking is to check that the references are the same (ie, that one object is exactly the same instance as another). Quite often overriding/implementing different equality logic is used to compare some data within fields of the object (akin to your example of z == p.z)
One difference between the overloads is that, as noted, one will be invoked when comparing an object to things which are known at compile time to be of the same type, while the other will be invoked in all other circumstances. Another very important difference which has not been mentioned is that Equals(ownType) will act not only on things of ownType, but also on things that are implicitly convertible to ownType. Because of this, Equals cannot not be expected to implement an equivalence relation among objects of convertible types unless one forces its operands to be of type Object. Consider, for example,
(12.0).Equals(12);
converts the integer value 12 to the Double value 12.0. Since the type and value of the passed value precisely match the 12.0 whose Equals method is being invoked, thus returning true.
(12).Equals(12.0);
Because Double is not implicitly convertible to Int32, passes the Double value as Object instead. Since the Double does not match the type of the 12 whose Equals method is being invoked, the method returns false.
The virtual method Equals(Object) implements an equivalence relation; in many cases involving implicit type conversions, the type-specific methods cannot be expected to do so.

What's compared in my Class without an EqualityComparer?

I want to check if an object is in a Queue before I enqueue it. If don't explicitly define an EqualityComparer, what does the Contains() function compare?
If it compares property values, that's perfect. If it compares to see if a reference to that object exists in the Queue then that defeats what I'm trying to accomplish in my code.
For classes, the default equality operation is by reference - it assumes that object identity and equality are the same, basically.
You can overcome this by overriding Equals and GetHashCode. I'd also suggest implementing IEquatable<T> to make this clear. Your hash code implementation should generate the hash code from the same values as the equality operation.
The default for reference types is to compare the reference.
However, if the type implements IEquatable<> it can be doing a different comparison. If you need to have a specific equality comparison in place, you need to create one yourself.

Overriding Object.Equals() instance method in C#; now Code Analysis/FxCop warning CA2218: "should also redefine GetHashCode". Should I suppress this?

I've got a complex class in my C# project on which I want to be able to do equality tests. It is not a trivial class; it contains a variety of scalar properties as well as references to other objects and collections (e.g. IDictionary). For what it's worth, my class is sealed.
To enable a performance optimization elsewhere in my system (an optimization that avoids a costly network round-trip), I need to be able to compare instances of these objects to each other for equality – other than the built-in reference equality – and so I'm overriding the Object.Equals() instance method. However, now that I've done that, Visual Studio 2008's Code Analysis a.k.a. FxCop, which I keep enabled by default, is raising the following warning:
warning : CA2218 : Microsoft.Usage : Since 'MySuperDuperClass'
redefines Equals, it should also redefine GetHashCode.
I think I understand the rationale for this warning: If I am going to be using such objects as the key in a collection, the hash code is important. i.e. see this question. However, I am not going to be using these objects as the key in a collection. Ever.
Feeling justified to suppress the warning, I looked up code CA2218 in the MSDN documentation to get the full name of the warning so I could apply a SuppressMessage attribute to my class as follows:
[SuppressMessage("Microsoft.Naming",
"CA2218:OverrideGetHashCodeOnOverridingEquals",
Justification="This class is not to be used as key in a hashtable.")]
However, while reading further, I noticed the following:
How to Fix Violations
To fix a violation of this rule,
provide an implementation of
GetHashCode. For a pair of objects of
the same type, you must ensure that
the implementation returns the same
value if your implementation of Equals
returns true for the pair.
When to Suppress Warnings
-----> Do not suppress a warning from this
rule. [arrow & emphasis mine]
So, I'd like to know: Why shouldn't I suppress this warning as I was planning to? Doesn't my case warrant suppression? I don't want to code up an implementation of GetHashCode() for this object that will never get called, since my object will never be the key in a collection. If I wanted to be pedantic, instead of suppressing, would it be more reasonable for me to override GetHashCode() with an implementation that throws a NotImplementedException?
Update: I just looked this subject up again in Bill Wagner's good book Effective C#, and he states in "Item 10: Understand the Pitfalls of GetHashCode()":
If you're defining a type that won't
ever be used as the key in a
container, this won't matter. Types
that represent window controls, web
page controls, or database connections
are unlikely to be used as keys in a
collection. In those cases, do
nothing. All reference types will
have a hash code that is correct, even
if it is very inefficient. [...] In
most types that you create, the best
approach is to avoid the existence of
GetHashCode() entirely.
... that's where I originally got this idea that I need not be concerned about GetHashCode() always.
If you are reallio-trulio absosmurfly positive that you'll never use the thing as a key to a hash table then your proposal is reasonable. Override GetHashCode; make it throw an exception.
Note that hash tables hide in unlikely places. Plenty of LINQ sequence operators use hash table implementations internally to speed things up. By rejecting the implementation of GetHashCode you are also rejecting being able to use your type in a variety of LINQ queries. I like to build algorithms that use memoization for speed increases; memoizers usually use hash tables. You are therefore also rejecting ability to memoize method calls that take your type as a parameter.
Alternatively, if you don't want to be that harsh: Override GetHashCode; make it always return zero. That meets the semantic requirements of GetHashCode; that two equal objects always have the same hash code. If it is ever used as a key in a dictionary performance is going to be terrible, but you can deal with that problem when it arises, which you claim it never will.
All that said: come on. You've probably spent more time typing up the question than it would take to correctly implement it. Just do it.
You should not suppress it. Look at how your equals method is implemented. I'm sure it compares one or more members on the class to determine equality. One of these members is oftentimes enough to distinguish one object from another, and therefore you could implement GetHashCode by returning membername.GetHashCode();.
My $0.10 worth? Implement GetHashCode.
As much as you say you'll never, ever need it, you may change your mind, or someone else may have other ideas on how to use the code. A working GetHashCode isn't hard to make, and guarantees that there won't be any problems in the future.
As soon as you forget, or another developer who isn't aware uses this, someone is going to have a painful bug to track down. I'd recommend simply implementing GetHashCode correctly and then you won't have to worry about it. Or just don't use Equals for your special equality comparison case.
The GetHashCode and Equals methods work together to provide value-based equality semantics for your type - you ought to implement them together.
For more information on this topic please see these articles:
All types are not compared equally
All types are not compared equally (part 2)
Shameless plug: These articles were written by me.

Categories