Equals method implementation helpers (C#) - c#

Everytime I write some data class, I usually spend so much time writing the IEquatable implementation.
The last class I wrote was something like:
public class Polygon
{
public Point[] Vertices { get; set; }
}
Implementing IEquatable was exaustive. Surely C#3.0/LINQ helps a lot, but the vertices can be shifted and/or in the reverse order, and that adds a lot of complexity to the Equals method. After many unit tests, and corresponding implementation, I gave up, and changed my application to accept only triangles, which IEquatable implementation required only 11 unit tests to be fully covered.
There is any tool or technique that helps implementing Equals and GetHashCode?

I use ReSharper to generate equality members. It will optionally implement IEquatable<T> as well as overriding operators if you want that (which of course you never do, but it's cool anyway).
The implementation of Equals includes an override of Object.Equals(Object), as well as a strongly typed variant (which can avoid unnecessary type checking). The lesser typed version calls the strongly typed one after performing a type check. The strongly typed version performs a reference equality check (Object.ReferenceEquals(Object,Object)) and then compares the values of all fields (well, only those that you tell the generator to include).
As for GetHashCode, a smart factorisation of the field's GetHashCode values are combined (using unchecked to avoid overflow exceptions if you use the compiler's checked option). Each of the field's values (apart from the first one) are multiplied by prime numbers before being combined. You can also specify which fields would never be null, and it'll drop any null checks.
Here's what you get for your Polygon class by pressing ALT+Insert then selecting "Generate Equality Members":
public class Polygon : IEquatable<Polygon>
{
public Point[] Vertices { get; set; }
public bool Equals(Polygon other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return Equals(other.Vertices, Vertices);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof (Polygon)) return false;
return Equals((Polygon) obj);
}
public override int GetHashCode()
{
return (Vertices != null ? Vertices.GetHashCode() : 0);
}
}
Some of the features I talked about above don't apply as there is only one field. Note too that it hasn't checked the contents of the array.
In general though, ReSharper pumps out a lot of excellent code in just a matter of seconds. And that feature is pretty low on my list of things that makes ReSharper such an amazing tool.

For comparing two arrays of items, I use the SequenceEqual extension method.
As for a generic Equals and GetHashCode, there's a technique based on serialization that might work for you.
Using MemoryStream and BinaryFormatter for reuseable GetHashCode and DeepCopy functions

Related

System.Array.IndexOf allocates memory

I've been profiling my code and found that System.Array.IndexOf is allocating a fair bit of memory. I've been trying to find out how come this happens.
public struct LRItem
{
public ProductionRule Rule { get; } // ProductionRule is a class
public int Position { get; }
}
// ...
public List<LRItem> Items { get; } = new List<LRItem>();
// ...
public bool Add(LRItem item)
{
if (Items.Contains(item)) return false;
Items.Add(item);
return true;
}
I'm assuming the IndexOf is called by Items.Contains because I don't think Items.Add has any business checking indices. I've tried looking at the reference source and .NET Core source but to no avail. Is this a bug in the VS profiler? Is this function actually allocating memory? Could I optimize my code somehow?
I know this is probably a bit late, but in case anyone else has the same question...
When List<T>.Contains(...) is called, it uses the EqualityComparer<T>.Default to compare the individual items to find what you've passed in[1]. The docs say this about EqualityComparer<T>.Default:
The Default property checks whether type T implements the System.IEquatable interface and, if so, returns an EqualityComparer that uses that implementation. Otherwise, it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T.
Since your LRItem does not implement IEquatable<T>, then it falls back to using Object.Equals(object, object). And because LRItem is a struct, then it will end up being boxed as an object so it can be passed in to Object.Equals(...), which is where the allocations are coming from.
The easy fix for this is to take a hint from the docs and implement the IEquatable<T> interface:
public struct LRItem : IEquatable<LRItem>
{
// ...
public bool Equals(LRItem other)
{
// Implement this
return true;
}
}
This will now cause EqualityComparer<T>.Default to return a specialised comparer that does not need to box your LRItem structs and hence avoiding the allocation.
[1] I'm not sure if something's changed since this question was asked (or maybe it's a .net framework vs core difference or something) but List<T>.Contains() doesn't call Array.IndexOf() nowadays. Either way, both of them do defer to EqualityComparer<T>.Default, which means that this should still be relevant in either case.

Why Implement the IEquatable<T> Interface

I have been reading articles and understand interfaces to an extent however, if i wanted to right my own custom Equals method, it seems I can do this without implementing the IEquatable Interface. An example.
using System;
using System.Collections;
using System.ComponentModel;
namespace ProviderJSONConverter.Data.Components
{
public class Address : IEquatable<Address>
{
public string address { get; set; }
[DefaultValue("")]
public string address_2 { get; set; }
public string city { get; set; }
public string state { get; set; }
public string zip { get; set; }
public bool Equals(Address other)
{
if (Object.ReferenceEquals(other, null)) return false;
if (Object.ReferenceEquals(this, other)) return true;
return (this.address.Equals(other.address)
&& this.address_2.Equals(other.address_2)
&& this.city.Equals(other.city)
&& this.state.Equals(other.state)
&& this.zip.Equals(other.zip));
}
}
}
Now if i dont implement the interface and leave : IEquatable<Address> out of the code, it seems the application operates exactly the same. Therefore, I am unclear as to why implement the interface? I can write my own custom Equals method without it and the breakpoint will hit the method still and give back the same results.
Can anyone help explain this to me more? I am hung up on why include "IEquatable<Address>" before calling the Equals method.
Now if i dont implement the interface and leave : IEquatable out of the code, it seems the application operates exactly the same.
Well, that depends on what "the application" does. For example:
List<Address> addresses = new List<Address>
{
new Address { ... }
};
int index = addresses.IndexOf(new Address { ... });
... that won't work (i.e. index will be -1) if you have neither overridden Equals(object) nor implemented IEquatable<T>. List<T>.IndexOf won't call your Equals overload.
Code that knows about your specific class will pick up the Equals overload - but any code (e.g. generic collections, all of LINQ to Objects etc) which just works with arbitrary objects won't pick it up.
The .NET framework has confusingly many possibilities for equality checking:
The virtual Object.Equals(object)
The overloadable equality operators (==, !=, <=, >=)
IEquatable<T>.Equals(T)
IComparable.CompareTo(object)
IComparable<T>.CompareTo(T)
IEqualityComparer.Equals(object, object)
IEqualityComparer<T>.Equals(T, T)
IComparer.Compare(object, object)
IComparer<T>.Compare(T, T)
And I did not mention the ReferenceEquals, the static Object.Equals(object, object) and the special cases (eg. string and floating-point comparison), just the cases where we can implement something.
Additionally, the default behavior of the first two points are different for structs and classes. So it is not a wonder that a user can be confused about what and how to implement.
As a thumb of rule, you can follow the following pattern:
Classes
By default, both the Equals(object) method and equality operators (==, !=) check reference equality.
If reference equality is not right for you, override the Equals method (and also GetHashCode; otherwise, your class will not be able to be used in hashed collections)
You can keep the original reference equality functionality for the == and != operators, it is common for classes. But if you overload them, it must be consistent with Equals.
If your instances can be compared to each other in less or greater meaning, implement the IComparable interface. When Equals reports equality, CompareTo must return 0 (again, consistency).
Basically that's it. Implementing the generic IEquatable<T> and Comparable<T> interfaces for classes is not a must: as there is no boxing, the performance gain would be minimal in the generic collections. But remember, if you implement them, keep the consistency.
Structs
By default, the Equals(object) performs a value comparison for structs (checks the field values). Though normally this is the expected behavior in case of a value type, the base implementation does this by using reflection, which has a terrible performance. So do always override the Equals(object) in a public struct, even if you implement the same functionality as it originally had.
When the Equals(object) method is used for structs, a boxing happens, which have a performance cost (not as bad as the reflection in ValueType.Equals, but it matters). That's why IEquatable<T> interface exists. You should implement it on structs if you want to use them in generic collections. Have I already mentioned to keep consistency?
By default, the == and != operators cannot be used for structs so you must overload them if you want to use them. Simply call the strongly-typed IEquatable<T>.Equals(T) implementation.
Similarly to classes, if less-or-greater is meaningful for your type, implement the IComparable interface. In case of structs, you should implement the IComparable<T> as well to make things performant (eg. Array.Sort, List<T>.BinarySearch, using the type as a key in a SortedList<TKey, TValue>, etc.). If you overloaded the ==, != operators, you should do it for <, >, <=, >=, too.
A little addendum:
If you must use a type that has an improper comparison logic for your needs, you can use the interfaces from 6. to 9. in the list. This is where you can forget consistency (at least considering the self Equals of the type) and you can implement a custom comparison that can be used in hash-based and sorted collections.
If you had overridden the Equals(object obj) method, then it would only be a matter of performances, as noted here: What's the difference between IEquatable and just overriding Object.Equals()?
But as long as you didn't override Equals(object obj) but provided your own strongly typed Equals(Adddress obj) method, without implementing IEquatable<T> you do not indicate to all classes that rely on the implementation of this interface to operate comparisons, that you have your own Equals method that should be used.
So, as John Skeet noted, the EqualityComparer<Address>.Default property used by List<Address>.IndexOf to compare addresses wouldn't be able to know it should use your Equals method.
IEquatable interface just adds Equals method with whatever type we supply in the generic param. Then the funciton overloading takes care of rest.
if we add IEquatable to Employee structure, that object can be compared with Employee object without any type casting. Though the same we can achieved with default Equals method which accepts Object as param,
So converting from Object to struct involves Boxing. Hence having IEquatable <Employee> will improve performance.
for example assume we want to compare Employee structure with another employee
if(e1.Equals(e2))
{
//do some
}
For above example it will use Equals with Employee as param. So no boxing nor unboxing is required
struct Employee : IEquatable<Employee>
{
public int Id { get; set; }
public bool Equals(Employee other)
{
//no boxing not unboxing, direct compare
return this.Id == other.Id;
}
public override bool Equals(object obj)
{
if(obj is Employee)
{ //un boxing
return ((Employee)obj).Id==this.Id;
}
return base.Equals(obj);
}
}
Some more examples:
Int structure implements IEquatable <int>
Bool structure implements IEquatable <bool>
Float structure implements IEquatable <float>
So if you call someInt.Equals(1) it doesn't fires Equals(object) method. it fires Equals(int) method.

HashSet<T>.RemoveWhere() and GetHashCode()

Aloha,
Here's a simple class that overrides GetHashCode:
class OverridesGetHashCode
{
public string Text { get; set; }
public override int GetHashCode()
{
return (Text != null ? Text.GetHashCode() : 0);
}
// overriding Equals() doesn't change anything, so I'll leave it out for brevity
}
When I create an instance of that class, add it to a HashSet and then change its Text property, like this:
var hashset = new HashSet<OverridesGetHashCode>();
var oghc = new OverridesGetHashCode { Text = "1" };
hashset.Add(oghc);
oghc.Text = "2";
then this doesn't work:
var removedCount = hashset.RemoveWhere(c => ReferenceEquals(c, oghc));
// fails, nothing is removed
Assert.IsTrue(removedCount == 1);
and neither does this:
// this line works, i.e. it does find a single item matching the predicate
var existing = hashset.Single(c => ReferenceEquals(c, oghc));
// but this fails; nothing is removed again
var removed = hashset.Remove(existing);
Assert.IsTrue(removed);
I guess the hash it internally uses is generated when item is inserted and, if that's true, it's
understandable that hashset.Contains(oghc) doesn't work.
I also guess it looks up item by its hash code and if it finds a match, only then it checks the predicate, and that might be why the first test fails (again, I'm just guessing here).
But why does the last test fail, I just got that object out of the hashset? Am I missing something, is this a wrong way to remove something from a HashSet?
Thank you for taking the time to read this.
UPDATE: To avoid confusion, here's the Equals():
protected bool Equals(OverridesGetHashCode other)
{
return string.Equals(Text, other.Text);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != this.GetType()) return false;
return Equals((OverridesGetHashCode) obj);
}
By changing the hash code of your object while that object is being used in a HashSet is a violation of the HashSet's contract.
Being unable to remove the object is not the problem here. You are not allowed to change the hash code in the first place.
Let me quote from MSDN:
The GetHashCode method for an object must consistently return the same
hash code as long as there is no modification to the object state that
determines the return value of the object's Equals method. Note that
this is true only for the current execution of an application, and
that a different hash code can be returned if the application is run
again.
They tell the story a little differently but the essence is the same. They say, the hash code can never change. In practice, you can change it as long as you make sure no one uses the old hash code anymore. Not that this is good practice, but it works.
It's important that any items added to a hash based table (HashSet, Dictionary, etc.) not be modified once they are inserted into the structure (at least not until they are removed).
To find an object in the data structure it computes it hash code, and then finds a location based on that hash code. If you mutate that object then the hash code it returns no longer reflects it's current location in that data structure (unless you're very, very lucky and it just happens to be a hash collision).
On the MSDN page for Dictionary is says:
As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value.
This same assertion applies to HashSet as well, as they both are implemented using hash tables.
There are good answers here and just wanted to add this. If you look at the decompiled HashSet<T> code, you'll see that Add(value) does the following:
Calls IEqualityComparer<T>.GetHashCode() to get the hash code for value. For the default comparer this boils down to GetHashCode().
Uses that hash code to calculate which "bucket" and "slot" the (reference to) value should be stored in.
Stores the reference.
When you call Remove(value) it does steps 1. and 2. again, to find where the reference is at. Then it calls IEqualityComparer<T>.Equals() to make sure that it indeed has found the right value. However, since you've changed what GetHashCode() returns, it calculates a different bucket/slot location, which is invalid. Thus, it cannot find the object.
So, note that Equals() doesn't really come into play here, because it will never even get to the right bucket/slot location if the hash code changes.

How does implementing an interface give us a strongly typed API?

In C# in depth, Jon Skeet uses IEquatable<> to override overload the Equals() operation.
public sealed class Pair<T1, T2> : IEquatable<Pair<T1, T2>>
{
public bool Equals(Pair<T1, T2> other)
{
//...
}
}
He says we do this "to give a strongly typed API that'll avoid unnecessary execution-time checks".
Which execution time checks are avoided? More importantly, how does implementing an interface achieve a strongly typed API?
I may have missed something in the book's context. I thought interfaces gave us code re-use via polymorphism. I also understand that they are good for programming to an abstraction instead of a concrete type. That's all I'm aware of.
The default Equals method takes an object as the parameter. Thus, when implementing this method, you have to make a runtime check in your code to ensure that this object is of type Pair (before you can compare those two):
public override bool Equals(Object obj) {
// runtime type check here
var otherPair = obj as Pair<T1, T2>;
if (otherPair == null)
return false;
// comparison code here
...
}
The Equals method of IEquatable<T>, however, takes a Pair<T1,T2> as a type parameter. Thus, you can avoid the check in your implementation, making it more efficient:
public bool Equals(Pair<T1, T2> other)
{
// comparison code here
...
}
Classes such as Dictionary<TKey, TValue>, List<T>, and LinkedList<T> are smart enough to use IEquatable<T>.Equals instead of object.Equals on their elements, if available (see MSDN).
In this case he's providing a strongly typed version of Object.Equals, which will replace code that might look like the following:
public override bool Equals(object other)
{
// The following type check is not needed with IEquatable<Pair<T1, T2>>
Pair<T1, T2> pair = other as Pair<T1, T2>;
if (pair != null)
{
// <-- IEquatable<Pair<T1, T2>> implementation
}
else
{
return base.Equals(other);
}
}
The IEquatable<T> interface provides a strongly typed implementation of the Equals method, as opposed to the Equals method in System.Object that receives a System.Object.
I think Jon saying "strongly typed" talks about generics.
I haven't found non-generic IEquitable interface but IComparable<T> vs. IComparable exist.
To be fair to Skeet (although sure he will be along soon) he does devote time to discussing what "strong typing" means in section 2.2.1.
In the context of your question (page 85 in my edition)), I think he means that the default Equals method (which takes an object as an argument) defers to the strongly-typed Equals method that implements the interface.

How to check if Dictionary already has a key 'x'?

I am trying to implement simple algorithm with use of C#'s Dictionary :
My 'outer' dictionary looks like this : Dictionary<paramID, Dictionary<string, object>> [where paramID is simply an identifier which holds 2 strings]
if key 'x' is already in the dictionary then add specific entry to this record's dictionary, if it doesn't exist then add its entry to the outer Dictionary and then add entry to the inner dictionary.
Somehow, when I use TryGetValue it always returns false, therefore it always creates new entries in the outer Dictionary - what produces duplicates.
My code looks more or less like this :
Dictionary<string, object> tempDict = new Dictionary<string, object>();
if(outerDict.TryGetValue(new paramID(xKey, xValue), out tempDict))
{
tempDict.Add(newKey, newValue);
}
Block inside the ifis never executed, even if there is this specific entry in the outer Dictionary.
Am I missing something ? (If you want I can post screen shots from debugger - or something else if you desire)
If you haven't over-ridden equals and GetHashCode on your paramID type, and it's a class rather than a struct, then the default equality meaning will be in effect, and each paramID will only be equal to itself.
You likely want something like:
public class ParamID : IEquatable<ParamID> // IEquatable makes this faster
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
//change for case-insensitive, culture-aware, etc.
return other != null && _first == other._first && _second == other._second;
}
public override bool Equals(object other)
{
return Equals(other as ParamID);
}
public override int GetHashCode()
{
//change for case-insensitive, culture-aware, etc.
int fHash = _first.GetHashCode();
return ((fHash << 16) | (fHash >> 16)) ^ _second.GetHashCode();
}
}
For the requested explanation, I'm going to do a different version of ParamID where the string comparison is case-insensitive and ordinal rather than culture based (a form that would be appropriate for some computer-readable codes (e.g. matching keywords in a case-insensitive computer language or case-insensitive identifiers like language tags) but not for something human-readable (e.g. it will not realise that "SS" is a case-insensitive match to "ß"). This version also considers {"A", "B"} to match {"B", "A"} - that is, it doesn't care what way around the strings are. By doing a different version with different rules it should be possible to touch on a few of the design considerations that come into play.
Let's start with our class containing just the two fields that are it's state:
public class ParamID
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
}
At this point if we do the following:
ParamID x = new ParamID("a", "b");
ParamID y = new ParamID("a", "b");
ParamID z = x;
bool a = x == y;//a is false
bool b = z == x;//b is true
Because by default a reference type is only equal to itself. Why? Well firstly, sometimes that's just what we want, and secondly it isn't always clear what else we might want without the programmer defining how equality works.
Note also, that if ParamID was a struct, then it would have equality defined much like what you wanted. However, the implementation would be rather inefficient, and also buggy if it contained a decimal, so either way it's always a good idea to implement equality explicitly.
The first thing we are going to do to give this a different concept of equality is to override IEquatable<ParamID>. This is not strictly necessary, (and didn't exist until .NET 2.0) but:
It will be more efficient in a lot of use cases, including when key to a Dictionary<TKey, TValue>.
It's easy to do the next step with this as a starting point.
Now, there are four rules we must follow when we implement an equality concept:
An object must still be always equal to itself.
If X == Y and X != Z, then later if the state of none of those objects has changed, X == Y and X != Z still.
If X == Y and Y == Z, then X == Z.
If X == Y and Y != Z then X != Z.
Most of the time, you'll end up following all these rules without even thinking about it, you just have to check them if you're being particularly strange and clever in your implementation. Rule 1 is also something that we can take advantage of to give us a performance boost in some cases:
public class ParamID : IEquatable<ParamID>
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
if(other == null)
return false;
if(ReferenceEquals(this, other))
return true;
if(string.Compare(_first, other._first, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._second, StringComparison.InvariantCultureIgnoreCase) == 0)
return true;
return string.Compare(_first, other._second, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._first, StringComparison.InvariantCultureIgnoreCase) == 0;
}
}
The first thing we've done is see if we're being compared with equality to null. We almost always want to return false in such cases (not always, but the exceptions are very, very rare and if you don't know for sure you're dealing with such an exception, you almost certainly are not), and certainly we don't want to throw a NullReferenceException.
The next thing we do is to see if the object is being compared with itself. This is purely an optimisation. In this case, it's probably a waste of time, but it can be very useful with more complicated equality tests, so it's worth pointing out this trick here. This takes advantage of the rule that identity entails equality, that is, any object is equal to itself (Ayn Rand seemed to think this was somehow profound).
Finally, having dealt with these two special cases, we get to the actual rule for equality. As I said above, my example considers two objects equal if they have the same two strings, in either order, for case-insensitive ordinal comparisons, so I've a bit of code to work that out.
(Note that the order in which we compare component parts can have a performance impact. Not in this case, but with a class that contains both an int and a string we would compare the ints first because is faster and we will hence perhaps find an answer of false before we even look at the strings)
Now at this point we've a good basis for overriding the Equals method defined in object:
public override bool Equals(object other)
{
return (other as ParamID);
}
Since as will return a ParamID reference if other is a ParamID and null for anything else (including if null was what we were passed in the first place), and since we already handle comparison with null, we're all set.
Try to compile at this point and you will get a warning that you have overriden Equals but not GetHashCode (the same is true if you'd done it the other way around).
GetHashCode is used by the dictionary (and other hash-based collections like HashTable and HashSet) to decide where to place the key internally. It will take the hashcode, re-hash it down to a smaller value in a way that is its business, and use it to place the object in its internal store.
Because of this, it's clear why the following is a bad idea were ParamID not readonly on all fields:
ParamID x = new ParamID("a", "b");
dict.Add(x, 33);
x.First = "c";//x will now likely never be found in dict because its hashcode doesn't match its position!
This means the following rules apply to hash-codes:
Two objects considered equal, must have the same hashcode. (This is a hard rule, you will have bugs if you break it).
While we can't guarantee uniqueness, the more spread out the returned results, the better. (Soft rule, you will have better performance the better you do at it).
(Well, 2½.) While not a strict rule, if we take such a complicated approach to point 2 above that it takes forever to return a result, the nett effect will be worse than if we had a poorer-quality hash. So we want to try to be reasonably quick too if we can.
Despite the last point, it's rarely worth memoising the results. Hash-based collections will normally memoise the value themselves, so it's a waste to do so in the object.
For the first implementation, because our approach to equality depended upon the default approach to equality of the strings, we could use strings default hashcode. For my different version I'll use another approach that we'll explore more later:
public override int GetHashCode()
{
return StringComparer.OrdinalIgnoreCase.GetHashCode(_first) ^ StringComparer.OrdinalIgnoreCase.GetHashCode(_second);
}
Let's compare this to the first version. In both cases we get hashcodes of the component parts. If the values where integers, chars or bytes we would have worked with the values themselves, but here we build on the work done in implementing the same logic for those parts. In the first version we use the GetHashCode of string itself, but since "a" has a different hashcode to "A" that won't work here, so we use a class that produces a hashcode ignoring that difference.
The other big difference between the two is that in the first case we mix the bits up more with ((fHash << 16) | (fHash >> 16)). The reason for this is to avoid duplicate hashes. We can't produce a perfect hashcode where every different object has a different value, because there are only 4294967296 possible hashcode values, but many more possible values for ParamID (including null, which is treated as having a hashcode of 0). (There are cases where prefect hashes are possible, but they bring in different concerns than here). Because of this imperfection we have to think not only about what values are possible, but which are likely. Generally, shifting bits like we've done in the first version avoids common values having the same hash. We don't want {"A", "B"} to hash the same as {"B", "A"}.
It's an interesting experiment to produce a deliberately poor GetHashCode that always returns 0, it'll work, but instead of being close to O(1), dictionaries will be O(n), and poor as O(n) goes for that!
The second version doesn't do that, because it has different rules so for it we actually want to consider values the same but for being switch around as equal, and hence with the same hashcode.
The other big difference is the use of StringComparer.OrdinalIgnoreCase. This is an instance of StringComparer which, among other interfaces, implements IEqualityComparer<string> and IEqualityComparer. There are two interesting things about the IEqualityComparer<T> and IEqualityComparer interfaces.
The first is that hash-based collections (such as dictionary) all use them, it's just that unless passed an instance of one to their constructor they will use DefaultEqualityComparer which calls into the Equals and GetHashCode methods we've described above.
The other, is that it allows us to ignore the Equals and GetHashCode mentioned above, and provide them from another class. There are three advantages to this:
We can use them in cases (string is a classic case) where there is more than one likely definition of "equals".
We can ignore that by the class' author, and provide our own.
We can use them to avoid a particular attack. This attack is based on being in a situation where input you provide will be hashed by the code you are attacking. You pick input so as to deliberately provide objects that are different, but hash the same. This means that the poor performance we talked about avoiding earlier is hit, and it can be so bad that it becomes a denial of service attack. By providing different IEqualityComparer implementations with random elements to the hash code (but the same for every instance of the comparer) we can vary the algorithm enough each time as to twart the attack. The use for this is rare (it has to be something that will hash based purely on outside input that is large enough for the poor performance to really hurt), but vital when it comes up.
Finally. If we override Equals we may or may not want to override == and != too. It can be useful to keep them refering to identity only (there are times when that is what we care most about) but it can be useful to have them refer to other semantics (`"abc" == "ab" + "c" is an example of an override).
In summary:
The default equality of reference objects is identity (equal only to itself).
The default equality of value types is a simple comparison of all fields (but poor in performance).
We can change the concept of equality for our classes in either case, but this MUST involve both Equals and GetHashCode*
We can override this and provide another concept of equality.
Dictionary, HashSet, ConcurrentDictionary, etc. all depend on this.
Hashcodes represent a mapping from all values of an object to a 32-bit number.
Hashcodes must be the same for objects we consider equal.
Hashcodes must be spread well.
*Incidentally, anonymous classes have a simple comparison like that of value types, but better performance, which matches almost any case in which we mght care about the hash code of an anonymous type.
Most likely, paramID does not implement equality comparison correctly.
It should be implementing IEquatable<paramID> and that means especially that the GetHashCode implementation must adhere to the requirements (see "Notes to implementers").
As for keys in dictionaries, MSDN says:
As long as an object is used as a key in the Dictionary(Of TKey,
TValue), it must not change in any way that affects its hash value.
Every key in a Dictionary(Of TKey, TValue) must be unique according to
the dictionary's equality comparer. A key cannot be Nothing, but a
value can be, if the value type TValue is a reference type.
Dictionary(Of TKey, TValue) requires an equality implementation to
determine whether keys are equal. You can specify an implementation of
the IEqualityComparer(Of T) generic interface by using a constructor
that accepts a comparer parameter; if you do not specify an
implementation, the default generic equality comparer
EqualityComparer(Of T).Default is used. If type TKey implements the
System.IEquatable(Of T) generic interface, the default equality
comparer uses that implementation.
Since you don't show the paramID type I cannot go into more detail.
As an aside: that's a lot of keys and values getting tangled in there. There's a dictionary inside a dictionary, and the keys of the outer dictionary aggregate some kind of value as well. Perhaps this arrangement can be advantageously simplified? What exactly are you trying to achieve?
Use the Dictionary.ContainsKey method.
And so:
Dictionary<string, object> tempDict = new Dictionary<string, object>();
paramID searchKey = new paramID(xKey, xValue);
if(outerDict.ContainsKey(searchKey))
{
outerDict.TryGetValue(searchKey, out tempDict);
tempDict.Add(newKey, newValue);
}
Also don't forget to override the Equals and GetHashCode methods in order to correctly compare two paramIDs:
class paramID
{
// rest of things
public override bool Equals(object obj)
{
paramID p = (paramID)obj;
// how do you determine if two paramIDs are the same?
if(p.key == this.key) return true;
return false;
}
public override int GetHashCode()
{
return this.key.GetHashCode();
}
}

Categories