I have a reference type that implements the IEquatable Interface. I have a Hashset that contains a single object. I then create an object that, by IEquatable's standards are example the same. But, when I run
var equivalentEntry = _riskControlATMEntries[grouping.Key].FirstOrDefault(e => e == atmEntry);
on the object I get null.
On the otherhand when I do
var equivalentEntry = _riskControlATMEntries[grouping.Key].FirstOrDefault(e => e.Equals(atmEntry));
I get the object that is considered equal based on the IEquatable interface's implementation.
So why does a HashSet rely on public bool Equals(ReferenceType other) but FirstOrDefault does not? What equality is the == operator in FirstOrDefault(e => e == other) looking for?
FirstOrDefault doesn't compare items for equality at all. You provided a filtering delegate that uses the == operator to compare the two objects in one case and used the Equals method in the other.
The == operator does whatever the class defines it to do by that type, or if not defined, by the closest base type that does (with object being the base type that is always there, and will always have a definition if nothing better was defined; it will compare objects based on their reference). Good design says that you should make sure the == operator for a class is defined to behave exactly the same as the Equals method, but nothing in the language forces you to do this, and apparently this class doesn't ensure they're the same, and it's unsurprisingly causing you problems.
This is more a question out of curiosity than necessity and came about having had to deal with Active Directory (MS) types such as SearchResultCollection (in the System.DirectoryServices namespace).
Frequently when dealing with AD in code, I find that I'm having to check values for null, Count, [0] .. whatever and convert what I get out .. all the while hoping that the underlying AD object via COM doesn't go poof etc.
After having a play about with Parallel.ForEach recently - and having to pass in an IEnumerable<T>, I thought, maybe it would be fun to see how I could cast a SearchResultCollection to an IEnumerable of my own custom type. In this type I would pull out all the values from the SearchResult object and stick them in my own, .NET managed code. Then I'd do away with the DirectoryEntry, DirectorySearcher etc. etc.
So, having already worked out that it's a good idea to do searchResultCollection.Cast() in order to supply Parallel.ForEach with it's source, I added an explicit operator for casting to my own type (let's just call it 'Item').
I tested this out within the ParallelForEach, var myItem = (Item)currentSearchResult.
Good times, my cast operator is called and it all works. I then thought, it would be nice to do something like searchResultCollection.Cast<Item>(). Sadly this didn't work, didn't hit any breakpoints in the cast operator.
I did some Googling and discovered a helpful post which Jon Skeet had answered:
IEnumerable.Cast<>
The crux of it, use .Select(...) to force the explicit cast operation. OK, but, hmmm.
I went off and perhaps disassembled System.Core -> System.Linq.Enumerable.Cast<TResult>, I noticed that this 'cast' is actually doing an 'as' keyword conversion under the hood:
public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)
{
IEnumerable<TResult> enumerable = source as IEnumerable<TResult>;
if (enumerable != null)
{
return enumerable;
}
if (source == null)
{
throw Error.ArgumentNull("source");
}
return CastIterator<TResult>(source);
}
I read some more and found this:
Implicit/Explicit conversion with respect to the "as" keyword
The top answer here states that 'as' doesn't invoke any conversion operators .. use a (cast). Semantically I find this a little weird, since the extension method is called cast.. Shouldn't it be casting? There will no doubt be a really good reason why this doesn't happen, anyone know what it is?
Even if it would have used the cast operator instead of as it still wouldn't be invoking user defined explicit operators, so it wouldn't matter. The only difference would be the type of exception thrown.
Explicit operators aren't known at all by the runtime. According to the CLR there is no way to cast your search result to Item. When the compiler notices that there is a cast that matches a given explicit operator it injects at compile time a call to that explicit operator (which is basically a static method) into the code. Once you get to runtime there is no remaining knowledge of the cast, there is simply a method call in place to handle the conversion.
Because this is how explicit operators are implemented, rather than providing knowledge to the runtime of how to do the conversion, there is no way for Cast to inject the explicit operator's call into the code. It's already been compiled. When it was compiled there was no knowledge of any explicit operator to inject, so none was injected.
Semantically I find this a little weird, since the extension method is called cast.. Shouldn't it be casting?
It's casting each element if it needs to, within CastIterator... although using a generic cast, which won't use explicit conversion operator you've defined anyway. You should think of the explicit conversion operator as a custom method with syntactic sugar over the top, and not something the CLR cares about in most cases.
For for the as operator: that's just used to say "If this is already a sequence of the right type, we can just return." It's used on the sequence as a whole, not each element.
This can actually cause problems in some slightly bizarre situations where the C# conversion and the CLR conversions aren't aligned, although I can't remember the example I first came upon immediately.
See the Cast/OfType post within Edulinq for more details.
If I understand correctly you need a DynamicCast which I wrote sometime ago.
Runtime doesn't know about implicit and explicit casting; it is the job of the compiler to do that. but using Enumerable.Cast<> you can't get that because Enumerable.Cast involves casting from System.Object to Item where there is no conversion available(you have conversion from X to Item, and not Object to Item)
Take the advantage of dynamic in .Net4.0
public static class DynamicEnumerable
{
public static IEnumerable<T> DynamicCast<T>(this IEnumerable source)
{
foreach (dynamic current in source)
{
yield return (T)(current);
}
}
}
Use it as
var result = collection.DynamicCast<Item>();
It is casting. See the implementation of CastIterator.
static IEnumerable<TResult> CastIterator<TResult>(IEnumerable source) {
foreach (object obj in source) yield return (TResult)obj;
}
The use of the as keyword here is only to check if the entire collection can be casted to your target instead of casting item by item. If the as returns something that isn't null, then the entire collection is returned and we skip the whole iteration process to cast every item.
For example this trivial example would return a casted collection instead of iterating over every item
int?[] collection = ...;
var castedCollection = collection.Cast<int?>()
because the as works right off the bat, no need to iterate over every item.
In this example, the as gives a null result and we have to use the CastIterator to go over every object
int?[] collection = ...;
var castedCollection = collection.Cast<object>()
I am trying to implement simple algorithm with use of C#'s Dictionary :
My 'outer' dictionary looks like this : Dictionary<paramID, Dictionary<string, object>> [where paramID is simply an identifier which holds 2 strings]
if key 'x' is already in the dictionary then add specific entry to this record's dictionary, if it doesn't exist then add its entry to the outer Dictionary and then add entry to the inner dictionary.
Somehow, when I use TryGetValue it always returns false, therefore it always creates new entries in the outer Dictionary - what produces duplicates.
My code looks more or less like this :
Dictionary<string, object> tempDict = new Dictionary<string, object>();
if(outerDict.TryGetValue(new paramID(xKey, xValue), out tempDict))
{
tempDict.Add(newKey, newValue);
}
Block inside the ifis never executed, even if there is this specific entry in the outer Dictionary.
Am I missing something ? (If you want I can post screen shots from debugger - or something else if you desire)
If you haven't over-ridden equals and GetHashCode on your paramID type, and it's a class rather than a struct, then the default equality meaning will be in effect, and each paramID will only be equal to itself.
You likely want something like:
public class ParamID : IEquatable<ParamID> // IEquatable makes this faster
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
//change for case-insensitive, culture-aware, etc.
return other != null && _first == other._first && _second == other._second;
}
public override bool Equals(object other)
{
return Equals(other as ParamID);
}
public override int GetHashCode()
{
//change for case-insensitive, culture-aware, etc.
int fHash = _first.GetHashCode();
return ((fHash << 16) | (fHash >> 16)) ^ _second.GetHashCode();
}
}
For the requested explanation, I'm going to do a different version of ParamID where the string comparison is case-insensitive and ordinal rather than culture based (a form that would be appropriate for some computer-readable codes (e.g. matching keywords in a case-insensitive computer language or case-insensitive identifiers like language tags) but not for something human-readable (e.g. it will not realise that "SS" is a case-insensitive match to "ß"). This version also considers {"A", "B"} to match {"B", "A"} - that is, it doesn't care what way around the strings are. By doing a different version with different rules it should be possible to touch on a few of the design considerations that come into play.
Let's start with our class containing just the two fields that are it's state:
public class ParamID
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
}
At this point if we do the following:
ParamID x = new ParamID("a", "b");
ParamID y = new ParamID("a", "b");
ParamID z = x;
bool a = x == y;//a is false
bool b = z == x;//b is true
Because by default a reference type is only equal to itself. Why? Well firstly, sometimes that's just what we want, and secondly it isn't always clear what else we might want without the programmer defining how equality works.
Note also, that if ParamID was a struct, then it would have equality defined much like what you wanted. However, the implementation would be rather inefficient, and also buggy if it contained a decimal, so either way it's always a good idea to implement equality explicitly.
The first thing we are going to do to give this a different concept of equality is to override IEquatable<ParamID>. This is not strictly necessary, (and didn't exist until .NET 2.0) but:
It will be more efficient in a lot of use cases, including when key to a Dictionary<TKey, TValue>.
It's easy to do the next step with this as a starting point.
Now, there are four rules we must follow when we implement an equality concept:
An object must still be always equal to itself.
If X == Y and X != Z, then later if the state of none of those objects has changed, X == Y and X != Z still.
If X == Y and Y == Z, then X == Z.
If X == Y and Y != Z then X != Z.
Most of the time, you'll end up following all these rules without even thinking about it, you just have to check them if you're being particularly strange and clever in your implementation. Rule 1 is also something that we can take advantage of to give us a performance boost in some cases:
public class ParamID : IEquatable<ParamID>
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
if(other == null)
return false;
if(ReferenceEquals(this, other))
return true;
if(string.Compare(_first, other._first, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._second, StringComparison.InvariantCultureIgnoreCase) == 0)
return true;
return string.Compare(_first, other._second, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._first, StringComparison.InvariantCultureIgnoreCase) == 0;
}
}
The first thing we've done is see if we're being compared with equality to null. We almost always want to return false in such cases (not always, but the exceptions are very, very rare and if you don't know for sure you're dealing with such an exception, you almost certainly are not), and certainly we don't want to throw a NullReferenceException.
The next thing we do is to see if the object is being compared with itself. This is purely an optimisation. In this case, it's probably a waste of time, but it can be very useful with more complicated equality tests, so it's worth pointing out this trick here. This takes advantage of the rule that identity entails equality, that is, any object is equal to itself (Ayn Rand seemed to think this was somehow profound).
Finally, having dealt with these two special cases, we get to the actual rule for equality. As I said above, my example considers two objects equal if they have the same two strings, in either order, for case-insensitive ordinal comparisons, so I've a bit of code to work that out.
(Note that the order in which we compare component parts can have a performance impact. Not in this case, but with a class that contains both an int and a string we would compare the ints first because is faster and we will hence perhaps find an answer of false before we even look at the strings)
Now at this point we've a good basis for overriding the Equals method defined in object:
public override bool Equals(object other)
{
return (other as ParamID);
}
Since as will return a ParamID reference if other is a ParamID and null for anything else (including if null was what we were passed in the first place), and since we already handle comparison with null, we're all set.
Try to compile at this point and you will get a warning that you have overriden Equals but not GetHashCode (the same is true if you'd done it the other way around).
GetHashCode is used by the dictionary (and other hash-based collections like HashTable and HashSet) to decide where to place the key internally. It will take the hashcode, re-hash it down to a smaller value in a way that is its business, and use it to place the object in its internal store.
Because of this, it's clear why the following is a bad idea were ParamID not readonly on all fields:
ParamID x = new ParamID("a", "b");
dict.Add(x, 33);
x.First = "c";//x will now likely never be found in dict because its hashcode doesn't match its position!
This means the following rules apply to hash-codes:
Two objects considered equal, must have the same hashcode. (This is a hard rule, you will have bugs if you break it).
While we can't guarantee uniqueness, the more spread out the returned results, the better. (Soft rule, you will have better performance the better you do at it).
(Well, 2½.) While not a strict rule, if we take such a complicated approach to point 2 above that it takes forever to return a result, the nett effect will be worse than if we had a poorer-quality hash. So we want to try to be reasonably quick too if we can.
Despite the last point, it's rarely worth memoising the results. Hash-based collections will normally memoise the value themselves, so it's a waste to do so in the object.
For the first implementation, because our approach to equality depended upon the default approach to equality of the strings, we could use strings default hashcode. For my different version I'll use another approach that we'll explore more later:
public override int GetHashCode()
{
return StringComparer.OrdinalIgnoreCase.GetHashCode(_first) ^ StringComparer.OrdinalIgnoreCase.GetHashCode(_second);
}
Let's compare this to the first version. In both cases we get hashcodes of the component parts. If the values where integers, chars or bytes we would have worked with the values themselves, but here we build on the work done in implementing the same logic for those parts. In the first version we use the GetHashCode of string itself, but since "a" has a different hashcode to "A" that won't work here, so we use a class that produces a hashcode ignoring that difference.
The other big difference between the two is that in the first case we mix the bits up more with ((fHash << 16) | (fHash >> 16)). The reason for this is to avoid duplicate hashes. We can't produce a perfect hashcode where every different object has a different value, because there are only 4294967296 possible hashcode values, but many more possible values for ParamID (including null, which is treated as having a hashcode of 0). (There are cases where prefect hashes are possible, but they bring in different concerns than here). Because of this imperfection we have to think not only about what values are possible, but which are likely. Generally, shifting bits like we've done in the first version avoids common values having the same hash. We don't want {"A", "B"} to hash the same as {"B", "A"}.
It's an interesting experiment to produce a deliberately poor GetHashCode that always returns 0, it'll work, but instead of being close to O(1), dictionaries will be O(n), and poor as O(n) goes for that!
The second version doesn't do that, because it has different rules so for it we actually want to consider values the same but for being switch around as equal, and hence with the same hashcode.
The other big difference is the use of StringComparer.OrdinalIgnoreCase. This is an instance of StringComparer which, among other interfaces, implements IEqualityComparer<string> and IEqualityComparer. There are two interesting things about the IEqualityComparer<T> and IEqualityComparer interfaces.
The first is that hash-based collections (such as dictionary) all use them, it's just that unless passed an instance of one to their constructor they will use DefaultEqualityComparer which calls into the Equals and GetHashCode methods we've described above.
The other, is that it allows us to ignore the Equals and GetHashCode mentioned above, and provide them from another class. There are three advantages to this:
We can use them in cases (string is a classic case) where there is more than one likely definition of "equals".
We can ignore that by the class' author, and provide our own.
We can use them to avoid a particular attack. This attack is based on being in a situation where input you provide will be hashed by the code you are attacking. You pick input so as to deliberately provide objects that are different, but hash the same. This means that the poor performance we talked about avoiding earlier is hit, and it can be so bad that it becomes a denial of service attack. By providing different IEqualityComparer implementations with random elements to the hash code (but the same for every instance of the comparer) we can vary the algorithm enough each time as to twart the attack. The use for this is rare (it has to be something that will hash based purely on outside input that is large enough for the poor performance to really hurt), but vital when it comes up.
Finally. If we override Equals we may or may not want to override == and != too. It can be useful to keep them refering to identity only (there are times when that is what we care most about) but it can be useful to have them refer to other semantics (`"abc" == "ab" + "c" is an example of an override).
In summary:
The default equality of reference objects is identity (equal only to itself).
The default equality of value types is a simple comparison of all fields (but poor in performance).
We can change the concept of equality for our classes in either case, but this MUST involve both Equals and GetHashCode*
We can override this and provide another concept of equality.
Dictionary, HashSet, ConcurrentDictionary, etc. all depend on this.
Hashcodes represent a mapping from all values of an object to a 32-bit number.
Hashcodes must be the same for objects we consider equal.
Hashcodes must be spread well.
*Incidentally, anonymous classes have a simple comparison like that of value types, but better performance, which matches almost any case in which we mght care about the hash code of an anonymous type.
Most likely, paramID does not implement equality comparison correctly.
It should be implementing IEquatable<paramID> and that means especially that the GetHashCode implementation must adhere to the requirements (see "Notes to implementers").
As for keys in dictionaries, MSDN says:
As long as an object is used as a key in the Dictionary(Of TKey,
TValue), it must not change in any way that affects its hash value.
Every key in a Dictionary(Of TKey, TValue) must be unique according to
the dictionary's equality comparer. A key cannot be Nothing, but a
value can be, if the value type TValue is a reference type.
Dictionary(Of TKey, TValue) requires an equality implementation to
determine whether keys are equal. You can specify an implementation of
the IEqualityComparer(Of T) generic interface by using a constructor
that accepts a comparer parameter; if you do not specify an
implementation, the default generic equality comparer
EqualityComparer(Of T).Default is used. If type TKey implements the
System.IEquatable(Of T) generic interface, the default equality
comparer uses that implementation.
Since you don't show the paramID type I cannot go into more detail.
As an aside: that's a lot of keys and values getting tangled in there. There's a dictionary inside a dictionary, and the keys of the outer dictionary aggregate some kind of value as well. Perhaps this arrangement can be advantageously simplified? What exactly are you trying to achieve?
Use the Dictionary.ContainsKey method.
And so:
Dictionary<string, object> tempDict = new Dictionary<string, object>();
paramID searchKey = new paramID(xKey, xValue);
if(outerDict.ContainsKey(searchKey))
{
outerDict.TryGetValue(searchKey, out tempDict);
tempDict.Add(newKey, newValue);
}
Also don't forget to override the Equals and GetHashCode methods in order to correctly compare two paramIDs:
class paramID
{
// rest of things
public override bool Equals(object obj)
{
paramID p = (paramID)obj;
// how do you determine if two paramIDs are the same?
if(p.key == this.key) return true;
return false;
}
public override int GetHashCode()
{
return this.key.GetHashCode();
}
}
Today I stumbled upon an interesting bug I wrote. I have a set of properties which can be set through a general setter. These properties can be value types or reference types.
public void SetValue( TEnum property, object value )
{
if ( _properties[ property ] != value )
{
// Only come here when the new value is different.
}
}
When writing a unit test for this method I found out the condition is always true for value types. It didn't take me long to figure out this is due to boxing/unboxing. It didn't take me long either to adjust the code to the following:
public void SetValue( TEnum property, object value )
{
if ( !_properties[ property ].Equals( value ) )
{
// Only come here when the new value is different.
}
}
The thing is I'm not entirely satisfied with this solution. I'd like to keep a simple reference comparison, unless the value is boxed.
The current solution I am thinking of is only calling Equals() for boxed values. Doing a check for a boxed values seems a bit overkill. Isn't there an easier way?
If you need different behaviour when you're dealing with a value-type then you're obviously going to need to perform some kind of test. You don't need an explicit check for boxed value-types, since all value-types will be boxed** due to the parameter being typed as object.
This code should meet your stated criteria: If value is a (boxed) value-type then call the polymorphic Equals method, otherwise use == to test for reference equality.
public void SetValue(TEnum property, object value)
{
bool equal = ((value != null) && value.GetType().IsValueType)
? value.Equals(_properties[property])
: (value == _properties[property]);
if (!equal)
{
// Only come here when the new value is different.
}
}
( ** And, yes, I know that Nullable<T> is a value-type with its own special rules relating to boxing and unboxing, but that's pretty much irrelevant here.)
Equals() is generally the preferred approach.
The default implementation of .Equals() does a simple reference comparison for reference types, so in most cases that's what you'll be getting. Equals() might have been overridden to provide some other behavior, but if someone has overridden .Equals() in a class it's because they want to change the equality semantics for that type, and it's better to let that happen if you don't have a compelling reason not to. Bypassing it by using == can lead to confusion when your class sees two things as different when every other class agrees that they're the same.
Since the input parameter's type is object, you will always get a boxed value inside the method's context.
I think your only chance is to change the method's signature and to write different overloads.
How about this:
if(object.ReferenceEquals(first, second)) { return; }
if(first.Equals(second)) { return; }
// they must differ, right?
Update
I realized this doesn't work as expected for a certain case:
For value types, ReferenceEquals returns false so we fall back to Equals, which behaves as expected.
For reference types where ReferenceEquals returns true, we consider them "same" as expected.
For reference types where ReferenceEquals returns false and Equals returns false, we consider them "different" as expected.
For reference types where ReferenceEquals returns false and Equals returns true, we consider them "same" even though we want "different"
So the lesson is "don't get clever"
I suppose
I'd like to keep a simple reference comparison, unless the value is boxed.
is somewhat equivalent to
If the value is boxed, I'll do a non-"simple reference comparison".
This means the first thing you'll need to do is to check whether the value is boxed or not.
If there exists a method to check whether an object is a boxed value type or not, it should be at least as complex as that "overkill" method you provided the link to unless that is not the simplest way. Nonetheless, there should be a "simplest way" to determine if an object is a boxed value type or not. It's unlikely that this "simplest way" is simpler than simply using the object Equals() method, but I've bookmarked this question to find out just in case.
(not sure if I was logical)
In a program I need to evaluate lots of objects. The result of evaluation is a double.
for example
Object myObject = new Object(x,y,z);
double a = eval(myObject);
after this lots of other objects should be evaluated.
I want to avoid reevaluating same objects. So I need to add evaluated objects and the evaluation result to a hash structure.
for example something like this after first evaluation: -------> this is a pseudo code
myHash.add(myObject, a);
Object anotherObject = new Object(x,y,z);
if (myHash.find(anotherObject))
double evaluationForAnotherObject = myHash.get(anotherObject);
any help would be highly welcomed
A Dictionary<TKey,TValue> can be used for such lookups:
Dictionary<object,double> dict=new Dictionary<object,double>();
if(dict.ContainsKey(obj))
x=dict[obj];
It's important to use the correct equality comparer. For example on objects it uses referential equality by default. If your TKey type doesn't use the desired equality comparison you can supply an IEqualityComparer<TKey> to the constructor of the dictionary.
As an alternative you can pass your function into a memoizer. It returns a new function which caches the result of earlier computations. AFAIK the MiscUtil library contains one.
Func<object,double> memoizingEval=Memoizer.Memoize(eval);
and then use memoizingEval(obj)