Add a specific number to a hashed object in C# - c#

In a program I need to evaluate lots of objects. The result of evaluation is a double.
for example
Object myObject = new Object(x,y,z);
double a = eval(myObject);
after this lots of other objects should be evaluated.
I want to avoid reevaluating same objects. So I need to add evaluated objects and the evaluation result to a hash structure.
for example something like this after first evaluation: -------> this is a pseudo code
myHash.add(myObject, a);
Object anotherObject = new Object(x,y,z);
if (myHash.find(anotherObject))
double evaluationForAnotherObject = myHash.get(anotherObject);
any help would be highly welcomed

A Dictionary<TKey,TValue> can be used for such lookups:
Dictionary<object,double> dict=new Dictionary<object,double>();
if(dict.ContainsKey(obj))
x=dict[obj];
It's important to use the correct equality comparer. For example on objects it uses referential equality by default. If your TKey type doesn't use the desired equality comparison you can supply an IEqualityComparer<TKey> to the constructor of the dictionary.
As an alternative you can pass your function into a memoizer. It returns a new function which caches the result of earlier computations. AFAIK the MiscUtil library contains one.
Func<object,double> memoizingEval=Memoizer.Memoize(eval);
and then use memoizingEval(obj)

Related

Safe convert function

Can someone explain me this code, especially I'm not sure how generic function as parameter works:
result.Notes= orderInfo.Notes.SafeConvert(x => (Communication.OrderNotes)x);
public static TOut[] SafeConvert<TIn, TOut>(
this TIn[] input,
Func<TIn, TOut> converter)
{
if (input == null)
{
return null;
}
return input
.Where(i => i != null)
.Select(converter)
.ToArray();
}
SafeConvert is a generic extension method. The first parameter (an array of the generic type TIn) is implicitly added when the method is invoked on an array of some type (in this case maybe a note?). The method also requires an function as a parameter. This function must take an instance of the type TIn and return a TOut instance. So, you'd invoke this method on an array of some type, supply a lambda expression or a delegate function, and it will return an array of whatever type your supplied function returns. It does this by using Linq to filter out nulls, run each item in the array through the method, then return the enumeration of those items as an array.
In the implementation you've given, it takes the "Notes" of "orderInfo" and explicitly casts them to "CommunicationOrderNotes."
Here's another way you could invoke the method.
var decimals = new [] {5, 3, 2, 1}.SafeConvert(someInt => (decimal) someInt);
This is what's known as an extension method. It's a static function that allows you to "add" methods to types without modifying the original code. It's somewhat analogous to the Decorator Pattern but there's controversy about whether it's actually an implementation of that particular pattern.
"Under the hood," at least, extension methods are just "syntactic sugar" for calling a static method, but you can call them as if they were an instance method of the extended object (in this case, arrays).
The <TIn, TOut> part means that TIn and TOut are some type you haven't specified yet (but that you intend to specify what they actually are when you go to use the class). To understand the purpose of this, think of a Linked List - really, the implementation of a Linked List of integers isn't any different than the code for a Linked List of strings, so you'd like it to be the case that you can create a single class and specify later that you want a list of integers or a list of strings or whatever. You definitely would not want to have to create an implementation for every single possible type of object - that would require a massive amount of redundant code.
Now, for the LINQ query:
return input
.Where(i => i != null)
.Select(converter)
.ToArray();
LINQ (Language Integrated Query) is a mechanism for querying different types of collections using a single syntax. You can use it to query .NET collections (like they're doing here), XML documents, or databases, for example.
LINQ Queries take an anonymous function of some kind and apply the operator to the collection in some way (see below).
Going through this query;
.Where(i => i != null)
As the name suggests, Where applies a filter. When applied to a collection, it returns a second collection with all of the elements of the first collection that match the filter condition. The
i => i != null
bit is the anonymous function that acts as a filter. Basically, what this is saying is "give me a collection with all of the members of the array that aren't null."
The Select method applies a transform to every element of the collection and returns the result as a second collection. In this case, you apply whatever transformation you passed in as an argument to the method.
It might sound slightly odd to think of code as data, but this is actually very routine in some other languages like F#, Lisp, and Scala. ("Under the hood", C# is implementing this behavior in an object-oriented way, but the effect is the same).
The basic idea of this function, then, is that it converts an array of one type to an array of a second type, filtering out all of the null references.

How to check if Dictionary already has a key 'x'?

I am trying to implement simple algorithm with use of C#'s Dictionary :
My 'outer' dictionary looks like this : Dictionary<paramID, Dictionary<string, object>> [where paramID is simply an identifier which holds 2 strings]
if key 'x' is already in the dictionary then add specific entry to this record's dictionary, if it doesn't exist then add its entry to the outer Dictionary and then add entry to the inner dictionary.
Somehow, when I use TryGetValue it always returns false, therefore it always creates new entries in the outer Dictionary - what produces duplicates.
My code looks more or less like this :
Dictionary<string, object> tempDict = new Dictionary<string, object>();
if(outerDict.TryGetValue(new paramID(xKey, xValue), out tempDict))
{
tempDict.Add(newKey, newValue);
}
Block inside the ifis never executed, even if there is this specific entry in the outer Dictionary.
Am I missing something ? (If you want I can post screen shots from debugger - or something else if you desire)
If you haven't over-ridden equals and GetHashCode on your paramID type, and it's a class rather than a struct, then the default equality meaning will be in effect, and each paramID will only be equal to itself.
You likely want something like:
public class ParamID : IEquatable<ParamID> // IEquatable makes this faster
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
//change for case-insensitive, culture-aware, etc.
return other != null && _first == other._first && _second == other._second;
}
public override bool Equals(object other)
{
return Equals(other as ParamID);
}
public override int GetHashCode()
{
//change for case-insensitive, culture-aware, etc.
int fHash = _first.GetHashCode();
return ((fHash << 16) | (fHash >> 16)) ^ _second.GetHashCode();
}
}
For the requested explanation, I'm going to do a different version of ParamID where the string comparison is case-insensitive and ordinal rather than culture based (a form that would be appropriate for some computer-readable codes (e.g. matching keywords in a case-insensitive computer language or case-insensitive identifiers like language tags) but not for something human-readable (e.g. it will not realise that "SS" is a case-insensitive match to "ß"). This version also considers {"A", "B"} to match {"B", "A"} - that is, it doesn't care what way around the strings are. By doing a different version with different rules it should be possible to touch on a few of the design considerations that come into play.
Let's start with our class containing just the two fields that are it's state:
public class ParamID
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
}
At this point if we do the following:
ParamID x = new ParamID("a", "b");
ParamID y = new ParamID("a", "b");
ParamID z = x;
bool a = x == y;//a is false
bool b = z == x;//b is true
Because by default a reference type is only equal to itself. Why? Well firstly, sometimes that's just what we want, and secondly it isn't always clear what else we might want without the programmer defining how equality works.
Note also, that if ParamID was a struct, then it would have equality defined much like what you wanted. However, the implementation would be rather inefficient, and also buggy if it contained a decimal, so either way it's always a good idea to implement equality explicitly.
The first thing we are going to do to give this a different concept of equality is to override IEquatable<ParamID>. This is not strictly necessary, (and didn't exist until .NET 2.0) but:
It will be more efficient in a lot of use cases, including when key to a Dictionary<TKey, TValue>.
It's easy to do the next step with this as a starting point.
Now, there are four rules we must follow when we implement an equality concept:
An object must still be always equal to itself.
If X == Y and X != Z, then later if the state of none of those objects has changed, X == Y and X != Z still.
If X == Y and Y == Z, then X == Z.
If X == Y and Y != Z then X != Z.
Most of the time, you'll end up following all these rules without even thinking about it, you just have to check them if you're being particularly strange and clever in your implementation. Rule 1 is also something that we can take advantage of to give us a performance boost in some cases:
public class ParamID : IEquatable<ParamID>
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
if(other == null)
return false;
if(ReferenceEquals(this, other))
return true;
if(string.Compare(_first, other._first, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._second, StringComparison.InvariantCultureIgnoreCase) == 0)
return true;
return string.Compare(_first, other._second, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._first, StringComparison.InvariantCultureIgnoreCase) == 0;
}
}
The first thing we've done is see if we're being compared with equality to null. We almost always want to return false in such cases (not always, but the exceptions are very, very rare and if you don't know for sure you're dealing with such an exception, you almost certainly are not), and certainly we don't want to throw a NullReferenceException.
The next thing we do is to see if the object is being compared with itself. This is purely an optimisation. In this case, it's probably a waste of time, but it can be very useful with more complicated equality tests, so it's worth pointing out this trick here. This takes advantage of the rule that identity entails equality, that is, any object is equal to itself (Ayn Rand seemed to think this was somehow profound).
Finally, having dealt with these two special cases, we get to the actual rule for equality. As I said above, my example considers two objects equal if they have the same two strings, in either order, for case-insensitive ordinal comparisons, so I've a bit of code to work that out.
(Note that the order in which we compare component parts can have a performance impact. Not in this case, but with a class that contains both an int and a string we would compare the ints first because is faster and we will hence perhaps find an answer of false before we even look at the strings)
Now at this point we've a good basis for overriding the Equals method defined in object:
public override bool Equals(object other)
{
return (other as ParamID);
}
Since as will return a ParamID reference if other is a ParamID and null for anything else (including if null was what we were passed in the first place), and since we already handle comparison with null, we're all set.
Try to compile at this point and you will get a warning that you have overriden Equals but not GetHashCode (the same is true if you'd done it the other way around).
GetHashCode is used by the dictionary (and other hash-based collections like HashTable and HashSet) to decide where to place the key internally. It will take the hashcode, re-hash it down to a smaller value in a way that is its business, and use it to place the object in its internal store.
Because of this, it's clear why the following is a bad idea were ParamID not readonly on all fields:
ParamID x = new ParamID("a", "b");
dict.Add(x, 33);
x.First = "c";//x will now likely never be found in dict because its hashcode doesn't match its position!
This means the following rules apply to hash-codes:
Two objects considered equal, must have the same hashcode. (This is a hard rule, you will have bugs if you break it).
While we can't guarantee uniqueness, the more spread out the returned results, the better. (Soft rule, you will have better performance the better you do at it).
(Well, 2½.) While not a strict rule, if we take such a complicated approach to point 2 above that it takes forever to return a result, the nett effect will be worse than if we had a poorer-quality hash. So we want to try to be reasonably quick too if we can.
Despite the last point, it's rarely worth memoising the results. Hash-based collections will normally memoise the value themselves, so it's a waste to do so in the object.
For the first implementation, because our approach to equality depended upon the default approach to equality of the strings, we could use strings default hashcode. For my different version I'll use another approach that we'll explore more later:
public override int GetHashCode()
{
return StringComparer.OrdinalIgnoreCase.GetHashCode(_first) ^ StringComparer.OrdinalIgnoreCase.GetHashCode(_second);
}
Let's compare this to the first version. In both cases we get hashcodes of the component parts. If the values where integers, chars or bytes we would have worked with the values themselves, but here we build on the work done in implementing the same logic for those parts. In the first version we use the GetHashCode of string itself, but since "a" has a different hashcode to "A" that won't work here, so we use a class that produces a hashcode ignoring that difference.
The other big difference between the two is that in the first case we mix the bits up more with ((fHash << 16) | (fHash >> 16)). The reason for this is to avoid duplicate hashes. We can't produce a perfect hashcode where every different object has a different value, because there are only 4294967296 possible hashcode values, but many more possible values for ParamID (including null, which is treated as having a hashcode of 0). (There are cases where prefect hashes are possible, but they bring in different concerns than here). Because of this imperfection we have to think not only about what values are possible, but which are likely. Generally, shifting bits like we've done in the first version avoids common values having the same hash. We don't want {"A", "B"} to hash the same as {"B", "A"}.
It's an interesting experiment to produce a deliberately poor GetHashCode that always returns 0, it'll work, but instead of being close to O(1), dictionaries will be O(n), and poor as O(n) goes for that!
The second version doesn't do that, because it has different rules so for it we actually want to consider values the same but for being switch around as equal, and hence with the same hashcode.
The other big difference is the use of StringComparer.OrdinalIgnoreCase. This is an instance of StringComparer which, among other interfaces, implements IEqualityComparer<string> and IEqualityComparer. There are two interesting things about the IEqualityComparer<T> and IEqualityComparer interfaces.
The first is that hash-based collections (such as dictionary) all use them, it's just that unless passed an instance of one to their constructor they will use DefaultEqualityComparer which calls into the Equals and GetHashCode methods we've described above.
The other, is that it allows us to ignore the Equals and GetHashCode mentioned above, and provide them from another class. There are three advantages to this:
We can use them in cases (string is a classic case) where there is more than one likely definition of "equals".
We can ignore that by the class' author, and provide our own.
We can use them to avoid a particular attack. This attack is based on being in a situation where input you provide will be hashed by the code you are attacking. You pick input so as to deliberately provide objects that are different, but hash the same. This means that the poor performance we talked about avoiding earlier is hit, and it can be so bad that it becomes a denial of service attack. By providing different IEqualityComparer implementations with random elements to the hash code (but the same for every instance of the comparer) we can vary the algorithm enough each time as to twart the attack. The use for this is rare (it has to be something that will hash based purely on outside input that is large enough for the poor performance to really hurt), but vital when it comes up.
Finally. If we override Equals we may or may not want to override == and != too. It can be useful to keep them refering to identity only (there are times when that is what we care most about) but it can be useful to have them refer to other semantics (`"abc" == "ab" + "c" is an example of an override).
In summary:
The default equality of reference objects is identity (equal only to itself).
The default equality of value types is a simple comparison of all fields (but poor in performance).
We can change the concept of equality for our classes in either case, but this MUST involve both Equals and GetHashCode*
We can override this and provide another concept of equality.
Dictionary, HashSet, ConcurrentDictionary, etc. all depend on this.
Hashcodes represent a mapping from all values of an object to a 32-bit number.
Hashcodes must be the same for objects we consider equal.
Hashcodes must be spread well.
*Incidentally, anonymous classes have a simple comparison like that of value types, but better performance, which matches almost any case in which we mght care about the hash code of an anonymous type.
Most likely, paramID does not implement equality comparison correctly.
It should be implementing IEquatable<paramID> and that means especially that the GetHashCode implementation must adhere to the requirements (see "Notes to implementers").
As for keys in dictionaries, MSDN says:
As long as an object is used as a key in the Dictionary(Of TKey,
TValue), it must not change in any way that affects its hash value.
Every key in a Dictionary(Of TKey, TValue) must be unique according to
the dictionary's equality comparer. A key cannot be Nothing, but a
value can be, if the value type TValue is a reference type.
Dictionary(Of TKey, TValue) requires an equality implementation to
determine whether keys are equal. You can specify an implementation of
the IEqualityComparer(Of T) generic interface by using a constructor
that accepts a comparer parameter; if you do not specify an
implementation, the default generic equality comparer
EqualityComparer(Of T).Default is used. If type TKey implements the
System.IEquatable(Of T) generic interface, the default equality
comparer uses that implementation.
Since you don't show the paramID type I cannot go into more detail.
As an aside: that's a lot of keys and values getting tangled in there. There's a dictionary inside a dictionary, and the keys of the outer dictionary aggregate some kind of value as well. Perhaps this arrangement can be advantageously simplified? What exactly are you trying to achieve?
Use the Dictionary.ContainsKey method.
And so:
Dictionary<string, object> tempDict = new Dictionary<string, object>();
paramID searchKey = new paramID(xKey, xValue);
if(outerDict.ContainsKey(searchKey))
{
outerDict.TryGetValue(searchKey, out tempDict);
tempDict.Add(newKey, newValue);
}
Also don't forget to override the Equals and GetHashCode methods in order to correctly compare two paramIDs:
class paramID
{
// rest of things
public override bool Equals(object obj)
{
paramID p = (paramID)obj;
// how do you determine if two paramIDs are the same?
if(p.key == this.key) return true;
return false;
}
public override int GetHashCode()
{
return this.key.GetHashCode();
}
}

Why can't I compare a KeyValuePair<TKey, TValue> with default

In .Net 2.5 I can usually get an equality comparison (==) between a value and its type default
if (myString == default(string))
However I get the following exception when I try to run an equality comparison on a default KeyValuePair and a KeyValuePair
Code Sample (from a pre-extension method, proto-lambda static ListUtilities class :) )
public static TKey
FirstKeyOrDefault<TKey, TValue>(Dictionary<TKey, TValue> lookups,
Predicate<KeyValuePair<TKey, TValue>> predicate)
{
KeyValuePair<TKey, TValue> pair = FirstOrDefault(lookups, predicate);
return pair == default(KeyValuePair<TKey, TValue>) ?
default(TKey) : pair.Key;
}
Exception:
Operator '==' cannot be applied to
operands of type
'System.Collections.Generic.KeyValuePair<string,object>'
and
'System.Collections.Generic.KeyValuePair<string,object>'
Is it because, as a struct, the KeyValuePair is not nullable? If this is the case, why, as, presumably, default was implemented to handle not nullable types?
EDIT
For the record, I chose #Chris Hannon as selected answer, as he gave me what I was looking for, the most elegant option, and a succinct explanation, however I do encourage reading #Dasuraga for a very comprehensive explanation as to why this is the case
This happens because KeyValuePair<TKey, TValue> does not define a custom == operator and is not included in the predefined list of value types that can use it.
Here is a link to the MSDN documentation for that operator.
For predefined value types, the equality operator (==) returns true if the values of its operands are equal, false otherwise.
Your best bet for an equality check in this case, because this is not a struct you have control over, is to call default(KeyValuePair<TKey,TValue>).Equals(pair) instead.
(If you don't care about the generics discussion linked to this error, you can just jump to the end for your "real" answer)
As the error says, there is no equality testing for KeyValuePairs (i.e. there is no built-in comparison method). The reason for this is to avoid having to place constraints on the types of KeyValuePairs (there are many cases where key,value comparisons would never be made).
Obviously if you want to compare thes KeyValuePairs, I'd imagine what you'd want is to check if the keys and values are equal. But this implies a whole mess of things , notably that TKey and TValue are both comparable types (ie they implement the IComparable interface)
You could write your own comparison function between keyvaluepairs, for example:
static bool KeyValueEqual<TKey , TValue>(KeyValuePair<TKey, TValue> fst,
KeyValuePair<TKey, TValue> snd)
where TValue:IComparable
where TKey:IComparable
{
return (fst.Value.CompareTo(snd.Value)==0)
&& (snd.Key.CompareTo(fst.Key)==0);
}
(Excuse the awful indentation)
Here we impose that TKey and TValue are both comparable (via the CompareTo member function).
The CompareTo function (as defined for pre-defined types) returns 0 when two objects are equal , à la strcmp . a.ComparesTo(b)==0 means a and b are the "same"(in value, not the same object).
so this function would take two KVPs (k,v) and (k',v') and would return true if and only if k==k' and v==v' (in the intuitive sense).
But is this necessary? It seems your test where you're having problems is based on some sort of verification on the return of FirstOrDefault.
But there's a reason your function's called FirstOrDefault:
Returns the first element of the
sequence that satisfies a condition or
a default value if no such element is
found.
(emphasis mine)
This function returns default values if something isn't found, meaning if your predicate isn't verified you'll get a KeyValuePair equal to (default(TKey),default(TValue).
Your code therefore (intends to) check whether pair.Key==default(TKey), only to return default(TKey) anyways. Wouldn't it just make more sense to return pair.Key from the outset?
In order you to use the "==" equality operator on any class or struct, it needs to override the operator: http://msdn.microsoft.com/en-us/library/ms173147(v=vs.80).aspx
KeyValuePair doesn't, and therefore you get the compile error. Note, you'll get the same error if you just try this:
var k1 = new KeyValuePair<int,string>();
var k2 = new KeyValuePair<int,string>();
bool b = k1 == k2; //compile error
EDIT: As Eric Lippert corrected me in the comments, classes obviously don't need to override the equality operator for "==" to be valid. It'll compile fine and do a reference equality check. My mistake.
It fails for the same reason as the following:
var kvp = new KeyValuePair<string,string>("a","b");
var res = kvp == kvp;
The clue is in the error message, naturally. (It has nothing to do with default).
Operator '==' cannot be applied to operands of type 'System.Collections.Generic.KeyValuePair<string,string>' and 'System.Collections.Generic.KeyValuePair<string,string>'
The operator == is not defined for KeyValuePair<T,U>.
Error messages FTW.
Happy coding.
Defaults are pitched at scalar types.
Ask yourself this question: What does it mean for KVP to have a default value?
For non-scalars the default is whatever you get from calling the nil constructor. Assuming that KVP Equals performs instance identity comparison, I would expect it to return false since you get a new object each time the constructor is invoked.
This goes in a slightly different direction, but I am presuming you queried a Dictionary to get this result, and then you want to check if it returned a valid result or not.
I found the better method of doing this was to query out the actual value instead of the whole KeyValuePair, like this:
var valitem = MyDict.Values.FirstOrDefault(x=> x.Something == aVar);
Now you can check if valitem is null or not. Again, it doesn't directly answer your question, but offers what might be a alternative approach to your intended goal.

Generic method to cast one arbitrary type to another in c#

I want to do something like this:
public static TResult MyCast<TSource, TResult>(TSource item)
{
return (TResult)item;
}
Without restrictions on TSource or TResult and avoiding unnecessary boxing if possible.
Edit: I want to stress out, that I want a simple casting of types, not elaborate type conversion here. It would be perfectly ok to fail at casting, say string to int.
Is there any sane way to do this using CLR 2.0?
Edit: this is a simplified version, so it's pretty useless, yes.
But consider casting generic collections, such as this:
public static Dictionary<string, TResult> CastValues<TSource, TResult>(this Dictionary<string, TSource> dictionary)
After some discussions with my co-workers, it seems like there's no simple way to implement such a feature (if at all possible), so I'm stuck with code bloat of several very simple methods for different situations (i.e. up- and downcast of reference types and casting of some value types) :(
Too bad I can't use .NET 4.0 with all it's dynamic et al goodness.
How would
x = MyCast<SourceType, ResultType>(y)
be any more useful than
x = (ResultType)y ?
This is straightforward when TSource and TResult are both reference types.
If one or the other are value types, how do you want it to work? Value types can't inherit from each other, so it's not a matter of doing an up- or down-cast. You might expect numeric conversions between, say, int and double, but you'd have to code these yourself: .NET doesn't treat them as typecasts. And conversion between, say, DateTime and string involves more intelligence (what format? which culture? etc.).
If you're just handling reference types then this method can be a one-liner. If you want to handle value types as well then you'll need to write special case code for the various combinations.
Edit: Convert.ChangeType does a reasonable job at encapsulating the various conversions between value types. However you mentioned you're keen not to introduce boxing: Convert.ChangeType isn't generic and it takes an object.
I think that the problem you are trying to solve is the same as the problem that you cannot cast a collection of one type to a collection of another type.
eg
class Obj1
{}
class Obj2:Obj1
{}
List<Obj2> srcList = GetList();
List<Obj1> castedList=(List<Obj2>) srcList;//this line wont compile
I have not done much at actually looking at the CLR code
However on the asuumption that it is like C++ what you would have here is actually different values stored in the collection. In other words srcList would contain a list of pointers to object 2's interface in castedList you would have a pointer to the the interface of the object 1's within object 2.
In order to resolve this you would need to have your casting function iterate through each of the items within the collection. However in order to be able to iterate through the items the list would have to implement some sort of enumeration interface. So the enumeration interface would need to be a constraint on the casting function.
So the answer would therefore be no.
However if you were prepared to implement this with restrictions on the in types you could have:
static class ListCast<TSource,TResult,TItemType>
where TSource:IEnumerable<TItemType>
where TResult:IList<TItemType>,new()
{
static TResult Cast(TSource list)
{
TResult castedList=newTResult();
foreach(TtemType item in list)
{
castedList.Add(TItemType)item);
}
return castedList;
}
}
you can just do this:
public static TResult MyCast<TSource, TResult>(TSource item)
{
return (TResult)((object)item);
}
Would love to hear how this could be bad.

Caching delegate results

I have a C# method which accepts a Predicate<Foo> and returns a list of matching items...
public static List<Foo> FindAll( Predicate<Foo> filter )
{
...
}
The filter will often be one of a common set...
public static class FooPredicates
{
public static readonly Predicate<Foo> IsEligible = ( foo => ...)
...
}
...but may be an anonymous delegate.
I'd now like to have this method cache its results in the ASP.NET cache, so repeated calls with the same delegate just return the cached result. For this, I need to create a cache key from the delegate. Will Delegate.GetHashCode() produce sensible results for this purpose? Is there some other member of Delegate that I should look at? Would you do this another way entirely?
To perform your caching task, you can follow the other suggestions and create a Dictionary<Predicate<Foo>,List<Foo>> (static for global, or member field otherwise) that caches the results. Before actually executing the Predicate<Foo>, you would need to check if the result already exists in the dictionary.
The general name for this deterministic function caching is called Memoization - and its awesome :)
Ever since C# 3.0 added lambda's and the swag of Func/Action delegates, adding Memoization to C# is quite easy.
Wes Dyer has a great post that brings the concept to C# with some great examples.
If you want me to show you how to do this, let me know...otherwise, Wes' post should be adequate.
In answer to your query about delegate hash codes. If two delegates are the same, d1.GetHashCode() should equal d2.GetHashCode(), but I'm not 100% about this. You can check this quickly by giving Memoization a go, and adding a WriteLine into your FindAll method. If this ends up not being true, another option is to use Linq.Expression<Predicate<Foo>> as a parameter. If the expressions are not closures, then expressions that do the same thing should be equal.
Let me know how this goes, I'm interested to know the answer about delegate.Equals.
Delegate equality looks at each invocation in the invocation list, testing for equality of method to be invoked, and target of method.
The method is a simple piece of the cache key, but the target of the method (the instance to call it on - assuming an instance method) could be impossible to cache in a serializable way. In particular, for anonymous functions which capture state, it will be an instance of a nested class created to capture that state.
If this is all in memory, just keeping the delegate itself as the hash key will be okay - although it may mean that some objects which clients would expect to be garbage collected hang around. If you need to serialize this to a database, it gets hairier.
Could you make your method accept a cache key (e.g. a string) as well? (That's assuming an in memory cache is inadequate.)
Keeping the cached results in a Dictionary<Predicate<Foo>,List<Foo>> is awkward for me because I want the ASP.NET cache to handle expiry for me rather than caching all results forever, but it's otherwise a good solution. I think I'll end up going with Will's Dictionary<Predicate<Foo>,string> to cache a string that I can use in the ASP.NET cache key.
Some initial tests suggest that delegate equality does the "right thing" as others have said, but Delegate.GetHashCode is pathologically unhelpful. Reflector reveals
public override int GetHashCode()
{
return base.GetType().GetHashCode();
}
So any Predicate<Foo> returns the same result.
My remaining issue was how equality works for anonymous delegates. What does "same method called on the same target" mean then? It seems that as long as the delegate was defined in the same place, references are equal. Delegates with the same body defined in different places are not.
static Predicate<int> Test()
{
Predicate<int> test = delegate(int i) { return false; };
return test;
}
static void Main()
{
Predicate<int> test1 = Test();
Predicate<int> test2 = Test();
Console.WriteLine(test1.Equals( test2 )); // True
test1 = delegate(int i) { return false; };
test2 = delegate(int i) { return false; };
Console.WriteLine(test1.Equals( test2 )); // False
}
This should be OK for my needs. Calls with the predefined predicates will be cached. Multiple calls to one method that calls FindAll with an anonymous method should get cached results. Two methods calling FindAll with apparently the same anonymous method won't share cached results, but this should be fairly rare.
Unless you're sure Delegate's implementation of GetHashCode is deterministic and doesn't result in any collisions I wouldn't trust it.
Here's two ideas. First, store the results of the delegates within a Predicate/List dictionary, using the predicate as the key, and then store the entire dictionary of results under a single key in the cache. Bad thing is that you lose all your cached results if the cache item is lost.
An alternative would be to create an extension method for Predicate, GetKey(), that uses an object/string dictionary to store and retrieve all keys for all Predicates. You index into the dictionary with the delegate and return its key, creating one if you don't find it. This way you're assured that you are getting the correct key per delegate and there aren't any collisions. A naiive one would be type name + Guid.
The same instance of an object will always return the same hashcode (requirement of GetHashCode() in .Net). If your predicates are inside a static list and you are not redefining them each time, I can't see a problem in using them as keys.

Categories