Aloha,
Here's a simple class that overrides GetHashCode:
class OverridesGetHashCode
{
public string Text { get; set; }
public override int GetHashCode()
{
return (Text != null ? Text.GetHashCode() : 0);
}
// overriding Equals() doesn't change anything, so I'll leave it out for brevity
}
When I create an instance of that class, add it to a HashSet and then change its Text property, like this:
var hashset = new HashSet<OverridesGetHashCode>();
var oghc = new OverridesGetHashCode { Text = "1" };
hashset.Add(oghc);
oghc.Text = "2";
then this doesn't work:
var removedCount = hashset.RemoveWhere(c => ReferenceEquals(c, oghc));
// fails, nothing is removed
Assert.IsTrue(removedCount == 1);
and neither does this:
// this line works, i.e. it does find a single item matching the predicate
var existing = hashset.Single(c => ReferenceEquals(c, oghc));
// but this fails; nothing is removed again
var removed = hashset.Remove(existing);
Assert.IsTrue(removed);
I guess the hash it internally uses is generated when item is inserted and, if that's true, it's
understandable that hashset.Contains(oghc) doesn't work.
I also guess it looks up item by its hash code and if it finds a match, only then it checks the predicate, and that might be why the first test fails (again, I'm just guessing here).
But why does the last test fail, I just got that object out of the hashset? Am I missing something, is this a wrong way to remove something from a HashSet?
Thank you for taking the time to read this.
UPDATE: To avoid confusion, here's the Equals():
protected bool Equals(OverridesGetHashCode other)
{
return string.Equals(Text, other.Text);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != this.GetType()) return false;
return Equals((OverridesGetHashCode) obj);
}
By changing the hash code of your object while that object is being used in a HashSet is a violation of the HashSet's contract.
Being unable to remove the object is not the problem here. You are not allowed to change the hash code in the first place.
Let me quote from MSDN:
The GetHashCode method for an object must consistently return the same
hash code as long as there is no modification to the object state that
determines the return value of the object's Equals method. Note that
this is true only for the current execution of an application, and
that a different hash code can be returned if the application is run
again.
They tell the story a little differently but the essence is the same. They say, the hash code can never change. In practice, you can change it as long as you make sure no one uses the old hash code anymore. Not that this is good practice, but it works.
It's important that any items added to a hash based table (HashSet, Dictionary, etc.) not be modified once they are inserted into the structure (at least not until they are removed).
To find an object in the data structure it computes it hash code, and then finds a location based on that hash code. If you mutate that object then the hash code it returns no longer reflects it's current location in that data structure (unless you're very, very lucky and it just happens to be a hash collision).
On the MSDN page for Dictionary is says:
As long as an object is used as a key in the Dictionary<TKey, TValue>, it must not change in any way that affects its hash value.
This same assertion applies to HashSet as well, as they both are implemented using hash tables.
There are good answers here and just wanted to add this. If you look at the decompiled HashSet<T> code, you'll see that Add(value) does the following:
Calls IEqualityComparer<T>.GetHashCode() to get the hash code for value. For the default comparer this boils down to GetHashCode().
Uses that hash code to calculate which "bucket" and "slot" the (reference to) value should be stored in.
Stores the reference.
When you call Remove(value) it does steps 1. and 2. again, to find where the reference is at. Then it calls IEqualityComparer<T>.Equals() to make sure that it indeed has found the right value. However, since you've changed what GetHashCode() returns, it calculates a different bucket/slot location, which is invalid. Thus, it cannot find the object.
So, note that Equals() doesn't really come into play here, because it will never even get to the right bucket/slot location if the hash code changes.
Related
Probably I miss something using a HashSet and HashCode, but I don´t know why this doesn't work as I thought. I have an object with the HashCode is overridden. I added the object to a HashSet and later I change a property (which is used to calculate the HashCode) then I can´t remove the object.
public class Test
{
public string Code { get; set; }
public override int GetHashCode()
{
return (Code==null)?0: Code.GetHashCode();
}
}
public void TestFunction()
{
var test = new Test();
System.Collections.Generic.HashSet<Test> hashSet = new System.Collections.Generic.HashSet<Test>();
hashSet.Add(test);
test.Code = "Change";
hashSet.Remove(test); //this doesn´t remove from the hashset
}
Firstly, you're overriding GetHashCode but not overriding Equals. Don't do that. They should always be overridden at the same time, in a consistent manner.
Next, you're right that changing an objects hash code will affect finding it in any hash-based data structures. After all, the first part of checking for the presence of a key in a hash-based structure is to find candidate equal values by quickly finding all existing entries with the same hash code. There's no way that the data structure can "know" that the hash code has been changed and update its own representation.
The documentation makes this clear:
You can override GetHashCode for immutable reference types. In general, for mutable reference types, you should override GetHashCode only if:
You can compute the hash code from fields that are not mutable; or
You can ensure that the hash code of a mutable object does not change while the object is contained in a collection that relies on its hash code.
public void TestFunction()
{
var test = new Test();
System.Collections.Generic.HashSet<Test> hashSet = new System.Collections.Generic.HashSet<Test>();
test.Code = "Change";
hashSet.Add(test);
hashSet.Remove(test); //this doesn´t remove from the hashset
}
first set the value into object of Test, then add it to HashSet.
HashCode is used to find object in the HashSet/ Dictionary. If hash code changed object no longer can be found in the HashSet because when looked up by new HashSet the bucket for items with that new hash code will not (most likely) contain the object (which is in bucket marked with old hash code).
Note that after initial search by hash code Equals will be used to perform final match, but it does not apply to your particular case as you have the same object and use default Equals that compares references.
Detailed explanation/guidelines - Guidelines and rules for GetHashCode by Eric Lippert.
I have a problem in an Windows Forms application which uses a XNA screen. I want to see if a change happens to an object after completing multiple lines of code. If it does, it should add a * to the Title to tell the user the file did changes but hasn't saved.
To do that, I copy the object and after those lines I check if they are equal.
MapSquare afterChange = TileMap.GetMapSquareAtPixel((int)mouseLoc.X,(int)mouseLoc.Y);
MapSquare beforeChange = (MapSquare)afterChange.Clone();
// code.....
if (!Object.Equals(beforeChange,afterChange))
parentForm.MapChanged = true; // this happens even when no changes happend
This mistake must be in my Clone Method I used with the Iclonable interface because even when I check equals right after copying it, it doesn't work.
public object Clone()
{
return new MapSquare(this);
}
private MapSquare(MapSquare squere)
{
this.LayerTiles = (int[])squere.LayerTiles.Clone();
this.CodeValue = squere.CodeValue;
this.Behavior = squere.Behavior;
}
What's the mistake? I think it's in the layertiles array but I already tried many things there, so I don't know what to do. Or is there even a other much easier way to solve my problem?
You would need to override Object.Equals for your MapSquare type to make it compare equality based on the values. By default, Object.Equals only returns true if the two variables refer to the same actual instance - not if they have the same member values.
If you plan to do this, I would recommend implementing IEquatable<MapSquare>, as well.
Unless you overload the equality operator, you are testing whether the two object references are equal, not whether they contain identical values.
I am trying to implement simple algorithm with use of C#'s Dictionary :
My 'outer' dictionary looks like this : Dictionary<paramID, Dictionary<string, object>> [where paramID is simply an identifier which holds 2 strings]
if key 'x' is already in the dictionary then add specific entry to this record's dictionary, if it doesn't exist then add its entry to the outer Dictionary and then add entry to the inner dictionary.
Somehow, when I use TryGetValue it always returns false, therefore it always creates new entries in the outer Dictionary - what produces duplicates.
My code looks more or less like this :
Dictionary<string, object> tempDict = new Dictionary<string, object>();
if(outerDict.TryGetValue(new paramID(xKey, xValue), out tempDict))
{
tempDict.Add(newKey, newValue);
}
Block inside the ifis never executed, even if there is this specific entry in the outer Dictionary.
Am I missing something ? (If you want I can post screen shots from debugger - or something else if you desire)
If you haven't over-ridden equals and GetHashCode on your paramID type, and it's a class rather than a struct, then the default equality meaning will be in effect, and each paramID will only be equal to itself.
You likely want something like:
public class ParamID : IEquatable<ParamID> // IEquatable makes this faster
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
//change for case-insensitive, culture-aware, etc.
return other != null && _first == other._first && _second == other._second;
}
public override bool Equals(object other)
{
return Equals(other as ParamID);
}
public override int GetHashCode()
{
//change for case-insensitive, culture-aware, etc.
int fHash = _first.GetHashCode();
return ((fHash << 16) | (fHash >> 16)) ^ _second.GetHashCode();
}
}
For the requested explanation, I'm going to do a different version of ParamID where the string comparison is case-insensitive and ordinal rather than culture based (a form that would be appropriate for some computer-readable codes (e.g. matching keywords in a case-insensitive computer language or case-insensitive identifiers like language tags) but not for something human-readable (e.g. it will not realise that "SS" is a case-insensitive match to "ß"). This version also considers {"A", "B"} to match {"B", "A"} - that is, it doesn't care what way around the strings are. By doing a different version with different rules it should be possible to touch on a few of the design considerations that come into play.
Let's start with our class containing just the two fields that are it's state:
public class ParamID
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
}
At this point if we do the following:
ParamID x = new ParamID("a", "b");
ParamID y = new ParamID("a", "b");
ParamID z = x;
bool a = x == y;//a is false
bool b = z == x;//b is true
Because by default a reference type is only equal to itself. Why? Well firstly, sometimes that's just what we want, and secondly it isn't always clear what else we might want without the programmer defining how equality works.
Note also, that if ParamID was a struct, then it would have equality defined much like what you wanted. However, the implementation would be rather inefficient, and also buggy if it contained a decimal, so either way it's always a good idea to implement equality explicitly.
The first thing we are going to do to give this a different concept of equality is to override IEquatable<ParamID>. This is not strictly necessary, (and didn't exist until .NET 2.0) but:
It will be more efficient in a lot of use cases, including when key to a Dictionary<TKey, TValue>.
It's easy to do the next step with this as a starting point.
Now, there are four rules we must follow when we implement an equality concept:
An object must still be always equal to itself.
If X == Y and X != Z, then later if the state of none of those objects has changed, X == Y and X != Z still.
If X == Y and Y == Z, then X == Z.
If X == Y and Y != Z then X != Z.
Most of the time, you'll end up following all these rules without even thinking about it, you just have to check them if you're being particularly strange and clever in your implementation. Rule 1 is also something that we can take advantage of to give us a performance boost in some cases:
public class ParamID : IEquatable<ParamID>
{
private readonly string _first; //not necessary, but immutability of keys prevents other possible bugs
private readonly string _second;
public ParamID(string first, string second)
{
_first = first;
_second = second;
}
public bool Equals(ParamID other)
{
if(other == null)
return false;
if(ReferenceEquals(this, other))
return true;
if(string.Compare(_first, other._first, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._second, StringComparison.InvariantCultureIgnoreCase) == 0)
return true;
return string.Compare(_first, other._second, StringComparison.InvariantCultureIgnoreCase) == 0 && string.Compare(_second, other._first, StringComparison.InvariantCultureIgnoreCase) == 0;
}
}
The first thing we've done is see if we're being compared with equality to null. We almost always want to return false in such cases (not always, but the exceptions are very, very rare and if you don't know for sure you're dealing with such an exception, you almost certainly are not), and certainly we don't want to throw a NullReferenceException.
The next thing we do is to see if the object is being compared with itself. This is purely an optimisation. In this case, it's probably a waste of time, but it can be very useful with more complicated equality tests, so it's worth pointing out this trick here. This takes advantage of the rule that identity entails equality, that is, any object is equal to itself (Ayn Rand seemed to think this was somehow profound).
Finally, having dealt with these two special cases, we get to the actual rule for equality. As I said above, my example considers two objects equal if they have the same two strings, in either order, for case-insensitive ordinal comparisons, so I've a bit of code to work that out.
(Note that the order in which we compare component parts can have a performance impact. Not in this case, but with a class that contains both an int and a string we would compare the ints first because is faster and we will hence perhaps find an answer of false before we even look at the strings)
Now at this point we've a good basis for overriding the Equals method defined in object:
public override bool Equals(object other)
{
return (other as ParamID);
}
Since as will return a ParamID reference if other is a ParamID and null for anything else (including if null was what we were passed in the first place), and since we already handle comparison with null, we're all set.
Try to compile at this point and you will get a warning that you have overriden Equals but not GetHashCode (the same is true if you'd done it the other way around).
GetHashCode is used by the dictionary (and other hash-based collections like HashTable and HashSet) to decide where to place the key internally. It will take the hashcode, re-hash it down to a smaller value in a way that is its business, and use it to place the object in its internal store.
Because of this, it's clear why the following is a bad idea were ParamID not readonly on all fields:
ParamID x = new ParamID("a", "b");
dict.Add(x, 33);
x.First = "c";//x will now likely never be found in dict because its hashcode doesn't match its position!
This means the following rules apply to hash-codes:
Two objects considered equal, must have the same hashcode. (This is a hard rule, you will have bugs if you break it).
While we can't guarantee uniqueness, the more spread out the returned results, the better. (Soft rule, you will have better performance the better you do at it).
(Well, 2½.) While not a strict rule, if we take such a complicated approach to point 2 above that it takes forever to return a result, the nett effect will be worse than if we had a poorer-quality hash. So we want to try to be reasonably quick too if we can.
Despite the last point, it's rarely worth memoising the results. Hash-based collections will normally memoise the value themselves, so it's a waste to do so in the object.
For the first implementation, because our approach to equality depended upon the default approach to equality of the strings, we could use strings default hashcode. For my different version I'll use another approach that we'll explore more later:
public override int GetHashCode()
{
return StringComparer.OrdinalIgnoreCase.GetHashCode(_first) ^ StringComparer.OrdinalIgnoreCase.GetHashCode(_second);
}
Let's compare this to the first version. In both cases we get hashcodes of the component parts. If the values where integers, chars or bytes we would have worked with the values themselves, but here we build on the work done in implementing the same logic for those parts. In the first version we use the GetHashCode of string itself, but since "a" has a different hashcode to "A" that won't work here, so we use a class that produces a hashcode ignoring that difference.
The other big difference between the two is that in the first case we mix the bits up more with ((fHash << 16) | (fHash >> 16)). The reason for this is to avoid duplicate hashes. We can't produce a perfect hashcode where every different object has a different value, because there are only 4294967296 possible hashcode values, but many more possible values for ParamID (including null, which is treated as having a hashcode of 0). (There are cases where prefect hashes are possible, but they bring in different concerns than here). Because of this imperfection we have to think not only about what values are possible, but which are likely. Generally, shifting bits like we've done in the first version avoids common values having the same hash. We don't want {"A", "B"} to hash the same as {"B", "A"}.
It's an interesting experiment to produce a deliberately poor GetHashCode that always returns 0, it'll work, but instead of being close to O(1), dictionaries will be O(n), and poor as O(n) goes for that!
The second version doesn't do that, because it has different rules so for it we actually want to consider values the same but for being switch around as equal, and hence with the same hashcode.
The other big difference is the use of StringComparer.OrdinalIgnoreCase. This is an instance of StringComparer which, among other interfaces, implements IEqualityComparer<string> and IEqualityComparer. There are two interesting things about the IEqualityComparer<T> and IEqualityComparer interfaces.
The first is that hash-based collections (such as dictionary) all use them, it's just that unless passed an instance of one to their constructor they will use DefaultEqualityComparer which calls into the Equals and GetHashCode methods we've described above.
The other, is that it allows us to ignore the Equals and GetHashCode mentioned above, and provide them from another class. There are three advantages to this:
We can use them in cases (string is a classic case) where there is more than one likely definition of "equals".
We can ignore that by the class' author, and provide our own.
We can use them to avoid a particular attack. This attack is based on being in a situation where input you provide will be hashed by the code you are attacking. You pick input so as to deliberately provide objects that are different, but hash the same. This means that the poor performance we talked about avoiding earlier is hit, and it can be so bad that it becomes a denial of service attack. By providing different IEqualityComparer implementations with random elements to the hash code (but the same for every instance of the comparer) we can vary the algorithm enough each time as to twart the attack. The use for this is rare (it has to be something that will hash based purely on outside input that is large enough for the poor performance to really hurt), but vital when it comes up.
Finally. If we override Equals we may or may not want to override == and != too. It can be useful to keep them refering to identity only (there are times when that is what we care most about) but it can be useful to have them refer to other semantics (`"abc" == "ab" + "c" is an example of an override).
In summary:
The default equality of reference objects is identity (equal only to itself).
The default equality of value types is a simple comparison of all fields (but poor in performance).
We can change the concept of equality for our classes in either case, but this MUST involve both Equals and GetHashCode*
We can override this and provide another concept of equality.
Dictionary, HashSet, ConcurrentDictionary, etc. all depend on this.
Hashcodes represent a mapping from all values of an object to a 32-bit number.
Hashcodes must be the same for objects we consider equal.
Hashcodes must be spread well.
*Incidentally, anonymous classes have a simple comparison like that of value types, but better performance, which matches almost any case in which we mght care about the hash code of an anonymous type.
Most likely, paramID does not implement equality comparison correctly.
It should be implementing IEquatable<paramID> and that means especially that the GetHashCode implementation must adhere to the requirements (see "Notes to implementers").
As for keys in dictionaries, MSDN says:
As long as an object is used as a key in the Dictionary(Of TKey,
TValue), it must not change in any way that affects its hash value.
Every key in a Dictionary(Of TKey, TValue) must be unique according to
the dictionary's equality comparer. A key cannot be Nothing, but a
value can be, if the value type TValue is a reference type.
Dictionary(Of TKey, TValue) requires an equality implementation to
determine whether keys are equal. You can specify an implementation of
the IEqualityComparer(Of T) generic interface by using a constructor
that accepts a comparer parameter; if you do not specify an
implementation, the default generic equality comparer
EqualityComparer(Of T).Default is used. If type TKey implements the
System.IEquatable(Of T) generic interface, the default equality
comparer uses that implementation.
Since you don't show the paramID type I cannot go into more detail.
As an aside: that's a lot of keys and values getting tangled in there. There's a dictionary inside a dictionary, and the keys of the outer dictionary aggregate some kind of value as well. Perhaps this arrangement can be advantageously simplified? What exactly are you trying to achieve?
Use the Dictionary.ContainsKey method.
And so:
Dictionary<string, object> tempDict = new Dictionary<string, object>();
paramID searchKey = new paramID(xKey, xValue);
if(outerDict.ContainsKey(searchKey))
{
outerDict.TryGetValue(searchKey, out tempDict);
tempDict.Add(newKey, newValue);
}
Also don't forget to override the Equals and GetHashCode methods in order to correctly compare two paramIDs:
class paramID
{
// rest of things
public override bool Equals(object obj)
{
paramID p = (paramID)obj;
// how do you determine if two paramIDs are the same?
if(p.key == this.key) return true;
return false;
}
public override int GetHashCode()
{
return this.key.GetHashCode();
}
}
I'm reading up on core C# programming constructs and having a hard time wrapping my head around the out parameter modifier. I know what it does by reading but am trying to think of a scenerio when I would use it.
Can someone give me a real-world example? Thanks.
The main motivation to using an out parameter is to allow a function to return multiple values to the caller and everyone else provided examples in the framework. I'll take a different approach to answering your question by exploring the reasoning behind having out parameters in the first place. I won't write out actual examples but describe them.
Normally you have only one mechanism to return values, the function's return value. Sure you could use a global (static) or instance variables too but that's not very practical nor safe to do in general (for reasons I won't explain here). Prior to .NET 3.5, there wasn't a really practical way to return multiple values from a function. If out or ref modifiers were not available, you would have a few options:
If all your values had the same type, you could return some collection of the values. This is perfectly fine in most cases, you could return an array of number, list of strings, whatever. This is perfect if all the values were related in exactly the same way. i.e., All numbers were the number of items in a container, or the list was of names of guests at a party. But what if the values you returned represented different quantities? What if they had different types? A list of objects could hold them all but it is not a very intuitive way to manipulate that sort of data.
For the case when you need to return multiple values of different types, the only practical option you had was to create a new class/struct type to encapsulate all these values and return an instance of that type. Doing so you could return strongly typed values with intuitive names and you could return multiple values this way. The problem is that in order to get that, you had to define the type with a specific name and everything just to be able to return multiple values. What if you wanted to return only two values which were simple enough making it impractical to create a type for it? You have a couple more options again:
You could create a set of generic types to contain a fixed amount of values of varying types (like a tuple in functional languages). But it is not as appealing to do so in a reusable manner since it wasn't part of the framework at the time. It could be put in a library but now you add a dependency on that library just for the sake of these simple types. (just be glad that .NET 4.0 now includes the Tuple type) But this still doesn't solve the fact that these are simple values which means added complexity for a simple task.
The option that was used was to include an out modifier which allows the caller to pass a "reference" to a variable so that the function may set the referenced variable as another way to return a value. This way of returning values was also available in C and C++ in many ways for the same reasons and played a role in influencing this decision. However the difference in C# is that for an out parameter, the function must set the value to something. If it doesn't, it results in a compiler error. This makes this less error prone since by having an out parameter, you're promising the caller that you will set the value to something and they can use it, the compiler makes sure you stick to that promise.
A note on the typical usage of the out (or ref) modifier, it will be rare to see more than one or two out parameters. In those cases, it will almost always be a much better idea to create the encapsulating type. You would typically use it if you needed just one more value to return.
However since C#-3.0/.NET-3.5 with the introduction of anonymous types and tuples introduced in .NET 4.0, these options provided alternative methods to return multiple values of varying types easier (and more intuitive) to do.
there are many scenarios where you would use it, but the main one would be where your method needs to return more then one parameter. Take, for example, the TryParse methods on int type. In this case, instead of throwing an exception a bool is returned as a success/failure flag and the parsed int is return as the out param. if you were to call int.Parse(...) you could potentially throw an exception.
string str = "123456";
int val;
if ( !int.TryParse(str,out val) )
{
// do some error handling, notify user, etc.
}
Sure, take a look at any of the TryParse methods, such as int.TryParse:
The idea is you actually want two pieces of information: whether a parse operation was successful (the return value), and, if so, what the result of it actually was (the out parameter).
Usage:
string input = Console.ReadLine();
int value;
// First we check the return value, which is a bool
// indicating success or failure.
if (int.TryParse(input, out value))
{
// On success, we also use the value that was parsed.
Console.WriteLine(
"You entered the number {0}, which is {1}.",
value,
value % 2 == 0 ? "even" : "odd"
);
}
else
{
// Generally, on failure, the value of an out parameter
// will simply be the default value for the parameter's
// type (e.g., default(int) == 0). In this scenario you
// aren't expected to use it.
Console.WriteLine(
"You entered '{0}', which is not a valid integer.",
input
);
}
Many developers complain of out parameters as a "code smell"; but they can be by far the most appropriate choice in many scenarios. One very important modern example would be multithreaded code; often an out parameter is necessary to permit "atomic" operations where a return value would not suffice.
Consider for example Monitor.TryEnter(object, ref bool), which acquires a lock and sets a bool atomically, something that wouldn't be possible via a return value alone since the lock acquisition would necessarily happen before the return value were assigned to a bool variable. (Yes, technically ref and out are not the same; but they're very close).
Another good example would be some of the methods available to the collection classes in the System.Collections.Concurrent namespace new to .NET 4.0; these provide similarly thread-safe operations such as ConcurrentQueue<T>.TryDequeue(out T) and ConcurrentDictionary<TKey, TValue>.TryRemove(TKey, out TValue).
Output parameters are found all over the .NET framework. Some of the uses I see most often are the TryParse methods, which return a boolean (indicating whether or not the parse was valid) and the actual result is returned via the output parameter. While it's also very common place to use a class when you need to return multiple values, in such an example as this it's a little heavy handed. For more on output parameters, see Jon Skeet's article on Parameter passing in C#.
Simple, when you have a method that returns more than one value.
One of the most "famous" cases is Dictionary.TryGetValue:
string value = "";
if (openWith.TryGetValue("tif", out value))
{
Console.WriteLine("For key = \"tif\", value = {0}.", value);
}
else
{
Console.WriteLine("Key = \"tif\" is not found.");
}
As others have said - out parameters allow us to return more than one value from a method call without having to wrap the results in struct/class.
The addition of the xxx.TryParse methods greatly simplified the coding necessary to convert between a string value (frequently from the UI) and a primitive type.
An example of what you might have had to write to achieve the same functionality is here:
/// <summary>
/// Example code for how <see cref="int.TryParse(string,out int)"/> might be implemented.
/// </summary>
/// <param name="integerString">A string to convert to an integer.</param>
/// <param name="result">The result of the parse if the operation was successful.</param>
/// <returns>true if the <paramref name="integerString"/> parameter was successfully
/// parsed into the <paramref name="result"/> integer; false otherwise.</returns>
public bool TryParse(string integerString, out int result)
{
try
{
result = int.Parse(integerString);
return true;
}
catch (OverflowException)
{
// Handle a number that was correctly formatted but
// too large to fit into an Int32.
}
catch (FormatException)
{
// Handle a number that was incorrectly formatted
// and so could not be converted to an Int32.
}
result = 0; // Default.
return false;
}
The two exception checks that are avoided here make the calling code much more readable. I believe that the actual .NET implementations avoid the exceptions altogether so perform better as well. Similarly, this example shows how IDictionary.TryGetValue(...) makes code simpler and more efficient:
private readonly IDictionary<string,int> mDictionary = new Dictionary<string, int>();
public void IncrementCounter(string counterKey)
{
if(mDictionary.ContainsKey(counterKey))
{
int existingCount = mDictionary[counterKey];
mDictionary[counterKey] = existingCount + 1;
}
else
{
mDictionary.Add(counterKey, 1);
}
}
public void TryIncrementCounter(string counterKey)
{
int existingCount;
if (mDictionary.TryGetValue(counterKey, out existingCount))
{
mDictionary[counterKey] = existingCount + 1;
}
else
{
mDictionary.Add(counterKey, 1);
}
}
And all thanks to the out parameter.
bool Int32.TryParse(String, out Int);
or something similar like Dictionary.TryGetValue.
But I would consider this one to be a not too good practice to employ it, of course, using those provided by API like the Int32 one to avoid Try-Catch is exceptions.
//out key word is used in function instead of return. we can use multiple parameters by using out key word
public void outKeyword(out string Firstname, out string SecondName)
{
Firstname = "Muhammad";
SecondName = "Ismail";
}
//on button click Event
protected void btnOutKeyword_Click(object sender, EventArgs e)
{
string first, second;
outKeyword(out first, out second);
lblOutKeyword.Text = first + " " + second;
}
Assuming a method with the following signature
bool TryXxxx(object something, out int toReturn)
What is it acceptable for toReturn to be if TryXxxx returns false?
In that it's infered that toReturn should never be used if TryXxxx fails does it matter?
If toReturn was a nulable type, then it would make sense to return null. But int isn't nullable and I don't want to have to force it to be.
If toReturn is always a certain value if TryXxxx fails we risk having the position where 2 values could be considered to indicate the same thing. I can see this leading to potential possible confusion if the 'default' value was returned as a valid response (when TryXxxx returns true).
From an implementation point if view it looks like having toReturn be a[ny] value is easiest, but is there anything more important to consider?
I would explicitly document it as using the default value for the type (whatever that type is; so 0 in this case but default(T) in a more general case). There are various cases where the default value is exactly what you want if the value isn't present - and in that case you can just ignore the return value from the method.
Note that this is what methods like int.TryParse and Dictionary.TryGetValue do.
It could be default(int):
bool TryXxxx(object something, out int toReturn)
{
toReturn = default(int);
return false;
}
I would say default, but really it shouldn't matter. The convention with the TryX is that the caller should check the return value and only use the out parameter when the method returns true.
Basically it is something. I would document it as "not defined". Sensible values are:
default()
Minvalue, MaxCValue, NEWvalue (as new int ()), null
NAN value (Double.NaN)
But in general, I woul really say "not defined" and not give people something they may try to rely on.
1) Actually, I think it should not matter because you should always check the boolean result of those methods before processing the value. That's what the TryXXX methods are for.
2) However, in such cases, I always refer to the implementation in the .NET framework to guarantee consistency. And a quick look into Reflector shows that the Int32 type returns 0 if parsing failed:
internal static unsafe bool TryParseInt32(string s, NumberStyles style, NumberFormatInfo info, out int result)
{
byte* stackBuffer = stackalloc byte[1 * 0x72];
NumberBuffer number = new NumberBuffer(stackBuffer);
result = 0; // <== see here!
if (!TryStringToNumber(s, style, ref number, info, false))
{
return false;
}
if ((style & NumberStyles.AllowHexSpecifier) != NumberStyles.None)
{
if (!HexNumberToInt32(ref number, ref result))
{
return false;
}
}
else if (!NumberToInt32(ref number, ref result))
{
return false;
}
return true;
}
However, not knowing the implementation details, it might still happen that the parsing problem occurs when a value has already been assigned (partly). In this case, the value might no longer be 0. Therefore, you should always stick to "solution" 1)! :-)
gehho.
Prior to .net, a normal pattern was for TryXX methods to simply leave the passed-by-reference argument unmodified. This was a very useful pattern, since it meant that code which wanted to use a default value could do something like:
MyNum = 5;
TryParsing(myString, &myNum);
while code that didn't want to use a default value could use:
if (TryParsing(myString, &myNum))
{ .. code that uses myNum .. }
else
{ .. code that doesn't use myNum .. }
In the former usage, the calling code would have to ensure myNum was initialized before the call, but would not have to worry about the TryParsing return value. In the latter usage, the calling code would have to worry about the return value, but would not have to initialize myNum before the call. The TryParsing routine itself wouldn't have to worry about which usage was intended.
C# does not very well permit such a pattern, however, unless TryParsing is written in another language. Either TryParsing has to be written in such a way that the previous value of myNum will unconditionally be overwritten without having been examined, the caller must unconditionally initialize it, or different methods must be provided for the two scenarios. If the TryParsing method were written in some other language, it could in theory behave like the old-style methods (write the argument on success, and leave it alone if not) while still calling it an out parameter. I would not recommend this, however, because the quirky behavior would not be confined to that out parameter.
Consider, for example, that a method of that style used an argument of type fooStruct, and fooStruct had a constructor that looked like:
fooStruct(string st)
{
fooStruct.TryParse(st, out this);
}
The compiler would be perfectly happy with such a constructor, since it "definitely" writes this. On the other hand, if some other code does:
while(someCondition)
{
var s = new fooStruct(someString);
...
}
One might expect that s will either hold an initialized structure (if someString is valid) or be blank (if it isn't). Nothing about that code would suggest that s could kept its value between repetitions of the loop. Nonetheless, that is exactly what would likely happen.