What is the best way to tell if an object is modified? - c#

I have an object that is mapped to a cookie as a serialized base-64 string. I only want to write out a new cookie if there are changes made to the object stored in the cookie on server-side.
What I want to do is get a hash code when the object is pulled from the cookie/initialized and compare the original hash code to the hash code that exists just before I send the cookie header off to the client to ensure I don't have to re-serialize/send the cookie unless changes were made.
I was going to override the .NET's Object.GetHashCode() method, but I wasn't sure that this is the best way to go about checking if an object is modified.
Are there any other ways I can check if an object is modified, or should I override the GetHashCode() method.
Update I decided to accept #rmbarnes's answer as it had an interesting solution to the problem, and because I decided to use his advice at the end of his post and not check for modification. I'd still be interested to hear any other solutions anyone may have to my scenario however.

GetHashCode() should always be in sync with Equals(), and Equals() isn't necessarily guaranteed to check for all of the fields in your object (there's certain situations where you want that to not be the case).
Furthermore, GetHashCode() isn't guaranteed to return unique values for all possible object states. It's conceivable (though unlikely) that two object states could result in the same HashCode (which does, after all, only have an int's worth of possible states; see the Pigeonhole Principle for more details).
If you can ensure that Equals() checks all of the appropriate fields, then you could possibly clone the object to record its state and then check it with Equals() against the new state to see if its changed.
BTW: Your mention of serialization gave me an idea. You could serialize the object, record it, and then when you check for object changing, repeat the process and compare the serialized values. That would let you check for state changes without having to make any code changes to your object. However, this isn't a great solution, because:
It's probably very inefficient
It's prone to serialization changes in the object; you might get false positives on the object state change.

At the end of the object's constructor you could serialize the object to a base 64 string just like the cookie stores it, and store this in a member variable.
When you want to check if the cookie needs recreating, re - serialize the object and compare this new base 64 string against the one stored in a member variable. If it has changed, reset the cookie with the new value.
Watch out for the gotcha - don't include the member variable storing the base 64 serialization in the serialization itself. I presume your language uses something like a sleep() function (is how PHP does it) to serialize itself, so just make sure the member is not included in that function.
This will always work because you are comparing the exact value you'd be saving in the cookie, and wouldn't need to override GetHashCode() which sounds like it could have nasty consequences.
All that said I'd probably just drop the test and always reset the cookie, can't be that much overhead in it when compared to doing the change check, and far less likelyhood of bugs.

I personally would say go with the plan you have.. A good hash code is the best way to see if an object is "as-is".. Theres tons of hashing algorithms you can look at, check out the obvious Wikipedia page on hash functions and go from there..
Override GetHashCode and go for it! Just make sure ALL the elements of the information make up part of the hash :)

Seems odd to me why you'd want to store the same object both server side and client side - especially if you're comparing them on each trip.
I'd guess that deserializing the cookie and comparing it to the server side object would be equivalent in performance to just serializing the object again.
But, if you wanted to do this, I'd compare the serialized server side object with the cookie's value and update accordingly. Worst case, you did the serialization for naught. Best case, you did a string compare.
The alternative, deserializing and comparing the objects, has a worst case of deserializing, comparing n fields, and then serializing. Best case is deserializing and comparing n fields.

Related

Object to GUID/UUID

I want to take any object and get a guid that represents that object.
I know that entails a lot of things. I am looking for a good-enough solution for common applications.
My specific use case is for caching, I want to know that the object used to create the thing I am caching has already made one in the past. There would be 2 different types of objects. Each type contains only public properties, and may contain a list/ienumable.
Assuming the object could be serializable my first idea was to serialize it to json (via native jsonserlizer or newtonsoft) and then take the json string and convert that to a uuid version 5 as detailed in a gist here How can I generate a GUID for a string?
My second approach if it's not serializable ( for example contained a dictionary ) would be to use reflection on the public properties to generate a unique string of some sort and then convert that to uuid version 5.
Both approaches use uuid version 5 to take a string to guid. Is there a proven c# class that makes valid uuid 5 guids? The gist looks good but want to be sure.
I was thinking of making the c# namespace and type name be the namespace for the uuid 5. Is that a valid use of namespace ?
My first approach is good enough for my simple use case but I wanted to explore the second approach as it's more flexible.
If creating the guid couldn't guarantee reasonable uniqueness it should throw an error. Surely super complicated objects would fail. How might I know that is the case if using reflection?
I am looking for new approaches or concerns/implementations to the second approach.
Edit: The reason why I bounty/reopened this almost 3 years later is because I need this again (and for caching again); but also because of the introduction of the generic unmanaged constraint in c# 7.3. The blog post at http://devblogs.microsoft.com/premier-developer/dissecting-new-generics-constraints-in-c-7-3/ seems to suggest that if the object can obey the unmanaged spec you can find a suitable key for a key-value store. Am I misunderstanding something?
This is still limited because the object (generic) must obey the unmanaged type constraint which is very limiting (no strings, no arrays, etc), but its one step closer. I don't completely understand why the method of getting the memory stream and getting a sha1 hash cant be done on not unmanaged typed.
I understand that reference types are pointing to places in memory and its not as easy to get the memory that represents all whole object; but it feels doable. After all, objects eventually are made up a bunch of implementations of unmanaged types (string is an array chars, etc)
PS: The requirement of GUID is loose, any integer/string at or under 512 bits would suffice
The problem of equality is a difficult one.
Here some thoughts on how you could solve your problem.
Hashing a serialized object
One method would be to serialize an object and then hash the result as proposed by Georg.
Using the md5 checksum gives you a strong checksum with the right input.
But getting it right is the problem.
You might have trouble using a common serialization framework, because:
They don't care whether a float is 1.0 or 1.000000000000001.
They might have a different understanding about what is equal than you / your employer.
They bloat the serialized text with unneeded symbols. (performance)
Just a little deviation in the serialized text causes a large deviation in the hashed GUID/UUID.
That's why, you should carefully test any serialization you do.
Otherwise you might get false possitives/negatives for objects (mostly false negatives).
Some points to think about:
Floats & Doubles:
Always write them the same way, preferably with the same number of digits to prevent something like 1.000000000000001 vs 1.0 from interfering.
DateTime, TimeStamp, etc.:
Apply a fixed format that wont change and is unambiguous.
Unordered collections:
Sort the data before serializing it. The order must be unambiguous
Strings:
Is the equality case-sensitive? If not make all the strings lower or upper case.
If necessary, make them culture invariant.
More:
For every type, think carefully what is equal and what is not. Think especially about edge cases. (float.NaN, -0 vs 0, null, etc.)
It's up to you whether you use an existing serializer or do it yourself.
Doing it yourself is more work and error prone, but you have full control over all aspects of equality and serialization.
Using an existing serializer is also error prone, because you need to test or prove whether the results are always like you want.
Introducing an unambiguous order and use a tree
If you have control over the source code, you can introduce a custom order function.
The order must take all properties, sub objects, lists, etc. into account.
Then you can create a binary tree, and use the order to insert and lookup objects.
The same problems as mentioned by the first approach still apply, you need to make sure that equal values are detected as such.
The big O performance is also worse than using hashing. But in most real live examples, the actual performance should be comparable or at least fast enough.
The good thing is, you can stop comparing two objects, as soon as you found a property or value that is not equal. Thus no need to always look at the whole object.
A binary tree needs O(log2(n)) comparisons for a lookup, thus that would be quite fast.
The bad thing is, you need access to all actual objects, thus keep them in memory.
A hashtable needs only O(1) comparisons for a lookup, thus would even be faster (theoretically at least).
Put them in a database
If you store all your objects in a database, then the database can do the lookup for you.
Databases are quite good in comparing objects and they have built in mechanisms to handle the equality/near equality problem.
I'm not a database expert, so for this option, someone else might have more insight on how good this solution is.
As others have said in comments, it sounds like GetHashCode might do the trick for you if you're willing to settle for int as your key. If not, there is a Guid constructor that takes byte[] of length 16. You could try something like the following
using System.Linq;
class Foo
{
public int A { get; set; }
public char B { get; set; }
public string C { get; set; }
public Guid GetGuid()
{
byte[] aBytes = BitConverter.GetBytes(A);
byte[] bBytes = BitConverter.GetBytes(B);
byte[] cBytes = BitConverter.GetBytes(C);
byte[] padding = new byte[16];
byte[] allBytes =
aBytes
.Concat(bBytes)
.Concat(cBytes)
.Concat(padding)
.Take(16)
.ToArray();
return new Guid(allBytes);
}
}
As said in the comments, there is no bullet entirely out of silver here, but a few that come quite close. Which of them to use depends on the types you want to use your class with and your context, e.g. when do you consider two objects to be equal. However, be aware that you will always face possible conflicts, a single GUID will not be sufficient to guarantee collision avoidance. All you can do is to decrease the probability of a collision.
In your case,
already made one in the past
sounds like you don't want to refer to reference equality but want to use a notion of value equality. The simplest way to do so is to trust that the classes implement equality using value equality because in that case, you would already be done using GetHashCode but that has a higher probability of collisions because it is only 32bit. Further, you would assume that whoever wrote the class did a good job, which is not always a good assumption to be made, particularly since people tend to blame you rather then themselves.
Otherwise, your best chances are serialization combined with a hashing algorithm of your choice. I would recommend MD5 because it is the fastest and produces the 128bit you need for a GUID. If you say your types consist of public properties only, I would suggest to use an XmlSerializer like so:
private MD5 _md5 = new MD5CryptoServiceProvider();
private Dictionary<Type, XmlSerializer> _serializers = new Dictionary<Type, XmlSerializer>();
public Guid CreateID(object obj)
{
if (obj == null) return Guid.Empty;
var type = obj.GetType();
if (!_serializers.TryGetValue(type, out var serializer))
{
serializer = new XmlSerializer(type);
_serializers.Add(type, serializer);
}
using (var stream = new MemoryStream())
{
serializer.Serialize(stream, obj);
stream.Position = 0;
return new Guid(_md5.ComputeHash(stream));
}
}
Just about all serializers have their drawbacks. XmlSerializer is not capable of serializing cyclic object graphs, DataContractSerializer requires your types to have dedicated attributes and also the old serializers based on the SerializableAttribute require that attribute to be set. You somehow have to make assumptions.

What to return when overriding Object.GetHashCode() in classes with no immutable fields?

Ok, before you get all mad because there are hundreds of similar sounding questions posted on the internet, I can assure you that I have just spent the last few hours reading all of them and have not found the answer to my question.
Background:
Basically, one of my large scale applications had been suffering from a situation where some Bindings on the ListBox.SelectedItem property would stop working or the program would crash after an edit had been made to the currently selected item. I initially asked the 'An item with the same key has already been added' Exception on selecting a ListBoxItem from code question here, but got no answers.
I hadn't had time to address that problem until this week, when I was given a number of days to sort it out. Now to cut a long story short, I found out the reason for the problem. It was because my data type classes had overridden the Equals method and therefore the GetHashCode method as well.
Now for those of you that are unaware of this issue, I discovered that you can only implement the GetHashCode method using immutable fields/properties. Using a excerpt from Harvey Kwok's answer to the Overriding GetHashCode() post to explain this:
The problem is that GetHashCode is being used by Dictionary and HashSet collections to place each item in a bucket. If hashcode is calculated based on some mutable fields and the fields are really changed after the object is placed into the HashSet or Dictionary, the object can no longer be found from the HashSet or Dictionary.
So the actual problem was caused because I had used mutable properties in the GetHashCode methods. When users changed these property values in the UI, the associated hash code values of the objects changed and then items could no longer be found in their collections.
Question:
So, my question is what is the best way of handling the situation where I need to implement the GetHashCode method in classes with no immutable fields? Sorry, let me be more specific, as that question has been asked before.
The answers in the Overriding GetHashCode() post suggest that in these situations, it is better to simply return a constant value... some suggest to return the value 1, while other suggest returning a prime number. Personally, I can't see any difference between these suggestions because I would have thought that there would only be one bucket used for either of them.
Furthermore, the Guidelines and rules for GetHashCode article in Eric Lippert's Blog has a section titled Guideline: the distribution of hash codes must be "random" which highlights the pitfalls of using an algorithm that results in not enough buckets being used. He warns of algorithms that decrease the number of buckets used and cause a performance problem when the bucket gets really big. Surely, returning a constant falls into this category.
I had an idea of adding an extra Guid field to all of my data type classes (just in C#, not the database) specifically to be used in and only in the GetHashCode method. So I suppose at the end of this long intro, my actual question is which implementation is better? To summarise:
Summary:
When overriding Object.GetHashCode() in classes with no immutable fields, is it better to return a constant from the GetHashCode method, or to create an additional readonly field for each class, solely to be used in the GetHashCode method? If I should add a new field, what type should it be and shouldn't I then include it in the Equals method?
While I am happy to receive answers from anyone, I am really hoping to receive answers from advanced developers with a sound knowledge on this subject.
Go back to basics. You read my article; read it again. The two ironclad rules that are relevant to your situation are:
if x equals y then the hash code of x must equal the hash code of y. Equivalently: if the hash code of x does not equal the hash code of y then x and y must be unequal.
the hash code of x must remain stable while x is in a hash table.
Those are requirements for correctness. If you can't guarantee those two simple things then your program will not be correct.
You propose two solutions.
Your first solution is that you always return a constant. That meets the requirement of both rules, but you are then reduced to linear searches in your hash table. You might as well use a list.
The other solution you propose is to somehow produce a hash code for each object and store it in the object. That is perfectly legal provided that equal items have equal hash codes. If you do that then you are restricted such that x equals y must be false if the hash codes differ. This seems to make value equality basically impossible. Since you wouldn't be overriding Equals in the first place if you wanted reference equality, this seems like a really bad idea, but it is legal provided that equals is consistent.
I propose a third solution, which is: never put your object in a hash table, because a hash table is the wrong data structure in the first place. The point of a hash table is to quickly answer the question "is this given value in this set of immutable values?" and you don't have a set of immutable values, so don't use a hash table. Use the right tool for the job. Use a list, and live with the pain of doing linear searches.
A fourth solution is: hash on the mutable fields used for equality, remove the object from all hash tables it is in just before every time you mutate it, and put it back in afterwards. This meets both requirements: the hash code agrees with equality, and hashes of objects in hash tables are stable, and you still get fast lookups.
I would either create an additional readonly field or else throw NotSupportedException. In my view the other option is meaningless. Let's see why.
Distinct (fixed) hash codes
Providing distinct hash codes is easy, e.g.:
class Sample
{
private static int counter;
private readonly int hashCode;
public Sample() { this.hashCode = counter++; }
public override int GetHashCode()
{
return this.hashCode;
}
public override bool Equals(object other)
{
return object.ReferenceEquals(this, other);
}
}
Technically you have to look out for creating too many objects and overflowing the counter here, but in practice I think that's not going to be an issue for anyone.
The problem with this approach is that instances will never compare equal. However, that's perfectly fine if you only want to use instances of Sample as indexes into a collection of some other type.
Constant hash codes
If there is any scenario in which distinct instances should compare equal then at first glance you have no other choice than returning a constant. But where does that leave you?
Locating an instance inside a container will always degenerate to the equivalent of a linear search. So in effect by returning a constant you allow the user to make a keyed container for your class, but that container will exhibit the performance characteristics of a LinkedList<T>. This might be obvious to someone familiar with your class, but personally I see it as letting people shoot themselves in the foot. If you know from beforehand that a Dictionary won't behave as one might expect, then why let the user create one? In my view, better to throw NotSupportedException.
But throwing is what you must not do!
Some people will disagree with the above, and when those people are smarter than oneself then one should pay attention. First of all, this code analysis warning states that GetHashCode should not throw. That's something to think about, but let's not be dogmatic. Sometimes you have to break the rules for a reason.
However, that is not all. In his blog post on the subject, Eric Lippert says that if you throw from inside GetHashCode then
your object cannot be a result in many LINQ-to-objects queries that use hash tables
internally for performance reasons.
Losing LINQ is certainly a bummer, but fortunately the road does not end here. Many (all?) LINQ methods that use hash tables have overloads that accept an IEqualityComparer<T> to be used when hashing. So you can in fact use LINQ, but it's going to be less convenient.
In the end you will have to weigh the options yourself. My opinion is that it's better to operate with a whitelist strategy (provide an IEqualityComparer<T> whenever needed) as long as it is technically feasible because that makes the code explicit: if someone tries to use the class naively they get an exception that helpfully tells them what's going on and the equality comparer is visible in the code wherever it is used, making the extraordinary behavior of the class immediately clear.
Where I want to override Equals, but there is no sensible immutable "key" for an object (and for whatever reason it doesn't make sense to make the whole object immutable), in my opinion there is only one "correct" choice:
Implement GetHashCode to hash the same fields as Equals uses. (This might be all the fields.)
Document that these fields must not be altered while in a dictionary.
Trust that users either don't put these objects in dictionaries, or obey the second rule.
(Returning a constant value compromises dictionary performance. Throwing an exception disallows too many useful cases where objects are cached but not modified. Any other implementation for GetHashCode would be wrong.)
Where this runs the user into trouble anyway, it's probably their fault. (Specifically: using a dictionary where they shouldn't, or using a model type in a context where they should be using a view-model type that uses reference equality instead.)
Or perhaps I shouldn't be overriding Equals in the first place.
If the classes truly contain nothing constant on which a hash value can be calculated then I would use something simpler than a GUID. Just use a random number persisted in the class (or in a wrapper class).
A simple approach is to store the hashCode in a private member and generate it on the first use. If your entity doesn't change often, and you're not going to be using two different objects that are Equal (where your Equals method returns true) as keys in your dictionary, then this should be fine:
private int? _hashCode;
public override int GetHashCode() {
if (!_hashCode.HasValue)
_hashCode = Property1.GetHashCode() ^ Property2.GetHashCode() etc... based on whatever you use in your equals method
return _hashCode.Value;
}
However, if you have, say, object a and object b, where a.Equals(b) == true, and you store an entry in your dictionary using a as the key (dictionary[a] = value).
If a does not change, then dictionary[b] will return value, however, if you change a after storing the entry in the dictionary, then dictionary[b] will most likely fail.
The only workaround to this is to rehash the dictionary when any of the keys change.

Snappy names for a ReferenceType value and a ValueType value

I have 2 classes. One handles a ReferenceType value, another does the same on a ValueType value. This is the only difference, but it is important. I am struggling to find a decent name for each class:
ReferenceTypeValueHandler and ValueTypeValueHandler?
Neah, ValueTypeValue sounds confusing.
ClassValueHandler and StructValueHandler?
I shouldn't use "Class" in a name of a class, should I?
NullableValueHandler and NonNullableValueHandler?
"Nullable" is already used for nullable value types (Nullable<>)
HeapValueHandler and StackValueHandler?
That's dumb. Exploiting the fact that reference type values are stored in the heap and value type values are in the stack, who cares? Also "Stack" is confusing implying it has something to do with a stack.
Any more ideas?
Update:
Some people suggest I should explain the purpose of the class. Well, although I don't think it's important, here it is: I am working on a XML to entity deserializer. I use XmlReader to take advantage of the streamline reading rather then working with DOM. As I read XML I build entities. Some entities are just wrappers for some other ones. These wrappers can take either a single entity or a collection (enumerable) of entities. Speaking of those which take a single entity, this entity has to be provided and it has to be provided exactly one time. If XML doesn't have it, it's a problem. If XML has more than 1 it's a problem too. So for keeping and ensuring that the entity is provided exactly one time I have a class ValueKeeper<TValue>. It has 2 methods TakeValue(TValue value) and TValue ClaimValue(). The TakeValue methods takes the value and checks if there is already a value provided before, if so it throws the exception with appropriate details. The ClaimValue method is called once the reading of the wrapper XML is finished and the wrapper entity has to be created over the scraped value, this method checks whether there is a value that was received via the TakeValue method, if so
it just returns that value, if no, then it throws the exception. Now, the problem is that for reference type values I am using comparison to NULL in order to see if the value was provided. In order to make such comparison possible there must be a generic constraint on the TValue type parameter: where TValue: class. Having this constraint in place I cannot use this class for value type values. So I need another class that does the same, but operates on values where TValue: struct using a Nullable<TValue> field to keep either provided or not-yet provided value. Now, with 2 classes I cannot go along just with ValueKeeper, I need one name for the reference type and another for the value type value. Here is where the question comes up. I need a way to express this subtle difference. But again, it's not important what the class does, what's important is to find appropriate way to put this difference clear.
I wouldn't agree that the rest of the class name is not important. You want to make your code speak for itself and to make it easy for the reader to understand the concepts you had in mind when designing the classes/structs. The class names you suggest would give me no idea of what the class is actually doing. I suggest to search for more concrete names: How is the value being handled? What value?
How do struct and class values differ from each other apart from that one is a class and the other one a struct? There must be some more difference because otherwise it wouldn't make sense to have the same thing as a struct and as a class (DRY).
If it's a very abstract operation you perform, try to search for the pattern, or a general name for a concept. To keep the value and make sure it was provided sounds a bit like a caching mechanism?
Secondly, your facing a semantic issue here: what is the term which subsumes 'values' of value types and 'values' of reference types. We could simply ask the inheritance chain of the .NET framework here and call it both an object.
So, in this case, something like CacheForValueTypeObjects and CacheForReferenceTypeObjects could work. I don't know whether cache expresses the purpose well, but if not, I would try to search for a term which best describes the 'final' purpose of the class, the reason why its there.
I bet you didn't think 'Well, what I really need now is a ValueTypeValueHandler!'. There was something more to it. ;) I like this kind of questions, thanks!

Is there any harm in having many enum values? (many >= 1000)

I have a large list of error messages that my biz code can return based on what's entered. The list may end up with more than a thousand.
I'd like to just enum these all out, using the [Description("")] attribute to record the friendly message.
Something like:
public enum ErrorMessage
{
[Description("A first name is required for users.")]
User_FirstName_Required = 1,
[Description("The first name is too long. It cannot exceed 32 characters.")]
User_FirstName_Length = 2,
...
}
I know enums are primitive types, integers specifically. There shouldn't be any problem with that many integers, right?
Is there something I'm not thinking of? It seems like this should be okay, but I figured I should ask the community before spending the time to do it this way.
Does .Net care about enum types differently when they have lots of values?
Update
The reason I didn't want to use Resources is because
a) I need to be able to reference each unique error message with an integer value. The biz layer services an API, in addition to other things, and a list of integer values has to be returned denoting the errors. I don't believe Resources allows you to address a resource value with an integer. Am I wrong?
b) There are no localization requirements.
I think a design that has 1,000+ values in an enum needs some more thought. Sounds like a "God Enum" anti-pattern will have to be invented for this case.
The main downside I'd point out with having the friendly description in an Attribute is that this will cause challenges if you ever need to localize your app for another language. If this is a consideration, it would be a good idea to put the strings in a resource file.
The enum itself should not be a problem, though having all of your error codes in one master list can be confusing. You may consider creating seperate enums for seperate categories of return codes, as this will make it easier for developers to understand the possible return values for a particular function. You can still give them distinct numeric values (by specifying the numeric values explicitly) if it's important that the codes be unique.
On a side note, the .NET BCL does not make much use of return codes and return codes are somewhat discouraged in modern .NET development. They create maintainability issues (you can almost never remove old return codes or risk breaking backwards compatibility) and they require special validation logic to handle the returns for every call. Stateful validation can be accomplished with IDataErrorInfo, where you use an intermediate class that can represent invalid states, but that only allows a Commit of changes that are validated. This allows you to manipulate the object freely, but also provide feedback to the user as to the validity of its state. The equivalent logic with error codes often requires a switch statement for each use.
1000 is not many, you should just make sure that the underlying integer type is big enough (don't use a char for your enum.
On second thought 1000 is tons if you're manually entering them, if they are generated from some data set it could make sense kinda...
I fully agree with duffymo. An enum with 1000+ values smells bad from y design point of view. Not to mention that it would be quite nasty for the developer to use intelligence on such a GOD ENUM:-)
I would better go for using resources.
I think it's very bad, for error handling you can simply use resource, as i see you want to do reflection and fetch the description its bad too.
If you don't want to use resources, you can define different enum for each of your business rules, Also your different business doesn't need others error message (and shouldn't be like this).

Check if a object is the same

Im doing some queries to a active directory, building up my own Dictionary to contain name, phone and email per user.
then i store user to a file, something like this:
domain\groupt\group\user;<checksum>
where path before ; is the unique id for the user (a user can be in different groups so i have to track that) and <checksum> is .GetHashCode() of that Dictinary object for that user.
Then i have a timer that every 30 sec checks AD and builds up the same dictinary for the user and looks up the id and checksum from the file.
BUT this is not working. For some reason .GetHashCode() generates a new int when there is no change to name, phone or email... so im thinking its not the function to use.
so how could i check if a object has changed in the way i tried to describe above?
You may have to override the GetHashCode method of your user object to create your custom one.
Something like
public override int GetHashCode()
{
return string.Concat(this.Domain,this.Name,...).GetHashCode();
}
But, anyway comparing the HashCodes only ensures you that the objects are not equal when the result is different, if the hashcodes are the same you still have to check if the contents is the same or not.
HashCode is useful for distributing objects in Hastables, but not for real comparison.
You better implement IComparable interface in your classes.
I suspect the reason your hashcodes are changing is that you have not overriden GetHashCode() in your class that combines your three fields, so it is using the default implemenation which will return a hashcode specific to the particular object (not its values). However, I would recommend not using GetHashCode at all for your problem as you need these values to persist to disk and be useful between invocations of your application.
GetHashCode() returns a hash that you can rely on only within the context of a single process instance (a single run). Implementations of GetHashCode are free to return values that are based upon memory ordering of elements which can change between process runs and, in addition, the values generated may differ between process architecture or between versions of .NET. See the documenation of GetHashCode for more details.
If you want something you can rely on when saved to disk and reloaded, you should switch to a hash function with behaviour that is well defined (or, alternatively, that you can control). A list of such algorithms can be found on the 'list of hash functions' Wikipedia page. In particular, the non-cryptographic ones are suitable for this task. (Although there is no reason, other than performance, why you couldn't use a cryptographic one.)
one point i haven't seen mentioned is that when you override the GetHashCode(), you should also override the Equals() to keep consistency and to avoid unexpected behaviour.
Try to override GetHashCode() to return something unique.

Categories