Trying to understand the GetHashCode()

Trying to understand the GetHashCode() - c#

I found the following on Microsoft documentation:
Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash code
I made my own tests to understand the Method:
public static void HashMetod()
{
List<Cliente> listClientTest = new List<Cliente>
{
new Cliente { ID = 1, name = "Marcos", Phones = "2222"}
};
List<Empresa> CompanyList = new List<Empresa>
{
new Empresa { ID = 1, name = "NovaQuimica", Clients = listClientTest },
new Empresa { ID = 1, name = "NovaQuimica", Clients = listClientTest }
};
CompanyList.Add(CompanyList[0]);
foreach (var item in CompanyList)
{
Console.WriteLine("Hash code = {0}", item.GetHashCode());
}
Console.WriteLine("CompanyList[0].Equals(CompanyList[1]) = {0}", CompanyList[0].Equals(CompanyList[1]));
Console.WriteLine("CompanyList[0].Equals(CompanyList[2]) = {0}", CompanyList[0].Equals(CompanyList[2]));
}
My Question is: How can two Differents objects returns the same HashCode? I believe that if two objects return the same, they are Equals(Thats what my method shows). Execute my method and check this out.

A simple observation based on the pigeonhole principle:
GetHashCode returns an int - a 32 bit integer.
There are 4.294.967.296 32-bit integers;
Considering only uppercase English letters, there are 141.167.095.653.376 ten letter words. If we include upper- and lowercase, then we have 144.555.105.949.057.024 combinations.
Since there are more objects than available hash-codes, some (different) objects must have the same hash code.
Another, more real-world example, is that if you wanted to give each person on Earth a hashcode, you would have collisions, since we have more persons than 32-bit integers.
"Fun" fact: because of the birthday paradox, in a city of 100.000 people, you have more than 50% chance of a hash collision.

Here is an Example;
String s1 = new String("AMY");
String s2 = new String("MAY");
Two different Objects, but if the hashCode is calculated with say, the ASCII Code of the characters, it will be the same for MAY and AMY.
You should basically understand the concept of hashing for this.
hashing an object means "finding a value (number) that can be reproduced by the very same instance again and again".
Because hash codes from Object.hashCode() are of type int, you can only have 2^32 different values.
That's why you will have so-called "collisions" depending on the hashing algorithm, when two distinct Objects produce the same hashCode.
To understand them better, you can go through a series of good examples;
PigeonHole, Sock Picking, Hair Counting
SoftBall Team
Birthday Problem.
Hope this helps.

You can read about hashing on the wiki page. But the whole point of hashing is to convert a value into an index, which is done with a hashing function. Hashing functions can vary, but pretty much all end with a mod to constrain the index value within a maximum so it can be put in an array. For each mod n there are an infinite amount of numbers that will yield the same index (I.E. 5 mod 2, 7 mod 2, etc).

You probably just need to read up on Hash Functions in general to make sure you understand that. From Wikipedia:
Hash functions are primarily used to generate fixed-length output data
that acts as a shortened reference to the original data
So essentially you know that you are taking a large (potentially infinite) set of possibilities and trying to fit them into a smaller, more manageable set of possibilities. Because of the two different sizes of the sets, you're guaranteed to have collisions between two different source objects and their Hashes. That said, a good Hash function minimizes those collisions as much as possible.

Hash code is int, that has 2^32 diffent values. Now let's take String class - it can have infinitly many different values, so we can conclude that there must be the same hash codes for different String values.
To find out hash collisions you may exploit Birthday paradox. For instance, for Doubles it could be
random gen = new Random();
Dictionary<int, Double> dict = new Dictionary<int, Double>();
// In general it'll take about
// 2 * sqrt(2^32) = 2 * 65536 = 131072 = 1e5 itterations
// to find out a hash collision (two unequal values with the same hash)
while (true) {
Double d = gen.NextDouble();
int key = d.GetHashCode();
if (dict.ContainsKey(key)) {
Console.Write(d.ToString(Culture.InvariantCulture));
Console.Write(".GetHashCode() == ");
Console.Write(dict[key].ToString(Culture.InvariantCulture));
Console.Write(".GetHashCode() == ");
Console.Write(key.ToString(Culture.InvariantCulture));
break;
}
dict.Add(key, d);
}
In my case
0.540086061479564.GetHashCode() == 0.0337553788133689.GetHashCode() == -1350313817

The purpose of a hash code is to allow code which receives an object to quickly identify things that an object cannot possibly be equal to. If a collection class which has been asked to store many objects it knows nothing about other than how to test them for equality, were then given another object and were asked whether it matches any of the objects it has stored, the collection would have to call Equals on every object in the collection. On the other hand, if the collection can call GetHashCode on each item that's added to the collection, as well as the item it's looking for, and if 99% of the objects in the collection have reported a hashcode which doesn't match the hashcode of the item being sought, then only the 1% of objects whose hashcode does match need to be examined.
The fact that two items' hash codes match won't help compare the two items any faster than could have been done without checking their hash codes, but the fact that items' hash codes don't match will eliminate any need to examine them further. In scenarios were items are far more likely not to match than they are to match, hash codes make it possible to accelerate the non-match case, sometimes by many orders of magnitude.

Related

How can hashset.contains be O(1) with this implementation?

HashSet.Contains implementation in .Net is:
/// <summary>
/// Checks if this hashset contains the item
/// </summary>
/// <param name="item">item to check for containment</param>
/// <returns>true if item contained; false if not</returns>
public bool Contains(T item) {
if (m_buckets != null) {
int hashCode = InternalGetHashCode(item);
// see note at "HashSet" level describing why "- 1" appears in for loop
for (int i = m_buckets[hashCode % m_buckets.Length] - 1; i >= 0; i = m_slots[i].next) {
if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) {
return true;
}
}
}
// either m_buckets is null or wasn't found
return false;
}
And I read in a lot of places "search complexity in hashset is O(1)". How?
Then why does that for-loop exist?
Edit: .net reference link: https://github.com/microsoft/referencesource/blob/master/System.Core/System/Collections/Generic/HashSet.cs

The classic implementation of a hash table works by assigning elements to one of a number of buckets, based on the hash of the element. If the hashing was perfect, i.e. no two elements had the same hash, then we'd be living in a perfectly perfect world where we wouldn't need to care about anything - any lookup would be O(1) always, because we'd only need to compute the hash, get the bucket and say if something is inside.
We're not living in a perfectly perfect world. First off, consider string hashing. In .NET, there are (2^16)^n possible strings of length n; GetHashCode returns a long, and there are 2^64 possible values of long. That's exactly enough to hash every string of length 4 to a unique long, but if we want strings longer than that, there must exist two different values that give the same hash - this is called a collision. Also, we don't want to maintain 2^64 buckets at all times anyway. The usual way of dealing with that is to take the hashcode and compute its value modulo the number of buckets to determine the bucket's number1. So, the takeaway is - we need to allow for collisions.
The referenced .NET Framework implementation uses the simplest way of dealing with collisions - every bucket holds a linked list of all objects that result in the particular hash. You add object A, it's assigned to a bucket i. You add object B, it has the same hash, so it's added to the list in bucket i right after A. Now if you lookup for any element, you need to traverse the list of all objects and call the actual Equals method to find out if that thing is actually the one you're looking for. That explains the for loop - in the worst case you have to go through the entire list.
Okay, so how "search complexity in hashset is O(1)"? It's not. The worst case complexity is proportional to the number of items. It's O(1) on average.2 If all objects fall to the same bucket, asking for the elements at the end of the list (or for ones that are not in the structure but would fall into the same bucket) will be O(n).
So what do people mean by "it's O(1) on average"? The structure monitors how many objects are there proportional to the number of buckets and if that exceeds some threshold, called the load factor, it resizes. It's easy to see that this makes the average lookup time proportional to the load factor.
That's why it's important for hash functions to be uniform, meaning that the probability that two randomly chosen different objects get the same long assigned is 1/2^643. That keeps the distribution of objects in a hash table uniform, so we avoid pathological cases where one bucket contains a huge number of items.
Note that if you know the hash function and the algorithm used by the hash table, you can force such a pathological case and O(n) lookups. If a server takes inputs from a user and stores them in a hash table, an attacker knowing the hash function and the hash table implementations could use this as a vector for a DDoS attack. There are ways of dealing with that too. Treat this as a demonstration that yes, the worst case can be O(n) and that people are generally aware of that.
There are dozens of other, more complicated ways hash tables can be implemented. If you're interested you need to research on your own. Since lookup structures are so commonplace in computer science, people have come up with all sorts of crazy optimisations that minimise not only the theoretical number of operations, but also things like CPU cache misses.
[1] That's exactly what's happening in the statement int i = m_buckets[hashCode % m_buckets.Length] - 1
[2] At least the ones using naive chaining are not. There exist hash tables with worst-case constant time complexity. But usually they're worse in practice compared to the theoretically (in regards to time complexity) slower implementations, mainly due to CPU cache misses.
[3] I'm assuming the domain of possible hashes is the set of all longs, so there are 2^64 of them, but everything I wrote generalises to any other non-empty, finite set of values.

Comparing string hashes on different machines

I have a bunch of IDs, that are in the String form, like "enemy1", "enemy2".
I want to save a progress, depends on how many of each enemies I killed. For that goal I use a dictionary like { { "enemy1", 0 }, { "enemy2", 1 } }.
Then I want to share player's save between few machines he can play into (like PC and laptop) via network (serialize it in JSON file first). For size decreasing and perfomance inreasing, i use hashes instead full string, using that alg (becouse MDSN said, that default hash alg can be different on different machines):
int hash_ = 0;
public override int GetHashCode()
{
if(hash_ == 0)
{
hash_ = 5381;
foreach(var ch in id_)
hash_ = ((hash_ << 5) + hash_) ^ ch;
}
return hash_;
}
So, the question is: is that alg in C# will return the same results in any machine player will use.
UPD: in comments i note that the main part of question was unclear.
So. If i can guarantee that all files before deserialization will be in the same encoding, is char representation on every machine that player can use will be the same and operation ^ ch will give same result? I mean WinX64/WinX32/Mac/Linux/... machines

Yes, that code will give the same result on every platform, for the same input. A char is a UTF-16 code unit, regardless of platform, and any given char will convert to the same int value on every platform. As normal with hash codes computed like this, you shouldn't assume that equal hash codes implies equal original values. (It's unclear how you're intending to use the hash, to be honest.)
I would point out that your code isn't thread-safe though - if two threads call GetHashCode at basically the same time, one may see a value of 0 (and therefore start hashing) whereas the second may see an interim result (as computed by the first thread) and assume that's the final hash. If you really believe caching is important here (and I'd test that first) you should compute the complete hash using a local variable, then copy it to the field only when you're done.

Understanding Hash Codes in .NET

What I've gathered up till now is that hash codes are integers that help finding data from an array faster. Look at this code:
string x = "Run the program to find this string's hash code!";
int hashCode = x.GetHashCode();
Random random = new Random(hashCode);
for(int i = 0; i<100; i++)
{
// Always generates the same set of random integers 60, 23, 67, 80, 89, 44, 44 and so on...
int randomNumber = random.Next(0, 100);
Console.WriteLine("Hash Code is: {0}", hashCode);
Console.WriteLine("The random number it generates is: {0}", randomNumber);
Console.ReadKey();
As you can see I used the Hash Code of string x as the seed for the random number generator. This code gives me a 100 random integers, but every time I run the program, it gives me the SAME set of random numbers! My question is: Why does it give me a different random number every time it iterates through the loop? Why does the Hash Code for x keep changing even though the string isn't changed. What are Hash Codes exactly and how are they generated (if necessary)?

It's vitally important for the hash code to remain the same for a given object throughout the lifetime of that program's execution. The hash code of a given object should not be relied on to remain the same across multiple executions of the program, which is what you're doing. Many implementations will happen to remain the same in different program invocations, but the .NET string implementation does not.

What I've gathered up till now is that hash codes are integers that help finding data from an array faster
No, they help find data in a hash based collection faster. An array is just a sequence of items; there is no reliance on, or benefit from using, hash codes in a normal array.
What are Hash Codes exactly
It is a 32-bit integer that is used to insert and identify an object in a hash-based collection like a Hashtable or Dictionary
and how are they generated (if necessary)?
There is not one algorithm that all objects use to generate a hash code. The only restrictions are that 1) two "equal" objects must generate the same hash code, and 2) an object's hash code must not change over the life of that object. There is no restriction that two "equal" objects in different programs return the same hash code.
The default implementation uses the location of the object in memory. Classes such as string that define "equality" as sometihng other that "a reference to the same object in memory" override this default behavior to honor rule 1 above.
If you want a hash code that can be persisted and is guaranteed to be the same each time you ask for it, then use a standard hashing algorithm like SHA1 or MD5.

Quickly creating 32 bit hash code uniquely identifying a struct composed of (mostly) primitive values

EDIT: 64 or 128 bit would also work. My brain just jumped to 32bit for some reason, thinking it would be sufficient.
I have a struct that is composed of mostly numeric values (int, decimal), and 3 strings that are never more than 12 alpha-characters each. I'm trying to create an integer value that will work as a hash code, and trying to create it quickly. Some of the numeric values are also nullable.
It seems like BitVector32 or BitArray would be useful entities for use in this endevor, but I'm just not sure how to bend them to my will in this task. My struct contains 3 strings, 12 decimals (7 of which are nullable), and 4 ints.
To simplify my use case, lets say you have the following struct:
public struct Foo
{
public decimal MyDecimal;
public int? MyInt;
public string Text;
}
I know I can get numeric identifiers for each value. MyDecimal and MyInt are of course unique, from a numerical standpoint. And the string has a GetHashCode() function which will return a usually-unique value.
So, with a numeric identifier for each, is it possible to generate a hash code that uniquely identifies this structure? e.g. I can compare 2 different Foo's containing the same values, and get the same Hash Code, every time (regardless of app domain, restarting the app, time of day, alignment of Jupiters moons, etc).
The hash would be sparse, so I don't anticipate collisions from my use cases.
Any ideas? My first run at it I converted everything to a string representation, concated it, and used the built-in GetHashCode() but that seems terribly ... inefficient.
EDIT: A bit more background information. The structure data is being delivered to a webclient, and the client does a lot of computation of included values, string construction, etc to re-render the page. The aforementioned 19 field structure represent a single unit of information, each page could have many of units. I'd like to do some client-side caching of the rendered result, so I can quickly re-render a unit without recomputing on the client side if I see the same hash identifier from the server. JavaScript numeric values are all 64 bit, so I suppose my 32bit constraint is artificial and limiting. 64 bit would work, or I suppose even 128 bit if I can break it into two 64 bit values on the server.

Well, even in a sparse table one should better be prepared for collisions, depending on what "sparse" means.
You would need to be able to make very specific assumptions about the data you will be hashing at the same time to beat this graph with 32 bits.
Go with SHA256. Your hashes will not depend on CLR version and you will have no collisions. Well, you will still have some, but less frequently than meteorite impacts, so you can afford not anticipating any.

Two things I suggest you take a look at here and here. I don't think you'll be able to GUARANTEE no collisions with just 32 bits.

Hash codes by definition of a hash function are not meant to be unique. They are only meant to be as evenly distributed across all result values as possible. Getting a hash code for an object is meant to be a quick way to check if two objects are different. If hash codes for two objects are different then those objects are different. But if hash codes are the same you have to deeply compare the objects to be be sure. Hash codes main usage is in all hash-based collections where they make it possible for nearly O(1) retrieval speed.
So in this light, your GetHashCode does not have to be complex and in fact it shouldn't. It must be balanced between being very quick and producing evenly distributed values. If it takes too long to get a hash code it makes it pointless because advantage over deep compare is gone. If on the other extreme end, hash code would always be 1 for example (lighting fast) it would lead to deep compare in every case which makes this hash code pointless too.
So get the balance right and don't try to come up with a perfect hash code. Call GetHashCode on all (or most) of your members and combine the results using Xor operator maybe with a bitwise shift operator << or >>. Framework types have GetHashCode quite optimized although they are not guaranteed to be the same in each application run. There is no guarantee but they also do not have to change and a lot of them don't. Use a reflector to make sure or create your own versions based on the reflected code.
In your particular case deciding if you have already processed a structure by just looking at its hash code is a bit risky. The better the hash the smaller the risk but still. The ultimate and only unique hash code is... the data itself. When working with hash codes you must also override Object.Equals for your code to be truly reliable.

I believe the usual method in .NET is to call GetHashCode on each member of the structure and xor the results.
However, I don't think GetHashCode claims to produce the same hash for the same value in different app domains.
Could you give a bit more information in your question about why you want this hash value and why it needs to be stable over time, different app domains etc.

What goal are you after? If it is performance then you should use a class since a struct will be copied by value whenever you pass it as a function parameter.
3 strings, 12 decimals (7 of which are nullable), and 4 ints.
On a 64 bit machine a pointer will be 8 bytes in size a decimal takes 16 bytes and an int 4 bytes. Ignoring padding your struct will use 232 bytes per instance. This is much bigger compared to the recommened maximum of 16 bytes which makes sense perf wise (classes take up at least 16 bytes due to its object header, ...)
If you need a fingerprint of the value you can use a cryptographically grade hash algo like SHA256 which will produce a 16 byte fingerprint. This is still not uniqe but at least unique enough. But this will cost quite some performance as well.
Edit1:
After you made clear that you need the hash code to identify the object in a Java Script web client cache I am confused. Why does the server send the same data again? Would it not be simpler to make the server smarter to send only data the client has not yet received?
A SHA hash algo could be ok in your case to create some object instance tag.
Why do you need a hash code at all? If your goal is to store the values in a memory efficient manner you can create a FooList which uses dictionaries to store identical values only once and uses and int as lookup key.
using System;
using System.Collections.Generic;
namespace MemoryEfficientFoo
{
class Foo // This is our data structure
{
public int A;
public string B;
public Decimal C;
}
/// <summary>
/// List which does store Foos with much less memory if many values are equal. You can cut memory consumption by factor 3 or if all values
/// are different you consume 5 times as much memory as if you would store them in a plain list! So beware that this trick
/// might not help in your case. Only if many values are repeated it will save memory.
/// </summary>
class FooList : IEnumerable<Foo>
{
Dictionary<int, string> Index2B = new Dictionary<int, string>();
Dictionary<string, int> B2Index = new Dictionary<string, int>();
Dictionary<int, Decimal> Index2C = new Dictionary<int, decimal>();
Dictionary<Decimal,int> C2Index = new Dictionary<decimal,int>();
struct FooIndex
{
public int A;
public int BIndex;
public int CIndex;
}
// List of foos which do contain only the index values to the dictionaries to lookup the data later.
List<FooIndex> FooValues = new List<FooIndex>();
public void Add(Foo foo)
{
int bIndex;
if(!B2Index.TryGetValue(foo.B, out bIndex))
{
bIndex = B2Index.Count;
B2Index[foo.B] = bIndex;
Index2B[bIndex] = foo.B;
}
int cIndex;
if (!C2Index.TryGetValue(foo.C, out cIndex))
{
cIndex = C2Index.Count;
C2Index[foo.C] = cIndex;
Index2C[cIndex] = cIndex;
}
FooIndex idx = new FooIndex
{
A = foo.A,
BIndex = bIndex,
CIndex = cIndex
};
FooValues.Add(idx);
}
public Foo GetAt(int pos)
{
var idx = FooValues[pos];
return new Foo
{
A = idx.A,
B = Index2B[idx.BIndex],
C = Index2C[idx.CIndex]
};
}
public IEnumerator<Foo> GetEnumerator()
{
for (int i = 0; i < FooValues.Count; i++)
{
yield return GetAt(i);
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class Program
{
static void Main(string[] args)
{
FooList list = new FooList();
List<Foo> fooList = new List<Foo>();
long before = GC.GetTotalMemory(true);
for (int i = 0; i < 1000 * 1000; i++)
{
list
//fooList
.Add(new Foo
{
A = i,
B = "Hi",
C = i
});
}
long after = GC.GetTotalMemory(true);
Console.WriteLine("Did consume {0:N0}bytes", after - before);
}
}
}
A similar memory conserving list can be found here

How generate unique Integers based on GUIDs

Is it possible to generate (highly probable) unique Integer from GUIDs?
int i = Guid.NewGuid().GetHashCode();
int j = BitConverter.ToInt32(Guid.NewGuid().ToByteArray(), 0);
Which one is better?

Eric Lippert did a very interesting (as always) post about the probability of hash collisions.
You should read it all but he concluded with this very illustrative graphic:
Related to your specific question, I would also go with GetHashCode since collisions will be unavoidable either way.

The GetHashCode function is specifically designed to create a well distributed range of integers with a low probability of collision, so for this use case is likely to be the best you can do.
But, as I'm sure you're aware, hashing 128 bits of information into 32 bits of information throws away a lot of data, so there will almost certainly be collisions if you have a sufficiently large number of GUIDs.

A GUID is a 128 bit integer (its just in hex rather than base 10). With .NET 4 use http://msdn.microsoft.com/en-us/library/dd268285%28v=VS.100%29.aspx like so:
// Turn a GUID into a string and strip out the '-' characters.
BigInteger huge = BigInteger.Parse(modifiedGuidString, NumberStyles.AllowHexSpecifier)
If you don't have .NET 4 you can look at IntX or Solver Foundation.

Here is the simplest way:
Guid guid = Guid.NewGuid();
Random random = new Random();
int i = random.Next();
You'll notice that guid is not actually used here, mainly because there would be no point in using it. Microsoft's GUID algorithm does not use the computer's MAC address any more - GUID's are actually generated using a pseudo-random generator (based on time values), so if you want a random integer it makes more sense to use the Random class for this.
Update: actually, using a GUID to generate an int would probably be worse than just using Random ("worse" in the sense that this would be more likely to generate collisions). This is because not all 128 bits in a GUID are random. Ideally, you would want to exclude the non-varying bits from a hashing function, although it would be a lot easier to just generate a random number, as I think I mentioned before. :)

If you are looking to break through the 2^32 barrier then try this method:
/// <summary>
/// Generate a BigInteger given a Guid. Returns a number from 0 to 2^128
/// 0 to 340,282,366,920,938,463,463,374,607,431,768,211,456
/// </summary>
public BigInteger GuidToBigInteger(Guid guid)
{
BigInteger l_retval = 0;
byte[] ba = guid.ToByteArray();
int i = ba.Count();
foreach (byte b in ba)
{
l_retval += b * BigInteger.Pow(256, --i);
}
return l_retval;
}
The universe will decay to a cold and dark expanse before you experience a collision.

I had a requirement where multiple instances of a console application needed to get an unique integer ID. It is used to identify the instance and assigned at startup. Because the .exe is started by hands, I settled on a solution using the ticks of the start time.
My reasoning was that it would be nearly impossible for the user to start two .exe in the same millisecond. This behavior is deterministic: if you have a collision, you know that the problem was that two instances were started at the same time. Methods depending on hashcode, GUID or random numbers might fail in unpredictable ways.
I set the date to 0001-01-01, add the current time and divide the ticks by 10000 (because I don't set the microseconds) to get a number that is small enough to fit into an integer.
var now = DateTime.Now;
var zeroDate = DateTime.MinValue.AddHours(now.Hour).AddMinutes(now.Minute).AddSeconds(now.Second).AddMilliseconds(now.Millisecond);
int uniqueId = (int)(zeroDate.Ticks / 10000);
EDIT: There are some caveats. To make collisions unlikely, make sure that:
The instances are started manually (more than one millisecond apart)
The ID is generated once per instance, at startup
The ID must only be unique in regard to other instances that are currently running
Only a small number of IDs will ever be needed

Because the GUID space is larger than the number of 32-bit integers, you're guaranteed to have collisions if you have enough GUIDs. Given that you understand that and are prepared to deal with collisions, however rare, GetHashCode() is designed for exactly this purpose and should be preferred.

Maybe not integers but small unique keys, anyway shorter then guids:
http://www.codeproject.com/Articles/14403/Generating-Unique-Keys-in-Net

In a static class, keep a static const integer, then add 1 to it before every single access (using a public get property). This will ensure you cycle the whole int range before you get a non-unique value.
/// <summary>
/// The command id to use. This is a thread-safe id, that is unique over the lifetime of the process. It changes
/// at each access.
/// </summary>
internal static int NextCommandId
{
get
{
return _nextCommandId++;
}
}
private static int _nextCommandId = 0;
This will produce a unique integer value within a running process. Since you do not explicitly define how unique your integer should be, this will probably fit.

Here is the simplest solution, just call GetHashCode() on the Guid. Note, that a guid is a 128 bit int while a int is 32. So its not guaranteed to be unique. But its probably statistically good enough for most implementations.
public override bool Equals(object obj)
{
if (obj is IBase)
return ((IBase)obj).Id == this.Id;
return base.Equals(obj);
}
public override int GetHashCode()
{
if (this.Id == Guid.Empty)
return base.GetHashCode();
return this.Id.GetHashCode();
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.