What does Collection.Contains() use to check for existing objects? - c#

I have a strongly typed list of custom objects, MyObject, which has a property Id, along with some other properties.
Let's say that the Id of a MyObject defines it as unique and I want to check if my collection doesn't already have a MyObject object that has an Id of 1 before I add my new MyObject to the collection.
I want to use if(!List<MyObject>.Contains(myObj)), but how do I enforce the fact that only one or two properties of MyObject define it as unique?
I can use IComparable? Or do I only have to override an Equals method? If so, I'd need to inherit something first, is that right?

List<T>.Contains uses EqualityComparer<T>.Default, which in turn uses IEquatable<T> if the type implements it, or object.Equals otherwise.
You could just implement IEquatable<T> but it's a good idea to override object.Equals if you do so, and a very good idea to override GetHashCode() if you do that:
public class SomeIDdClass : IEquatable<SomeIDdClass>
{
private readonly int _id;
public SomeIDdClass(int id)
{
_id = id;
}
public int Id
{
get { return _id; }
}
public bool Equals(SomeIDdClass other)
{
return null != other && _id == other._id;
}
public override bool Equals(object obj)
{
return Equals(obj as SomeIDdClass);
}
public override int GetHashCode()
{
return _id;
}
}
Note that the hash code relates to the criteria for equality. This is vital.
This also makes it applicable for any other case where equality, as defined by having the same ID, is useful. If you have a one-of requirement to check if a list has such an object, then I'd probably suggest just doing:
return someList.Any(item => item.Id == cmpItem.Id);

List<T> uses the comparer returned by EqualityComparer<T>.Default and according to the documentation for that:
The Default property checks whether
type T implements the
System.IEquatable(Of T) interface and,
if so, returns an EqualityComparer(Of
T) that uses that implementation.
Otherwise, it returns an
EqualityComparer(Of T) that uses the
overrides of Object.Equals and
Object.GetHashCode provided by T.
So you can either implement IEquatable<T> on your custom class, or override the Equals (and GetHashCode) methods to do the comparison by the properties you require. Alternatively you could use linq:
bool contains = list.Any(i => i.Id == obj.Id);

You can use LINQ to do this pretty easily.
var result = MyCollection.Any(p=>p.myId == Id);
if(result)
{
//something
}

You can override Equals and GetHashCode, implement an IEqualityComparer<MyObject> and use that in the Contains call, or use an extension method like Any
if (!myList.Any(obj => obj.Property == obj2.Property && obj.Property2 == obj2.Property2))
myList.Add(obj2);

First define helper class with IEqualityComparer.
public class MyEqualityComparer<T> : IEqualityComparer<T>
{
Func<T, int> _hashDelegate;
public MyEqualityComparer(Func<T, int> hashDelegate)
{
_hashDelegate = hashDelegate;
}
public bool Equals(T x, T y)
{
return _hashDelegate(x) == _hashDelegate(y);
}
public int GetHashCode(T obj)
{
return _hashDelegate(obj);
}
}
Then in your code, just define comparator and use it:
var myComparer = new MyEqualityComparer<MyObject>(delegate(MyObject obj){
return obj.ID;
});
var result = collection
.Where(f => anotherCollection.Contains(f.First, myComparer))
.ToArray();
This way you can define the way how Equality is computed without modifying your classes. You can also use it for processing object from third party libraries as you cannot modify their code.

You can use IEquatable<T>. Implement this in your class, and then check to see if the T passed to the Equals has the same Id as this.Id. I'm sure this works for checking a key in a dictionary, but I've not used it for a collection.

Related

How to compare two IEnumerable<T> in C# if I don't know the actual object type?

I'm struggling with implementing the IEquatable<> interface for a class. The class has a Parameter property that uses a generic type. Basically the class definition is like this:
public class MyClass<T> : IEquatable<MyClass<T>>
{
public T Parameter { get; }
...
}
In the Equals() method I'm using EqualityComparer<T>.Default.Equals(Parameter, other.Parameter) to compare the property. Generally, this works fine – as long as the property is not a collection, for example an IEnumerable<T>. The problem is that the default equality comparer for IEnumerable<T> is checking reference equality.
Obviously, you'd want to use SequenceEqual() to compare the IEnumerable<T>. But to get this running, you need to specify the generic type of the SequenceEqual() method. This is the closest I could get:
var parameterType = typeof(T);
var enumerableType = parameterType.GetInterfaces()
.Where(type => type.IsGenericType && type.GetGenericTypeDefinition() == typeof(IEnumerable<>))
.Select(type => type.GetGenericArguments().First()).FirstOrDefault();
if (enumerableType != null)
{
var castedThis = Convert.ChangeType(Parameter, enumerableType);
var castedOther = Convert.ChangeType(other.Parameter, enumerableType);
var isEqual = castedThis.SequenceEqual(castedOther);
}
But this does not work because Convert.ChangeType() returns an object. And of course object does not implement SequenceEqual().
How do I get this working? Thanks for any tipps!
Best regards,
Oliver
Given that you have a generic container that you want to compare various generic items, you don't want to be hard coding in various specific equality checks for certain types. There are going to be lots of situations where the default equality comparison won't work for what some particular caller is trying to do. The comments have numerous different examples of problems that can come up, but also just consider the many many classes out there who's default equality is a reference comparison by for which someone might want a value comparison. You can't have this equality comparer just hard code in a solution for all of those types.
The solution of course is easy. Let the caller provide their own equality implementation, which in C#, means an IEqualityComparer<T>. Your class can become:
public class MyClass<T> : IEquatable<MyClass<T>>
{
private IEqualityComparer<T> comparer;
public MyClass(IEqualityComparer<T> innerComparer = null)
{
comparer = innerComparer ?? EqualityComparer<T>.Default;
}
public T Parameter { get; }
...
}
And now by default the default comparer will be used for any given type, but the caller can always specify a non-default comparer for any type that needs different equality semantics.
Effectively you want a way to say
var castedThis = (IEnumerable<U>)Convert.ChangeType(Parameter, enumerableType);
where T is IEnumerable<U> and U is dynamic.
I don't think you can do that.
If you are happy with some boxing though, you can use the non-generic IEnumerable interface:
public bool Equals(MyClass<T> other)
{
var parameterType = typeof(T);
if (typeof(IEnumerable).IsAssignableFrom(parameterType))
{
var castedThis = ((IEnumerable)this.Parameter).GetEnumerator();
var castedOther = ((IEnumerable)other.Parameter).GetEnumerator();
try
{
while (castedThis.MoveNext())
{
if (!castedOther.MoveNext())
return false;
if (!Convert.Equals(castedThis.Current, castedOther.Current))
return false;
}
return !castedOther.MoveNext();
}
finally
{
(castedThis as IDisposable)?.Dispose();
(castedOther as IDisposable)?.Dispose();
}
}
else
{
return EqualityComparer<T>.Default.Equals(this.Parameter, other.Parameter);
}
}
If you are not happy with the boxing, then you can use reflection to construct and call SequenceEqual (as inspired by How do I invoke an extension method using reflection?):
public bool Equals(MyClass<T> other)
{
var parameterType = typeof(T);
if (typeof(IEnumerable).IsAssignableFrom(parameterType))
{
var enumerableType = parameterType.GetGenericArguments().First();
var sequenceEqualMethod = typeof(Enumerable)
.GetMethods(BindingFlags.Static | BindingFlags.Public)
.Where(mi => {
if (mi.Name != "SequenceEqual")
return false;
if (mi.GetGenericArguments().Length != 1)
return false;
var pars = mi.GetParameters();
if (pars.Length != 2)
return false;
return pars[0].ParameterType.IsGenericType && pars[0].ParameterType.GetGenericTypeDefinition() == typeof(IEnumerable<>) && pars[1].ParameterType.IsGenericType && pars[1].ParameterType.GetGenericTypeDefinition() == typeof(IEnumerable<>);
})
.First()
.MakeGenericMethod(enumerableType)
;
return (bool)sequenceEqualMethod.Invoke(this.Parameter, new object[] { this.Parameter, other.Parameter });
}
else
{
return EqualityComparer<T>.Default.Equals(this.Parameter, other.Parameter);
}
}
You can cache the sequenceEqualMethod for better performance.

How to implement unit tests on IEqualityComparer?

I have a class and a comparer for this class that implements IEqualityComparer:
class Foo
{
public int Int { get; set; }
public string Str { get; set; }
public Foo(int i, string s)
{
Int = i;
Str = s;
}
private sealed class FooEqualityComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
if (ReferenceEquals(x, y)) return true;
if (ReferenceEquals(x, null)) return false;
if (ReferenceEquals(y, null)) return false;
if (x.GetType() != y.GetType()) return false;
return x.Int == y.Int && string.Equals(x.Str, y.Str);
}
public int GetHashCode(Foo obj)
{
unchecked
{
return (obj.Int * 397) ^ (obj.Str != null ? obj.Str.GetHashCode() : 0);
}
}
}
public static IEqualityComparer<Foo> Comparer { get; } = new FooEqualityComparer();
}
The two methods Equals and GetHashCode are used for example in List.Except via an instance of the comparer.
My question is: how to implement properly unit tests on this comparer? I want to detect if someone adds a public property in Foo without modifying the comparer, because in this case the comparer becomes invalid.
If I do something like:
Assert.That(new Foo(42, "answer"), Is.EqualTo(new Foo(42, "answer")));
This cannot detect that a new property was added, and that this property differs in the two objects.
Is there any way to do this?
If it is possible, can we add an attribute to a property to say that this property is not relevant in the comparison?
You can use reflection to get the properties of the type, e.g.:
var knownPropNames = new string[]
{
"Int",
"Str",
};
var props = typeof(Foo).GetProperties(BindingFlags.Public | BindingFlags.Instance);
var unknownProps = props
.Where(x => !knownPropNames.Contains(x.Name))
.Select(x => x.Name)
.ToArray();
// Use assertion instead of Console.WriteLine
Console.WriteLine("Unknown props: {0}", string.Join("; ", unknownProps));
This way, you can implement a test that fails if any properties are added. Of course, you'd have to add new properties to the array at the beginning. As using reflection is an expensive operation from a performance point of view, I'd propose to use it in the test, not in the comparer itself if you need to compare lots of objects.
Please also note the use of the BindingFlags parameter so you can restrict the properties to only the public ones and the ones on instance-level.
Also, you can define a custom attribute that you use to mark properties that are not relevant. For example:
[AttributeUsage(AttributeTargets.Property)]
public class ComparerIgnoreAttribute : Attribute {}
You can apply it to a property:
[ComparerIgnore]
public decimal Dec { get; set; }
In addition, you'd have to extend the code that discovers unknown properties:
var unknownProps = props
.Where(x => !knownPropNames.Contains(x.Name)
&& !x.GetCustomAttributes(typeof(ComparerIgnoreAttribute)).Any())
.Select(x => x.Name)
.ToArray();
Basically you could check all the properties you want to check in Equals via reflection. To filter some of them out use an attribute on those properties:
class Foo
{
[MyAttribute]
public string IgnoredProperty { get; set; }
public string MyProperty { get; set; }
}
Now in your comparer check for that specific attribute. Afterwards compare every property that is contained in the remaining list via PropertyInfo.GetValue
class MyComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
var properties = this.GetType().GetProperties()
.Where(x => "Attribute.IsDefined(x, typeof(MyAttribute));
var equal = true;
foreach(var p in properties)
equal &= p.GetValue(x, null) == p.GetValue(y, null);
return equal;
}
}
However you should have some good pre-checks within GetHashCode to avoid unneccessary calls to this slow method.
EDIT: As you´ve mentioned ReSharper, I assume as you provide the actual properties to be validated at runtime even R# doesn´t know a good way to implement GetHashCode. You will need some properties that will allways be available on your type and that provide a good enough idea of what might be considered equal. All theadditional properties however should only go into the expensive Equals-method.
EDIT2: As mentioned in the comments doing reflection within Equals or even GetHashCode is a bad idea as it´s usually quite slow and can often be avoided. If you know the properties to be checked for eqality at compile-time you should definitly include them within those two methods as doing so gives you much more safety. When you find yourself really to need this because you have to many properties you probably have some basic problem as your class is doing too much.
I guess you can check properties count inside the comparer. Something like this:
private sealed class FooEqualityComparer : IEqualityComparer<Foo>
{
private List<bool> comparisonResults = new List<bool>();
private List<Func<Foo, Foo, bool>> conditions = new List<Func<Foo, Foo, bool>>{
(x, y) => x.Int == y.Int,
(x, y) => string.Equals(x.Str, y.Str)
};
private int propertiesCount = typeof(Foo)
.GetProperties(BindingFlags.Public | BindingFlags.Instance)
//.Where(someLogicToExclde(e.g attribute))
.Count();
public bool Equals(Foo x, Foo y)
{
if (ReferenceEquals(x, y)) return true;
if (ReferenceEquals(x, null)) return false;
if (ReferenceEquals(y, null)) return false;
if (x.GetType() != y.GetType()) return false;
//has new property which is not presented in the conditions list and not excluded
if (conditions.Count() != propertiesCount) return false;
foreach(var func in conditions)
if(!func(x, y)) return false;//returns false on first mismatch
return true;//only if all conditions are satisfied
}
public int GetHashCode(Foo obj)
{
unchecked
{
return (obj.Int * 397) ^ (obj.Str != null ? obj.Str.GetHashCode() : 0);
}
}
}

Each Property-Value in a MyObject-list must be unique

Let's say I have the following object:
public class MyObject
{
public string MyValue { get; set; }
}
And in another class I have a list of these objects:
public class MyClass
{
private List<MyObject> _list;
public MyClass(List<MyObject> myObjects)
{
_list = myObjects;
}
public bool AllUniqueValues()
{
...
}
}
I want to check if all MyObjects in the list have an unique (non-duplicated) Value. When I use the following it works:
public bool AllUnique()
{
return _list.All(x => _list.Count(y => String.Equals(y.Value, x.Value)) == 1);
}
But I have the feeling this can be done easier / more elegant. So, my question, is there a better / more elegant approach to check if all MyObjects have a non-duplicated Value, and if so, how?
I find this quite elegant:
public static class EnumerableExtensions
{
public static bool AllUnique<TSource, TResult>(this IEnumerable<TSource> enumerable,
Func<TSource, TResult> selector)
{
var uniques = new HashSet<TResult>();
return enumerable.All(item => uniques.Add(selector(item)));
}
}
And now your code becomes:
var allUnique = _list.AllUnique(i => i.MyValue);
One of many way to do it:
return !_list.GroupBy(c=>c.MyValue).Any(c=>c.Count() > 1);
At least it is a little bit more clear.
The most elegant way of solving this is using a set data structure. An unordered collection of unique elements. In .NET, you need to use HashSet<T>.
You can either override Equals and GetHashCode of MyObject to provide what equality means in your case, or implement an IEqualityComparer<T>.
If you instantiate HashSet<T> and you don't provide an IEqualityComparer<T> implementation, then it will use your overrides, otherwise it will use the whole implementation. Usually you implement equality comparers if there're more than a meaning of equality for the same object.
I might still need an ordered collection of elements
If you still need to store your objects in order, you can both store the elements in both the HashSet<T> and List<T> in parallel. What you get with HashSet<T> is a practically O(1) access to your items when you need check if an item exists, get one or perform some supported operations in the collection, since it's a hashed collection, it won't need to iterate it entirely to find the element.
There are many ways to do it, but personally, I'd do the following:
public bool AllUnique()
{
return _list.GroupBy(x => x.MyValue).Count() == _list.Count();
}

What is the proper way to set up a always false IEqualityComparer<T>?

I have a case where two objects can be compared many different ways for equality. For example:
public class HeightComparer : IEqualityComparer<Person> {
public bool Equals(Person x, Person y) {
return x.Height.Equals(y.Height);
}
public int GetHashCode(Person obj) {
return obj.Height;
}
}
And I use these comparers in Dictionary<Person,Person>(IEqualityComparer<Person>) for various methods. How would you make a comparer that guarantees each person is unique? I came up with the following, but it runs slow since the GetHashCode() method often returns the same value.
public class NullPersonComparer : IEqualityComparer<Person> {
public bool Equals(Person x, Person y) {
return false; // always unequal
}
public int GetHashCode(Person obj) {
return obj.GetHashCode();
}
}
I could return the same value of 0 from GetHashCode(Person obj) but it still is slow populating the dictionary.
Edit
Here is a use case:
Dictionary<Person, Person> people = new Dictionary<Person, Person>(comparer);
foreach (string name in Names)
{
Person person= new Person(name);
Person realPerson;
if (people.TryGetValue(person, out realPerson))
{
realPerson.AddName(name);
}
else
{
people.Add(person, person);
}
}
If the type has not overridden the Equals or GetHashCode methods then their default implementations, from object, do what you want, namely provide equality based on their identity, rather than their value. You can use EqualityComparer<Person>.Default to get an IEqualityComparer that uses those semantics if you want.
If the Equals method has been overridden to provide some sort of value semantics, but you don't want that, you want identity semantics, then you can use object.ReferenceEquals in your own implementation:
public class IdentityComparer<T> : IEqualityComparer<T>
{
public bool Equals(T x, T y)
{
return object.ReferenceEquals(x, y);
}
public int GetHashCode(T obj)
{
return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode(obj);
}
}

Use custom classes IEquality by default rather than specifying it as a parameter?

Is it possible to have the Dictionary<> class use the IEqualityComparer specified inside the class it's using as its key rather than specifying it as a parameter every time I construct it?
public class mytest : IEqualityComparer<mytest>
{
public string name = "foo";
bool IEqualityComparer<mytest>.Equals(mytest x, mytest y) { return x.name == y.name; }
int IEqualityComparer<mytest>.GetHashCode(mytest obj) { return obj.name.GetHashCode(); }
public override int GetHashCode() { return name.GetHashCode(); }
}
...
var a = new Dictionary<mytest, int>();
a.Add(new mytest(), 1);
a.Add(new mytest(), 2);//does not throw error...bad!
var b = new Dictionary<mytest, int>(new mytest());
b.Add(new mytest(), 1);
b.Add(new mytest(), 2);//will throw error...good!
you just need to override the Equals and GetHashCode methods of mytest or implement IEquatable
http://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.80).aspx
Dictionary requires an equality implementation to determine whether
keys are equal.
You can specify an implementation of the
IEqualityComparer generic interface by using a constructor that
accepts a comparer parameter; if you do not specify an implementation,
the default generic equality comparer EqualityComparer.Default is
used.
If type TKey implements the System.IEquatable generic interface,
the default equality comparer uses that implementation.

Categories