Why does my dynamic IEqualityComparer not work? - c#

I have a class
public class Foo
{
public int ID { get; set; }
}
and I've implemented a LinqEqualityComparer to allow dynamic IEqualityComparer tests for the Except extenion method.
public class LinqEqualityComparer<T> : IEqualityComparer<T>
{
protected Func<T, T, bool> Comparison { get; set; }
public LinqEqualityComparer(Func<T, T, bool> comparison)
{
Comparison = comparison;
}
public bool Equals(T x, T y)
{
return Comparison(x, y);
}
public int GetHashCode(T obj)
{
return obj.GetHashCode();
}
}
I've created the following code to test it:
IEnumerable<Foo> settings = new Foo[]
{
new Foo{ID = 1},
new Foo{ID = 2}
};
IEnumerable<Foo> currentSettings = new Foo[]
{
new Foo{ID = 1},
new Foo{ID = 2},
new Foo{ID = 3}
};
IEqualityComparer<Foo> comparer = new LinqEqualityComparer<Foo>((x, y) => x.ID == y.ID);
IEnumerable<Foo> missing = currentSettings.Except(settings, comparer);
However Foos 1,2 and 3 are all present in the 'missing' variable.
Why does this LinqEqualityComparer not work?

Because your equality comparer does not implement GetHashCode correctly. The GetHashCode implementation must produce the same code for elements that compare equal. This does not happen here because the equality comparison is customized while the hash codes are not generated accordingly.
To make this work you would need to do one of two things:
Make the comparer accept the hash code implementation as an additional argument, i.e. x => x.ID.GetHashCode() and forward to that. This is easiest and what you should do in practice.
Modify GetHashCode in such a way that it is an aggregate function of the hash codes of the properties that take part in the comparison (here that is the ID property) -- a straight xor of the individual hash codes would work (even though it might not be optiomal).
That leaves you with the problem of how to detect which
properties are compared. To be able to answer that question automatically you would need to accept an expression tree instead of a delegate for the comparison, i.e. an Expression<Func<T, T, bool>> and then visit the expression tree to determine what to do. That's bound to not be easy going.

Related

Collection of objects with two keys

Imagine this scenario: I need to manipulate (add, search and delete) items from a list of objects of type Book.
class Book{
int Id {get; set;}
string Title {get; set;}
string Author {get; set;}
int Year {get; set;}
// more properties
}
Constriants:
Id should be unique within the collection of Books
Title should be unique within the collection of Books
What I have so far, a Dictionary<int, Book> that has Id as a key and Book as a value. But in this case, If I want to add a new book to the dictionary I have to loop through all the values to check whether the Title is duplicate or not.
I start thinking about creating a HashSet only for Titles or having a second dictionary Dictionary<string, Book> that has Title as a key.
Any suggestion How to handle this scenario?
Edit:
As #David mentioned, I forgot to tell that my main concern here is performance. I want to lookup objects by Id and Title in the fastest way (O(1)).
You might use Tuple as the key:
var collection = new Dictionary<Tuple<int, string>, Book> (...);
var key = new Tuple<int, string>(1, "David"); // <<-----------
if(!collection.ContainsKey(key))
collection [key] = new Book(...);
Note that Tuple has its built in Equals() to make your life easier.
Update:
#AustinWBryan mentioned using ValueTuples (C# 7.0 feature) to replace Tuple, highly recommended. For more info about ValueTuples, refer to this link.
To ensure that both sides of the composite key are also unique a tuple won't cut it. Instead make your own key that checks for this in the equality checker.
public struct CompositeKey<T1, T2> : IEquatable<CompositeKey<T1, T2>>
{
private static readonly EqualityComparer<T1> t1Comparer = EqualityComparer<T1>.Default;
private static readonly EqualityComparer<T2> t2Comparer = EqualityComparer<T2>.Default;
public T1 Key1;
public T2 Key2;
public CompositeKey(T1 key1, T2 key2)
{
Key1 = key1;
Key2 = key2;
}
public override bool Equals(object obj) => obj is CompositeKey<T1, T2> && Equals((CompositeKey<T1, T2>)obj);
public bool Equals(CompositeKey<T1, T2> other)
{
return t1Comparer.Equals(Key1, other.Key1)
&& t2Comparer.Equals(Key2, other.Key2);
}
public override int GetHashCode() => Key1.GetHashCode();
}
So the dictionary works on buckets. It puts all the keys into buckets based on the hash code generated by GetHashCode(). Then it searches that bucket using a for loop over Equals(). The idea is that buckets should be as small as possible (ideally one item).
So we can control when a key will match, and how many buckets/items there are by controlling the hash code. If we return a constant hash code like 0, then everything is in the same bucket and it's down to the equality method to compare every item.
This comparer only returns the hash of the first key item. Assuming the first key item should be unique this is enough. Each bucket should still be one item, and when doing a lookup (that uses the full equals method) that's when the second key is also checked to ensure the type is the same value.
If you want to use ValueTuple as the key type you can pass in a custom comparer to the dictionary to achieve the same effect.
public class CompositeValueTupleComparer<T1, T2> : IEqualityComparer<(T1, T2)>
{
private static readonly EqualityComparer<T1> t1Comparer = EqualityComparer<T1>.Default;
private static readonly EqualityComparer<T2> t2Comparer = EqualityComparer<T2>.Default;
public bool Equals((T1, T2) x, (T1, T2) y) =>
t1Comparer.Equals(x.Item1, y.Item1) && t2Comparer.Equals(x.Item2, y.Item2);
public int GetHashCode((T1, T2) obj) => obj.Item1.GetHashCode();
}
new Dictionary<(int, string), Book>(new CompositeValueTupleComparer<int, string>());
It seems like both the ID and Name are going to be unique, as in, you shouldn't be able to use the same ID twice, regardless if the name has been used already. Otherwise, we'd end up with dict[3] referring to two different values.
Tuples or structs can't give that behavior, and still require you to loop. What you should instead do, is use a class similar to the one I've created:
public class TwoKeyDictionary<TKey1, TKey2, TValue>
{
public readonly List<TKey1> firstKeys = new List<TKey1>();
public readonly List<TKey2> secondKeys = new List<TKey2>();
public readonly List<TValue> values = new List<TValue>();
public void Add(TKey1 key1, TKey2 key2, TValue value)
{
if (firstKeys.Contains(key1)) throw new ArgumentException();
if (secondKeys.Contains(key2)) throw new ArgumentException();
firstKeys.Add(key1);
secondKeys.Add(key2);
values.Add(value);
}
public void Remove(TKey1 key) => RemoveAll(firstKeys.IndexOf(key));
public void Remove(TKey2 key) => RemoveAll(secondKeys.IndexOf(key));
private void RemoveAll(int index)
{
if (index < 1) return;
firstKeys.RemoveAt(index);
secondKeys.RemoveAt(index);
values.RemoveAt(index);
}
public TValue this[TKey1 key1]
{
get
{
int index = firstKeys.IndexOf(key1);
if (index < 0) throw new IndexOutOfRangeException();
return values[firstKeys.IndexOf(key1)];
}
}
public TValue this[TKey2 key2]
{
get
{
int index = secondKeys.IndexOf(key2);
if (index < 0) throw new IndexOutOfRangeException();
return values[secondKeys.IndexOf(key2)];
}
}
}
And then you can use it like this:
var twoDict = new TwoKeyDictionary<int, string, float>();
twoDict.Add(0, "a", 0.5f);
twoDict.Add(2, "b", 0.25f);
Console.WriteLine(twoDict[0]); // Prints "0.5"
Console.WriteLine(twoDict[2]); // Prints "0.25"
Console.WriteLine(twoDict["a"]); // Prints "0.5"
Console.WriteLine(twoDict["b"]); // Prints "0.25"
twoDict.Add(0, "d", 2); // Throws exception: 0 has already been added, even though "d" hasn't
twoDict.Add(1, "a", 5); // Throws exception: "a" has already been added, even though "1" hasn't
The TwoKeyDictionary would need to implement ICollection, IEnumerable, etc., to do the full behavior stuff

Each Property-Value in a MyObject-list must be unique

Let's say I have the following object:
public class MyObject
{
public string MyValue { get; set; }
}
And in another class I have a list of these objects:
public class MyClass
{
private List<MyObject> _list;
public MyClass(List<MyObject> myObjects)
{
_list = myObjects;
}
public bool AllUniqueValues()
{
...
}
}
I want to check if all MyObjects in the list have an unique (non-duplicated) Value. When I use the following it works:
public bool AllUnique()
{
return _list.All(x => _list.Count(y => String.Equals(y.Value, x.Value)) == 1);
}
But I have the feeling this can be done easier / more elegant. So, my question, is there a better / more elegant approach to check if all MyObjects have a non-duplicated Value, and if so, how?
I find this quite elegant:
public static class EnumerableExtensions
{
public static bool AllUnique<TSource, TResult>(this IEnumerable<TSource> enumerable,
Func<TSource, TResult> selector)
{
var uniques = new HashSet<TResult>();
return enumerable.All(item => uniques.Add(selector(item)));
}
}
And now your code becomes:
var allUnique = _list.AllUnique(i => i.MyValue);
One of many way to do it:
return !_list.GroupBy(c=>c.MyValue).Any(c=>c.Count() > 1);
At least it is a little bit more clear.
The most elegant way of solving this is using a set data structure. An unordered collection of unique elements. In .NET, you need to use HashSet<T>.
You can either override Equals and GetHashCode of MyObject to provide what equality means in your case, or implement an IEqualityComparer<T>.
If you instantiate HashSet<T> and you don't provide an IEqualityComparer<T> implementation, then it will use your overrides, otherwise it will use the whole implementation. Usually you implement equality comparers if there're more than a meaning of equality for the same object.
I might still need an ordered collection of elements
If you still need to store your objects in order, you can both store the elements in both the HashSet<T> and List<T> in parallel. What you get with HashSet<T> is a practically O(1) access to your items when you need check if an item exists, get one or perform some supported operations in the collection, since it's a hashed collection, it won't need to iterate it entirely to find the element.
There are many ways to do it, but personally, I'd do the following:
public bool AllUnique()
{
return _list.GroupBy(x => x.MyValue).Count() == _list.Count();
}

Having trouble removing duplicates in List<class>

I have a class and list:
public class className
{
public string firstParam { get; set; }
public string secondParam { get; set; }
}
public static List<className> listName = new List<className>();
The list includes (for example):
Apple Banana
Corn Celery
Corn Celery
Corn Grapes
Raisins Pork
I am trying to edit the list (or create a new list) to get:
Apple Banana
Corn Celery
Corn Grapes
Raisins Pork
I have tried:
var listNoDupes = listName.Distinct();
And:
IEnumerable<className> listNoDupes = listName.Distinct();
But both return the list in the same condition as before, with duplicates.
You need to override/implement Equals() and GetHashCode(), right now you are listing distinct instances and they are correctly ALL distinct/unique from each other.
The problem you are running into is the identity of the objects is not what you think. Your intuition is telling you that the identity is the combination of firstParam and secondParam. What truly is happening is each distinct instance of className has its own identity that does not rely on the implementation of the object. You will need to override the methods provided via System.Object, mainly Equals and GetHashCode although you might get away with not overriding GetHashCode (this will be needed for hash sets to work properly.)
If your class only contains those two fields then instead of implementing Equals and GetHashCode You can also do:
var listNoDupes = listName.GroupBy(r => new { r.firstParam, r.secondParam })
.Select(grp => grp.First())
.ToList();
Or you can get an IEnumerable<T> back like:
IEnumerable<className> listNoDupes =
listName
.GroupBy(r => new { r.firstParam, r.secondParam })
.Select(grp => grp.First());
The code above would group on the properties firstParam and secondParam, later grp.First would return you a single item from the group and you will end up a single item from each group, (no duplicates)
There is the third possibility - use Distinct method version that takes IEqualityComparer. Unfortunately, C# does not support creating anonymous, temporary implementations of interfaces. We can create helper class and extension:
public static class IEnumerableExtensions
{
public class LambdaEqualityComparer<T> : IEqualityComparer<T>
{
private Func<T, T, bool> comparer;
private Func<T, int> hash;
public LambdaEqualityComparer(Func<T, T, bool> comparer,
Func<T, int> hash)
{
this.comparer = comparer;
this.hash = hash;
}
public bool Equals(T x, T y)
{
return comparer(x, y);
}
public int GetHashCode(T x)
{
return hash(x);
}
}
public static IEnumerable<T> Distinct<T>(this IEnumerable<T> elems,
Func<T, T, bool> comparer,
Func<T, int> hash)
{
return elems.Distinct(new LambdaEqualityComparer<T>(comparer, hash));
}
}
and then we can provide lambdas for Distinct method:
var filteredList = myList.Distinct((x, y) => x.firstParam == y.firstParam &&
x.secondParam == y.secondParam,
x => 17 * x.firstParam.GetHashCode() + x.secondParam.GetHashCode());
This allows you to distinct objects on single shot, without implementing Equals and GetHashCode. If, for example, there is a single place in the project, where you are calling such Distinct, this is probably enough to use this extension. If, on the other hand, identity of the className objects is a concept that spans through many methods and classes, for sure it will be better to define simply Equals and GetHashCode.

How to assert that two list contains elements with the same public properties in NUnit?

I want to assert that the elements of two list contains values that I expected, something like:
var foundCollection = fooManager.LoadFoo();
var expectedCollection = new List<Foo>()
{
new Foo() { Bar = "a", Bar2 = "b" },
new Foo() { Bar = "c", Bar2 = "d" }
};
//assert: I use AreEquivalent since the order does not matter
CollectionAssert.AreEquivalent(expectedCollection, foundCollection);
However the above code will not work (I guess because .Equals() does not return true for different objects with the same value). In my test, I only care about the public property values, not whether the objects are equal. What can I do to make my assertion?
REWORKED ANSWER
There is a CollectionAssert.AreEqual(IEnumerable, IEnumerable, IComparer) overload to assert that two collections contain the same objects in the same order, using an IComparer implementation to check the object equivalence.
In the scenario described above, the order is not important. However, to sufficiently handle also the situation where there are multiple equivalent objects in the two collections, it becomes necessary to first order the objects in each collection and use one-by-one comparison to ensure that also the number of equivalent objects are the same in the two collections.
Enumerable.OrderBy provides an overload that takes an IComparer<T> argument. To ensure that the two collections are sorted in the same order, it is more or less required that the types of the identifying properties implement IComparable. Here is an example of a comparer class that implements both the IComparer and IComparer<Foo> interfaces, and where it is assumed that Bar takes precedence when ordering:
public class FooComparer : IComparer, IComparer<Foo>
{
public int Compare(object x, object y)
{
var lhs = x as Foo;
var rhs = y as Foo;
if (lhs == null || rhs == null) throw new InvalidOperationException();
return Compare(lhs, rhs);
}
public int Compare(Foo x, Foo y)
{
int temp;
return (temp = x.Bar.CompareTo(y.Bar)) != 0 ? temp : x.Bar2.CompareTo(y.Bar2);
}
}
To assert that the objects in the two collections are the same and comes in equal numbers (but not necessarily in the same order to begin with), the following lines should do the trick:
var comparer = new FooComparer();
CollectionAssert.AreEqual(
expectedCollection.OrderBy(foo => foo, comparer),
foundCollection.OrderBy(foo => foo, comparer), comparer);
No, NUnit has no such mechanism as of current state. You'll have to roll your own assertion logic. Either as separate method, or utilizing Has.All.Matches:
Assert.That(found, Has.All.Matches<Foo>(f => IsInExpected(f, expected)));
private bool IsInExpected(Foo item, IEnumerable<Foo> expected)
{
var matchedItem = expected.FirstOrDefault(f =>
f.Bar1 == item.Bar1 &&
f.Bar2 == item.Bar2 &&
f.Bar3 == item.Bar3
);
return matchedItem != null;
}
This of course assumes you know all relevant properties upfront (otherwise, IsInExpected will have to resort to reflection) and that element order is not relevant.
(And your assumption was correct, NUnit's collection asserts use default comparers for types, which in most cases of user defined ones will be object's ReferenceEquals)
Using Has.All.Matches() works very well for comparing a found collection to the expected collection. However, it is not necessary to define the predicate used by Has.All.Matches() as a separate function. For relatively simple comparisons, the predicate can be included as part of the lambda expression like this.
Assert.That(found, Has.All.Matches<Foo>(f =>
expected.Any(e =>
f.Bar1 == e.Bar1 &&
f.Bar2 == e.Bar2 &&
f.Bar3 == e.Bar3)));
Now, while this assertion will ensure that every entry in the found collection also exists in the expected collection, it does not prove the reverse, namely that every entry in the expected collection is contained in the found collection. So, when it is important to know that found and expected contain are semantically equivalent (i.e., they contain the same semantically equivalent entries), we must add an additional assertion.
The simplest choice is to add the following.
Assert.AreEqual(found.Count(), expected.Count());
For those who prefer a bigger hammer, the following assertion could be used instead.
Assert.That(expected, Has.All.Matches<Foo>(e =>
found.Any(f =>
e.Bar1 == f.Bar1 &&
e.Bar2 == f.Bar2 &&
e.Bar3 == f.Bar3)));
By using the first assertion above in conjunction with either the second (preferred) or third assertion, we have now proven that the two collections are semantically the same.
Have you tried something like this?
Assert.That(expectedCollection, Is.EquivalentTo(foundCollection))
I had a similar problem. Listing contributors, which contains "commenters" and other ppl... I want to get all the comments and from that derive the creators, but I'm ofc only interested in unique creators. If someone created 50 comments I only want her name to appear once. So I write a test to see that the commenters are int the GetContributors() result.
I may be wrong, but what I think your after (what I was after when I found this post) is to assert that there are exactly one of each item in one collection, found in another collection.
I solved this like so:
Assert.IsTrue(commenters.All(c => actual.Count(p => p.Id == c.Id) == 1));
If you also want the resulting list not to contain other items than expected you could just compare the length of the lists as well..
Assert.IsTrue(commenters.length == actual.Count());
I hope this is helpful, if so, I'd be very grateful if you would rate my answer.
To perform equivilance operations on complex types you need to implement IComaprable.
http://support.microsoft.com/kb/320727
Alternatively you could use recursive reflection, which is less desirable.
One option is to write custom constraints to compare the items. Here's a nice article on the subject: http://www.davidarno.org/2012/07/25/improving-nunit-custom-constraints-with-syntax-helpers/
I recommend against using reflection or anything complex, it just adds more work/maintenace.
Serialize the object (i recommend json) and string compare them.
I'm unsure why you object to order by but I'd still recommend it as it will save a custom compare's for each type.
And it automatically works with domain objects change.
Example (SharpTestsEx for fluent)
using Newtonsoft.Json;
using SharpTestsEx;
JsonConvert.SerializeObject(actual).Should().Be.EqualTo(JsonConvert.SerializeObject(expected));
You can write it as a simple extensions and make it more readable.
public static class CollectionAssertExtensions
{
public static void CollectionAreEqual<T>(this IEnumerable<T> actual, IEnumerable<T> expected)
{
JsonConvert.SerializeObject(actual).Should().Be.EqualTo(JsonConvert.SerializeObject(expected));
}
}
and then using your example call it like so:
var foundCollection = fooManager.LoadFoo();
var expectedCollection = new List<Foo>()
{
new Foo() { Bar = "a", Bar2 = "b" },
new Foo() { Bar = "c", Bar2 = "d" }
};
foundCollection.CollectionAreEqual(foundCollection);
You'll get an assert message like so:
...:"a","Bar2":"b"},{"Bar":"d","Bar2":"d"}]
...:"a","Bar2":"b"},{"Bar":"c","Bar2":"d"}]
...__________________^_____
Simple code explaining how to use the IComparer
using System.Collections;
using System.Collections.Generic;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace CollectionAssert
{
[TestClass]
public class UnitTest1
{
[TestMethod]
public void TestMethod1()
{
IComparer collectionComparer = new CollectionComparer();
var expected = new List<SomeModel>{ new SomeModel { Name = "SomeOne", Age = 40}, new SomeModel{Name="SomeOther", Age = 50}};
var actual = new List<SomeModel> { new SomeModel { Name = "SomeOne", Age = 40 }, new SomeModel { Name = "SomeOther", Age = 50 } };
NUnit.Framework.CollectionAssert.AreEqual(expected, actual, collectionComparer);
}
}
public class SomeModel
{
public string Name { get; set; }
public int Age { get; set; }
}
public class CollectionComparer : IComparer, IComparer<SomeModel>
{
public int Compare(SomeModel x, SomeModel y)
{
if(x == null || y == null) return -1;
return x.Age == y.Age && x.Name == y.Name ? 0 : -1;
}
public int Compare(object x, object y)
{
var modelX = x as SomeModel;
var modelY = y as SomeModel;
return Compare(modelX, modelY);
}
}
}
This solved my problem using NUnit's Assertion class from the NUnitCore assembly:
AssertArrayEqualsByElements(list1.ToArray(), list2.ToArray());

c# 2D auto expandable collection

I'm looking for a collection.
I need to be able to add elements as if using a 2D integer key, for example .Add(3, 4, element). If I add outside the range of the collection I need the collection to expand, this include negatively, although it can have a limit, for example the range of an Int16 would be good. Every element in the collection can have the same type as each other but I need to specify what that is, for example Set<type> s;
I also need to avoid slow operations such as searching when looking up an element, performance is less important when adding to the collection.
Does anyone have any ideas about what approach to use or best could provide the class in there answer.
If you want a compound key, you can use the Tuple<T1,T2> class in a : Dictionary<Tuple<T1,T2>, TItem>.
var coll = new Dictionary<Tuple<int,int>, AnyClass>();
coll.Add(new Tuple<int,int>(2, 3), new AnyClass("foo"));
coll.Add(new Tuple<int,int>(4, 2), new AnyClass("bar"));
var foo = coll[new Tuple<int,int>(2,3)];
var bar = coll[new Tuple<int,int>(4,2)];
If the syntax is too weird, you may wrap the class like this :
public class Dictionary2d<TKey1, TKey2, TItem> : Dictionary<Tuple<TKey1, TKey2>,TItem>
{
public void Add(TKey1 k1, TKey2, TItem item) {
this.Add(Tuple.Create(k1,k2), item);
}
public TItem this[TKey1 k1, TKey2 k2] {
get { return this[Tuple.Create(k1,k2)]; }
}
}
public class Program
{
static void Main() {
var coll = new Dictionary2d<int,int, AnyClass>();
coll.Add(2, 3, new AnyClass("foo"));
coll.Add(4, 2, new AnyClass("bar"));
var foo = coll[2,3];
var bar = coll[4,2];
}
}
The benefits of using Tuple class, is that the equality and hashcode comparison is natively handled, so even if it's a class, two differents instances of tuple with same values will be considered equals.
It sounds like you want a Dictionary<int, T>.
You can implement this Set<T> by storing its data in a private variable of type Dictionary<int, Dictionary<int, T>>.
You can then store using
public void Add(int key1, int key2, T value)
{
_storage[key1][key2] = value;
}

Categories