Intersect with a custom IEqualityComparer using Linq - c#

Long story short: I have 2 collections of objects. One contains good values (Let's call it "Good"), the other default values (Mr. "Default"). I want the Intersect of the Union between Good and Default, and Default. In other words: Intersect(Union(Good, Default), Default). One might think it resolves as Default, but here is where it gets tricky : I use a custom IEqualityComparer.
I got the following classes :
class MyClass
{
public string MyString1;
public string MyString2;
public string MyString3;
}
class MyEqualityComparer : IEqualityComparer<MyClass>
{
public bool Equals(MyClass item1, MyClass item2)
{
if(item1 == null && item2 == null)
return true;
else if((item1 != null && item2 == null) ||
(item1 == null && item2 != null))
return false;
return item1.MyString1.Equals(item2.MyString1) &&
item1.MyString2.Equals(item2.MyString2);
}
public int GetHashCode(MyClass item)
{
return new { item.MyString1, item.MyString2 }.GetHashCode();
}
}
Here are the characteristic of my collections Good and Default collections :
Default : It's a large set, containing all the wanted { MyString1, MyString2 } pairs, but the MyString3 values are, as you can guess, default values.
Good : It's a smaller set, containing mostly items which are in the Default set, but with some good MyString3 values. It also has some { MyString1, MyString2 } that are outside of the wanted set.
What I want to do is this : Take only the items from Good that are in Default, but add the other items in Default to that.
Here is, what I think is, my best try :
HalfWantedResult = Good.Union(Default, new MyEqualityComparer());
WantedResult= HalfWantedResult.Intersect(Good, new MyEqualityComparer());
I taught it should have worked, but the result I get is basically only the good { MyString1, MyString2 } pairs set, but all coming from the Default set, so I have the default value all across. I also tried switching the Default and Good of the last Intersect, but I get the same result.

First of all this is wrong:
public bool Equals(MyClass item1, MyClass item2)
{
return GetHashCode(item1) == GetHashCode(item2);
}
If the hashcode's are different for sure the corresponding 2 items are different, but if they're equal is not guaranteed that the corresponding 2 items are equal.
So this is the correct Equals implementation:
public bool Equals(MyClass item1, MyClass item2)
{
if(object.ReferenceEquals(item1, item2))
return true;
if(item1 == null || item2 == null)
return false;
return item1.MyString1.Equals(item2.MyString1) &&
item1.MyString2.Equals(item2.MyString2);
}
As Slacks suggested (anticipating me) the code is the following:
var Default = new List<MyClass>
{
new MyClass{MyString1="A",MyString2="A",MyString3="-"},
new MyClass{MyString1="B",MyString2="B",MyString3="-"},
new MyClass{MyString1="X",MyString2="X",MyString3="-"},
new MyClass{MyString1="Y",MyString2="Y",MyString3="-"},
new MyClass{MyString1="Z",MyString2="Z",MyString3="-"},
};
var Good = new List<MyClass>
{
new MyClass{MyString1="A",MyString2="A",MyString3="+"},
new MyClass{MyString1="B",MyString2="B",MyString3="+"},
new MyClass{MyString1="C",MyString2="C",MyString3="+"},
new MyClass{MyString1="D",MyString2="D",MyString3="+"},
new MyClass{MyString1="E",MyString2="E",MyString3="+"},
};
var wantedResult = Good.Intersect(Default, new MyEqualityComparer())
.Union(Default, new MyEqualityComparer());
// wantedResult:
// A A +
// B B +
// X X -
// Y Y -
// Z Z -

You need to check for actual equality, not just hashcode equality.
GetHashCode() is not (and cannot be) collision free, which is why the Equals method is required in the first place.
Also, you can do this much more simply by writing
WantedResult = Good.Concat(Default).Distinct();
The Distinct method will return the first item of each pair of duplicates, so this will return the desired result.
EDIT: That should be
WantedResult = Good.Intersect(Default, new MyEqualityComparer())
.Union(Default, new MyEqualityComparer());

Related

C# Compare two object values

I currently have two objects (of the same type) that may represent any primitive value such as string, int, datetime etc.
var valueX = ...;
var valueY = ...;
Atm I compare them on string level like this
var result = string.Compare(fieldValueX.ToString(), fieldValueY.ToString(), StringComparison.Ordinal);
But I need to compare them on type level (as ints if those happen to be ints
int i = 0;
int j = 2;
i.CompareTo(j);
, as dates if they happen to be date etc), something like
object.Compare(x,y);
That returns -1,0,1 in the same way. What are the ways to achieve that ?
Thanks for your answers, the correct way was to check if the object implements IComparable and if it does - make a typecast and call CompareTo
if (valueX is IComparable)
{
var compareResult = ((IComparable)valueX).CompareTo((IComparable)valueY);
}
Object1.Equals(obj1, obj2) wont work unless #object is referencing the same object.
EG:
var obj1 = new MyObject();
var obj2 = new MyObject();
This will return "False" for Object1.Equals(obj1, obj2) as they are different ref's
var obj1 = new MyObject();
var obj2 = obj1;
This will return "True" for Object1.Equals(obj1, obj2) as they are the same ref.
Solution:
You will most likely need to write an extension method that overrides Object.Equals. either create a custom object comparer for a specific type (See here for custom object comparer:) or you can dynamically go through each property and compare.
There's several options to do this.
Override Object.Equal
You can override the Object.Equal() method in the class, and then determine what makes the objects equal there. This can also let you cleverly decide what to compare, since it appears those objects can be multiple data types. Inside this override, you'll need to handle each possible case. You can read more about this option here:
https://msdn.microsoft.com/en-us/library/bsc2ak47(v=vs.110).aspx
It should be noted by default, Object.Equal() will compare your objects references.
Implement IComparable
IComparable is a neat interface that gives an object Compare. As the comments mention, this will let you define how to compare the objects based on whatever criteria you want.
This option gets covered here: https://msdn.microsoft.com/en-us/library/system.icomparable(v=vs.110).aspx
Implement CompareBy() Methods
Alternatively, you can implement methods for each possible type, ie CompareByInt() or CompareByString(), but this method depends on you knowing what you're going to have when you go to do it. This will also have the negative effect of making code more difficult to maintain, as there's many more methods involved.
You can write a GeneralComparer with a Compare method, overloaded as necessary.
For types that must perform a standard comparison you can use EqualityComparer<T>.Default; for other types you write your own comparison function. Here's a sample:
static class GeneralComparer
{
public static int Compare(int x, int y)
{
//for int, use the standard comparison:
return EqualityComparer<int>.Default.Equals(x, y);
}
public static int Compare(string x, string y)
{
//for string, use custom comparison:
return string.Compare(x, y, StringComparison.Ordinal);
}
//overload for DateTime
//overload for MyType
//overload for object
//...
}
The correct overload is chosen at runtime.
There's a drawback: if you declare two int (or other specific types) as object, the object overload is called:
object a = 2;
object b = 3;
//this will call the "Compare(object x, object y)" overload!
int comparison = GeneralComparer.Compare(a, b);
converting the objects to dictionary, then following math set(s) concept subtract them, result items should be empty in case they are identically.
public static IDictionary<string, object> ToDictionary(this object source)
{
var fields = source.GetType().GetFields(
BindingFlags.GetField |
BindingFlags.Public |
BindingFlags.Instance).ToDictionary
(
propInfo => propInfo.Name,
propInfo => propInfo.GetValue(source) ?? string.Empty
);
var properties = source.GetType().GetProperties(
BindingFlags.GetField |
BindingFlags.GetProperty |
BindingFlags.Public |
BindingFlags.Instance).ToDictionary
(
propInfo => propInfo.Name,
propInfo => propInfo.GetValue(source, null) ?? string.Empty
);
return fields.Concat(properties).ToDictionary(key => key.Key, value => value.Value); ;
}
public static bool EqualsByValue(this object source, object destination)
{
var firstDic = source.ToFlattenDictionary();
var secondDic = destination.ToFlattenDictionary();
if (firstDic.Count != secondDic.Count)
return false;
if (firstDic.Keys.Except(secondDic.Keys).Any())
return false;
if (secondDic.Keys.Except(firstDic.Keys).Any())
return false;
return firstDic.All(pair =>
pair.Value.ToString().Equals(secondDic[pair.Key].ToString())
);
}
public static bool IsAnonymousType(this object instance)
{
if (instance == null)
return false;
return instance.GetType().Namespace == null;
}
public static IDictionary<string, object> ToFlattenDictionary(this object source, string parentPropertyKey = null, IDictionary<string, object> parentPropertyValue = null)
{
var propsDic = parentPropertyValue ?? new Dictionary<string, object>();
foreach (var item in source.ToDictionary())
{
var key = string.IsNullOrEmpty(parentPropertyKey) ? item.Key : $"{parentPropertyKey}.{item.Key}";
if (item.Value.IsAnonymousType())
return item.Value.ToFlattenDictionary(key, propsDic);
else
propsDic.Add(key, item.Value);
}
return propsDic;
}
originalObj.EqualsByValue(messageBody); // will compare values.
source of the code

Prevent stack overflow while crawling inside objects via reflection in c#

I have this method called MatchNodes: IEnumerable<bool> MatchNodes<T>(T n1, T n2)
Which basically gets every property and field from both T objects (via reflection, and not including properties/fields from base classes) and compares them, returning the result as a IEnumerable of bools.
When it finds a primitive type or string, if just returns the == between them.
When it finds a type derived from a collection, it iterates each member and calls MatchNodes for each of them (ouch).
When it finds any other type, it calls MatchNodes for each property/field.
My solution is obviously asking for a stack overflow exception, but I don't have a clue on how make it better, because I have no idea how deep the objects will go.
Code (try not to cry please, it's ugly as hell):
public static IEnumerable<bool> MatchNodes<T>(T n1, T n2)
{
Func<PropertyInfo, bool> func= null;
if (typeof(T) == typeof(String))
{
String str1 = n1 as String;
String str2 = n2 as String;
func = new Func<PropertyInfo, bool>((property) => str1 == str2);
}
else if (typeof(System.Collections.IEnumerable).IsAssignableFrom(typeof(T)))
{
System.Collections.IEnumerable e1 = (System.Collections.IEnumerable)n1;
System.Collections.IEnumerable e2 = (System.Collections.IEnumerable)n2;
func = new Func<PropertyInfo, bool>((property) =>
{
foreach (var v1 in e1)
{
if (e2.GetEnumerator().MoveNext())
{
var v2 = e2.GetEnumerator().Current;
if (((IEnumerable<bool>)MatchNodes(v1, v2)).All(b => b == true))
{
return false;
}
}
else
{
return false;
}
}
if (e2.GetEnumerator().MoveNext())
{
return false;
}
else return true;
});
}
else if (typeof(T).IsPrimitive || typeof(T) == typeof(Decimal))
{
func = new Func<PropertyInfo, bool>((property) => property.GetValue(n1, null) == property.GetValue(n2, null));
}
else
{
func = new Func<PropertyInfo, bool>((property) =>
((IEnumerable<bool>)MatchNodes(property.GetValue(n1, null),
property.GetValue(n2, null))).All(b => b == true));
}
foreach (PropertyInfo property in typeof(T).GetProperties().Where((property) => property.DeclaringType == typeof(T)))
{
bool result =func(property);
yield return result;
}
}
What I'm looking at is a way to crawl into the objects without calling my method recursively.
EDIT
To clarify, example:
public class Class1 : RandomClassWithMoreProperties{
public string Str1{get;set;}
public int Int1{get;set;}
}
public class Class2{
public List<Class1> MyClassProp1 {get;set;}
public Class1 MyClassProp2 {get;set;}
public string MyStr {get;set;}
}
MatchNodes(n1,n2) where n1.GetType() and n2.GetType() are Class2 would return true if:
Every Class1 object inside MyClassProp1 has the same Str1,Int1 for both objects
MyClassProp2 has the same Str1,Int1 for both objects
MyStr is equal for both objects
And I won't compare any properties from RandomClassWithMoreProperties.
You can use a stack or queue to store the properties you want to compare. It goes along these lines:
var stack = new Stack<Tuple<object, object>>();
// prime the stack
foreach (var prop in n1.GetType().GetProperties())
{
stack.Push(Tuple.Create(prop.GetValue(n1), prop.GetValue(n2));
}
while (stack.Count > 0)
{
var current = stack.Pop();
// if current is promitive: compare
// if current is enumerable: push all elements as Tuples on the stack
// else: push all properties as tuples on the stack
}
If you use a Queue instead of a Stack you get a BFS instead of a DFS. Also you should probably keep track of already visited nodes in a HashSet. You also might want to add a check to make sure the types of n1 and n2 are the same.
A good approach here is to keep a breadcrumb trail of objects that you've touched, and passing that forward as you delve deeper. For each new object, check to see whether it is in the graph of objects that you have already seen, and if it is, short circuit and bail out (you've already seen that node). A stack is probably appropriate.
You are not likely to get stack overflows by comparing an acyclic object graph- it's when you end up with loops that things blow up.
Just keep track of the objects you already visited, in a List<object> for example (or Set<> or anything like that)...
Also, any recursion can be un-recursed using the stack that you'll control manually.

Intersect between two lists not working

I have two lists see below.....result is coming back as empty
List<Pay>olist = new List<Pay>();
List<Pay> nlist = new List<Pay>();
Pay oldpay = new Pay()
{
EventId = 1,
Number = 123,
Amount = 1
};
olist.Add(oldpay);
Pay newpay = new Pay ()
{
EventId = 1,
Number = 123,
Amount = 100
};
nlist.Add(newpay);
var Result = nlist.Intersect(olist);
any clue why?
You need to override the Equals and GetHashCode methods in your Pay class, otherwise Intersect doesn't know when 2 instances are considered equal. How could it guess that it is the EventId that determines equality? oldPay and newPay are different instances, so by default they're not considered equal.
You can override the methods in Pay like this:
public override int GetHashCode()
{
return this.EventId;
}
public override bool Equals(object other)
{
if (other is Pay)
return ((Pay)other).EventId == this.EventId;
return false;
}
Another option is to implement an IEqualityComparer<Pay> and pass it as a parameter to Intersect:
public class PayComparer : IEqualityComparer<Pay>
{
public bool Equals(Pay x, Pay y)
{
if (x == y) // same instance or both null
return true;
if (x == null || y == null) // either one is null but not both
return false;
return x.EventId == y.EventId;
}
public int GetHashCode(Pay pay)
{
return pay != null ? pay.EventId : 0;
}
}
...
var Result = nlist.Intersect(olist, new PayComparer());
Intersect is probably only adding objects when the same instance of Pay is in both List. As oldPay and newPay are instantiated apart they're considered not equal.
Intersect uses the Equals method to compare objects. If you don't override it it keeps the same behavior of the Object class: returning true only if both are the same instance of the object.
You should override the Equals method in Pay.
//in the Pay class
public override bool Equals(Object o) {
Pay pay = o as Pay;
if (pay == null) return false;
// you haven't said if Number should be included in the comparation
return EventId == pay.EventId; // && Number == pay.Number; (if applies)
}
Objects are reference types. When you create two objects, you have two unique references. The only way they would ever compare equal is if you did:
object a = new object();
object b = a;
In this case, (a == b) is true. Read up on reference vs value types, and objects
And to fix your issue, override Equals and GetHashCode, as Thomas Levesque pointed out.
As others have noted, you need to provide the appropriate overrides to get Intersect to work correctly. But there is another way if you don't want to bother with overrides and your use case is simple. This assumes you want to match items on EventId, but you can modify this to compare any property. Note that this approach is likely more expensive than calling Intersect, but for small data sets it may not matter.
List<Pay> intersectedPays = new List<Pay>();
foreach (Pay o in olist)
{
var intersectedPay = nlist.Where(n => n.EventId == o.EventId).SingleOrDefault();
if (intersectedPay != null)
intersectedPays.Add(intersectedPay);
}
List<Pay> result = intersectedPays;

c# compare the data in two object models

I have a dialog, when spawned it gets populated with the data in an object model. At this point the data is copied and stored in a "backup" object model. When the user has finished making their changes, and click "ok" to dismiss the dialog, I need a quick way of comparing the backup object model with the live one - if anything is changed I can create the user a new undo state.
I don't want to have to go and write comparison function for every single class in the object model if possible.
If I serialised both object models and they were identical but stored in different memory locations would they be equal? Does some simple way exist to compare two serialised object models?
I didn't bother with a hash string but just a straight Binary serialisation works wonders. When the dialog opens serialise the object model.
BinaryFormatter formatter = new BinaryFormatter();
m_backupStream = new MemoryStream();
formatter.Serialize(m_backupStream,m_objectModel);
Then if the user adds to the object model using available controls (or not). When the dialog closes you can compare to the original serialisation with a new one - this for me is how i decide whether or not an Undo state is required.
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream liveStream = new MemoryStream();
formatter.Serialize(liveStream,m_objectModel);
byte[] streamOneBytes = liveStream.ToArray();
byte[] streamTwoBytes = m_backupStream.ToArray();
if(!CompareArrays(streamOneBytes, streamTwoBytes))
AddUndoState();
And the compare arrays function incase anybody needs it - prob not the best way of comparing two arrays im sure.
private bool CompareArrays(byte[] a, byte[] b)
{
if (a.Length != b.Length)
return false;
for (int i = 0; i < a.Length;i++)
{
if (a[i] != b[i])
return false;
}
return true;
}
I'd say the best way is to implement the equality operators on all classes in your model (which is usually a good idea anyway if you're going to do comparisons).
class Book
{
public string Title { get; set; }
public string Author { get; set; }
public ICollection<Chapter> Chapters { get; set; }
public bool Equals(Book other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return Equals(other.Title, Title) && Equals(other.Author, Author) && Equals(other.Chapters, Chapters);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof (Book)) return false;
return Equals((Book) obj);
}
public override int GetHashCode()
{
unchecked
{
int result = (Title != null ? Title.GetHashCode() : 0);
result = (result*397) ^ (Author != null ? Author.GetHashCode() : 0);
result = (result*397) ^ (Chapters != null ? Chapters.GetHashCode() : 0);
return result;
}
}
}
This snippet is auto-generated by ReSharper, but you can use this as a basis. Basically you will have to extend the non overriden Equals method with your custom comparison logic.
For instance, you might want to use SequenceEquals from the Linq extensions to check if the chapters collection is equal in sequence.
Comparing two books will now be as simple as saying:
Book book1 = new Book();
Book book2 = new Book();
book1.Title = "A book!";
book2.Title = "A book!";
bool equality = book1.Equals(book2); // returns true
book2.Title = "A different Title";
equality = book1.Equals(book2); // returns false
Keep in mind that there's another way of implementing equality: the System.IEquatable, which is used by various classes in the System.Collections namespace for determining equality.
I'd say check that out as well and you're well on your way!
I understand your question to be how one can compare two objects for value equality (as opposed to reference equality) without prior knowledge of the types, such as if they implement IEquatable or override Equals.
To do this I recommend two options:
A. Use an all-purpose serialization class to serialize both objects and compare their value. For example I have a class called XmlSerializer that takes any object and serializes its public properties as an XML document. Two objects that have the same values and possibly the same reference will have the same values in this sense.
B. Using reflection, compare the values of all of the properties of both objects, like:
bool Equal(object a, object b)
{
// They're both null.
if (a == null && b == null) return true;
// One is null, so they can't be the same.
if (a == null || b == null) return false;
// How can they be the same if they're different types?
if (a.GetType() != b.GetType()) return false;
var Props = a.GetType().GetProperties();
foreach(var Prop in Props)
{
// See notes *
var aPropValue = Prop.GetValue(a) ?? string.Empty;
var bPropValue = Prop.GetValue(b) ?? string.Empty;
if(aPropValue.ToString() != bPropValue.ToString())
return false;
}
return true;
}
Here we're assuming that we can easily compare the properties, like if they all implement IConvertible, or correctly override ToString. If that's not the case I would check if they implement IConvertible and if not, recursively call Equal() on the properties.
This only works if you're content with comparing public properties. Of course you COULD check private and protected fields and properties too, but if you know so little about the objects you're probably asking for trouble but doing so.

Array.BinarySearch where a certain condition is met

I have an array of a certain type. Now I want to find an entry where a certain condition is met.
What is the preferred way to do this with the restriction that I don't want to create a temporary object to find, but instead I only want to give a search condition.
MyClass[] myArray;
// fill and sort array..
MyClass item = Array.BinarySearch(myArray, x=>x.Name=="Joe"); // is this possible?
Maybe is it possible to use LINQ to solve it?
EDIT:
I know that it works on normal collections, but I need it to work for BinarySearch.
Just use FirstOrDefault (or SingleOrDefault, if unique).
var myItem = myArray.FirstOrDefault( x => x.Name == "Joe" );
Or if you want to force a BinarySearch and you know that the array is sorted
var myItem = Array.BinarySearch( myArray,
new MyClass { Name = "Joe" },
new MyClassNameComparer() );
where MyClassNameComparer is IComparer<MyClass> and compares based on the name property.
If you don't want any temporary object -- I assume that a constant string is ok, otherwise you're lost -- then you can use.
var myItem = Array.BinarySearch( myArray,
"Joe",
MyClassOrStringComparer() );
Where MyClassOrStringComparer is able to compare a string to a MyClass object (and vice versa).
public class MyClassOrStringComparer
{
public int Compare( object a, object b )
{
if (object.Equals(a,b))
{
return 0;
}
else if (a == null)
{
return -1;
}
else if (b == null)
{
return 1;
}
string aName = null;
string bName = null;
if (a is string)
{
aName = a;
}
else
{
aName = ((MyClass)a).Name;
}
if (b is string)
{
bName = b;
}
else
{
bName = ((MyClass)b).Name;
}
return aName.CompareTo( b.Name );
}
BinarySearch can only be used when the array is sorted, and only when searching for a particular value of the sort key. So this rules out use of an arbitrary predicate.
No, BinarySearch does not contains overload with Comparision<> parameter. You can use LINQ method instead:
MyClass item = myArray.FirstOrDefault(x => x.Name == "Joe");

Categories