I've got a class:
class ThisClass
{
private string a {get; set;}
private string b {get; set;}
}
I would like to use the Intersect and Except methods of Linq, i.e.:
private List<ThisClass> foo = new List<ThisClass>();
private List<ThisClass> bar = new List<ThisClass>();
Then I fill the two lists separately. I'd like to do, for example (and I know this isn't right, just pseudo code), the following:
foo[a].Intersect(bar[a]);
How would I do this?
If you want a list of a single property you'd like to intersect then all the other pretty LINQ solutions work just fine.
BUT! If you'd like to intersect on a whole class though and as a result have a List<ThisClass> instead of List<string> you'll have to write your own equality comparer.
foo.Intersect(bar, new YourEqualityComparer());
same with Except.
public class YourEqualityComparer: IEqualityComparer<ThisClass>
{
#region IEqualityComparer<ThisClass> Members
public bool Equals(ThisClass x, ThisClass y)
{
//no null check here, you might want to do that, or correct that to compare just one part of your object
return x.a == y.a && x.b == y.b;
}
public int GetHashCode(ThisClass obj)
{
unchecked
{
var hash = 17;
//same here, if you only want to get a hashcode on a, remove the line with b
hash = hash * 23 + obj.a.GetHashCode();
hash = hash * 23 + obj.b.GetHashCode();
return hash;
}
}
#endregion
}
Maybe
// returns list of intersecting property 'a' values
foo.Select(f => f.a).Intersect(bar.Select(b => b.a));
BTW property a should be public.
Not sure of the speed of this compared to intersect and compare but how about:
//Intersect
var inter = foo.Where(f => bar.Any(b => b.a == f.a));
//Except - values of foo not in bar
var except = foo.Where(f => !bar.Any(b => b.a == f.a));
foo.Select(x=>x.a).Intersect(bar.Select(x=>x.a))
What exactly is the desired effect? Do you want to get a list of strings composed of all the a's in your classes, or a list of ThisClass, when two ThisClass instances are identified via unique values of a?
If it's the former, the two answers from #lazyberezovksy and #Tilak should work. If it's the latter, you'll have to override IEqualityComparer<ThisClass> or IEquatable<ThisClass> so that Intersect knows what makes two instances of ThisClass equivalent:
private class ThisClass : IEquatable<ThisClass>
{
private string a;
public bool Equals(ThisClass other)
{
return string.Equals(this.a, other.a);
}
}
then you can just call:
var intersection = foo.Intersect(bar);
I know this is old but couldn't you also just override the Equals & GetHashCode on the class itself?
class ThisClass
{
public string a {get; set;}
private string b {get; set;}
public override bool Equals(object obj)
{
// If you only want to compare on a
ThisClass that = (ThisClass)obj;
return string.Equals(a, that.a/* optional: not case sensitive? */);
}
public override int GetHashCode()
{
return a.GetHashCode();
}
}
You should create IEqualityComparer. You can pass the IEqualityComparer to Intersect() method. This will help you get List(which intersect with bar) easier.
var intersectionList = foo.Intersect(bar, new ThisClassEqualityComparer()).ToList();
class ThisClassEqualityComparer : IEqualityComparer<ThisClass>
{
public bool Equals(ThisClass b1, ThisClass b2)
{
return b1.a == b2.a;
}
public int GetHashCode(Box bx)
{
// To ignore to compare hashcode, please consider this.
// I would like to force Equals() to be called
return 0;
}
}
Related
I'm desperately trying to delete all the items with a list of the same value inside.
Here's the code:
private void Button_deleteDouble_MouseDown(object sender, EventArgs e)
{
boardGenerate.Add(new BoardInformation(146, new List<string> { "test" }));
boardGenerate.Add(new BoardInformation(545, new List<string> { "test" }));
boardGenerate = boardGenerate.DistinctBy(x => x.positionQueen).ToList();
}
Normally, since the two lists inside the object are the same, the .DistinctBy() command should remove one of the two objects.
But no, my object list still has the same two objects with the same list
.positionQueen is the name of the variable containing the list
Could somebody help me?
Edit :
The DistinctBy() method comes from MoreLinq.
And this is my BoardInformation class:
public class BoardInformation
{
public BoardInformation(int nbQueen, List<string> positionQueen)
{
this.nbQueen = nbQueen;
this.positionQueen = positionQueen;
}
public int nbQueen { get; set; }
public List<string> positionQueen { get; set; }
}
Set-based operations like Distinct and DistinctBy need a way of determining whether two values are the same. You're using DistinctBy, so you're already asking MoreLINQ to compare the "inner lists" for equality - but you're not saying how to do that.
List<T> doesn't override Equals or GetHashCode, which means it inherits the reference equality behaviour from System.Object. In other words, if you create two separate List<T> objects, they won't compare as equal, even if they have the same content. For example:
List<int> list1 = new List<int>();
List<int> list2 = new List<int>();
Console.WriteLine(list1.Equals(list2)); // False
You need to tell DistinctBy how you want to compare the two lists, using an IEqualityComparer<T> - where T in this case is List<string> (because that's the type of BoardInformation.positionQueen.
Here's an example of a generic ListEqualityComparer you could use:
using System;
using System.Collections.Generic;
using System.Linq;
public sealed class ListEqualityComparer<T> : IEqualityComparer<List<T>>
{
private readonly IEqualityComparer<T> elementComparer;
public ListEqualityComparer(IEqualityComparer<T> elementComparer) =>
this.elementComparer = elementComparer;
public ListEqualityComparer() : this(EqualityComparer<T>.Default)
{
}
public bool Equals(List<T> x, List<T> y) =>
ReferenceEquals(x, y) ? true
: x is null || y is null ? false
// Delegate to LINQ's SequenceEqual method
: x.SequenceEqual(y, elementComparer);
public int GetHashCode(List<T> obj)
{
if (obj is null)
{
return 0;
}
// Just a very simple hash implementation
int hash = 23;
foreach (var item in obj)
{
hash = hash * 31 +
(item is null ? 0
: elementComparer.GetHashCode(item));
}
return hash;
}
}
You'd then pass that to DistinctBy, like this:
// We're fine to use the default *element* comparer (string.Equals etc)
var comparer = new ListEqualityComparer<string>();
boardGenerate = boardGenerate.DistinctBy(x => x.positionQueen, comparer).ToList();
Now DistinctBy will call into the comparer, passing in the lists, and will consider your two BoardInformation objects are equal - so only the first will be yielded by DistinctBy, and you'll end up with a list containing a single item.
It comes down to whether a equality check is using referential equality or value equality...you want value equality based on a specific property and that has to be done by hand.
When there is no IEqualityComparer provided which can used to compare individual objects (which is need by the Distinct call), the system determines the equality from each item's references by using their derived object low level service method call of GetHashCode from each reference; hence a reference difference is done and all your values in the list are unique (not equal) regardless of similar property values.
What you are looking for is to have value equality checked specifically for the nbQueenProperty.
To fully utilize Distinct one must create a IEqualityComparer and modify the GetHashCode. By specifing the hash value which can make objects equal...you can weed out the same positionQueen (or other properties) instances out.
Example
public class MyClass
{
public string Name { get; set; }
public int nbQueen { get; set; }
}
Equality comparer to weed out all nbQueen similarities:
class ContactEmailComparer : IEqualityComparer < MyClass >
{
public bool Equals(MyClass x, MyClass y)
{
return x.nbQueen.Equals(y.nbQueen); // Compares by calling each `GetHashCode`
}
public int GetHashCode(MyClass obj)
{
return obj.nbQueen.GetHashCode(); // Add or remove other properties as needed.
}
}
Test code
var original = new List<MyClass>()
{
new MyClass() { nbQueen = 1, Name="Alpha" },
new MyClass() { nbQueen = 1, Name="Omega" },
new MyClass() { nbQueen = 3, Name="Delta" }
};
IEqualityComparer<MyClass> comparer = new ContactEmailComparer();
var newOne = original.Distinct( comparer ).ToList();
Result of the value of newOne :
To be clear...
... .DistinctBy() command should remove one of the two objects.
Does not remove anything. It returns a reference to a new list that should be distinct via the equality operation. The original list (the reference to it) does not change.
LINQ solution
because you have another List inside your class you can not use District or DistrictBy, alternatively, you can use LINQ to filter the list.
boardGenerate = (from b in boardGenerate
from l in b.positionQueen
group new { l,b } by l into g
select g.First().b
).ToList();
// this returns just first duplicate item like district
this is my Clients class:
public class Clients
{
public string Email { get; set; }
public string Name { get; set; }
public Clients(string e, string n)
{
Email = e;
Name = n;
}
I want to make a new list which contains the same clients from List A and List B .
For example:
List A - John, Jonathan, James ....
List B - Martha, Jane, Jonathan ....
Unsubscribers - Jonathan
public static List<Clients> SameClients(List<Clients> A, List<Clients> B)
{
List<Clients> Unsubscribers = new List<Clients>();
Unsubscribers = A.Intersect(B).ToList();
return Unsubscribers;
}
However for some reasons I get empty list and I have no idea what's wrong.
The problem is that when you are comparing objects Equals and Gethashcode are used to compare them. You can override these two methods and provide your own implementation based on your needs...there is already an answer below covering how to override these two methods
However, normally I prefer to keep my entities/models (or whatever you want to call them) very simple and keep comparison implementation details away from my models. In that case, you can implement an IEqualityComparer<TSource> and use an overload of Intersects that takes in an IEqualityComparer
Here's an example implementation of IEqualityComprarer based on only the Name property...
public class ClientNameEqualityComparer : IEqualityComparer<Clients>
{
public bool Equals(Clients c1, Clients c2)
{
if (c2 == null && c1 == null)
return true;
else if (c1 == null | c2 == null)
return false;
else if(c1.Name == c2.Name)
return true;
else
return false;
}
public int GetHashCode(Client c)
{
return c.Name.GetHashCode();
}
}
Basically, the implementation above only cares about the Name property, if two instances of Clients have the same value for the Name property, then they are considered equal.
Now you can do the followig...
A.Intersect(B, new ClientNameEqualityComparer()).ToList();
And that will produce the results you are expecting...
Intersect uses GetHashCode and Equals by default, but you haven't overriden it, so Object.Equals is used which just compares references. Since all your client-instances are initialized with new they are separate instances even if they have equal values. That's why Intersect "thinks" that there are no common clients.
So you have several options.
implement a custom IEqualityComparer<Clients> and pass that to Intersect(or many other LINQ methods). This has the advantage that you could implement different comparer for different requirements and you don't need to modify the original class
let Clients override Equals and GetHashCode and /or
let Clients implement IEquatable<Clients>
For example(showing the last two because other answer showed already IEqualityComparer<T>):
public class Clients : IEquatable<Clients>
{
public string Email { get; set; }
public string Name { get; set; }
public Clients(string e, string n)
{
Email = e;
Name = n;
}
public override bool Equals(object obj)
{
return obj is Clients && this.Equals((Clients)obj);
}
public bool Equals(Clients other)
{
return Email == other?.Email == true
&& Name == other?.Name == true;
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + (Email?.GetHashCode() ?? 0);
hash = hash * 23 + (Name?.GetHashCode() ?? 0);
return hash;
}
}
}
Worth reading:
Differences between IEquatable<T>, IEqualityComparer<T>, and overriding .Equals() when using LINQ on a custom object collection?
I have some classes that contain several fields. I need to compare them by value, i.e. two instances of a class are equal if their fields contain the same data. I have overridden the GetHashCode and Equals methods for that.
It can happen that these classes contain circular references.
Example: We want to model institutions (like government, sports clubs, whatever). An institution has a name. A Club is an institution that has a name and a list of members. Each member is a Person that has a name and a favourite institution. If a member of a certain club has this club as his favourite institution, we have a circular reference.
But circular references, in conjunction with value equality, lead to infinite recursion. Here is a code example:
interface IInstitution { string Name { get; } }
class Club : IInstitution
{
public string Name { get; set; }
public HashSet<Person> Members { get; set; }
public override int GetHashCode() { return Name.GetHashCode() + Members.Count; }
public override bool Equals(object obj)
{
Club other = obj as Club;
if (other == null)
return false;
return Name.Equals(other.Name) && Members.SetEquals(other.Members);
}
}
class Person
{
public string Name { get; set; }
public IInstitution FavouriteInstitution { get; set; }
public override int GetHashCode() { return Name.GetHashCode(); }
public override bool Equals(object obj)
{
Person other = obj as Person;
if (other == null)
return false;
return Name.Equals(other.Name)
&& FavouriteInstitution.Equals(other.FavouriteInstitution);
}
}
class Program
{
public static void Main()
{
Club c1 = new Club { Name = "myClub", Members = new HashSet<Person>() };
Person p1 = new Person { Name = "Johnny", FavouriteInstitution = c1 }
c1.Members.Add(p1);
Club c2 = new Club { Name = "myClub", Members = new HashSet<Person>() };
Person p2 = new Person { Name = "Johnny", FavouriteInstitution = c2 }
c2.Members.Add(p2);
bool c1_and_c2_equal = c1.Equals(c2); // StackOverflowException!
// c1.Equals(c2) calls Members.SetEquals(other.Members)
// Members.SetEquals(other.Members) calls p1.Equals(p2)
// p1.Equals(p2) calls c1.Equals(c2)
}
}
c1_and_c2_equal should return true, and in fact we (humans) can see that they are value-equal with a little bit of thinking, without running into infinite recursion. However, I can't really say how we figure that out. But since it is possible, I hope that there is a way to resolve this problem in code as well!
So the question is: How can I check for value equality without running into infinite recursions?
Note that I need to resolve circular references in general, not only the case from above. I'll call it a 2-circle since c1 references p1, and p1 references c1. There can be other n-circles, e.g. if a club A has a member M whose favourite is club B which has member N whose favourite club is A. That would be a 4-circle. Other object models might also allow n-circles with odd numbers n. I am looking for a way to resolve all these problems at once, since I won't know in advance which value n can have.
An easy workaround (used in RDBMS) is to use a unique Id to identify a Person(any type). Then you don't need to compare every other property and you never run into such cuircular references.
Another way is to compare differently in Equals, so provide the deep check only for the type of the Equals and not for the referenced types. You could use a custom comparer:
public class PersonNameComparer : IEqualityComparer<Person>
{
public bool Equals(Person x, Person y)
{
if (x == null && y == null) return true;
if (x == null || y == null) return false;
if(object.ReferenceEquals(x, y)) return true;
return x.Name == y.Name;
}
public int GetHashCode(Person obj)
{
return obj?.Name?.GetHashCode() ?? int.MinValue;
}
}
Now you can change the Equals implementation of Club to avoid that the Members(Persons) will use their deep check which includes the institution but only their Name:
public override bool Equals(object obj)
{
if (Object.ReferenceEquals(this, obj))
return true;
Club other = obj as Club;
if (other == null)
return false;
var personNameComparer = new PersonNameComparer();
return Name.Equals(other.Name)
&& Members.Count == other.Members.Count
&& !Members.Except(other.Members, personNameComparer).Any();
}
You notice that i can't use SetEquals because there is no overload for my custom comparer.
Following the suggestion of Dryadwoods, I changed the Equals methods so that I can keep track of the items that were already compared.
First we need an equality comparer that checks reference equality for corresponding elements of pairs:
public class ValuePairRefEqualityComparer<T> : IEqualityComparer<(T,T)> where T : class
{
public static ValuePairRefEqualityComparer<T> Instance
= new ValuePairRefEqualityComparer<T>();
private ValuePairRefEqualityComparer() { }
public bool Equals((T,T) x, (T,T) y)
{
return ReferenceEquals(x.Item1, y.Item1)
&& ReferenceEquals(x.Item2, y.Item2);
}
public int GetHashCode((T,T) obj)
{
return RuntimeHelpers.GetHashCode(obj.Item1)
+ 2 * RuntimeHelpers.GetHashCode(obj.Item2);
}
}
And here is the modified Equals method of Club:
static HashSet<(Club,Club)> checkedPairs
= new HashSet<(Club,Club)>(ValuePairRefEqualityComparer<Club>.Instance);
public override bool Equals(object obj)
{
Club other = obj as Club;
if (other == null)
return false;
if (!Name.Equals(other.Name))
return;
if (checkedPairs.Contains((this,other)) || checkedPairs.Contains((other,this)))
return true;
checkedPairs.Add((this,other));
bool membersEqual = Members.SetEquals(other.Members);
checkedPairs.Clear();
return membersEqual;
}
The version for Person is analogous. Note that I add (this,other) to checkedPairs and check if either (this,other) or (other,this) is contained because it might happen that after the first call of c1.Equals(c2), we end up with a call of c2.Equals(c1) instead of c1.Equals(c2). I am not sure if this actually happens, but since I can't see the implementation of SetEquals, I believe it is a possibility.
Since I am not happy with using a static field for the already checked pairs (it will not work if the program is concurrent!), I asked another question: make a variable last for a call stack.
For the general case that I am interested in
-- where we have classes C1, ..., Cn where each of these classes can have any number of VALUES (like int, string, ...) as well as any number of REFERENCES to any other classes of C1, ..., Cn (e.g. by having for each type Ci a field ICollection<Ci>) --
the question "Are two objects A and B equal?", in the sense of equality that I described here,
seems to be EQUIVALENT to
the question "For two finite, directed, connected, colored graphs G and H, does there exist an isomorphism from G to H?".
Here is the equivalence:
graph vertices correspond to objects (class instances)
graph edges correspond to references to objects
color corresponds to the conglomerate of values and the type itself (i.e. colors of two vertices are the same if their corresponding objects have the same type and the same values)
That's an NP-hard question, so I think I'm going to discard my plan to implement this and go with a circular-reference-free approach instead.
I have a little strange problem. I use Visual Studio and I am developing a project with C#.
I have two custom classes "Attr" and "FD" and I use lists that includes their objects e.g.
List<Attr> attrList = new List<Attr>();
List<FD> fdList = new List<FD>();
So when I try to find the intersection of two lists the result is not what I expect. To make it more simple I tried to Intersect similar Objects and the result is wrong again. What is going wrong?
This is the fd. It is an object of class FD.
This is the ff which is also an object of FD class.
As you can see these object contains exactly the same values.
The method GetLeft() returns a list that contains objects of class Attr.
So when I try to find the intersection between those two lists (fd.GetLeft() and ff.GetLeft() ) the result is nothing (it should be a list that contains an Attr object "A").
What did I miss?
P.S. These screenshots are from the debugg mode in Visual Studio.
In order to use Intersect I suggest implementing IEqualityComparer<T>, something like this :
public class FD
{
public string Name { get; set; }
}
static void Main()
{
List<FD> fdList1 = new List<FD>();
fdList1.Add(new FD { Name = "a" });
List<FD> fdList2 = new List<FD>();
fdList2.Add(new FD { Name = "a" });
IEnumerable<FD> fd = fdList1.Intersect<FD>(fdList2, new ComparerFd()).ToList();
}
And the CamparerFd should be like this :
public class ComparerFd : IEqualityComparer<FD>
{
public bool Equals(FD x, FD y)
{
return x.Name == y.Name;
}
public int GetHashCode(FD obj)
{
if(obj == null) return 0;
return obj.Name.GetHashCode();//Or whatever way to get hash code
}
}
If you created your own class, and did not override the Equals-method in that class, the Intersect-method will only compare the references of the objects, and not the properties.
Take the following, really simple class:
class MyClass
{
int Value { get; set; }
public MyClass(int value)
{
this.Value = value;
}
}
Now, create two lists, with both containing one object. The properties of the objects are the same, but the instances are not:
var list1 = new List<MyClass>
{
new MyClass(5)
};
var list2 = new List<MyClass>
{
new MyClass(5)
};
So the following will happen:
list1[0].Equals(list2[0]); // false
list1.Intersect(list2); // No matches
If you want these methods to compare the properties of your MyClass-objects, implement IEqualityComparer<MyClass>, e.g. change the classes signature to:
class MyClass : IEqualityComparer<MyClass>
{
..
}
Alternatively, you can just override Equals and GetHashCode, as then these methods will be called as default IEqualityComparer.
See the this answer on how to properly override Equals and GetHashCode.
I have a List<MyClass> someList.
class MyClass
{
public int Prop1...
public int Prop2...
public int Prop3...
}
I would like to know how to get a new distinct List<MyClass> distinctList from List<MyClass> someList, but only comparing it to Prop2.
You can emulate the effect of DistinctBy using GroupBy and then just using the first entry in each group. Might be a bit slower that the other implementations though.
someList.GroupBy(elem=>elem.Prop2).Select(group=>group.First());
Unfortunately there's no really easy built-in support for this in the framework - but you can use the DistinctBy implementation I have in MoreLINQ.
You'd use:
var distinctList = someList.DistinctBy(x => x.Prop2).ToList();
(You can take just the DistinctBy implementation. If you'd rather use a Microsoft implementation, I believe there's something similar in the System.Interactive assembly of Reactive Extensions.)
you need to use .Distinct(..); extension method.
Here's a quick sample:
public class Comparer : IEqualityComparer<Point>
{
public bool Equals(Point x, Point y)
{
return x.X == y.X;
}
public int GetHashCode(Point obj)
{
return (int)obj.X;
}
}
Do not forget about GetHashCode.
Usage:
List<Point> p = new List<Point>();
// add items
p.Distinct(new Comparer());
Override Equals(object obj) and GetHashCode() methods:
class MyClass
{
public int Prop1 { get; set; }
public int Prop2 { get; set; }
public int Prop3 { get; set; }
public override bool Equals(object obj)
{
return ((MyClass)obj).Prop2 == Prop2;
}
public override int GetHashCode()
{
return Prop2.GetHashCode();
}
}
and then just call:
List<MyClass> distinctList = someList.Distinct().ToList();
Since the introduction of value tuples, if you want a LINQ equivalent to SQL's DISTINCT
items.GroupBy(item => (item.prop1, item.prop2, ...)).Select(group => group.First())
If you would like to Distinct your list by multiple fields, You have to create an instance of IEqualityComparer interface:
public class MyComparer : IEqualityComparer<MyModel>
{
public bool Equals(MyModel x, MyModel y)
{
// compare multiple fields
return
x.Field1 == y.Field1 &&
x.Field2 == y.Field2 &&
x.Field3 == y.Field3 ;
}
public int GetHashCode(MyModel obj)
{
return
obj.Field1.GetHashCode() +
obj.Field2.GetHashCode() +
obj.Field3.GetHashCode();
}
}
Then use the comparer to distinct your list:
var distinctedList = myList.Distinct(new MyComparer()).ToList();
I know it's been a while, but I needed the simplest answer and at this time (with .NET 4.5.1) I found the following to be the most straight-forward answer I could get to:
IEnumerable<long> allIds = waitingFiles.Values.Select(wf => wf.groupId).Distinct();
My situation is that I have a ConcurrentDictionary that looks something like:
ConcurrentDictionary<long, FileModel>
The ConcurrentDictionary Values property is basically my List<FileModel>.
*FileModel has a groupId that isn't necessarily unique (though, obviously the key (long) that I use to add the FileModel object into the dictionary is unique to the FileModel).
*Named for clarity in the example.
The point is that I have a large number of FileModels (imagine 100) in the ConcurrentDictionary and within those 100 FileModels there are 5 different groupIds.
At this point I just need a list of the distinct groupId.
So, again if I just had a list of FileModel the code would look like the following:
IEnumerable <long> allIds = allFileModel.Select(fm => fm.groupId).Distinct();