C# Linq, Searching for same items in two lists - c#

we have the following setup:
We have a array of objects with a string in it (xml-ish but not normalized) and we have a list/array of strings with id.
We need to find out if a string from that list with id's is also pressent in one of the objects.
Here we have a setup that we have tried:
public class Wrapper
{
public string MyProperty { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Wrapper> wrappers = new List<Wrapper>()
{
new Wrapper{ MyProperty = "<flkds,dlsklkdlsqkdkqslkdlqk><id>3</id><sqjldkjlfdskjlkfjsdklfj>"},
new Wrapper{ MyProperty = "<flkds,dlsklkdlsqkdkqslkdlqk><id>2</id><sqjldkjlfdskjlkfjsdklfj>"}
};
string[] ids = { "<id>0</id>", "<id>1</id>", "<id>2</id>" };
var props = wrappers.Select(w => w.MyProperty);
var intersect = props.Intersect(ids, new MyEquilityTester());
Debugger.Break();
}
}
class MyEquilityTester: IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return x.Contains(y);
}
public int GetHashCode(string obj)
{
return obj.GetHashCode();
}
}
Edit:
What we expect is when we do a .Any() on intersect that is says true because wrappers has a object with a prop that contains <id>2</id>, intersect is null.
If we are using the wrong method please say. It should work as fast as posible. A simple true when found will do!

For your case, you could write your IEqualitycomparer like this:
class MyEquilityTester: IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return x.Contains(y) || y.Contains(x);
}
public int GetHashCode(string obj)
{
return 0;
}
}
and it will find
<flkds,dlsklkdlsqkdkqslkdlqk><id>2</id><sqjldkjlfdskjlkfjsdklfj>
This works because GetHashCode always return 0, and the x.Contains(y) || y.Contains(x) check.
Another not-so-hacky solution is to use a Where in combination with Any
IEnumerable<String> intersect = props.Where(p => ids.Any (i => p.Contains(i)));
or replace the Where with another Any if you don't care about the actual items and you only want a true or false.
bool intersect = props.Any(p => ids.Any (i => p.Contains(i)));

wrappers.Where(w=>ids.Any(i=>w.MyProperty.Contains(i)))

Related

Intersection of List of List

I have a list of lists which looks like the following
public class FilteredVM
{
public int ID { get; set; }
public string Name { get; set; }
public string Number { get; set; }
}
List<List<FilteredVM>> groupedExpressionResults = new List<List<FilteredVM>>();
I would like to Intersect the lists within this list based upon the ID's, whats the best way to tackle this?
Here's an optimized extension method:
public static HashSet<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> series, IEqualityComparer<T> equalityComparer = null)
{
if (series == null)
throw new ArgumentNullException("series");
HashSet<T> set = null;
foreach (var values in series)
{
if (set == null)
set = new HashSet<T>(values, equalityComparer ?? EqualityComparer<T>.Default);
else
set.IntersectWith(values);
}
return set ?? new HashSet<T>();
}
Use this with the following comparer:
public class FilteredVMComparer : IEqualityComparer<FilteredVM>
{
public static readonly FilteredVMComparer Instance = new FilteredVMComparer();
private FilteredVMComparer()
{
}
public bool Equals(FilteredVM x, FilteredVM y)
{
return x.ID == y.ID;
}
public int GetHashCode(FilteredVM obj)
{
return obj.ID;
}
}
Like that:
series.IntersectAll(FilteredVMComparer.Instance)
You could just write
series.Aggregate((a, b) => a.Intersect(b, FilteredVMComparer.Instance))
but it 'd be wasteful because it'd have to construct multiple sets.
Intersect will work when the type are dead equals, which in your case won't apply because you haven't implemented the GetHashCode and Equals methods, which is the best and complete way.
Thus, If you only intended to take elements that contains in both lists, than the following solution will suit you right.
Assuming list1 and list2 are type List<FilteredVM> than, The most simple way, will be doing this:
var intersectByIDs = list1.Where(elem => list2.Any(elem2 => elem2.ID == elem.ID));
If you are a fan of one-liner solutions you can use this:
List<FilteredVM> result = groupedExpressionResults.Aggregate((x, y) => x.Where(xi => y.Select(yi => yi.ID).Contains(xi.ID)).ToList());
And if you just want the IDs you can just add .Select(x => x.ID), like this:
var ids = groupedExpressionResults.Aggregate((x, y) => x.Where(xi => y.Select(yi => yi.ID).Contains(xi.ID)).ToList()).Select(x => x.ID);
Working Demo

Get distinct list values

i have a C# application in which i'd like to get from a List of Project objects , another List which contains distinct objects.
i tried this
List<Project> model = notre_admin.Get_List_Project_By_Expert(u.Id_user);
if (model != null) model = model.Distinct().ToList();
The list model still contains 4 identical objects Project.
What is the reason of this? How can i fix it?
You need to define "identical" here. I'm guessing you mean "have the same contents", but that is not the default definition for classes: the default definition is "are the same instance".
If you want "identical" to mean "have the same contents", you have two options:
write a custom comparer (IEqualityComparer<Project>) and supply that as a parameter to Distinct
override Equals and GetHashCode on Project
There are also custom methods like DistinctBy that are available lots of places, which is useful if identity can be determined by a single property (Id, typically) - not in the BCL, though. But for example:
if (model != null) model = model.DistinctBy(x => x.Id).ToList();
With, for example:
public static IEnumerable<TItem>
DistinctBy<TItem, TValue>(this IEnumerable<TItem> items,
Func<TItem, TValue> selector)
{
var uniques = new HashSet<TValue>();
foreach(var item in items)
{
if(uniques.Add(selector(item))) yield return item;
}
}
var newList =
(
from x in model
select new {Id_user= x.Id_user}
).Distinct();
or you can write like this
var list1 = model.DistinctBy(x=> x.Id_user);
How do you define identical? You should override Equals in Project with this definition (if you override Equals also override GetHashCode). For example:
public class Project
{
public int ProjectID { get; set; }
public override bool Equals(object obj)
{
var p2 = obj as Project;
if (p2 == null) return false;
return this.ProjectID == m2.ProjectID;
}
public override int GetHashCode()
{
return ProjectID;
}
}
Otherwise you are just checking reference equality.
The object's reference aren't equal. If you want to be able to do that on the entire object itself and not just a property, you have to implement the IEqualityComparer or IEquatable<T>.
Check this example: you need to use either Comparator or override Equals()
class Program
{
static void Main( string[] args )
{
List<Item> items = new List<Item>();
items.Add( new Item( "A" ) );
items.Add( new Item( "A" ) );
items.Add( new Item( "B" ) );
items.Add( new Item( "C" ) );
items = items.Distinct().ToList();
}
}
public class Item
{
string Name { get; set; }
public Item( string name )
{
Name = name;
}
public override bool Equals( object obj )
{
return Name.Equals((obj as Item).Name);
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}
Here's an answer from basically the same question that will help.
Explanation:
The Distinct() method checks reference equality for reference types. This means it is looking for literally the same object duplicated, not different objects which contain the same values.
Credits to #Rex M.
Isn't simpler to use one of the approaches shown below :) ?
You can just group your domain objects by some key and select FirstOrDefault like below.
More interesting option is to create some Comparer adapter that takes you domain object and creates other object the Comparer can use/work with out of the box. Base on the comparer you can create your custom linq extensions like in sample below. Hope it helps :)
[TestMethod]
public void CustomDistinctTest()
{
// Generate some sample of domain objects
var listOfDomainObjects = Enumerable
.Range(10, 10)
.SelectMany(x =>
Enumerable
.Range(15, 10)
.Select(y => new SomeClass { SomeText = x.ToString(), SomeInt = x + y }))
.ToList();
var uniqueStringsByUsingGroupBy = listOfDomainObjects
.GroupBy(x => x.SomeText)
.Select(x => x.FirstOrDefault())
.ToList();
var uniqueStringsByCustomExtension = listOfDomainObjects.DistinctBy(x => x.SomeText).ToList();
var uniqueIntsByCustomExtension = listOfDomainObjects.DistinctBy(x => x.SomeInt).ToList();
var uniqueStrings = listOfDomainObjects
.Distinct(new EqualityComparerAdapter<SomeClass, string>(x => x.SomeText))
.OrderBy(x=>x.SomeText)
.ToList();
var uniqueInts = listOfDomainObjects
.Distinct(new EqualityComparerAdapter<SomeClass, int>(x => x.SomeInt))
.OrderBy(x => x.SomeInt)
.ToList();
}
Custom comparer adapter:
public class EqualityComparerAdapter<T, V> : EqualityComparer<T>
where V : IEquatable<V>
{
private Func<T, V> _valueAdapter;
public EqualityComparerAdapter(Func<T, V> valueAdapter)
{
_valueAdapter = valueAdapter;
}
public override bool Equals(T x, T y)
{
return _valueAdapter(x).Equals(_valueAdapter(y));
}
public override int GetHashCode(T obj)
{
return _valueAdapter(obj).GetHashCode();
}
}
Custom linq extension (definition of DistinctBy extension method):
// Embedd this class in some specific custom namespace
public static class DistByExt
{
public static IEnumerable<T> DistinctBy<T,V>(this IEnumerable<T> enumerator,Func<T,V> valueAdapter)
where V : IEquatable<V>
{
return enumerator.Distinct(new EqualityComparerAdapter<T, V>(valueAdapter));
}
}
Definition of domain object used in test case:
public class SomeClass
{
public string SomeText { get; set; }
public int SomeInt { get; set; }
}
List<ViewClReceive> passData = (List<ViewClReceive>)TempData["passData_Select_BankName_List"];
passData = passData?.DistinctBy(b=>b.BankNm).ToList();
It will Works ......

Grouping by IEnumerable<string> does not work at all

I'm not really sure, why grouping by IEnumerable<string> does not work. I provide custom IEqualityComparer, of course.
public class StringCollectionEqualityComparer : EqualityComparer<IEnumerable<string>>
{
public override bool Equals(IEnumerable<string> x, IEnumerable<string> y)
{
if (Object.Equals(x, y) == true)
return true;
if (x == null) return y == null;
if (y == null) return x == null;
return x.SequenceEqual(y, StringComparer.OrdinalIgnoreCase);
}
public override int GetHashCode(IEnumerable<string> obj)
{
return obj.OrderBy(value => value, StringComparer.OrdinalIgnoreCase).Aggregate(0, (hashCode, value) => value == null ? hashCode : hashCode ^ value.GetHashCode() + 33);
}
}
class A
{
public IEnumerable<string> StringCollection { get; set; }
}
IEnumerable<A> collection = // collection of A
var grouping = collection.GroupBy(obj => a.StringCollection, StringCollectionEqualityComparer.Default).ToList();
(ToList() is to force evaluation, I have breakpoints in StringCollectionEqualityComparer, but unfortunately, they're not invoked, as expected)
When I group collection in this dumb way, it actually works.
var grouping = collection.GroupBy(obj => String.Join("|", obj.StringCollection));
Unfortunately, obviously it is not something I want to use.
By not working, I mean the results are not the ones I expect (using dumb way, the results are correct).
StringCollectionEqualityComparer.Default is a valid alternative way to access EqualityComparer<IEnumerable<string>>.Default, since the latter is a base class of the former. You need to create an instance of StringCollectionEqualityComparer, simply using new StringCollectionEqualityComparer(), instead.

How to compare, in C#, two lists of objects on one or more properties of these objects?

First of all I must say I'm not a seasoned programmer. I looked at similar problems on StackOverflow but didn't seem to find a suitable answer that I can implement with my limited skills.
In C#, I need to compare two lists of objects based on the values of one or more properties in those objects. I want to create two new lists, one of the objects that exist in the left, but have differences in some property values in, or don't exist at all in the right list and vice versa.
Before I only had to compare the two based on one value, so I did not have to work on objects but on string, so I did something like this:
(LeftItems and RightItems are Entities)
List<String> leftList = new List<string>();
List<String> rightList = new List<string>();
List<String> leftResultList = new List<string>();
List<String> rightResultList = new List<string>();
List<String> leftResultObjList = new List<string>();
List<String> rightResultObjList = new List<string>();
foreach (item i in leftItems)
{
leftlist.Add(i.value);
}
//same for right
foreach (string i in leftList)
{
if(!rightList.contains(i))
{
leftResultList.Add(i);
}
}
//same for the right list
Now I have to compare on more than one value, so I created a class which has several properties that I need to compare, so I'd like to do the same as the above, but with object properties:
class CompItems
{
string _x;
string _y;
public CompItems(string x, string y)
{
_x = x;
_y = y;
}
}
foreach (item i in leftItems)
{
leftList.Add(new CompItem(i.value1,i.value2));
}
//same for the right list
foreach (CompItem c in leftItems)
{
// Here is where things go wrong
if(one property of object in rightItems equals property of object in leftItems) && some other comparisons
{
resultLeftObjList.Add(c)
}
}
//And the same for the right list
You can make your class inherit from IComparable and do the comparison based on the properties you want like the following:
class Employee : IComparable
{
private string name;
public string Name
{
get { return name; }
set { name = value ; }
}
public Employee( string a_name)
{
name = a_name;
}
#region IComparable Members
public int CompareTo( object obj)
{
Employee temp = (Employee)obj;
if ( this.name.Length < temp.name.Length)
return -1;
else return 0;
}
}
You can find the details of this solution here
The easiest and most OOP approach in this case, imo, could be a simple implementation
of IComparable Interface on your both types, and after simply call CompareTo.
Hope this helps.
For example override
public Coordinates(string x, string y)
{
X = x;
Y = y;
}
public string X { get; private set; }
public string Y { get; private set; }
public override bool Equals(object obj)
{
if (!(obj is Coordinates))
{
return false;
}
Coordinates coordinates = (Coordinates)obj;
return ((coordinates.X == this.X) && (coordinates.Y == this.Y));
}
And then call 'Equal' of list

How to use the IEqualityComparer

I have some bells in my database with the same number. I want to get all of them without duplication. I created a compare class to do this work, but the execution of the function causes a big delay from the function without distinct, from 0.6 sec to 3.2 sec!
Am I doing it right or do I have to use another method?
reg.AddRange(
(from a in this.dataContext.reglements
join b in this.dataContext.Clients on a.Id_client equals b.Id
where a.date_v <= datefin && a.date_v >= datedeb
where a.Id_client == b.Id
orderby a.date_v descending
select new Class_reglement
{
nom = b.Nom,
code = b.code,
Numf = a.Numf,
})
.AsEnumerable()
.Distinct(new Compare())
.ToList());
class Compare : IEqualityComparer<Class_reglement>
{
public bool Equals(Class_reglement x, Class_reglement y)
{
if (x.Numf == y.Numf)
{
return true;
}
else { return false; }
}
public int GetHashCode(Class_reglement codeh)
{
return 0;
}
}
Your GetHashCode implementation always returns the same value. Distinct relies on a good hash function to work efficiently because it internally builds a hash table.
When implementing interfaces of classes it is important to read the documentation, to know which contract you’re supposed to implement.1
In your code, the solution is to forward GetHashCode to Class_reglement.Numf.GetHashCode and implement it appropriately there.
Apart from that, your Equals method is full of unnecessary code. It could be rewritten as follows (same semantics, ¼ of the code, more readable):
public bool Equals(Class_reglement x, Class_reglement y)
{
return x.Numf == y.Numf;
}
Lastly, the ToList call is unnecessary and time-consuming: AddRange accepts any IEnumerable so conversion to a List isn’t required. AsEnumerable is also redundant here since processing the result in AddRange will cause this anyway.
1 Writing code without knowing what it actually does is called cargo cult programming. It’s a surprisingly widespread practice. It fundamentally doesn’t work.
Try This code:
public class GenericCompare<T> : IEqualityComparer<T> where T : class
{
private Func<T, object> _expr { get; set; }
public GenericCompare(Func<T, object> expr)
{
this._expr = expr;
}
public bool Equals(T x, T y)
{
var first = _expr.Invoke(x);
var sec = _expr.Invoke(y);
if (first != null && first.Equals(sec))
return true;
else
return false;
}
public int GetHashCode(T obj)
{
return obj.GetHashCode();
}
}
Example of its use would be
collection = collection
.Except(ExistedDataEles, new GenericCompare<DataEle>(x=>x.Id))
.ToList();
If you want a generic solution that creates an IEqualityComparer for your class based on a property (which acts as a key) of that class have a look at this:
public class KeyBasedEqualityComparer<T, TKey> : IEqualityComparer<T>
{
private readonly Func<T, TKey> _keyGetter;
public KeyBasedEqualityComparer(Func<T, TKey> keyGetter)
{
if (default(T) == null)
{
_keyGetter = (x) => x == null ? default : keyGetter(x);
}
else
{
_keyGetter = keyGetter;
}
}
public bool Equals(T x, T y)
{
return EqualityComparer<TKey>.Default.Equals(_keyGetter(x), _keyGetter(y));
}
public int GetHashCode(T obj)
{
TKey key = _keyGetter(obj);
return key == null ? 0 : key.GetHashCode();
}
}
public static class KeyBasedEqualityComparer<T>
{
public static KeyBasedEqualityComparer<T, TKey> Create<TKey>(Func<T, TKey> keyGetter)
{
return new KeyBasedEqualityComparer<T, TKey>(keyGetter);
}
}
For better performance with structs there isn't any boxing.
Usage is like this:
IEqualityComparer<Class_reglement> equalityComparer =
KeyBasedEqualityComparer<Class_reglement>.Create(x => x.Numf);
Just code, with implementation of GetHashCode and NULL validation:
public class Class_reglementComparer : IEqualityComparer<Class_reglement>
{
public bool Equals(Class_reglement x, Class_reglement y)
{
if (x is null || y is null))
return false;
return x.Numf == y.Numf;
}
public int GetHashCode(Class_reglement product)
{
//Check whether the object is null
if (product is null) return 0;
//Get hash code for the Numf field if it is not null.
int hashNumf = product.hashNumf == null ? 0 : product.hashNumf.GetHashCode();
return hashNumf;
}
}
Example:
list of Class_reglement distinct by Numf
List<Class_reglement> items = items.Distinct(new Class_reglementComparer());
The purpose of this answer is to improve on previous answers by:
making the lambda expression optional in the constructor so that full object equality can be checked by default, not just on one of the properties.
operating on different types of classes, even complex types including sub-objects or nested lists. And not only on simple classes comprising only primitive type properties.
Not taking into account possible list container differences.
Here, you'll find a first simple code sample that works only on simple types (the ones composed only by primitif properties), and a second one that is complete (for a wider range of classes and complex types).
Here is my 2 pennies try:
public class GenericEqualityComparer<T> : IEqualityComparer<T> where T : class
{
private Func<T, object> _expr { get; set; }
public GenericEqualityComparer() => _expr = null;
public GenericEqualityComparer(Func<T, object> expr) => _expr = expr;
public bool Equals(T x, T y)
{
var first = _expr?.Invoke(x) ?? x;
var sec = _expr?.Invoke(y) ?? y;
if (first == null && sec == null)
return true;
if (first != null && first.Equals(sec))
return true;
var typeProperties = typeof(T).GetProperties();
foreach (var prop in typeProperties)
{
var firstPropVal = prop.GetValue(first, null);
var secPropVal = prop.GetValue(sec, null);
if (firstPropVal != null && !firstPropVal.Equals(secPropVal))
return false;
}
return true;
}
public int GetHashCode(T obj) =>
_expr?.Invoke(obj).GetHashCode() ?? obj.GetHashCode();
}
I know we can still optimize it (and maybe use a recursive?)..
But that is working like a charm without this much complexity and on a wide range of classes. ;)
Edit: After a day, here is my $10 attempt:
First, in a separate static extension class, you'll need:
public static class CollectionExtensions
{
public static bool HasSameLengthThan<T>(this IEnumerable<T> list, IEnumerable<T> expected)
{
if (list.IsNullOrEmptyCollection() && expected.IsNullOrEmptyCollection())
return true;
if ((list.IsNullOrEmptyCollection() && !expected.IsNullOrEmptyCollection()) || (!list.IsNullOrEmptyCollection() && expected.IsNullOrEmptyCollection()))
return false;
return list.Count() == expected.Count();
}
/// <summary>
/// Used to find out if a collection is empty or if it contains no elements.
/// </summary>
/// <typeparam name="T">Type of the collection's items.</typeparam>
/// <param name="list">Collection of items to test.</param>
/// <returns><c>true</c> if the collection is <c>null</c> or empty (without items), <c>false</c> otherwise.</returns>
public static bool IsNullOrEmptyCollection<T>(this IEnumerable<T> list) => list == null || !list.Any();
}
Then, here is the updated class that works on a wider range of classes:
public class GenericComparer<T> : IEqualityComparer<T> where T : class
{
private Func<T, object> _expr { get; set; }
public GenericComparer() => _expr = null;
public GenericComparer(Func<T, object> expr) => _expr = expr;
public bool Equals(T x, T y)
{
var first = _expr?.Invoke(x) ?? x;
var sec = _expr?.Invoke(y) ?? y;
if (ObjEquals(first, sec))
return true;
var typeProperties = typeof(T).GetProperties();
foreach (var prop in typeProperties)
{
var firstPropVal = prop.GetValue(first, null);
var secPropVal = prop.GetValue(sec, null);
if (!ObjEquals(firstPropVal, secPropVal))
{
var propType = prop.PropertyType;
if (IsEnumerableType(propType) && firstPropVal is IEnumerable && !ArrayEquals(firstPropVal, secPropVal))
return false;
if (propType.IsClass)
{
if (!DeepEqualsFromObj(firstPropVal, secPropVal, propType))
return false;
if (!DeepObjEquals(firstPropVal, secPropVal))
return false;
}
}
}
return true;
}
public int GetHashCode(T obj) =>
_expr?.Invoke(obj).GetHashCode() ?? obj.GetHashCode();
#region Private Helpers
private bool DeepObjEquals(object x, object y) =>
new GenericComparer<object>().Equals(x, y);
private bool DeepEquals<U>(U x, U y) where U : class =>
new GenericComparer<U>().Equals(x, y);
private bool DeepEqualsFromObj(object x, object y, Type type)
{
dynamic a = Convert.ChangeType(x, type);
dynamic b = Convert.ChangeType(y, type);
return DeepEquals(a, b);
}
private bool IsEnumerableType(Type type) =>
type.GetInterface(nameof(IEnumerable)) != null;
private bool ObjEquals(object x, object y)
{
if (x == null && y == null) return true;
return x != null && x.Equals(y);
}
private bool ArrayEquals(object x, object y)
{
var firstList = new List<object>((IEnumerable<object>)x);
var secList = new List<object>((IEnumerable<object>)y);
if (!firstList.HasSameLengthThan(secList))
return false;
var elementType = firstList?.FirstOrDefault()?.GetType();
int cpt = 0;
foreach (var e in firstList)
{
if (!DeepEqualsFromObj(e, secList[cpt++], elementType))
return false;
}
return true;
}
#endregion Private Helpers
We can still optimize it but it worth give it a try ^^.
The inclusion of your comparison class (or more specifically the AsEnumerable call you needed to use to get it to work) meant that the sorting logic went from being based on the database server to being on the database client (your application). This meant that your client now needs to retrieve and then process a larger number of records, which will always be less efficient that performing the lookup on the database where the approprate indexes can be used.
You should try to develop a where clause that satisfies your requirements instead, see Using an IEqualityComparer with a LINQ to Entities Except clause for more details.
IEquatable<T> can be a much easier way to do this with modern frameworks.
You get a nice simple bool Equals(T other) function and there's no messing around with casting or creating a separate class.
public class Person : IEquatable<Person>
{
public Person(string name, string hometown)
{
this.Name = name;
this.Hometown = hometown;
}
public string Name { get; set; }
public string Hometown { get; set; }
// can't get much simpler than this!
public bool Equals(Person other)
{
return this.Name == other.Name && this.Hometown == other.Hometown;
}
public override int GetHashCode()
{
return Name.GetHashCode(); // see other links for hashcode guidance
}
}
Note you DO have to implement GetHashCode if using this in a dictionary or with something like Distinct.
PS. I don't think any custom Equals methods work with entity framework directly on the database side (I think you know this because you do AsEnumerable) but this is a much simpler method to do a simple Equals for the general case.
If things don't seem to be working (such as duplicate key errors when doing ToDictionary) put a breakpoint inside Equals to make sure it's being hit and make sure you have GetHashCode defined (with override keyword).

Categories