I know 2 ways to remove doubles from an array of objects that support explicit comparing:
Using HashSet constructor and
Using LINQ's Distinct().
How to remove doubles from array of structs, comparing array members by a single field only? In other words, how to write the predicate, that could be used by Distinct().
Regards,
Well, you could implement IEqualityComparer<T> to pick out that field and use that for equality testing and hashing... or you could use DistinctBy which is in MoreLINQ.
Of course, you don't have to take a dependency on MoreLINQ really - you can implement it very simply:
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
// TODO: Implement null argument checking :)
HashSet<TKey> keys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
I would probably just loop:
var values = new HashSet<FieldType>();
var newList = new List<ItemType>();
foreach(var item in oldList) {
if(hash.Add(item.TheField)) newList.Add(item);
}
The LINQ answer has been published before. I am copying from Richard Szalay's answer here: Filtering duplicates out of an IEnumerable
public static class EnumerationExtensions
{
public static IEnumerable<TSource> Distinct<TSource,TKey>(
this IEnumerable<TSource> source, Func<TSource,TKey> keySelector)
{
KeyComparer comparer = new KeyComparer(keySelector);
return source.Distinct(comparer);
}
private class KeyComparer<TSource,TKey> : IEqualityComparer<TSource>
{
private Func<TSource,TKey> keySelector;
public DelegatedComparer(Func<TSource,TKey> keySelector)
{
this.keySelector = keySelector;
}
bool IEqualityComparer.Equals(TSource a, TSource b)
{
if (a == null && b == null) return true;
if (a == null || b == null) return false;
return keySelector(a) == keySelector(b);
}
int IEqualityComparer.GetHashCode(TSource obj)
{
return keySelector(obj).GetHashCode();
}
}
}
Which, as Richard says, is used like this:
var distinct = arr.Distinct(x => x.Name);
implement a custom IEqualityComparer<T>
public class MyStructComparer : IEqualityComparer<MyStruct>
{
public bool Equals(MyStruct x, MyStruct y)
{
return x.MyVal.Equals(y.MyVal);
}
public int GetHashCode(MyStruct obj)
{
return obj.MyVal.GetHashCode();
}
}
then
var distincts = myStructList.Distinct(new MyStructComparer());
Related
I'm trying to put non duplicated values in a lookup edit using the code below but the variable unique is always null, and I don't know where is the problem.
Any help please?
List<VueItemItemUnit> liste = ObjReservation.LoadAllFamilles();
var unique =
from element in liste
group element by element.FA_CODE into Group
where Group.Count() == 1
select Group.Key;
lookUpFamille.Properties.DataSource = unique;
Try this :
var unique = liste.Distinct(element => element.FA_CODE).Select(element => element.FA_CODE);
I suggest you use the following approach:
lookUpFamille.Properties.DataSource = list.DistinctBy(e => e.FA_CODE).ToList();
//...
// DistinctBy<T,TKey> extension
static class EnumerableHelper {
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> source, Func<T, TKey> keySelector) {
return source.Distinct(new EqualityComparer<T, TKey>(keySelector));
}
class EqualityComparer<T, TKey> : IEqualityComparer<T> {
readonly Func<T, TKey> keySelector;
public EqualityComparer(Func<T, TKey> keySelector) {
this.keySelector = keySelector;
}
bool IEqualityComparer<T>.Equals(T x, T y) {
return Equals(keySelector(x), keySelector(y));
}
int IEqualityComparer<T>.GetHashCode(T obj) {
return keySelector(obj).GetHashCode();
}
}
}
I am wondering what rules there are to tell if a portion of LINQ code is suffering from double enumeration?
For example, what are the telltale signs that the following code might be double enumerating? What are some other signs to look out for?
public static bool MyIsIncreasingMonotonicallyBy<T, TResult>(this IEnumerable<T> list, Func<T, TResult> selector)
where TResult : IComparable<TResult>
{
return list.Zip(list.Skip(1), (a, b) => selector(a).CompareTo(selector(b)) <= 0).All(b => b);
}
Pass in one of these:
public class OneTimeEnumerable<T> : IEnumerable<T>
{
public OneTimeEnumerable(IEnumerable<T> source)
{
_source = source;
}
private IEnumerable<T> _source;
private bool _wasEnumerated = false;
public IEnumerator<T> GetEnumerator()
{
if (_wasEnumerated)
{
throw new Exception("double enumeration occurred");
}
_wasEnumerated = true;
return _source.GetEnumerator();
}
}
I hate posting this since it's somewhat subjective, but it feels like there's a better method to do this that I'm just not thinking of.
Sometimes I want to 'distinct' a collection by certain columns/properties but without throwing away other columns (yes, this does lose information, as it becomes arbitrary which values of those other columns you'll end up with).
Note that this extension is less powerful than the Distinct overloads that take an IEqualityComparer<T> since such things could do much more complex comparison logic, but this is all I need for now :)
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> getKeyFunc)
{
return from s in source
group s by getKeyFunc(s) into sourceGroups
select sourceGroups.First();
}
Example usage:
var items = new[]
{
new { A = 1, B = "foo", C = Guid.NewGuid(), },
new { A = 2, B = "foo", C = Guid.NewGuid(), },
new { A = 1, B = "bar", C = Guid.NewGuid(), },
new { A = 2, B = "bar", C = Guid.NewGuid(), },
};
var itemsByA = items.DistinctBy(item => item.A).ToList();
var itemsByB = items.DistinctBy(item => item.B).ToList();
Here you go. I don't think this is massively more efficient than your own version, but it should have a slight edge. It only requires a single pass through the sequence, yielding each item as it goes, rather than needing to group the entire sequence first.
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
return source.DistinctBy(keySelector, null);
}
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> keyComparer)
{
if (source == null)
throw new ArgumentNullException("source");
if (keySelector == null)
throw new ArgumentNullException("keySelector");
return source.DistinctByIterator(keySelector, keyComparer);
}
private static IEnumerable<TSource> DistinctByIterator<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> keyComparer)
{
var keys = new HashSet<TKey>(keyComparer);
foreach (TSource item in source)
{
if (keys.Add(keySelector(item)))
yield return item;
}
}
I've previously written a generic Func => IEqualityComparer utility class just for the purpose of being able to call overloads of LINQ methods that accept an IEqualityComparer with having to write a custom class each time.
It uses a delegate (just like your example) to supply the comparison semantics. This allows me to use the built-in implementations of the library methods rather than rolling my own - which I presume are more likely to be correct and efficiently implemented.
public static class ComparerExt
{
private class GenericEqualityComparer<T> : IEqualityComparer<T>
{
private readonly Func<T, T, bool> m_CompareFunc;
public GenericEqualityComparer( Func<T,T,bool> compareFunc ) {
m_CompareFunc = compareFunc;
}
public bool Equals(T x, T y) {
return m_CompareFunc(x, y);
}
public int GetHashCode(T obj) {
return obj.GetHashCode(); // don't override hashing semantics
}
}
public static IComparer<T> Compare<T>( Func<T,T,bool> compareFunc ) {
return new GenericEqualityComparer<T>(compareFunc);
}
}
You can use this as so:
var result = list.Distinct( ComparerExt.Compare( (a,b) => { /*whatever*/ } );
I also often throw in a Reverse() method to allow for changing the ordering of operands in the comparison, like so:
private class GenericComparer<T> : IComparer<T>
{
private readonly Func<T, T, int> m_CompareFunc;
public GenericComparer( Func<T,T,int> compareFunc ) {
m_CompareFunc = compareFunc;
}
public int Compare(T x, T y) {
return m_CompareFunc(x, y);
}
}
public static IComparer<T> Reverse<T>( this IComparer<T> comparer )
{
return new GenericComparer<T>((a, b) => comparer.Compare(b, a));
}
My Code looks like this :
Collection<NameValueCollection> optionInfoCollection = ....
List<NameValueCollection> optionInfoList = new List<NameValueCollection>();
optionInfoList = optionInfoCollection.ToList();
if(_isAlphabeticalSoting)
Sort optionInfoList
I tried optionInfoList.Sort() but it is not working.
Using the sort method and lambda expressions, it is really easy.
myList.Sort((a, b) => String.Compare(a.Name, b.Name))
The above example shows how to sort by the Name property of your object type, assuming Name is of type string.
If you just want Sort() to work, then you'll need to implement IComparable or IComparable<T> in the class.
If you don't mind creating a new list, you can use the OrderBy/ToList LINQ extension methods. If you want to sort the existing list with simpler syntax, you can add a few extension methods, enabling:
list.Sort(item => item.Name);
For example:
public static void Sort<TSource, TValue>(
this List<TSource> source,
Func<TSource, TValue> selector)
{
var comparer = Comparer<TValue>.Default;
source.Sort((x, y) => comparer.Compare(selector(x), selector(y)));
}
public static void SortDescending<TSource, TValue>(
this List<TSource> source,
Func<TSource, TValue> selector)
{
var comparer = Comparer<TValue>.Default;
source.Sort((x, y) => comparer.Compare(selector(y), selector(x)));
}
public class Person {
public string FirstName { get; set; }
public string LastName { get; set; }
}
List<Person> people = new List<Person>();
people.Sort(
delegate(Person x, Person y) {
if (x == null) {
if (y == null) { return 0; }
return -1;
}
if (y == null) { return 0; }
return x.FirstName.CompareTo(y.FirstName);
}
);
You need to set up a comparer that tells Sort() how to arrange the items.
Check out List.Sort Method (IComparer) for an example of how to do this...
F# has a bunch of standard sequence operators I have come to know and love from my experience with Mathematica. F# is getting lots of my attention now, and when it is in general release, I intend to use it frequently.
Right now, since F# isn't yet in general release, I can't really use it in production code. LINQ implements some of these operators using SQL-like names (e.g. 'select' is 'map', and 'where' is 'filter'), but I can find no implementation of 'fold', 'iter' or 'partition'.
Has anyone seen any C# implementation of standard sequence operators? Is this something someone should write?
If you look carefully, many Seq operations have a LINQ equivalent or can be easily derived. Just looking down the list...
Seq.append = Concat<TSource>(IEnumerable<TSource> second)
Seq.concat = SelectMany<IEnumerable<TSource>, TResult>(s => s)
Seq.distinct_by = GroupBy(keySelector).Select(g => g.First())
Seq.exists = Any<TSource>(Func<TSource, bool> predicate)
Seq.mapi = Select<TSource, TResult>(Func<TSource, Int32, TResult> selector)
Seq.fold = Aggregate<TSource, TAccumulate>(TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> func)
List.partition is defined like this:
Split the collection into two collections, containing the elements for which the given predicate returns true and false respectively
Which we can implement using GroupBy and a two-element array as a poor-man's tuple:
public static IEnumerable<TSource>[] Partition<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
return source.GroupBy(predicate).OrderByDescending(g => g.Key).ToArray();
}
Element 0 holds the true values; 1 holds the false values. GroupBy is essentially Partition on steroids.
And finally, Seq.iter and Seq.iteri map easily to foreach:
public static void Iter<TSource>(this IEnumerable<TSource> source, Action<TSource> action)
{
foreach (var item in source)
action(item);
}
public static void IterI<TSource>(this IEnumerable<TSource> source, Action<Int32, TSource> action)
{
int i = 0;
foreach (var item in source)
action(i++, item);
}
fold = Aggregate
Tell use what iter and partition do, and we might fill in the blanks. I'm guessing iter=SelectMany and partition might involve Skip/Take?
(update) I looked up Partition - here's a crude implementation that does some of it:
using System;
using System.Collections.Generic;
static class Program { // formatted for space
// usage
static void Main() {
int[] data = { 1, 2, 3, 4, 5, 6 };
var qry = data.Partition(2);
foreach (var grp in qry) {
Console.WriteLine("---");
foreach (var item in grp) {
Console.WriteLine(item);
}
}
}
static IEnumerable<IEnumerable<T>> Partition<T>(
this IEnumerable<T> source, int size) {
int count = 0;
T[] group = null; // use arrays as buffer
foreach (T item in source) {
if (group == null) group = new T[size];
group[count++] = item;
if (count == size) {
yield return group;
group = null;
count = 0;
}
}
if (count > 0) {
Array.Resize(ref group, count);
yield return group;
}
}
}
iter exists as a method in the List class which is ForEach
otherwise :
public static void iter<T>(this IEnumerable<T> source, Action<T> act)
{
foreach (var item in source)
{
act(item);
}
}
ToLookup would probably be a better match for List.partition:
IEnumerable<T> sequence = SomeSequence();
ILookup<bool, T> lookup = sequence.ToLookup(x => SomeCondition(x));
IEnumerable<T> trueValues = lookup[true];
IEnumerable<T> falseValues = lookup[false];
Rolling your own in C# is an interesting exercise, here are a few of mine. (See also here)
Note that iter/foreach on an IEnumerable is slightly controversial - I think because you have to 'finalise' (or whatever the word is) the IEnumerable in order for anything to actually happen.
//mimic fsharp map function (it's select in c#)
public static IEnumerable<TResult> Map<T, TResult>(this IEnumerable<T> input, Func<T, TResult> func)
{
foreach (T val in input)
yield return func(val);
}
//mimic fsharp mapi function (doens't exist in C#, I think)
public static IEnumerable<TResult> MapI<T, TResult>(this IEnumerable<T> input, Func<int, T, TResult> func)
{
int i = 0;
foreach (T val in input)
{
yield return func(i, val);
i++;
}
}
//mimic fsharp fold function (it's Aggregate in c#)
public static TResult Fold<T, TResult>(this IEnumerable<T> input, Func<T, TResult, TResult> func, TResult seed)
{
TResult ret = seed;
foreach (T val in input)
ret = func(val, ret);
return ret;
}
//mimic fsharp foldi function (doens't exist in C#, I think)
public static TResult FoldI<T, TResult>(this IEnumerable<T> input, Func<int, T, TResult, TResult> func, TResult seed)
{
int i = 0;
TResult ret = seed;
foreach (T val in input)
{
ret = func(i, val, ret);
i++;
}
return ret;
}
//mimic fsharp iter function
public static void Iter<T>(this IEnumerable<T> input, Action<T> action)
{
input.ToList().ForEach(action);
}
Here is an update to dahlbyk's partition solution.
It returned an array[] where "element 0 holds the true values; 1 holds the false values" — but this doesn't hold when all the elements match or all fail the predicate, in which case you've got a singleton array and a world of pain.
public static Tuple<IEnumerable<T>, IEnumerable<T>> Partition<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
var partition = source.GroupBy(predicate);
IEnumerable<T> matches = partition.FirstOrDefault(g => g.Key) ?? Enumerable.Empty<T>();
IEnumerable<T> rejects = partition.FirstOrDefault(g => !g.Key) ?? Enumerable.Empty<T>();
return Tuple.Create(matches, rejects);
}