Checking if all items in one generic collection exist in another using custom comparison delegate - c#

I have a situation where I need a generic method to which I can pass two collections of type T along with a delegate that compares the two collections and returns true if every element in collection 1 has an equal element in collection 2, even if they are not in the same index of the collection. What I mean by "equal" is handled by the delegate. My initial thought was to return false if the collections were different lengths and otherwise sort them and then compare them like parallel arrays. Then it occurred to me that I can't sort a collection of a generic type without the types sharing an interface. So now I am thinking a LINQ expression might do the trick, but I can't think of how to write it. Consider my current code:
private static bool HasSameCollectionItems<T>(ICollection<T> left, ICollection<T> right, Func<T, T, bool> func)
{
if (left.Count != right.Count)
{
return false;
}
foreach (var item in left)
{
bool leftItemIsInRightCollection = ??? MAGIC ???
if (!leftItemIsInRightCollection)
{
return false;
}
}
return true;
}
I would like to replace ??? MAGIC ??? with a LINQ expression to see if item is "equal" to an element in right using the passed in delegate func. Is this even possible?
Note: For reasons I don't want to bother getting into here, impelemnting IEquatable or overriding the Equals method is not an option here.

It looks like you want .All() and .Any() methods (first method checks that all elements satisfy condition second only check if such an element exist) :
bool leftItemIsInRightCollection = right.Any(rItem => func(item, rItem));
Also i'd refactor your code to something like :
private static bool HasSameCollectionItems<T>(ICollection<T> left, ICollection<T> right, Func<T, T, bool> func)
{
return left.Count == right.Count && left.All(LI => right.Any(RI => func(LI, RI)));
}

The following works by checking whether there are element in left which are not in right.
If you insist on a delegate to determine equality, you can use the FuncEqualityComparer from here. (Note that you must also provide an implementation for Object.GetHashCode)
private static bool HasSameCollectionItems<T>(ICollection<T> left, ICollection<T> right, IEqualityComparer<T> comparer)
{
if (left.Count != right.Count) return false;
return !left.Except(right, comparer).Any();
}

Related

Looking for alternative LINQ expression(s)

I'm working on a code generator that validated objects based on certain business rules. As an example, I’m curious to find out various ways below logic can be written as LINQ expression.
Assertion should evaluate to true when collection is null OR when count of "TrueAndCorrect" items is anything but 1. One possible solution is:
bool assertion = report.DeclarationOfTrusteeCollection == null
|| report.DeclarationOfTrusteeCollection.Count(f => f.FTER99.Equals("TrueAndCorrect")) != 1
Are there other ways this LINQ can be expressed as, perhaps more compact, using Any, inverting the operators, or any other?
The original code is:
bool assertion =
report.DeclarationOfTrusteeCollection == null ||
report.DeclarationOfTrusteeCollection.Count(
f => f.FTER99.Equals("TrueAndCorrect")) != 1;
There are some problems here.
First, the intention of the null check seems to be "a null collection has the same semantics as an empty collection". This is a worst-practice in C#. Never do this! If you want to represent an empty collection, make an empty collection. There's even an Enumerable.Empty helper method for you.
So, start with that; the code should be:
if (report.DeclarationOfTrusteeCollection == null)
throw some appropriate exception
or
Debug.Assert(report.DeclarationOfTrusteeCollection != null);
if the condition is impossible.
That leaves us with
bool assertion =
report.DeclarationOfTrusteeCollection.Count(
f => f.FTER99.Equals("TrueAndCorrect")) != 1;
This is bad. Suppose I show you a jar that contains some number of pennies and I ask you "is there exactly one penny in the jar?" How many pennies do you have to count before you know the answer? Your code here is counting all of them, but you could stop after two.
Enumerable gives you a method which throws if a sequence is not a singleton, but no method that tests it. Fortunately it is easy to write. The best practice here is to write a helper method that has the exact semantics you want:
static class Extensions
{
public static bool IsSingleton<T>(this IEnumerable<T> items)
{
bool seenOne = false;
foreach(T item in items)
{
if (seenOne) return false;
seenOne = true;
}
return seenOne;
}
public static bool IsSingleton<T>(
this IEnumerable<T> items, Func<T, bool> predicate) =>
items.Where(predicate).IsSingleton();
}
Done. And now your code is:
if (report.DeclarationOfTrusteeCollection == null)
throw some appropriate exception
bool assertion =
report.DeclarationOfTrusteeCollection.IsSingleton(f => ...);
Write the code so that it reads like what it is logically doing. That's the beauty and power of LINQ sequence operators.
You could use the null-propagation operator:
bool assertion = report.DeclarationOfTrusteeCollection?.Count(f => f.FTER99.Equals("TrueAndCorrect")) != 1;
Since null is not 1 this is also true if the collection is null.
It would be nice if you don't need to count the whole collection, you already know it's wrong when there's more than one matching element. But I don't know of a built-in method for that. You could write your own extension:
public static class MyExtensions
{
public static bool IsNullOrHasNotExactlyOneMatching<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
if (source == null) return true;
bool found = false;
foreach(T element in source)
{
if (!predicate(element)) continue;
if (found) return true; // this is the second match!
found = true;
}
return !found; // one match found (or not)
}
}
And use it:
bool assertion = report.DeclarationOfTrusteeCollection.IsNullOrHasNotExactlyOneMatching(f => f.FTER99.Equals("TrueAndCorrect"));
As mentioned by Rawling you could shorten the extension using Take():
public static bool IsNullOrHasNotExactlyOneMatching<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
return source?.Where(predicate).Take(2).Count() != 1;
}
or do this directly:
bool assertion = report.DeclarationOfTrusteeCollection?.Where(f => f.FTER99.Equals("TrueAndCorrect"))
.Take(2).Count() != 1;
Both versions only iterate until a second match was found (or until the end if no match was found).

How to simply convert an IEnumerable into IOrderedEnumerable in O(1)? [duplicate]

Say there is an extension method to order an IQueryable based on several types of Sorting (i.e. sorting by various properties) designated by a SortMethod enum.
public static IOrderedEnumerable<AClass> OrderByX(this IQueryable<AClass> values,
SortMethod? sortMethod)
{
IOrderedEnumerable<AClass> queryRes = null;
switch (sortMethod)
{
case SortMethod.Method1:
queryRes = values.OrderBy(a => a.Property1);
break;
case SortMethod.Method2:
queryRes = values.OrderBy(a => a.Property2);
break;
case null:
queryRes = values.OrderBy(a => a.DefaultProperty);
break;
default:
queryRes = values.OrderBy(a => a.DefaultProperty);
break;
}
return queryRes;
}
In the case where sortMethod is null (i.e. where it is specified that I don't care about the order of the values), is there a way to instead of ordering by some default property, to instead just pass the IEnumerator values through as "ordered" without having to perform the actual sort?
I would like the ability to call this extension, and then possibly perform some additional ThenBy orderings.
All you need to do for the default case is:
queryRes = values.OrderBy(a => 1);
This will effectively be a noop sort. Because the OrderBy performs a stable sort the original order will be maintained in the event that the selected objects are equal. Note that since this is an IQueryable and not an IEnumerable it's possible for the query provider to not perform a stable sort. In that case, you need to know if it's important that order be maintained, or if it's appropriate to just say "I don't care what order the result is, so long as I can call ThenBy on the result).
Another option, that allows you to avoid the actual sort is to create your own IOrderedEnumerable implementation:
public class NoopOrder<T> : IOrderedEnumerable<T>
{
private IQueryable<T> source;
public NoopOrder(IQueryable<T> source)
{
this.source = source;
}
public IOrderedEnumerable<T> CreateOrderedEnumerable<TKey>(Func<T, TKey> keySelector, IComparer<TKey> comparer, bool descending)
{
if (descending)
{
return source.OrderByDescending(keySelector, comparer);
}
else
{
return source.OrderBy(keySelector, comparer);
}
}
public IEnumerator<T> GetEnumerator()
{
return source.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return source.GetEnumerator();
}
}
With that your query can be:
queryRes = new NoopOrder<AClass>(values);
Note that the consequence of the above class is that if there is a call to ThenBy that ThenBy will effectively be a top level sort. It is in effect turning the subsequent ThenBy into an OrderBy call. (This should not be surprising; ThenBy will call the CreateOrderedEnumerable method, and in there this code is calling OrderBy, basically turning that ThenBy into an OrderBy. From a conceptual sorting point of view, this is a way of saying that "all of the items in this sequence are equal in the eyes of this sort, but if you specify that equal objects should be tiebroken by something else, then do so.
Another way of thinking of a "no op sort" is that it orders the items based in the index of the input sequence. This means that the items are not all "equal", it means that the order input sequence will be the final order of the output sequence, and since each item in the input sequence is always larger than the one before it, adding additional "tiebreaker" comparisons will do nothing, making any subsequent ThenBy calls pointless. If this behavior is desired, it is even easier to implement than the previous one:
public class NoopOrder<T> : IOrderedEnumerable<T>
{
private IQueryable<T> source;
public NoopOrder(IQueryable<T> source)
{
this.source = source;
}
public IOrderedEnumerable<T> CreateOrderedEnumerable<TKey>(Func<T, TKey> keySelector, IComparer<TKey> comparer, bool descending)
{
return new NoopOrder<T>(source);
}
public IEnumerator<T> GetEnumerator()
{
return source.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return source.GetEnumerator();
}
}
If you return always the same index value you will get an IOrderedEnumerable that preserve the original list order:
case null:
queryRes = values.OrderBy(a => 1);
break;
Btw I don't think this is a right thing to do. You will get a collection that is supposted to be ordered but actually it is not.
Bottom line, IOrderedEnumerable exists solely to provide a grammar structure to the OrderBy()/ThenBy() methods, preventing you from trying to start an ordering clause with ThenBy(). process. It's not intended to be a "marker" that identifies the collection as ordered, unless it was actually ordered by OrderBy(). So, the answer is that if the sorting method being null is supposed to indicate that the enumerable is in some "default order", you should specify that default order (as your current implementation does). It's disingenuous to state that the enumerable is ordered when in fact it isn't, even if, by not specifying a SortingMethod, you are inferring it's "ordered by nothing" and don't care about the actual order.
The "problem" inherent in trying to simply mark the collection as ordered using the interface is that there's more to the process than simply sorting. By executing an ordering method chain, such as myCollection.OrderBy().ThenBy().ThenByDescending(), you're not actually sorting the collection with each call; not yet anyway. You are instead defining the behavior of an "iterator" class, named OrderedEnumerable, which will use the projections and comparisons you define in the chain to perform the sorting at the moment you need an actual sorted element.
Servy's answer, stating that OrderBy(x=>1) is a noop and should be optimized out of SQL providers ignores the reality that this call, made against an Enumerable, will still do quite a bit of work, and that most SQL providers in fact do not optimize this kind of call; OrderBy(x=>1) will, in most Linq providers, produce a query with an "ORDER BY 1" clause, which not only forces the SQL provider to perform its own sorting, it will actually result in a change to the order, because in T-SQL at least "ORDER BY 1" means to order by the first column of the select list.

How to get excluded collection without a second LINQ query?

I have a LINQ query that looks like this:
var p = option.GetType().GetProperties().Where(t => t.PropertyType == typeof(bool));
What is the most efficient way to get the items which aren't included in this query, without executing a second iteration over the list.
I could easily do this with a for loop but I was wondering if there's a shorthand with LINQ.
var p = option.GetType().GetProperties().ToLookup(t => t.PropertyType == typeof(bool));
var bools = p[true];
var notBools = p[false];
.ToLookup() is used to partition an IEnumerable based on a key function. In this case, it will return an Lookup which will have at most 2 items in it. Items in the Lookup can be accessed using a key similar to an IDictionary.
.ToLookup() is evaluated immediately and is an O(n) operation and accessing a partition in the resulting Lookup is an O(1) operation.
Lookup is very similar to a Dictionary and have similar generic parameters (a Key type and a Value type). However, where Dictionary maps a key to a single value, Lookup maps a key to an set of values. Lookup can be implemented as IDictionary<TKey, IEnumerable<TValue>>
.GroupBy() could also be used. But it is different from .ToLookup() in that GroupBy is lazy evaluated and could possibly be enumerated multiple times. .ToLookup() is evaluated immediately and the work is only done once.
You cannot get something that you don't ask for. So if you exlude all but bool you can't expect to get them later. You need to ask for them.
For what it's worth, if you need both, the one you want and all other in a single query you could GroupBy this condition or use ToLookup which i would prefer:
var isboolOrNotLookup = option.GetType().GetProperties()
.ToLookup(t => t.PropertyType == typeof(bool)); // use PropertyType instead
Now you can use this lookup for further processing. For example, if you want a collection of all properties which are bool:
List<System.Reflection.PropertyInfo> boolTypes = isboolOrNotLookup[true].ToList();
or just the count:
int boolCount = isboolOrNotLookup[true].Count();
So if you want to process all which are not bool:
foreach(System.Reflection.PropertyInfo prop in isboolOrNotLookup[false])
{
}
Well, you could go for source.Except(p), but it would reiterate the list and perform a lot of comparisons.
I'd say - write an extension method that does it using foreach, basically splitting the list into two destinations. Or something like this.
How about:
public class UnzipResult<T>{
private readonly IEnumearator<T> _enumerator;
private readonly Func<T, bool> _filter;
private readonly Queue<T> _nonMatching = new Queue<T>();
private readonly Queue<T> _matching = new Queue<T>();
public IEnumerable<T> Matching {get{
if(_matching.Count > 0)
yield return _matching.Dequeue();
else {
while(_enumerator.MoveNext()){
if(_filter(_enumerator.Current))
yield return _enumerator.Current;
else
_nonMatching.Enqueue(_enumerator.Current);
}
yield break;
}
}}
public IEnumerable<T> Rest {get{
if(_matching.Count > 0)
yield return _nonMatching.Dequeue();
else {
while(_enumerator.MoveNext()){
if(!_filter(_enumerator.Current))
yield return _enumerator.Current;
else
_matching.Enqueue(_enumerator.Current);
}
yield break;
}
}}
public UnzipResult(IEnumerable<T> source, Func<T, bool> filter){
_enumerator = source.GetEnumerator();
_filter = filter;
}
}
public static UnzipResult<T> Unzip(this IEnumerable<T> source, Func<T,bool> filter){
return new UnzipResult(source, filter);
}
It's written in notepad, so probably doesn't compile, but my idea is: whatever collection you enumerate (matching or non-matching), you only enumerate the source once. And it should work fairly well with those pesky infinite collections (think yield return random.Next()), unless all elements do/don't fulfil filter.

Check if multiple values (stored in a dedicated collection) are in a LINQ collection, in query

What is the method in LINQ to supply a collection of values and check if any/all of these values are in a collection?
Thanks
You can emulate this via .Intersect() and check if the intersection set has all the required elements. I guess this is pretty inefficient but quick and dirty.
List<T> list = ...
List<T> shouldBeContained = ...
bool containsAll = (list.Intersect(shouldBeContained).Count == shouldBeContained.Count)
Or you could do it with .All(). I guess this is more efficient and cleaner:
List<T> list = ...
List<T> shouldBeContained = ...
bool containsAll = (shouldBeContained.All(x=>list.Contains(x));
Linq has a number of operators that can be used to check existence of one set of values in another.
I would use Intersect:
Produces the set intersection of two sequences by using the default equality comparer to compare values.
While there's nothing easy that is built in...you could always create extension methods to make life easier:
public static bool ContainsAny<T>(this IEnumerable<T> data,
IEnumerable<T> intersection)
{
foreach(T item in intersection)
if(data.Contains(item)
return true;
return false;
}
public static bool ContainsAll<T>(this IEnumerable<T> data,
IEnumerable<T> intersection)
{
foreach(T item in intersection)
if(!data.Contains(item))
return false;
return true;
}

How do I implement a matching algorithm using predicates?

I understand how to use delegates and I am okay with lambda expressions to make use of predicates. I've come to a point where I want to implement a method that uses a predicate as an argument and can't figure out how to reference the predicate to find the matches in my collection:
private static T FindInCollection<T>(ICollection<T> collection, Predicate<T> match)
{
foreach (T item in collection)
{
//So how do I reference match to return the matching item?
}
return default(T);
}
I want to then reference this using something akin to:
ICollection<MyTestClass> receivedList = //Some list I've received from somewhere else
MyTestClass UsefulItem = FindInCollection<MyTestClass>(receivedList, i => i.SomeField = "TheMatchingData");
If anyone can give me an explanation or point me to a reference regarding implementation of predicates, I'd appreciate it. The documentation out there seems to all relate to passing predicates (which I can do just fine), not actually implementing the functionality that uses them...
Thanks
private static T FindInCollection<T>(ICollection<T> collection, Predicate<T> match)
{
foreach (T item in collection)
{
if (match(item))
return item;
}
return default(T);
}
You just use the predicate like any other delegate. It's basically a method you can call with any argument of type T, which will return true.

Categories