LINQ extension methods - Any() vs. Where() vs. Exists()

LINQ extension methods - Any() vs. Where() vs. Exists() - c#

Unfortunately the names of these methods make terrible search terms, and I've been unable to find a good resource that explains the difference between these methods--as in when to use each.
Thanks.
Edit:
The sort of query that I'm trying to fully understand is something like this:
context.Authors.Where(a => a.Books.Any(b => b.BookID == bookID)).ToList();
And thanks to all who've answered.

Where returns a new sequence of items matching the predicate.
Any returns a Boolean value; there's a version with a predicate (in which case it returns whether or not any items match) and a version without (in which case it returns whether the query-so-far contains any items).
I'm not sure about Exists - it's not a LINQ standard query operator. If there's a version for the Entity Framework, perhaps it checks for existence based on a key - a sort of specialized form of Any? (There's an Exists method in List<T> which is similar to Any(predicate) but that predates LINQ.)

context.Authors.Where(a => a.Books.Any(b => b.BookID == bookID)).ToList();
a.Books is the list of books by that author. The property is automatically created by Linq-to-Sql, provided you have a foreign-key relationship set up.
So, a.Books.Any(b => b.BookID == bookID) translates to "Do any of the books by this author have an ID of bookID", which makes the complete expression "Who are the authors of the book with id bookID?"
That could also be written something like
from a in context.Authors
join b in context.Books on a.AuthorId equal b.AuthorID
where b.BookID == bookID
select a;
UPDATE:
Any() as far as I know, only returns a bool. Its effective implementation is:
public Any(this IEnumerable<T> coll, Func<T, bool> predicate)
{
foreach(T t in coll)
{
if (predicte(t))
return true;
}
return false;
}

Just so you can find it next time, here is how you search for the enumerable Linq extensions. The methods are static methods of Enumerable, thus Enumerable.Any, Enumerable.Where and Enumerable.Exists.
google.com/search?q=Enumerable.Any
google.com/search?q=Enumerable.Where
google.com/search?q=Enumerable.Exists
As the third returns no usable result, I found that you meant List.Exists, thus:
google.com/search?q=List.Exists
I also recommend hookedonlinq.com as this is has very comprehensive and clear guides, as well clear explanations of the behavior of Linq methods in relation to deferness and lazyness.

Any - boolean function that returns true when any of object in list satisfies condition set in function parameters. For example:
List<string> strings = LoadList();
boolean hasNonEmptyObject = strings.Any(s=>string.IsNullOrEmpty(s));
Where - function that returns list with all objects in list that satisfy condition set in function parameters. For example:
IEnumerable<string> nonEmptyStrings = strings.Where(s=> !string.IsNullOrEmpty(s));
Exists - basically the same as any but it's not generic - it's defined in List class, while Any is defined on IEnumerable interface.

Any() returns true if any of the elements in a collection meet your predicate's criteria. (Any() does not iterate through the entire collection, as it returns upon the first match.)
Where() returns an enumerable of all elements in a collection that meet your predicate's criteria.
Exists() does the same thing as Any() except it's just an older implementation that was there on the List back before LINQ.

IEnumerable introduces quite a number of extensions to it which helps you to pass your own delegate and invoking the resultant from the IEnumerable back. Most of them are by nature of type Func
The Func takes an argument T and returns TResult.
In case of
Where - Func : So it takes IEnumerable of T and Returns a bool. The where will ultimately returns the IEnumerable of T's for which Func returns true.
So if you have 1,5,3,6,7 as IEnumerable and you write .where(r => r<5) it will return a new IEnumerable of 1,3.
Any - Func basically is similar in signature but returns true only when any of the criteria returns true for the IEnumerable. In our case, it will return true as there are few elements present with r<5.
Exists - Predicate on the other hand will return true only when any one of the predicate returns true.
So in our case if you pass .Exists(r => 5) will return true as 5 is an element present in IEnumerable.

foreach (var item in model.Where(x => !model2.Any(y => y.ID == x.ID)).ToList())
{
enter code here
}
same work you also can do with Contains
secondly Where is give you new list of values.
thirdly using Exist is not a good practice, you can achieve your target from Any and contains like
EmployeeDetail _E = Db.EmployeeDetails.where(x=>x.Id==1).FirstOrDefault();
Hope this will clear your confusion.

Related

Does Linq's IEnumerable.Select return a reference to the original IEnumerable?

I was trying to clone an List in my code, because I needed to output that List to some other code, but the original reference was going to be cleared later on. So I had the idea of using the Select extension method to create a new reference to an IEnumerable of the same elements, for example:
List<int> ogList = new List<int> {1, 2, 3};
IEnumerable<int> enumerable = ogList.Select(s => s);
Now after doing ogList.Clear(), I was surprised to see that my new enumerable was also empty.
So I started fiddling around in LINQPad, and saw that even if my Select returned different objects entirely, the behaviour was the same.
List<int> ogList = new List<int> {1, 2, 3};
IEnumerable<int> enumerable = ogList.Select(s => 5); // Doesn't return the original int
enumerable.Count().Dump(); // Count is 3
ogList.Clear();
enumerable.Count().Dump(); // Count is 0!
Note that in LINQPad, the Dump()s are equivalent to Console.WriteLine().
Now probably my need to clone the list in the first place was due to bad design, and even if I didn't want to rethink the design I could easily clone it properly. But this got me thinking about what the Select extension method actually does.
According to the documentation for Select:
This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.
So then I tried adding this code before clearing:
foreach (int i in enumerable)
{
i.Dump();
}
The result was still the same.
Finally, I tried one last thing to figure out if the reference in my new enumerable was the same as the old one. Instead of clearing the original List, I did:
ogList.Add(4);
Then I printed out the contents of my enumerable (the "cloned" one), expecting to see '4' appended to the end of it. Instead, I got:
5
5
5
5 // Huh?
Now I have no choice but to admit that I have no idea how the Select extension method works behind the scenes. What's going on?

List/List<T> are for all intents and purposes fancy resizable arrays. They own and hold the data for value types such as your ints or references to the data for reference types in memory and they always know how many items they have.
IEnumerable/IEnumerable<T> are different beasts. They provide a different service/contract. An IEnumerable is fictional, it does not exist. It can create data out of thin air, with no physical backing. Their only promise is that they have a public method called GetEnumerator() that returns an IEnumerator/IEnumerator<T>. The promise that an IEnumerator makes is simple:
some item could be available or not at a time when you decide you need it. This is achieved through a simple method that the IEnumerator interface has: bool MoveNext() - which returns false when the enumeration is completed or true if there was in fact a new item that needed to be returned. You can read the data through a property that the IEnumerator interface has, conveniently called Current.
To get back to your observations/question: as far as the IEnumerable in your example is concerned, it does not even think about the data unless your code tells it to fetch some data.
When you are writing:
List<int> ogList = new List<int> {1, 2, 3};
IEnumerable<int> enumerable = ogList.Select(s => s);
You are saying: Listen here IEnumerable, I might come to you asking for some items at some point in the future. I'll tell you when I will need them, for now sit still and do nothing. With Select(s => s) you are conceptually defining an identity projection of int to int.
A very rough simplified, non-real-life implementation of the select you've written is:
IEnumerable<T> Select(this IEnumerable<int> source, Func<int,T> transformer) something like
{
foreach (var i in source) //create an enumerator for source and starts enumeration
{
yield return transformer(i); //yield here == return an item and wait for orders
}
}
(this explains why you got a 5 when expecting a for, your transform was s => 5)
For value types, such as the ints in your case: If you want to clone the list, clone the whole list or part of it for future enumeration by using the result of an enumeration materialized through a List. This way you create a list that is a clone of the original list, entirely detached from its original list:
IEnumerable<int> cloneOfEnumerable = ogList.Select(s => s).ToList();
Later edit: Of course ogList.Select(s => s) is equivalent to ogList. I'm leaving the projection here, as it was in the question.
What you are creating here is: a list from the result of an enumerable, further consumed through the IEnumerable<int> interface. Considering what I've said above about the nature of IList vs IEnumerable, I would prefer to write/read:
IList<int> cloneOfEnumerable = ogList.ToList();
CAUTION: Be careful with reference types. IList/List make no promise of keeping the objects "safe", they can mutate to null for all IList cares. Keyword if you ever need it: deep cloning.
CAUTION: Beware of infinite or non-rewindable IEnumerables

Provided answers explain why you are not obtaining a cloned list (due to deferred execution of some LINQ extension methods).
However, keep in mind that list.Select(e => e).ToList() will get a real clone only when dealing with value types such as int.
If you have a list of reference types you will receive a cloned list of references to existent objects. In this case you should consider one of the solutions provided here for deep-cloning or my favorite from here (which might be limited by object inner structure).

You have to be aware that an object that implements IEnumerable does not have to be a collection itself. It is an object that makes it possible to get an object that implements IEnumerator. Once you have the enumerator you can ask for the first element and for the next element until there are no more next elements.
Every LINQ function that returns an IEnumerable is not the sequence itself, it only enables you to ask for the enumerator. If you want a sequence, you'll have to use ToList.
There are several other LINQ functions that do not return an IEnumerable, but for instance a Dictionary, or only one element (FirstOrDefault(), Max(), Single(), Any(). These functions will get the enumerator from the IEnumerable and start enumerating until they have the result. Any will only have to check if you can start enumerating. Max will enumerate over all elements and remember the largest one. etc.
You'll have to be aware: as long as your LINQ statement is an IEnumerable of something, your source sequence is not accessed yet. If you change your source sequence before you start enumerating, the enumeration is over your changed source sequence.
If you don't want this, you'll have to do the enumeration before you change your source. Usually this will be ToList, but this can be any of the non-deferred function: Max(), Any(), FirstOrDefault(), etc.
List<TSource> sourceItems = ...
var myEnumerable = sourceItems
.Where(sourceItem => ...)
.GroupBy(sourceItem => ...)
.Select(group => ...);
// note: myEnumerable is an IEnumerable, it is not a sequence yet.
var list1 = sourceItems.ToList(); // Enumerate over the sequence
var first = sourceItems.FirstOrDefault(); // Enumerate and stop after the first
// now change the source, and to the same things again
sourceItems.Clear();
var list1 = sourceItems.ToList(); // returns empty list
var first = sourceItems.FirstOrDefault(); // return null: there is no first element
So every LINQ function that does not return IEnumerable, will start enumerating over sourceItems as the sequence is at the moment that you start enumerating. The IEnumerable is not the sequence itself.

This is an enumerable.
var enumerable = ogList.Select(s => s);
If you iterate through this enumerable, LINQ will in turn iterate over the original resultset. Each and every time. If you do anything to the original enumerable, the results will also be reflected in your LINQ calls.
If you need to freeze the data, store it in a list instead:
var enumerable = ogList.Select(s => s).ToList();
Now you've made a copy. Iterating over this list will not touch the original enumerable.

Would using an extension method to query an IEnumerable be equivalent to using IQueryable?

I understand the main difference between IQueryable and IEnumerable. IQueryable executes the query in the database and returns the filtered results while IEnumerable brings the full result set into memory and executes the filtering query there.
I don't think using an extension method to query on the initial assignment of a variable will cause it to be executed in the database like IQueryable, but I just wanted to make sure.
This code will cause the full result set of People to be returned from the database, and then the filtering is done in memory:
(The People property on the context is of type DbSet)
IEnumerable<Person> people = context.People;
Person person = people.Where(x => x.FirstName == "John");
Even though I am adding the filtering below as an extension method before assigning the item to my variable, I'm assuming this code should work the same way as the code above, and bring back the full result set into memory before filtering it, right?
Person person = context.People.Where(x => x.FirstName == "John");
EDIT:
Thanks for the replies guys. I modified the code example to show what I meant (removed the IEnumerable in the second paragraph of code).
Also, to clarify, context.People is of type DbSet, which implements both IQueryable and IEnumerable. So I 'm not actually sure which .Where method is being called. IntelliSense tells me it is the IQueryable version, but can this be trusted? Is this always the case when working directly with a DbSet of a context?

IEnumerable<Person> people = context.People;
Person person = people.Where(x => x.FirstName == "John");
... will execute the IEnumerable<T>.Where method extension, which accepts a Func<TSource, bool> predicate parameter, forcing the filtering to happen in memory.
In contrast...
IEnumerable<Person> people = context.People.Where(x => x.FirstName == "John");
...will execute the IQueryable<T>.Where method extension, which accepts a Expression<Func<TSource, bool>> predicate parameter. Notice that this is an expression, not a delegate, which allows it to translate the where condition to a database condition.
So it really does make a difference which extension method you invoke.

IQueryable<T> works on expressions. It is effectively a query-builder, accumulating information about a query without doing anything... until the moment you need a value from it, when:
a query is generated from the accumulated Expression, in the target language (e.g. SQL)
the query is executed, usually on the database,
results are converted back to a C# object.
IEnumerable<T> extensions works on pre-compiled functions. When you need a value from it:
C# code in those functions is executed.
It is easy to confuse the two, because:
both have extension functions with similar names,
the lambda syntax is the same for Expressions and Functions - so you cannot tell them apart,
the use of "var" to declare variables removes the datatype (often the only clue as to which interface is being used).
IQueryable<int> a;
IEnumerable<int> b;
int x1 = a.FirstOrDefault(i => i > 10); // Expression passed in
int x2 = b.FirstOrDefault(i => i > 10); // Function passed in
Extension methods with the same name usually do the same thing (because they were written that way) but sometimes they don't.
So the answer is: No, they are not equivalent.

Compare predicates

I have a list of predicates
public List<Func<Album, bool>> Predicates { get; set; }
I'd like to check if a list contains specific predicate.
What I do is this :
bool check = Predicates.Contains(x=>x.AlbumName == "Winter");
But this always returns false even though there is such a predicate in the list. I assume it is because predicates are anonymous methods and each is kind of unique, but still is it possible to compare them somehow?

I'm afraid the answer is basically "no". If you had expression trees instead of delegates then you could probably compare those with effort, but basically you've got references to separate methods. You'd need to inspect the IL inside the methods to compare whether or not they're the same.
Of course if you have a set of objects on which the predicates operate, you could find out whether you have any predicates which matches the same subset as your "target" predicate, but that's not the same as testing whether the predicate is actually the same.

Predicate comparison is perfectly possible. What you are doing wrong is you are using lambda expressions, basically declaring new methods for each predicate (AND for their comparison)! :)
Instead of lambdas, use normal functions.
For example, declare: bool AlbumNameFilter(Album album) {return album.Name == "Winter";}
You will be able to add AlbumNameFilter to Predicates list and Find it later:
Predicates.Contains(AlbumnameFilter);

Switch to
Expression< Func< Album, bool>> list
and create custom comparer

You can use Dictionary instead of List. Then you can compare the different Predicates by whatever you assign as a Key.
Dictionary<(YourKey), Predicate<Album>> predicates = new Dictionary<(YourKey), Predicate<Album>>;
....
if (!predicates.ContainsKey((YourKey)))
{
predicates.Add((YourKey), x => x.AlbumName == "Winter")
}
else
{
predicates.Remove((YourKey));
}
Then to use the Predicate you just access the Dictionary Value.

How does OrderBy extension method orders collection

Suppose we have a class Foo which has an Int32 Field Bar and you want to sort the collection of the Foo objects by the Bar Value. One way is to implement IComparable's CompareTo() method, but it can also be done using Language Integrated Query (LINQ) like this
List<Foo> foos = new List<Foo>();
// assign some values here
var sortedFoos = foos.OrderBy(f => f.Bar);
now in sortedFoos we have foos collection which is sorted. But if you use System.Diagnostics.StopWatch object to measure the time it took OrderBy() to sort the collection is always 0 milliseconds. But whenever you print sortedFoos collection it's obviously sorted. How is that possible. It takes literary no time to sort collection but after method execution the collection is sorted ? Can somebody explain to me how that works ? And also one thing. Suppose after sorting foos collection I added another element to it. now whenever I print out the collection the the element I added should be in the end right ? WRONG ! The foos collection will be sorted like, the element I added was the part of foos collection even if I add that element to foos after sorting. I don't quit understand how any of that work so can anybody make it clear for me ?!

Almost all LINQ methods use lazy evaluation - they don't immediately do anything, but they set up the query to do the right thing when you ask for the data.
OrderBy follows this model too - although it's less lazy than methods like Where and Select. When you ask for the first result from the result of OrderBy, it will read all the data from the source, sort all of it, and then return the first element. Compare this to Select for example, where asking for the first element from the projection only asks for the first element from the source.
If you're interested in how LINQ to Objects works behind the scenes, you might want to read my Edulinq blog series - a complete reimplementation of LINQ to Objects, with a blog post describing each method's behaviour and implementation.
(In my own implementation of OrderBy, I actually only sort lazily - I use a quick sort and sort "just enough" to return the next element. This can make things like largeCollection.OrderBy(...).First() a lot quicker.)

LINQ believe in deferred execution. This means the expression will only be evaluated when you started iterating or accessing the result.

The OrderBy extension uses the deault IComparer for the type it is working on, unless an alternative is passed via the appropriate overload.
The sorting work is defered until the IOrderedEnumerable<T> returned by your statement is first accessed. If you place the Stopwatch around that first access, you'll see how long sorting takes.
This makes a lot of sense, since your statement could be formed from multiple calls that return IOrderedEnumerables. As the ordering calls are chained, they fluently extend the result, allowing the finally returned IOrderedEnumerable to perform the sorting in the most expedient way possible. This is probably achieved by chaining all the IComparer calls and sorting once. A greedy implementation would have to wastefully sort multiple times.
For instance, consider
class MadeUp
{
public int A;
public DateTime B;
public string C;
public Guid D;
}
var verySorted = madeUps.OrderBy(m => m.A)
.ThenBy(m => m.B)
.ThenByDescending(m => m.C)
.ThenBy(m => m.D);
If verySorted was evaluated greedily, then every property in the sequence would be evaluated and the sequence would be reordered 4 times. Because the linq implementation of IOrderedEnumerable deferes sorting until enumeration, it is able to optimise the process.
The IComparers for A, B, C and D can be combined into a composite delegate, something like this simplified representation,
int result
result = comparerA(A, B);
if (result == 0)
{
result = comparerB(A, B);
if (result == 0)
{
result = comparerC(A, B);
if (result == 0)
{
result = comparerD(A, B);
}
}
}
return result;
the composite delegate is then used to sort the sequence once.

You have to add the ToList() to get a new collection. If you do't the OrderBy will be called when you start iterating on sortedFoos.
List<Foo> foos = new List<Foo>();
// assign some values here
var sortedFoos = foos.OrderBy(f => f.Bar).ToList();

lambda expressions on populated lists

There are a few posts on the site about how to order by using lambda expressions however I cannot seem to get mine to work. I am trying to reorder a list that is already populated. Am i wrong in thinking that i can rearrange the order of this list using lambada expressions?
QuarterMileTimes.OrderByDescending(c => c.PquartermileTime);
I was wondering if it's down to PquartermileTime being a string? I also tried this expression on a date
QuarterMileTimes.orderBy(c => c.RaceDay);
Still no luck where am I going wrong?

When you call OrderByDescending, the method returns a new IEnumerable<T> - it does not reorder the collection in place.
Try doing:
QuarterMileTimes = QuarterMileTimes.OrderByDescending(c => c.PquartermileTime).ToList();
(This is if your collection is a List<T>...)

The result of OrderByDescending (and all of the other Enumerable extension methods) is an IEnumerable<T> that projects the source data in the order you're describing. It does not alter the original data in any way.
If you prefer, you can use the ToList() extension method to create a new List<T> from that result and assign it back to the original variable.
QuarterMileTimes = QuarterMileTimes.OrderByDescending(/*...*/).ToList();
(This is assuming, of course, that QuarterMileTimes is a List<T>)
The gist of the answer is no, OrderByDescending does not alter the data source in any way.

You are assigning it to a new variable aren't you?
var sortedTimes = QuarterMileTimes.OrderByDescending(c => c.PquartermileTime);
It isn't like e.g. the List.Sort method, that sorts the existing list.
The result of the method has to be assigned to a variable.

OrderByDescending returns an IOrderedEnumerable<T> i.e. a new sequence with the items in the specified order. You'll have to re-assign QuarterMileTimes to get the behaviour you expect:
QuarterMileTimes = QuarterMileTimes.OrderByDescending(c => c.PquarterMileTime).ToList();
Alternatively you can just use the returned sequence separately, which is the usual approach.

QuarterMileTimes.OrderByDescending(c => c.PquartermileTime) returns a new enumerable, ordered by PquartermileTime. It does not reorder QuarterMileTimes in place.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

LINQ extension methods - Any() vs. Where() vs. Exists() - c#

Related

Does Linq's IEnumerable.Select return a reference to the original IEnumerable?

Would using an extension method to query an IEnumerable be equivalent to using IQueryable?

Compare predicates

How does OrderBy extension method orders collection

lambda expressions on populated lists

Categories

Resources