So I know that Find() is only a List<T> method, whereas First() is an extension for any IEnumerable<T>. I also know that First() will return the first element if no parameter is passed, whereas Find() will throw an exception. Lastly, I know that First() will throw an exception if the element is not found, whereas Find() will return the type's default value.
I hope that clears up confusion about what I'm actually asking. This is a computer science question and deals with these methods at the computational level. I've come to understand that IEnumerable<T> extensions do not always operate as one would expect under the hood. So here's the Q, and I mean from a "close to the metal" standpoint: What is the difference between Find() and First()?
Here's some code to provide basic assumptions to operate under for this question.
var l = new List<int> { 1, 2, 3, 4, 5 };
var x = l.First(i => i == 3);
var y = l.Find(i => i == 3);
Is there any actual computational difference between how First() and Find() discover their values in the code above?
Note: Let us ignore things like AsParallel() and AsQueryable() for now.
Here's the code for List<T>.Find (from Reflector):
public T Find(Predicate<T> match)
{
if (match == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
}
for (int i = 0; i < this._size; i++)
{
if (match(this._items[i]))
{
return this._items[i];
}
}
return default(T);
}
And here's Enumerable.First:
public static TSource First<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
if (predicate == null)
{
throw Error.ArgumentNull("predicate");
}
foreach (TSource local in source)
{
if (predicate(local))
{
return local;
}
}
throw Error.NoMatch();
}
So both methods work roughly the same way: they iterate all items until they find one that matches the predicate. The only noticeable difference is that Find uses a for loop because it already knows the number of elements, and First uses a foreach loop because it doesn't know it.
First will throw an exception when it finds nothing, FirstOrDefault however does exactly the same as Find (apart from how it iterates through the elements).
BTW Find is rather equal to FirstOrDefault() than to First(). Because if predicate of First() is not satisfied with any list elements you will get an exception.
Here what returns a dotpeek, another great free reflector replacement with some of ReSharper features
Here for Enumerable.First(...) and Enumerable.FirstOrDefault(...) extension methods:
public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
foreach (TSource element in source) {
if (predicate(element)) return element;
}
return default(TSource);
}
public static TSource First<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
foreach (TSource element in source) {
if (predicate(element)) return element;
}
throw Error.NoMatch();
}
and here is for List<>.Find:
/// <summary>
/// Searches for an element that matches the conditions defined by the specified predicate, and returns the first occurrence within the entire <see cref="T:System.Collections.Generic.List`1"/>.
/// </summary>
///
/// <returns>
/// The first element that matches the conditions defined by the specified predicate, if found; otherwise, the default value for type <paramref name="T"/>.
/// </returns>
/// <param name="match">The <see cref="T:System.Predicate`1"/> delegate that defines the conditions of the element to search for.</param><exception cref="T:System.ArgumentNullException"><paramref name="match"/> is null.</exception>
[__DynamicallyInvokable]
public T Find(Predicate<T> match)
{
if (match == null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
for (int index = 0; index < this._size; ++index)
{
if (match(this._items[index]))
return this._items[index];
}
return default (T);
}
1- Find() returns Null if the entity is not in the context but First() will throw an exception
2- Find() returns entities that have been added to the context but have not yet been saved to the database
Since List<> is not indexed in any way, it has to go through all values to find a specific value. Therefore it doesn't make much of a difference compared to traversing the list via an enumerable (apart from the creation of a enumerable helper object instance).
That said, keep in mind that the Find function was created way earlier than the First extension method (Framework V2.0 vs. V3.5), and I doubt that they would have implemented Find if the List<> class had been implemented at the same time as the extension methods.
Related
When looking at Microsoft's implementation of various C# LINQ methods, I noticed that the public extension methods are mere wrappers that return the actual implementation in the form of a separate iterator function.
For example (from System.Linq.Enumerable.cs):
public static IEnumerable<TSource> Concat<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second) {
if (first == null) throw Error.ArgumentNull("first");
if (second == null) throw Error.ArgumentNull("second");
return ConcatIterator<TSource>(first, second);
}
static IEnumerable<TSource> ConcatIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second) {
foreach (TSource element in first) yield return element;
foreach (TSource element in second) yield return element;
}
What is the reason for wrapping the iterator like that instead of combining them into one and return the iterator directly?
Like this:
public static IEnumerable<TSource> Concat<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second) {
if (first == null) throw Error.ArgumentNull("first");
if (second == null) throw Error.ArgumentNull("second");
foreach (TSource element in first) yield return element;
foreach (TSource element in second) yield return element;
}
Wrappers are used to check method arguments immediately (i.e. when you call LINQ extension method). Otherwise arguments will not be checked until you start to consume iterator (i.e. use query in foreach loop or call some extension method which executes query - ToList, Count etc). This approach is used for all extension methods with deferred type of execution.
If you will use approach without wrapper then:
int[] first = { 1, 2, 3 };
int[] second = null;
var all = first.Concat(second); // note that query is not executed yet
// some other code
...
var name = Console.ReadLine();
Console.WriteLine($"Hello, {name}, we have {all.Count()} items!"); // boom! exception here
With argument-checking wrapper method you will get exception at first.Concat(second) line.
I have me some Resharper squiggles here.
and they tell me that I have a possible multiple enumeration of IEnumerable going on. However you can see that this is not true. final is explicitly declared as a list ( List<Point2D> ) and pointTangents is declared previously as List<PointVector2D>
Any idea on why Resharper might be telling me this?
Edit Experiments To See If I can replicate with simpler code
As you can see below there are no squiggles and no warnings even though Bar is declared to take IEnumerable as arg.
Looks a lot like RSRP-429474 False-positive warning for possible multiple enumeration :
I have this code:
List<string> duplicateLabelsList = allResourcesLookup.SelectMany(x => x).Select(x => x.LoaderOptions.Label).Duplicates<string, string>().ToList(); ;
if (duplicateLabelsList.Any())
throw new DuplicateResourceLoaderLabelsException(duplicateLabelsList);
For both usages of duplicateLabelsList, I'm being warned about
possible multiple enumeration, despite the fact I've called ToList and
therefore there should be no multiple enumeration.
which (currently) has a Fix Version of 9.2, which (currently) isn't yet released.
The extension method public static TSource Last<TSource>(this IEnumerable<TSource> source); is defined for the type IEnumberable<TSource>.
If one looks at the implentation of Last<TSource>:
public static TSource Last<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw Error.ArgumentNull("source");
IList<TSource> list = source as IList<TSource>;
if (list != null)
{
int count = list.Count;
if (count > 0) return list[count - 1];
}
else
{
using (IEnumerator<TSource> e = source.GetEnumerator())
{
if (e.MoveNext())
{
TSource result;
do
{
result = e.Current;
} while (e.MoveNext());
return result;
}
}
}
throw Error.NoElements();
}
It is clear that if source implements IList then source is not enumerated and therefore your assumption that this is a "bug" in Resharper is correct.
I'd consider it more like a false positive probably due to the fact that Resharper has no general way to know that Last()'s implementation avoids unnecessary enumerations. It is probably deciding to flag the potential multiple enumeration based on the fact that Last<TSource> is defined for typed IEnumerable<T> objects.
To be more specific: will the Linq extension method Any(IEnumerable collection, Func predicate) stop checking all the remaining elements of the collections once the predicate has yielded true for an item?
Because I don't want to spend to much time on figuring out if I need to do the really expensive parts at all:
if(lotsOfItems.Any(x => x.ID == target.ID))
//do expensive calculation here
So if Any is always checking all the items in the source this might end up being a waste of time instead of just going with:
var candidate = lotsOfItems.FirstOrDefault(x => x.ID == target.ID)
if(candicate != null)
//do expensive calculation here
because I'm pretty sure that FirstOrDefault does return once it got a result and only keeps going through the whole Enumerable if it does not find a suitable entry in the collection.
Does anyonehave information about the internal workings of Any, or could anyone suggest a solution for this kind of decision?
Also, a colleague suggested something along the lines of:
if(!lotsOfItems.All(x => x.ID != target.ID))
since this is supposed to stop once the conditions returns false for the first time but I'm not sure on that, so if anyone could shed some light on this as well it would be appreciated.
As we see from the source code, Yes:
internal static bool Any<T>(this IEnumerable<T> source, Func<T, bool> predicate) {
foreach (T element in source) {
if (predicate(element)) {
return true; // Attention to this line
}
}
return false;
}
Any() is the most efficient way to determine whether any element of a sequence satisfies a condition with LINQ.
also:a colleague suggested something along the lines of
if(!lotsOfItems.All(x => x.ID != target.ID)) since this is supposed to
stop once the conditions returns false for the first time but i'm not
sure on that, so if anyone could shed some light on this as well it
would be appreciated :>]
All() determines whether all elements of a sequence satisfy a condition. So, the enumeration of source is stopped as soon as the result can be determined.
Additional note:
The above is true if you are using Linq to objects. If you are using Linq to Database, then it will create a query and will execute it against database.
You could test it yourself: https://ideone.com/nIDKxr
public static IEnumerable<int> Tester()
{
yield return 1;
yield return 2;
throw new Exception();
}
static void Main(string[] args)
{
Console.WriteLine(Tester().Any(x => x == 1));
Console.WriteLine(Tester().Any(x => x == 2));
try
{
Console.WriteLine(Tester().Any(x => x == 3));
}
catch
{
Console.WriteLine("Error here");
}
}
Yes, it does :-)
also:a colleague suggested something along the lines of
if(!lotsOfItems.All(x => x.ID != target.ID))
since this is supposed to stop once the conditions returns false for the first time but i'm not sure on that, so if anyone could shed some light on this as well it would be appreciated :>]
Using the same reasoning, All() could continue even if one of the element returns false :-) No, even All() is programmed correctly :-)
It does whatever is the quickest way of doing what it has to do.
When used on an IEnumerable this will be along the lines of:
foreach(var item in source)
if(predicate(item))
return true;
return false;
Or for the variant that doesn't take a predicate:
using(var en = source.GetEnumerator())
return en.MoveNext();
When run against at database it will be something like
SELECT EXISTS(SELECT null FROM [some table] WHERE [some where clause])
And so on. How that was executed would depend in turn on what indices were available for fulfilling the WHERE clause, so it could be a quick index lookup, a full table scan aborting on first match found, or an index lookup followed by a partial table scan aborting on first match found, depending on that.
Yet other Linq providers would have yet other implementations, but generally the people responsible will be trying to be at least reasonably efficient.
In all, you can depend upon it being at least slightly more efficient than calling FirstOrDefault, as FirstOrDefault uses similar approaches but does have to return a full object (perhaps constructing it). Likewise !All(inversePredicate) tends to be pretty much on a par with Any(predicate) as per this answer.
Single is an exception to this
Update: The following from this point on no longer applies to .NET Core, which has changed the implementation of Single.
It's important to note that in the case of linq-to objects, the overloads of Single and SingleOrDefault that take a predicate do not stop on identified failure. While the obvious approach to Single<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) would be something like:
public static TSource Single<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
/* do null checks */
using(var en = source.GetEnumerator())
while(en.MoveNext())
{
var val = en.Current;
if(predicate(val))
{
while(en.MoveNext())
if(predicate(en.Current))
throw new InvalidOperationException("too many matching items");
return val;
}
}
throw new InvalidOperationException("no matching items");
}
The actual implementation is something like:
public static TSource Single<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
/* do null checks */
var result = default(TSource);
long tally = 0;
for(var item in source)
if(predicate(item))
{
result = item;
checked{++tally;}
}
switch(tally)
{
case 0:
throw new InvalidOperationException("no matching items");
case 1:
return result;
default:
throw new InvalidOperationException("too many matching items");
}
}
Now, while successful Single will have to scan everything, this can mean that an unsucessful Single is much, much slower than it needs to (and can even potentially throw an undocumented error) and if the reason for the unexpected duplicate is a bug which is duplicating items into the sequence - and hence making it far larger than it should be, then the Single that should have helped you find that problem is now dragging away through this.
SingleOrDefault has the same issue.
This only applies to linq-to-objects, but it remains safer to do .Where(predicate).Single() rather than Single(predicate).
Any stops at the first match. All stops at the first non-match.
I don't know whether the documentation guarantees that but this behavior is now effectively fixed for all time due to compatibility reasons. It also makes sense.
Yes it stops when the predicate is satisfied once. Here is code via RedGate Reflector:
[__DynamicallyInvokable]
public static bool Any<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
if (predicate == null)
{
throw Error.ArgumentNull("predicate");
}
foreach (TSource local in source)
{
if (predicate(local))
{
return true;
}
}
return false;
}
i want to check for the occurrence of some given items in the collection and thought it work like this but it doesn't
public static bool ContainsAny<T>(this IEnumerable<T> collection, IEnumerable<T> otherCollection)
{
if (otherCollection == null)
throw new ArgumentNullException("otherCollection", ExceptionResources.NullParameterUsed);
if (collection == null)
throw new ArgumentNullException("collection", ExceptionResources.ExtensionUsedOnNull);
else
return Enumerable.Any<T>(otherCollection, new Func<T, bool>(((Enumerable)collection).Contains<T>));
}
I want to get true if the specified collection contains any of the otherCollection items, otherwise false.
But there is an error telling me that i can not convert system.collections.generic.iEnumerable>T> in system.linq.Enumberable. Where is my mistake?
This should work:
return otherCollection.Any(collection.Contains);
You don't need any cast to call Contains method,because there is already an extension method that takes an IEnumerable<T> as first parameter.
I have two sets of datarows. They are each IEnumerable. I want to append/concatenate these two lists into one list. I'm sure this is doable. I don't want to do a for loop and noticed that there is a Union method and a Join method on the two Lists. Any ideas?
Assuming your objects are of the same type, you can use either Union or Concat. Note that, like the SQL UNION keyword, the Union operation will ensure that duplicates are eliminated, whereas Concat (like UNION ALL) will simply add the second list to the end of the first.
IEnumerable<T> first = ...;
IEnumerable<T> second = ...;
IEnumerable<T> combined = first.Concat(second);
or
IEnumerable<T> combined = first.Union(second);
If they are of different types, then you'll have to Select them into something common. For example:
IEnumerable<TOne> first = ...;
IEnumerable<TTwo> second = ...;
IEnumerable<T> combined = first.Select(f => ConvertToT(f)).Concat(
second.Select(s => ConvertToT(s)));
Where ConvertToT(TOne f) and ConvertToT(TTwo s) represent an operation that somehow converts an instance of TOne (and TTwo, respectively) into an instance of T.
I just encountered a similar situation where I need to concatenate multiple sequences.
Naturally searched for existing solutions on Google/StackOverflow, however did not find anything the did not evaluate the enumerable, e.g. convert to array then use Array.Copy() etc., so I wrote an extension and static utiltiy method called ConcatMultiple.
Hope this helps anyone that needs to do the same.
/// <summary>
/// Concatenates multiple sequences
/// </summary>
/// <typeparam name="TSource">The type of the elements of the input sequences.</typeparam>
/// <param name="first">The first sequence to concatenate.</param>
/// <param name="source">The other sequences to concatenate.</param>
/// <returns></returns>
public static IEnumerable<TSource> ConcatMultiple<TSource>(this IEnumerable<TSource> first, params IEnumerable<TSource>[] source)
{
if (first == null)
throw new ArgumentNullException("first");
if (source.Any(x => (x == null)))
throw new ArgumentNullException("source");
return ConcatIterator<TSource>(source);
}
private static IEnumerable<TSource> ConcatIterator<TSource>(IEnumerable<TSource> first, params IEnumerable<TSource>[] source)
{
foreach (var iteratorVariable in first)
yield return iteratorVariable;
foreach (var enumerable in source)
{
foreach (var iteratorVariable in enumerable)
yield return iteratorVariable;
}
}
/// <summary>
/// Concatenates multiple sequences
/// </summary>
/// <typeparam name="TSource">The type of the elements of the input sequences.</typeparam>
/// <param name="source">The sequences to concatenate.</param>
/// <returns></returns>
public static IEnumerable<TSource> ConcatMultiple<TSource>(params IEnumerable<TSource>[] source)
{
if (source.Any(x => (x == null)))
throw new ArgumentNullException("source");
return ConcatIterator<TSource>(source);
}
private static IEnumerable<TSource> ConcatIterator<TSource>(params IEnumerable<TSource>[] source)
{
foreach (var enumerable in source)
{
foreach (var iteratorVariable in enumerable)
yield return iteratorVariable;
}
}
The Join method is like a SQL join, where the list are cross referenced based upon a condition, it isn't a string concatenation or Adding to a list. The Union method does do what you want, as does the Concat method, but both are LAZY evaluations, and have the requirement the parameters be non-null. They return either a ConcatIterator or a UnionIterator, and if called repeatedly this could cause problems. Eager evaluation results in different behavior, if that is what you want, then an extension method like the below could be used.
public static IEnumerable<T> myEagerConcat<T>(this IEnumerable<T> first,
IEnumerable<T> second)
{
return (first ?? Enumerable.Empty<T>()).Concat(
(second ?? Enumerable.Empty<T>())).ToList();
}
Delayed invocation of the second and subsequent enumerables
I usually use Linq IEnumerable<T>.Concat() but today I needed to be 100% sure that the second enumeration was not enumerated until the first one has been processed until the end. (e.g. two db queries that I didn't want to run simultaneously). So the following function made the trick to delay the enumerations.
IEnumerable<T> DelayedConcat<T>(params Func<IEnumerable<T>>[] enumerableList)
{
foreach(var enumerable in enumerableList)
{
foreach (var item in enumerable())
{
yield return item;
}
}
}
Usage:
return DelayedConcat(
() => GetEnumerable1(),
() => GetEnumerable2(),
// and so on.. () => GetEnumerable3(),
);
In this example GetEnumerable2 function invocation will be delayed until GetEnumerable1 has been enumerated till the end.