I am new to LINQ and discovered yesterday that you can have multiple where clauses such as:
var items = from object in objectList
where object.value1 < 100
where object.value2 > 10
select object;
Or you can write:
var items = from object in objectList
where object.value1 < 100
&& object.value2 > 10
select object;
What is the difference between the two?
The first one will be translated into:
objectList.Where(o => o.value1 < 100).Where(o=> o.value2 > 10)
while the second one will be translated in:
objectList.Where(o => o.value1 < 100 && o.value2 > 10)
So, in the first one, you will have a first filtered sequence that is filtered again (first sequence contains all the objects with value < 100, the second one containing all the objects with value > 10 from the first sequence), in while the second one you will do the same comparisons in the same labda expression. This is valid fro Linq to objects, for other providers it depends how the expression is translated.
The marked answer gets it a little bit inaccurate.
As #Philippe said, the first one will be translated into:
objectList.Where(o => o.value1 < 100).Where(o=> o.value2 > 10)
while the second one will be translated in:
objectList.Where(o => o.value1 < 100 && o.value2 > 10)
But Linq has a little optimization for chained Where calls.
If you inspect Linq's source code you will see the following:
class WhereEnumerableIterator<TSource> : Iterator<TSource>
{
public override IEnumerable<TSource> Where(Func<TSource, bool> predicate)
{
return new WhereEnumerableIterator<TSource>(source,
CombinePredicates(this.predicate, predicate));
}
}
What CombinePredicates does is is combining the two predicates with && between them:
static Func<TSource, bool> CombinePredicates<TSource>(Func<TSource, bool> predicate1,
Func<TSource, bool> predicate2)
{
return x => predicate1(x) && predicate2(x);
}
So objectList.Where(X).Where(Y) is equivalent to objectList.Where(X && Y) except for the creation time of the query (Which is extremely short anyway) and the invocation of two predicates.
Bottom line is that it does not filter or iterate the collection two times - but one composite time.
The first one translates to:
objectList.Where(o => o.value1 < 100)
.Where(o => o.value2 > 10);
while the latter gets you:
objectList.Where(o => o.value1 < 100 && o.value2 > 10);
It's functionally the same, and while the second one would spare a method call, the difference in performance is negligible. Use what's more readable for you.
That is, if you're using LINQ to Objects. If you're using a provider, it depends on how it's implemented (if the predicate is not factored in the resulting query, the result can be sub-optimal).
I've just profile it.
No difference in SQL code
At the most basic level, you get two Where operations instead of one. Using Reflector is the best way to examine what comes out the other end of a query expression.
Whether they get optimised down to the same thing depends on the actual LINQ provider - it needs to take the entire tree and convert it to another syntax. For LINQ To Objects, it doesn't.
C# in Depth is good to give you an understanding of this topic.
How about this for an answer: with && you can't guarantee that both expressions with be evaluated (if the first condition is false then the second might not be evaluated). With the two where clauses then you can. No idea whether it's true but it sounds good to me!
All other things being equal, I would choose the condition1 && condition2 version, for the sake of code readability.
Related
I'm performing a query in which I occasionally expect NULL like this:
.Where(d => d.Id == varid && d.Date >= vardate1 && d.Date <= vardate2)
.Sum(d => (decimal?)d.Delta);
Delta is a non-nullable decimal and intellisense shows that the result of the Sum is going to be a decimal? because I introduced the cast. The generated SQL is as expected, and when run manually it correctly returns NULL when there are no matching records. However, the result from the materialized query is always 0. This behavior is different than the non-core EF, which would have returned null. Is this really the new expected behavior? If so, how can I force it to return null when I need it to? Null and 0 have different meanings in this context.
I can bring in the records first and then sum on the server, but it would be nice if EF core did what I would expect on its own.
Most likely a bug, but knowing EF Core designer visions for non nullable Max / Min / Average and also First / Single translation, I won't be surprised if they are doing that intentionally to emulate the (weird) LINQ to Objects nullable Sum behavior which returns 0 event though the result type of the method is nullable.
It can be seen by the following snippet
decimal? result = Enumerable.Empty<decimal?>().Sum(); // result is 0
and is even documented (!?):
Remarks
This method returns zero if source contains no elements.
The "funny" thing is that this is just for root query Sum execution - inside projections it has the SQL behavior you are looking for.
Which lead us to the workaround by utilizing group by constant trick combined with projection. In order to not repeat it everywhere you need it, and also easily remove it if it get fixed in some later EF Core version, you can encapsulate it in a custom extension method like this:
public static partial class EfCoreExtensions
{
public static decimal? SumOrDefault<T>(this IQueryable<T> source, Expression<Func<T, decimal?>> selector)
=> source.GroupBy(e => 0, selector).Select(g => g.Sum()).AsEnumerable().FirstOrDefault();
}
and replace
.Sum(d => (decimal?)d.Delta);
with
.SumOrDefault(d => d.Delta);
Just make sure you use it only for final calls, because if you use it inside query expression tree, as any custom method it won't be recognized and will cause client evaluation or runtime exception.
The above "group by constant trick" does not work in EF Core 5.0.
A variation of the extension method using an Aggregate function can achieve the desired result.
So to have all NULLs return Null, but otherwise return the sum of non null values:
public static decimal? SumOrDefault<TSource>(this IEnumerable<TSource> source, Func<TSource, decimal?> selector)
=> (from s in source select selector(s))
.Aggregate((decimal?)null, (acc, item) => acc.HasValue ? acc + item.GetValueOrDefault() : item);
Or if you prefer that any NULL value results in a NULL returned
public static decimal? SumAllOrNull<TSource>(this IEnumerable<TSource> source, Func<TSource, decimal?> selector)
=> (from s in source select selector(s))
.Aggregate((decimal?)null, (acc, item) => acc.HasValue ? acc + item : item.HasValue ? item : null);
But note that as mentioned, this will work only on Linq-to-Objects, not Linq-to-Sql, so you need ToList() or AsEnumerable() beforehand, and so it brings more data back from the database that you might want or need.
.Where(d => d.Id == varid && d.Date >= vardate1 && d.Date <= vardate2)
.AsEnumerable()
.SumOrDefault(d => d.Delta);
This question already has answers here:
C# LINQ First() faster than ToArray()[0]?
(10 answers)
Closed 8 years ago.
I am wondering what happens under the hood of list.first() and list[0] and which performs better.
For example which is faster?
for(int i = 0; i < 999999999999... i++)
{
str.Split(';').First() vs. str.Split(';')[0]
list.Where(x => x > 1).First() vs. list.Where(x => x > 1).ToList()[0]
}
Sorry In case of a duplicate question
Which performs better? The array accessor, since that doesn't need a method to be put on the stack and doesn't have to execute the First method to eventually get to the array accessor.
As the Reference Source of Enumerable shows, First() is actually:
IList<TSource> list = source as IList<TSource>;
if (list != null) {
if (list.Count > 0) return list[0];
}
So it doesn't do anything else, it just takes more steps to get there.
For your second part (list.Where(x => x > 1).First() vs. list.Where(x => x > 1).ToList()[0]):
Where returns an IEnumerable, which isn't IList so it doesn't go for the first part of the First method but the second part:
using (IEnumerator<TSource> e = source.GetEnumerator()) {
if (e.MoveNext()) return e.Current;
}
This will traverse each item one by one until it gets the desired index. In this case 0, so it will get there very soon. The other one calling ToList will always be less efficient since it has to create a new object and put all items in there in order to get the first one. The first call is definitely faster.
Simple compaction of performance http://pastebin.com/bScgyDaM
str.Split(';').First(); : 529103
str.Split(';')[0]; : 246753
list.Where(x => x == "a").First(); : 98590
list.Where(x => x == "a").ToList()[0]; : 230858
First vs [0]
if you have simple array faster is [0] because it only calculating adders in memory.
but if you combine with others LINQ command faster is First(). for example Where().First() searching until he finds first element. Where().ToList()[0] finds all elements then convert to list and do a simple calculation.
another thing is that Where() is an deferred method. A query that contains only deferred methods is not executed until the items in the result are enumerated.
so you can
list.Where( x => x>12);
list.add(10);
list.add(13);
foreach (int item in list)
{
Console.WriteLine(item);
}
13 will attach to result but 10 no because 10 and 13 were first added to the list later list was searched.
If you want to know more about Linq you can read that book Pro LINQ by Joseph Rattz and Adam Freeman http://www.apress.com/9781430226536
There is no significant difference between these, you will get pretty much the same results:
str.Split(';').First() vs. str.Split(';')[0]
For your second comparison, here you are asking only the first element
list.Where(x => x > 1).First()
So as soon as WhereIterator returns an item it's done. But in second you are putting all results into list then getting the first item using indexer , therefore it will be slower.
list.Where(x => x > 1).First() vs. list.Where(x => x > 1).ToList()[0]
The First() should be faster when applied to Enumerable because of
deferred execution. In your case, the result will be returned as soon
as one item of your list has been found that match the criteria
Where(x => x > 1).
In the second example, your initial list has to be fully enumerated,
ALL items matching the criteria will be put in a temporary list, of which you get the first item with the array accessor.
str.Split(';').First() vs. str.Split(';')[0]
In that case the method Split() already returns an array. The array accessor might be marginally faster, but the performance gain will be negligible in most cases.
Suppose we have a source IEnumerable sequence:
IEnumerable<Tuple<int, int>> source = new [] {
new Tuple<int, int>(1, 2),
new Tuple<int, int>(2, 3),
new Tuple<int, int>(3, 2),
new Tuple<int, int>(5, 2),
new Tuple<int, int>(2, 0),
};
We want to apply some filters and some transformations:
IEnumerable<int> result1 = source.Where(t => (t.Item1 + t.Item2) % 2 == 0)
.Select(t => t.Item2)
.Select(i => 1 / i);
IEnumerable<int> result2 = from t in source
where (t.Item1 + t.Item2) % 2 == 0
let i = t.Item2
select 1 / i;
These two queries are equivalent, and both will throw a DivideByZeroException on the last item.
However, when the second query is enumerated, the VS debugger will let me inspect the entire query, thus very handy in determining the source of the problem.
However, there is no equivalent help when the first query is enumerated. Inspecting into the LINQ implementation yields no useful data, probably due to the binary being optimized:
Is there a way to usefully inspect the enumerable values up the "stack" of IEnumerables when not using query syntax? Query syntax is not an option because sharing code is impossible with it (ie, the transformations are non trivial and used more than once).
But you can debug the first one. Just insert a breakpoint on any one of the lambdas and you're free to inspect the values of the parameters or whatever else is in scope.
When debugging you can then inspect the values of (in the case of breaking within the first Where) t, t.Item1, etc.
As for the reason that you can inspect t when performing the final select in your second query, but not your first, it's because you haven't created equivalent queries. The second query you wrote, when written out by the compiler, will not generate something like your first query. It will create something subtly, but still significantly, different. It will create something like this:
IEnumerable<int> result1 = source.Where(t => (t.Item1 + t.Item2) % 2 == 0)
.Select(t => new
{
t,
i = t.Item2,
})
.Select(result => 1 / result.i);
A let call doesn't just select out that value, as the first query you wrote does. It selects out a new anonymous type that pulls out the value from the let clause as well as the previous value, and then modifies the subsequent queries to pull out the appropriate variable. That's why the "previous" variables (i.e. t are still in scope at the end of the query (at both compile time and runtime; that alone should have been a big hint to you). Using the query I provided above, when breaking on the select, you can see the value of result.t through the debugger.
I'm trying to create a filter method for Entity framework List and understand better the
Expression<Func<...
I have a Test Function like this.
public IQueryable<T> Filter<T>(IEnumerable<T> src, Expression<Func<T, bool>> pred)
{
return src.AsQueryable().Where(pred);
}
and if I do this:
context.Table.Filter(e => e.ID < 500);
or this:
context.Table.Filter(e => e.SubTable.Where(et => et.ID < 500).Count() > 0 && e.ID < 500);
it all works well.
But if I do this:
context.Table.Filter(e => e.SubTable.Filter(et => et.ID < 500).Count() > 0 && e.ID < 500);
or this:
context.Table.Where(e => e.SubTable.Filter(et => et.ID < 500).Count() > 0 && e.ID < 500);
I receive one error.
LINQ to Entities does not recognise the method ...Filter...
Why does it work in one case and not in the other? What should I change in the filter to make it work with related tables?
I prefer to stay away from other external libraries as what I want is to learn how it works and be able to use it in any scenario in future.
In the first two cases, the filter runs in the database correctly.
Jon and Tim already explained why it doesn't work.
Assuming that the filter code inside Filter is not trivial, you could change Filter so that it returns an expression EF can translate.
Let's assume you have this code:
context.Table.Where(x => x.Name.Length > 500);
You can now create a method the returns this expression:
Expression<Func<YourEntity, bool>> FilterByNameLength(int length)
{
return x => x.Name.Length > length;
}
Usage would be like this:
context.Table.Where(FilterByNameLength(500));
The expression you build inside FilterByNameLength can be arbitrarily complex as long as you could pass it directly to Where.
It's useful to understand the difference between Expression<Func<>> and Func<>.
An Expression e => e.ID < 500 stores the info about that expression: that there's a T e, that you're accessing the property ID, calling the < operator with the int value 500. When EF looks at that, it might turn it into something like [SomeTable].[ID] < 500.
A Func e => e.ID < 500 is a method equivalent to:
static bool MyMethod(T e) { return e.ID < 500; }
It is compiled as IL code that does this; it's not designed to be 'reconstituted' into a SQL query or anything else, only run.
When EF takes your Expression, it must understand every piece of it, because it uses that to build a SQL query. It is programmed to know what the existing Where method means. It does not know what your Filter method means, even though it's a trivial method, so it just gives up.
Why it works in one case and not in the adder?
Because EF doesn't really "know" about your Filter method. It has no understanding of what it's meant to do, so it doesn't know how to translate it into SQL. Compare that with Where etc, which it does understand.
The version where you call it directly on the initial table works because that way you don't end up with an expression tree containing a call to Filter - it just calls Filter directly, which in turn does build up a query... but one which EF understands.
I'd be very surprised if you could work out a way of getting your Filter method to work within an EF query... but you've already said that using Where works anyway, so why use Filter at all? I'd use the Where version - or better yet, use the Any overload which takes a predicate:
context.Table.Filter(e => e.SubTable.Any(et => et.ID < 500) && e.ID < 500);
I'm making a query with Linq, backed by an Entity Framework datasource.
I'm getting the following error:
LINQ to Entities does not recognize the method 'Double Sqrt(Double)'
method, and this method cannot be translated into a store expression.
Here's a simplified version of my function (my version is more complex and uses ACos, sin, cos and other C# Math class functions).
var objects =
from n in context.Products.Where(p => p.r == r)
let a = Math.Sqrt((double)n.Latitude)
where a < 5
orderby a
select n;
return objects.Take(100).ToList();
I think the problem may be related to the situation that Linq to EF (and a SQL datasource) has a limited set of supported function compared to Linq to SQL. I'm relatively new to this so I'm not 100% sure.
Can anyone give me a pointer in the right direction?
Cheers,
Try SquareRoot function defined in SqlFunctions
var objects =
from n in context.Products.Where(p => p.r == r)
let a = SqlFunctions.SquareRoot((double)n.Latitude)
where a < 5
orderby a
select n;
return objects.Take(100).ToList();
If you start of learning LINQ with LINQ-to-objects, you'll run into this a lot once you start using LINQ-to-Entities.
You can do pretty much anything that will compile in LINQ-to-objects, because LINQ-to-objects translates into code when compiled.
LINQ-to-Entities (and LINQ-to-SQL) translates into expression trees. So, only the syntax that that specific LINQ provider allowed for is valid. In my first "for real" LINQ-to-Entities expression, which compiled just fine, I ran into this error about 5 times, as one by one I removed code that wasn't handled by LINQ-to-Entities.
So when you see this, it's normal and common. You need to find another way each time.
You could avoid the problem with a logical equivalent:
var objects =
from n in context.Products.Where(p => p.r == r)
where (double)n.Latitude < 25
orderby a
select n;
return objects.Take(100).ToList();
You could also pull all the data to the client and then run your code using LINQ-to-objects:
var objects =
from n in context.Products.Where(p => p.r == r).ToList()
let a = Math.Sqrt((double)n.Latitude)
where a < 5
orderby a
select n;
return objects.Take(100).ToList();
Finally, you should be able to do this math on the server. Check out the System.Data.Objects.SqlClient.SqlFunctions SqlFunctions Class. These functions will translate into the expression. This in particular looks like it might be the ticket.
Please try
var objects =
from n in context.Products.Where(p => p.r == r)
let a = Math.Pow((double)n.Latitude, 0.5)
where a < 5
orderby a
select n;