I am writing some code that takes a LINQ to SQL IQueryable<T> and adds further dynamically generated Where clauses. For example here is the skeleton of one of the methods:
IQueryable<T> ApplyContains(IQueryable<T> source, string field, string value)
{
Expression<Func<T, bool>> lambda;
... dynamically generate lambda for p => p.<field>.Contains(value) ...
return source.Where(lambda);
}
I might chain several of these methods together and finish off with a Skip/Take page.
Am I correct in thinking that when the IQueryable is finally evaluated if there is anything in the lambda expressions that can't be translated to SQL an exception will be thrown? In particular I'm concerned I might accidentally do something that would cause the IQueryable to evaluate early and then continue the evaluation in memory (thereby pulling in thousands of records).
From some things I've read I suspect IQueryable will not evaluate early like this. Can anyone confirm this please?
Yes you are correct in thinking that your IQueryable can throw an error at runtime if part of the expression can't be translated into SQL. Because of this I think it's a good idea to have your queries in a Business Layer class (like a data service or repository) and then make sure that query is covered by an automated test.
Regarding your Linq expression evaluating at an unexpected time, the basic rule to keep in mind is that your expression will evaluate whenever you call a foreach on it. This also includes methods that call a foreach behind the scenes like ToList() and FirstOrDefault().
BTW an easy way to tell if a method is going to call a foreach and force your lambda to evaluate is to check whether the return value on that method is an IQueryable. If the return value is another IQueryable then the method is probably just adding to the expression but not forcing it to evaluate. If the return value is a List<T>, an anonymous type, or anything that looks like data instead of an IQueryable then the method had to force your expression to evaluate to get that data.
Your thinking is correct.
As long as you pass the IQueryable an Expression in your Where clauses it will not evaluate unexpectedly.
Also, the extension methods beginning with "To" will cause evaluation (i.e. ToList(), ToArray()).
Related
If the IQueryable interface performs the query expression in the server rather than fetching all records like IEnumerable, why is IQueryable not replaced by IEnumerable where it can be faster and more efficient?
DBSet<T> has two flavors of Where (IQueryable and IEnumerable). Is there a way to call the IEnumerable version because the IQueryable is called by default, without calling ToList()?
If the IQueryable perform the query Expression in the server rather
than fetching all records like IEnumerable, why IQueryable not
replaced by IEnumerable where it can be faster and more efficient?
IQueryable and IEnumerable represent two different things. Think of a IQueryable as a "question", it does not have any results itself. A IEnumerable is an "answer", it only has data connected to it but you can't tell what generated that data.
It is not that a IQueryable is "faster" per say, it just allows you to put your filtering and projections in to the "question" you ask to the SQL server and let it return only the answers it needs to (In the form of a IEnumerable by calling .ToList() or similar).
If you only use a IEnumerable the only question you can ask is "Give me everything you know" then on the answer it gives you you perform your filtering and projections. That is why IQueryable is considered faster, because there is a lot less data that needs to be processed because you where able to ask a more specific question to the server.
The reason IQueryable has not replaced IEnumerable everywhere is because the thing you are asking a question has to be able to understand the question you are asking it. It takes a lot of work to be able to parse every possible thing you could ask it to filter or project on so most implementations limit themselves to only common things they know they need to be able to answer. For example in Entity Framework when you ask a question it does not understand how to handle you will get a error that says something similar to "Specified method is not supported" when you try to get a IEnumerable (an answer) from the IQueryable.
DBSet<T> has two flavors of Where (IQueryable and IEnumerable).
is there a way to call the IEnumerable version because the
IQueryable is called by default, without calling ToList()?
The class DBSet<T> has no Where method on it at all. The two Where functions come from two extension methods, Enumerable.Where and Queryable.Where. You can force it to use the Enumerable overload by casting the object to a IEnumerable<T> before you call the extension method. However do remember, Queryable.Where filters the question, Enumerable.Where only filters the result.
It is wasteful to ask for results from the server to then just throw them away, so I would not recommend doing this.
Let's say I need an extension method which selects only required properties from different sources. The source could be the database or in-memory collection. So I have defined such extension method:
public IQueryable<TResult> SelectDynamic<TResult>(
this IQueryable<T> source,
...)
This works fine for IQueryables. But, I have to call this function also for IEnumerables.
And in that case, I can call it with the help of .AsQueryable():
myEnumerable.AsQueryable()
.SelectDynamic(...)
.ToList();
Both work fine. And if both work fine, in which conditions I have to create two different extension methods for the same purpose, one for IEnumerable and another one for IQueryable?
My method has to send query to the database in case of Queryable.
For example, here is the source of .Select extension method inside System.Linq namespace:
.Select for IEnumerable
.Select for IQueryable
I am repeating my main question again:
My method must send query to the database in case of Queryable, but not when working with IEnumerable. And for now, I am using AsQueryable() for the enumerables. Because, I dont want to write same code for the Enumerable. Can it have some side effects?
If your code only actually works when the objects its dealing with are loaded in memory, just supply the IEnumerable variant and let your callers decide when they want to convert an IQueryable into an in-memory IEnumerable.
Generally, you won't implement new variations around IQueryable unless you're writing a new database provider.
myEnumerable.AsQueryable() returns a custom object: new EnumerableQuery<TElement>(myEnumerable); (source code)
This EnumerableQuery class implements IEnumerable<T> and IQueryable<T>
When using the EnumerableQuery result of .AsQueryable() as an IEnumerable, the implementation of the interface method IEnumerable<T>.GetIterator() simply returns the original source iterator, so no change and minimal overhead.
When using the result of .AsQueryable() as an IQueriable, the implementation of the interface property IQueriable.Expression simply returns Expression.Constant(this), ready to be evaluated later as an IEnumerable when the whole expression tree is consumed.
(All the other methods and code paths of EnumerableQuery are not really relevant, when the EnumerableQuery is constructed directly from an IEnumerable, as far as I can tell)
If I understand you correctly, you have implemented your method selectDynamic<TResult>() in such a way that you construct an expression tree inside the method, that produces the desired result when compiled.
As far as I understand the source code, when you call e.g. myEnumerable.AsEnumerable().selectDynamic().ToList(), the expression tree you constructed is compiled and executed on myEnumerable, and the total overhead should be fairly minimal, since all this work is only done once per query, not once per element.
So i think there is nothing wrong with implementing your IEnumerable Extension method like this:
public IEnumerable<TResult> SelectDynamic<TResult>(
this IEnumerable<T> source,...)
return source.AsQueryable().SelectDynamic();
}
There is some slight overhead, since this compiles the query once each time this method is called, and I am not sure the JITer is smart enough to cache this compilation. But I think that will not be noticeable in most circumstances, unless you execute this query a thousand times per second.
There should be no other side efects, apart from slight performance issues, in implementing the IEnumerable extension method in this way.
This code causes a NotSupportedException.
var detailList = context.Details.Where(x => x.GetType().GetProperty("Code").GetValue(x,null).ToString() == "00101").ToList();
But this code works.
var detailList = context.Details.AsEnumerable().Where(x => x.GetType().GetProperty("Code").GetValue(x,null).ToString() == "00101").ToList();
MSDN says:
- AsEnumerable() Returns the input typed as IEnumerable
- DbSet Is an IEnumerable
So why we need to use AsEnumerable() method?
DbSet is also IQueryable.
IQueryable has its own set of LINQ extension methods that translate expression trees into SQL, and do not support reflection.
By calling AsEnumerable(), you change the compile-time type of the expression to IEnumerable<T>, forcing the extension methods to bind to the standard LINQ ones.
If you prefer to run your query on the server, you should build an expression tree instead of using reflection.
The first query attempts to have the query provider translate the query into SQL and execute it against the database. It fails to create a valid database query, so it fails with the error mentioned.
Using AsEnumerable types the query as an IEnumerable<T>, rather than an IQueryable<T>, statically, and as such ends up calling the LINQ to objects version of the query methods, pulling the entire table into memory and then performing all of the operations within the application.
When you're querying an IQueryable<T>, your method gets translated via an Expression Tree into SQL. AsEnumerable to change the compile time type to IEnumerable<T> and to bring all the entities from your database into memory, where you can query them via reflection via LINQ to Objects.
Is it IEnumerable<T>. As far as I know, the reference always points to a class instance. What instance type does the LINQ query really point to?
You can find it out by calling .GetType() on your IEnumerable<T> variable and inspecting the type in the debugger.
For different LINQ providers and even different LINQ methods, such types may or may not be different.
What matters to your code is that they all implement IEnumerable<T> which you should work with, or IQueryable<T> which also accepts expressions, meaning your predicates and projections will become syntax trees and may be manipulated by a LINQ provider at runtime, e.g. to be translated into SQL.
Actual classes, if this is what you're asking about, may even be compiler-generated, e.g. yield return expression is translated to such a class.
Either way, they are usually internal and you should never, ever depend on them.
Depending on your original data source, it is either IEnumerable or IQueryable:
The result of a Linq database query is typically IQueryable<T> which is derived from IEnumerable<T>, IQueryable, and IEnumerable.
If your database query includes an OrderBy clause, the type is IOrderedQueryable<T>, being derived from IQueryable<T>
If your data source is an IEnumerable, the result type is IEnumerable<T>
I don't know but I guess it's an internal type. You don't have to think about the class.
In fact it could be different classes depending on the concrete query. The compiler could convert the LINQ expression to one or another implementation depending on the conditions/processing.
if I am not mistaken, IEnumerable< T >, where T depends on your query
You can check here for Three parts of Query operation.You can see that the return type of an LINQ query is IEnumerable< int >.
What is the quickest way to find out which .net framework linq methods (e.g .IEnumerable linq methods) are implemented using deferred execution vs. which are not implemented using deferred execution.
While coding many times, I wonder if this one will be executed right way. The only way to find out is go to MSDN documentation to make sure. Would there be any quicker way, any directory, any list somewhere on the web, any cheat sheet, any other trick up your sleeve that you can share? If yes, please do so. This will help many linq noobs (like me) to make fewer mistakes. The only other option is to check documentation until one have used them enough to remember (which is hard for me, I tend not to remember "anything" which is documented somewhere and can be looked up :D).
Generally methods that return a sequence use deferred execution:
IEnumerable<X> ---> Select ---> IEnumerable<Y>
and methods that return a single object doesn't:
IEnumerable<X> ---> First ---> Y
So, methods like Where, Select, Take, Skip, GroupBy and OrderBy use deferred execution because they can, while methods like First, Single, ToList and ToArray don't because they can't.
There are also two types of deferred execution. For example the Select method will only get one item at a time when it's asked to produce an item, while the OrderBy method will have to consume the entire source when asked to return the first item. So, if you chain an OrderBy after a Select, the execution will be deferred until you get the first item, but then the OrderBy will ask the Select for all the items.
The guidelines I use:
Always assume any API that returns IEnumerable<T> or IQueryable<T> can and probably will use deferred execution. If you're consuming such an API, and need to iterate through the results more than once (e.g. to get a Count), then convert to a collection before doing so (usually by calling the .ToList() extension method.
If you're exposing an enumeration, always expose it as a collection (ICollection<T> or IList<T>) if that is what your clients will normally use. For example, a data access layer will often return a collection of domain objects. Only expose IEnumerable<T> if deferred execution is a reasonable option for the API you're exposing.
Actually, there's more; in addition you need to consider buffered vs non-buffered. OrderBy can be deferred, but when iterated must consume the entire stream.
In general, anything in LINQ that returns IEnumerable tends to be deferred - while Min etc (which return values) are not deferred. The buffering (vs not) can usually be reasoned, but frankly reflector is a pretty quick way of finding out for sure. But note that often this is an implementation detail anyway.
For actual "deferred execution", you want methods that work on an IQueryable. Method chains based on an IQueryable work to build an expression tree representing your query. Only when you call a method that takes the IQueryable and produces a concrete or IEnumerable result (ToList() and similar, AsEnumerable(), etc) is the tree evaluated by the Linq provider (Linq2Objects is built into the Framework, as is Linq2SQL and now the MSEF; other ORMs and persistence-layer frameworks also offer Linq providers) and the actual result returned. Any IEnumerable class in the framework can be cast to an IQueryable using the AsQueryable() extension method, and Linq providers that will translate the expression tree, like ORMs, will provide an AsQueryable() as a jump-off point for a linq query against their data.
Even against an IEnumerable, some of the Linq methods are "lazy". Because the beauty of an IEnumerable is that you don't have to know about all of it, only the current element and whether there's another, Linq methods that act on an IEnumerable often return an iterator class that spits out an object from its source whenever methods later in the chain ask for one. Any operation that doesn't require knowledge of the entire set can be lazily evaluated (Select and Where are two big ones; there are others). Ones that do require knowing the entire collection (sorting via OrderBy, grouping with GroupBy, and aggregates like Min and Max) will slurp their entire source enumerable into a List or Array and work on it, forcing evaluation of all elements through all higher nodes. Generally, you want these to come late in a method chain if you can help it.
Here's a summary of different ways you can know if your query will be deferred or not:
If you're using query expression syntax instead of query method syntax, then it will be deferred.
If you're using query method syntax, it MIGHT be deferred depending on what it returns.
Hover over the var key word (if that's what you're using as the type for the variable used to store the query). If it says IEnumerable<T> then it'll be deferred.
Try to iterate over the query using a foreach. If you get an error saying it cannot iterate over your variable because it does not support GetEnumerator(), you know the query is not deferred.
Source: Essential Linq
If you cast the collection to an IQueryable using .AsQueryable(), your LINQ calls will use the deferred execution.
See here: Using IQueryable with Linq