With entity framework, we can do :
MyContext context = ... // a normal EF context
var products = context.Products.Where(p => p.Location == "France") ;
or
var products = context.Products.Where(p => p.CategoryId == 54) ;
Which are both translate in their equivalent SQL query.
OK, but somewhere over there, there's a piece of code that handle this :
public static IEnumerable<T> Where(Func<bool, T> func) {
......
}
From that Where function, how do LINQ to SQL know what's the implementation of func ?
Obvious answer maybe, but I can't really find it.
You should really do a Go To Definition on your code. The function used for LINQ-to-SQL and for Entity Framework is
IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
contained in the System.Linq.Queryable that uses expression trees, that are a constuct able to "describe" pieces of code. A little quote:
Expression trees represent code in a tree-like data structure, where each node is an expression, for example, a method call or a binary operation such as x < y.
For example, your first expression
var products = context.Products.Where(p => p.Location == "France");
is converted by the C# compiler to this code:
ParameterExpression par = Expression.Parameter(typeof(Product), "p");
LambdaExpression lambda = Expression.Lambda(
Expression.Equal(
Expression.Property(par, "Location"),
Expression.Constant("France")),
par);
var products = context.Products.Where(lambda);
Now... while the creation of an expression tree is quite simple, the reverse operation (de-building an expression tree and creating a query) is something VERY VERY complex. Big big headache complex. Nearly magic-level complex :-)
The problem isn't de-building the expression tree. That is easy. You use an ExpressionVisitor and you are done. The problem is merging the various layers of LINQ query and comprehending what the programmer wanted to obtain.
I'll add that IL (the "assembly" of .NET) is high level enough that it is possible to decompile it (see for example IlSpy). There is at least a library, DelegateDecompiler that is able to decompile a delegate to an Expression Tree, so even without Expression Trees, LINQ-to-SQL and EF could have used something similar and decompiled methods directly to SQL language.
Related
I've read this answer and understood from it the specific case it highlights, which is when you have a lambda inside another lambda and you don't want to accidentally have the inner lambda also compile with the outer one. When the outer one is compiled, you want the inner lambda expression to remain an expression tree. There, yes, it makes sense quoting the inner lambda expression.
But that's about it, I believe. Is there any other use case for quoting a lambda expression?
And if there isn't, why do all the LINQ operators, i.e. the extensions on IQueryable<T> that are declared in the Queryable class quote the predicates or lambdas they receive as arguments when they package that information in the MethodCallExpression.
I tried an example (and a few others over the last couple of days) and it doesn't seem to make any sense to quote a lambda in this case.
Here's a method call expression to a method that expects a lambda expression (and not a delegate instance) as its only parameter.
I then compile the MethodCallExpression by wrapping it inside a lambda.
But that doesn't compile the inner LambdaExpression (the argument to the GimmeExpression method) as well. It leaves the inner lambda expression as an expression tree and does not make a delegate instance of it.
In fact, it works well without quoting it.
And if I do quote the argument, it breaks and gives me an error indicating that I am passing in the wrong type of argument to the GimmeExpression method.
What's the deal? What's this quoting all about?
private static void TestMethodCallCompilation()
{
var methodInfo = typeof(Program).GetMethod("GimmeExpression",
BindingFlags.NonPublic | BindingFlags.Static);
var lambdaExpression = Expression.Lambda<Func<bool>>(Expression.Constant(true));
var methodCallExpression = Expression.Call(null, methodInfo, lambdaExpression);
var wrapperLambda = Expression.Lambda(methodCallExpression);
wrapperLambda.Compile().DynamicInvoke();
}
private static void GimmeExpression(Expression<Func<bool>> exp)
{
Console.WriteLine(exp.GetType());
Console.WriteLine("Compiling and executing expression...");
Console.WriteLine(exp.Compile().Invoke());
}
You have to pass the argument as a ConstantExpression:
private static void TestMethodCallCompilation()
{
var methodInfo = typeof(Program).GetMethod("GimmeExpression",
BindingFlags.NonPublic | BindingFlags.Static);
var lambdaExpression = Expression.Lambda<Func<bool>>(Expression.Constant(true));
var methodCallExpression =
Expression.Call(null, methodInfo, Expression.Constant(lambdaExpression));
var wrapperLambda = Expression.Lambda(methodCallExpression);
wrapperLambda.Compile().DynamicInvoke();
}
private static void GimmeExpression(Expression<Func<bool>> exp)
{
Console.WriteLine(exp.GetType());
Console.WriteLine("Compiling and executing expression...");
Console.WriteLine(exp.Compile().Invoke());
}
The reason should be pretty obvious - you're passing a constant value, so it has to be a ConstantExpression. By passing the expression directly, you're explicitly saying "and get the value of exp from this complicated expression tree". And since that expression tree doesn't actually return a value of Expression<Func<bool>>, you get an error.
The way IQueryable works doesn't really have much to do with this. The extension methods on IQueryable have to preserve all information about the expressions - including the types and references of the ParameterExpressions and similar. This is because they don't actually do anything - they just build the expression tree. The real work happens when you call queryable.Provider.Execute(expression). Basically, this is how the polymorphism is preserved even though we're doing composition, rather than inheritance (/interface implementation). But it does mean that the IQueryable extension methods themselves cannot do any shortcuts - they don't know anything about the way the IQueryProvider is actually going to interpret the query, so they can't throw anything away.
The most important benefit you get from this, though, is that you can compose the queries and subqueries. Consider a query like this:
from item in dataSource
where item.SomeRelatedItem.Where(subItem => subItem.SomeValue == 42).Count() > 2
select item;
Now, this is translated to something like this:
dataSource.Where(item => item.SomeRelatedItem.Where(subItem => subItem.SomeValue == 42).Count() > 2);
The outer query is pretty obvious - we'll get a Where with the given predicate. The inner query, however, is actually going to be a Call to Where, taking the actual predicate as an argument.
By making sure that actual invocations of the Where method are actually translated into a Call of the Where method, both of these cases become the same, and your LINQProvider is that one bit simpler :)
I've actually written LINQ providers that don't implement IQueryable, and which actually have some useful logic in the methods like Where. It's a lot simpler and more efficient, but has the drawback described above - the only way to handle subqueries would be to manually Invoke the Call expressions to get the "real" predicate expression. Yikes - that's quite an overhead for a simple LINQ query!
And of course, it helps you compose different queryable providers, although I haven't actually seen (m)any examples of using two completely different providers in a single query.
As for the difference between Expression.Constant and Expression.Quote themselves, they seem rather similar. The crucial difference is that Expression.Constant will treat any closures as actual constants, rather than closures. Expression.Quote on the other hand, will preserve the "closure-ness" of the closures. Why? Because the closure objects themselves are also passed as Expression.Constant :) And since IQueryable trees are doing lambdas of lambdas of lambdas of [...], you really don't want to lose the closure semantics at any point.
A lambda expression is an anonymous method, which under the covers is a delegate so I can do something like this:
delegate bool Foo(int x);
Foo bar = x => x == 1;
Passing this delegate to an Enumerable extension method makes perfect sense, as the typical expected argument is a Func, which is shorthand for a delegate:
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate);
However, I am unclear about how it is possible to pass in the delegate to a Queryable extension method like this one:
public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate);
This method expects an Expression<TDelegate> argument, but it is perfectly legal to pass in a lambda expression. What is the mechanism that coerces the lambda expression into Expression<TDelegate> so that it may be consumed?
I am familiar with the fact that Queryable methods build out expression trees for parsing by providers, I'm just curious about this one aspect that isn't immediately obvious to me.
UPDATE
I'm becoming less ignorant about my ignorance. Lambda expressions aren't delegates, but can be used to create either delegates or expressions:
Expression<Func<int, bool>> foo = c => c == 1;
Does the compiler infer the type based on the context? I'm guessing that must be the case, as this isn't legal:
var foo = c => c == 1;
This is described in the specification:
4.6 Expression Tree Types
If a conversion exists from a lambda expression to a delegate type D,
a conversion also exists to the expression tree type Expression<D>.
Whereas the conversion of a lambda expression to a delegate type
generates a delegate that references executable code for the lambda
expression, conversion to an expression tree type creates an
expression tree representation of the lambda expression. Expression
trees are efficient in-memory data representations of lambda
expressionsand make the structure of the lambda expressiontransparent
and explicit
So there is a conversion from a lambda to a compatible expression tree type, and the compiler emits the equivalent expression tree instead of creating a delegate.
Quite simply you can't.
However, to make IQueryable methods useful, VS2008 and above include a clever compiler trick. That a lambda expression that is a single statement may be assignable to both a delegate, and an Expression<TDelegate>. The compiler normally will hoist the expression and make a method.
But for an assignment to an Expression<TDelegate> it breaks down the statements into their syntactic meaning and turns that into an expression tree.
e.g.
Func<int,int> func = x=>x*x;
Expression<Func<int,int>> expression = x=>x*x;
The first one will probably be turned into a static method with a garbled name something like::
private static int <B012>SomeMethod(int x){
return x*x;
}
Where as the second statement will be transformed into something like::
ParameterExpression paramX = Expression.Parameter(typeof(int));
Expression<Func<int,int>> expression = Expression.Lambda<Func<int,int>>(
Expression.Multiply(paramX,paramX),paramX);
But you can not do::
expression = func;
This is not valid, as func is a delegate. You can do this though::
func=expression.Compile()
Which compiles the expressions into a func.
**Note the suggested transformations may not be a 100% correct.
The reason they did this was to allow LINQ-to-Objects (Basically Map/Reduce from other language) to share the same friendly syntax as LINQ-To-Providers. So you can write a statement that means the same thing but can change where the filtering and transformation happens.
GetEmployees().Where(e=>e.LastName=="Smith")
Can read the same, but could in theory be describing doing the filtering on this box, or an the database, or parsing an xml file or any number of various things.
I believe this has to do with how the query is built on an IQueryable. The method requires an expression tree because it can be looked inside and structured to match a more optimized (potentially) query or map more closely to the underlying data source. So simply passing in a Func would allow only for execution, whereas an Expression<Func> allows for expression trees which can be observed and executed.
Also, probably more closely answering your exact question, check this SO post out.
I found the following question that can combine multiple Expression<Func<T,bool>> expressions:
How to merge two C# Lambda Expressions without an invoke?
I'm wondering whether, using a similar technique, how you go about merging a .OrderBy Expression<Func<Entity, Key>> with a .Where Expression<Func<Entity, bool>> into one Expression of type, or inheriting from type, System.Linq.Expressions.Expression.
I'm making a really cut down QueryProvider-style class for taking T => T == ... Func's via public methods .Where and .OrderBy. This is with the intention that the expression this class builds gets passed to a QueryTranslator to create suitable SQL.
A QueryTranslator of this style, when called from a QueryProvider, typically takes one System.Linq.Expressions.Expression as an argument which it then translates to SQL.
I can follow through and understand the question above for merging two .Where Funcs. The following blog post was particularly useful for this, and I traced through each of the examples:
http://blogs.msdn.com/b/meek/archive/2008/05/02/linq-to-entities-combining-predicates.aspx
This CodeProject article was also useful:
http://www.codeproject.com/Articles/24255/Exploring-Lambda-Expression-in-C
When combining a .OrderBy Func with a .Where Func however the two have different generic arguments. In the case of the .Where it's a Func<Entity, bool>, and in the case of the .OrderBy it's a Func<Entity, Key>.
Merging these two Expressions, on the surface, isn't as straight forward as merging two with the same generic arguments.
The in-built Func-Expression-producing-engine (unsure of exact term) is able to combine a .Where Func with a .OrderBy Func. I'm curious about what goes on under the hood when these two expressions are merged to become one System.Linq.Expressions.Expression. Is it possible, and if so how would you go about combining an Expression<Func<Entity, bool>> with an Expression<Func<Entity,Key>> assuming the Entity is of the same type in each Expression?
public IQueryable<T> ApplyExpressions<T, TKey>(
Expression<Func<T, TKey>> sortBy,
Expression<Func<T, bool>> filterBy,
IQueryable<T> source)
{
return source.Where(filterBy).OrderBy(sortBy);
}
IQueryable<Customer> query = ApplyExpressions<Customer, int>(
c => c.Age,
c => c.Name.StartsWith("B"),
myDC.Customers)
Expression queryExpression = query.Expression; //tada
I think where this question got ambiguous is because it's kind of unanswerable.
I was trying to build one super-expression, like how the IQueryable<> expression builder works with .Where and .OrderBy expressions all in one expression.
Without a subject, however, my limited understanding of Expressions and Expression Trees seems to suggest it's not possible.
To combine a .Where expression with a .OrderBy expression into one Expression you need MethodCallExpression's to separate the two LambdaExpressions.
The MethodCallExpression needs a subject to call the Method on. This is where the IQueryable<> gets used.
I've altered my Query class to keep all the LambdaExpression's separate, and these then get passed to the QueryTranslator separately. The QT merges the output SQL into the right places.
This is as opposed to an IQueryable<> QueryProvider that analyses one super-Expression and outputs SQL into the right places based on MethodCallExpression's in the Expression.
Is it possible to use a dynamic Linq Expression inside a Query Expression?
Something like this:
from obj1 in ObjectSet1
let res = ObjectSet2.Where(* SomeExpression *)
where ...
select ...
I'm trying to build Expression.Lambda<Func<TSource, bool>> Expression as for SomeExpression.
Is it possible to to use dynamic Linq Expression within the Expression Query or do I need to build the whole Expression tree from scratch?
How can I, if any, use obj1 when I'm building SomeExpression?
Note: I'm using Entity Framework, I can't use SomeExpression.Compile() within the expression tree.
It's not only possible, it's normal. Large expression trees can be generated from smaller trees (this is what LINQ does). You only have to pass in an Expression<Func<TSource, bool>> to Where().
You can see how in my reply to this other thread here - replacing operator in Where clause Lambda with a parameter . The interesting part is in the MakeWhereLambda method.
If you're wanting it to be completely dynamic, one possible option would be to use Expression Tree Serialization:
http://expressiontree.codeplex.com/
I started to use this a while back but that task got reprioritized, so I'm not sure how suitable it is for any given task. That said, it looks pretty good...
Hope this helps.
Nate
When you are building the expression tree at runtime there's no code
emitted. It's a way to represent .NET code at runtime...
Ok...
Now lets say I have this code :
ParameterExpression p = Expression.Parameter (typeof (string), "s");
MemberExpression stringLength = Expression.Property (p, "Length");
ConstantExpression five = Expression.Constant (5);
BinaryExpression comparison = Expression.LessThan (stringLength, five);
Expression<Func<string, bool>> lambda= Expression.Lambda<Func<string, bool>> (comparison, p);
Func<string, bool> runnable = lambda.Compile();
This code Wont be in IL ? of course it will be ! ( maybe the last line wont emit code until compile ...but the first lines I think will emit code !)
So what am i saving here ?
Ok so the first 5 lines did emit code and the last one didn't... big deal.
What am i missing ? Can you please let me see the whole picture ?
With an Expression Tree, you build a description of some code instead of the code itself.
Expression Trees should not be used in the context of writing regular code that 'shouldn't be compiled at compile time'. They should be used in more dynamic scenarios.
The expression tree you show will compile to: s.Length < 5 and you invoke the runnable with bool isStringSmallerThan5 = runnable("MyString").
The whole idea of Expression Trees is that they describe some code and can be compiled at runtime. This means that you can do the following:
BinaryExpression comparison = null;
if (lessThen)
{
comparison = Expression.LessThan(stringLength, five);
}
else
{
comparison = Expression.GreaterThan(stringLength, five);
}
Now you can change the behavior of your code at runtime!
The biggest use of Expression Trees is that they can be interpreted by a provider. For example Linq To Entities uses Expression Trees and compiles them to SQL code that can be run against the database. LinqToXml is another example of what you can do with Expression Trees.
This a nice blog post to get you started.
Expression trees are useful when you receive them in a method since they enable you to make more complex use of the expression content. If you receive a predicate in a method, you can run it against a target and check the result. If you receive the expression tree representing an expression tree, you can parse it and do something useful with it. An example is LINQ which utilizes this many places, but among one in the "Where"-methods. Catching the expression tree rather than the IL makes it relatively straight forward to translate into SQL rather than just do a full 'Select' and run predicates against the materialized result.