Compiled Expression Trees misunderstanding?

Compiled Expression Trees misunderstanding? - c#

I have this expression :
Expression<Func<string, bool>> f = s => s.Length < 5;
ParameterExpression p = Expression.Parameter (typeof (string), "s");
MemberExpression stringLength = Expression.Property (p, "Length");
ConstantExpression five = Expression.Constant (5);
BinaryExpression comparison = Expression.LessThan (stringLength, five);
Expression<Func<string, bool>> lambda= Expression.Lambda<Func<string, bool>> (comparison, p);
//lets : test
Func<string, bool> runnable = lambda.Compile();
Console.WriteLine (runnable ("kangaroo")); // False
Console.WriteLine (runnable ("dog")); //True
I want to ask about the .Compile()
What does it compile ? And what is the difference between the first execution vs later executions...?
Compile should be something that happens once and not happens again later ....
What / How does it help me ?

When you are building the expression tree at runtime there's no code emitted. It's a way to represent .NET code at runtime.
Once you call the .Compile method on the expression tree the actual IL code is emitted to convert this expression tree into a delegate (Func<string, bool> in your case) that you could invoke at runtime. So the code that this expression tree represents can be executed only after you compile it.
Calling Compile is an expensive operation. Basically you should be calling it once and then caching the resulting delegate that you could use to invoke the code many times.

The Expression<Func<string,bool>> is only a representation of an expression, it cannot be executed. Calling Compile() gives you a compiled delegate, a piece of code that you can call. Essentially, your program composes a small code snippet at runtime, and then call it as if it were processed by the compiler. This is what the last two lines of your code do: as you can see, the compiled snippet can analyze the length of the string that you pass in - when the length is less than five, you get a True back; when it's five or more, you get a False.
What happens on first execution of the compiled snippet is platform-dependent, and should not be detectable by programmers using the .NET platform.

Compile() takes the expression tree (which is a data representation of some logic) and converts it to IL which can then be executed directly as a delegate.
The only difference between the first execution and later executions is the possibility that Compile() won't trigger the JIT compilation from IL to native processor code. That may happen on first execution.

Related

Evaluator.PartialEval reduce provided expression

In one of my projects I have an ExpressionVisitor to translate provided expression into some query string. But before translating it I need to evaluate all refferences in the expression to real values. To do that I use Evaluator.PartialEval method from EntityFramework Project.
Assuming I have this query:
var page = 100;
var query = myService.AsQueryable<Product>()
//.Where(x=>x.ProductId.StartsWith(p.ProductId))
.Skip(page)
.Take(page);
var evaluatedQueryExpr = Evaluator.PartialEval(query.Expression);
As you can see I have commented Where method. In this case evaluatedQueryExpr will not contain the methods Take and Skip.
However, if I use any other method with Expression before Take or Skip everything works, Evaluator evaluates an expression correctly and return it fully.
I found out that the problem occurs in the line 80 of the Evaluator class:
return Expression.Constant(fn.DynamicInvoke(null), e.Type);
Could you explain why this happens and suggest a workaround?
Update
here is a project on github
LinqToSolrQueriable inherited from IOrderedQueryable
LinqToSolrProvider inherited from IQueryProvider including line range causing the issue

The good news are that the expression is not really reduced (Skip and Take are still there :), but is simply converted from MethodCallExpression to ConstantExpression containing the original expression:
query.Expression:
.Call System.Linq.Queryable.Take(
.Call System.Linq.Queryable.Skip(
.Constant<LinqToSolr.Query.LinqToSolrQueriable`1[LinqToSolrTest.Product]>(LinqToSolr.Query.LinqToSolrQueriable`1[LinqToSolrTest.Product]),
100),
100)
evaluatedQueryExpr:
.Constant<System.Linq.IQueryable`1[LinqToSolrTest.Product]>(LinqToSolr.Query.LinqToSolrQueriable`1[LinqToSolrTest.Product])
Here the debug display is giving you a wrong impression. If you take the ConstaintExpression.Value, you'll see that it's a IQueryable<Product> with Expression property being exactly the same as the original query.Expression.
The bad news are that this is not what you expect from PartialEval - in fact it doesn't do anything useful in this case (except potentially breaking your query translation logic).
So why is this happening?
The method you are using from EntityFramework.Extended library is in turn taken (as indicated in the comments) from MSDN Sample Walkthrough: Creating an IQueryable LINQ Provider. It can be noticed that the PartialEval method has two overloads - one with Func<Expression, bool> fnCanBeEvaluated parameter used to identify whether a given expression node can be part of the local function (in other words, to be partially evaluated or not), and one without such parameter (used by you) which simply calls the first passing the following predicate:
private static bool CanBeEvaluatedLocally(Expression expression)
{
return expression.NodeType != ExpressionType.Parameter;
}
The effect is that it stops evaluation of ParameterExpression type expressions and any expressions containing directly or indirectly ParameterExpression. The last should explain the behavior you are observing. When the query contains Where (and basically any LINQ operator) with parametrized lambda expression (hence parameter) before the Skip / Take calls, it would stop evaluation of the containing methods (which you can see from the above query.Expression debug view - the Where call will be inside the Skip).
Now, this overload is used by the MSDN example to evaluate a concrete nested Where method lambda expression and is not generally applicable for any type of expression like IQueryable.Expression. In fact the linked project is using the PartialEval method in a single place inside QueryCache class, and also calling the other overload passing a different predicate which in addition to ParameterExpressions stops the evaluation of any expression with result type of IQueryable.
Which I think is the solution of your problem as well:
var evaluatedQueryExpr = Evaluator.PartialEval(query.Expression,
// can't evaluate parameters or queries
e => e.NodeType != ExpressionType.Parameter &&
!typeof(IQueryable).IsAssignableFrom(e.Type)
);

Why would you quote a LambdaExpression?

I've read this answer and understood from it the specific case it highlights, which is when you have a lambda inside another lambda and you don't want to accidentally have the inner lambda also compile with the outer one. When the outer one is compiled, you want the inner lambda expression to remain an expression tree. There, yes, it makes sense quoting the inner lambda expression.
But that's about it, I believe. Is there any other use case for quoting a lambda expression?
And if there isn't, why do all the LINQ operators, i.e. the extensions on IQueryable<T> that are declared in the Queryable class quote the predicates or lambdas they receive as arguments when they package that information in the MethodCallExpression.
I tried an example (and a few others over the last couple of days) and it doesn't seem to make any sense to quote a lambda in this case.
Here's a method call expression to a method that expects a lambda expression (and not a delegate instance) as its only parameter.
I then compile the MethodCallExpression by wrapping it inside a lambda.
But that doesn't compile the inner LambdaExpression (the argument to the GimmeExpression method) as well. It leaves the inner lambda expression as an expression tree and does not make a delegate instance of it.
In fact, it works well without quoting it.
And if I do quote the argument, it breaks and gives me an error indicating that I am passing in the wrong type of argument to the GimmeExpression method.
What's the deal? What's this quoting all about?
private static void TestMethodCallCompilation()
{
var methodInfo = typeof(Program).GetMethod("GimmeExpression",
BindingFlags.NonPublic | BindingFlags.Static);
var lambdaExpression = Expression.Lambda<Func<bool>>(Expression.Constant(true));
var methodCallExpression = Expression.Call(null, methodInfo, lambdaExpression);
var wrapperLambda = Expression.Lambda(methodCallExpression);
wrapperLambda.Compile().DynamicInvoke();
}
private static void GimmeExpression(Expression<Func<bool>> exp)
{
Console.WriteLine(exp.GetType());
Console.WriteLine("Compiling and executing expression...");
Console.WriteLine(exp.Compile().Invoke());
}

You have to pass the argument as a ConstantExpression:
private static void TestMethodCallCompilation()
{
var methodInfo = typeof(Program).GetMethod("GimmeExpression",
BindingFlags.NonPublic | BindingFlags.Static);
var lambdaExpression = Expression.Lambda<Func<bool>>(Expression.Constant(true));
var methodCallExpression =
Expression.Call(null, methodInfo, Expression.Constant(lambdaExpression));
var wrapperLambda = Expression.Lambda(methodCallExpression);
wrapperLambda.Compile().DynamicInvoke();
}
private static void GimmeExpression(Expression<Func<bool>> exp)
{
Console.WriteLine(exp.GetType());
Console.WriteLine("Compiling and executing expression...");
Console.WriteLine(exp.Compile().Invoke());
}
The reason should be pretty obvious - you're passing a constant value, so it has to be a ConstantExpression. By passing the expression directly, you're explicitly saying "and get the value of exp from this complicated expression tree". And since that expression tree doesn't actually return a value of Expression<Func<bool>>, you get an error.
The way IQueryable works doesn't really have much to do with this. The extension methods on IQueryable have to preserve all information about the expressions - including the types and references of the ParameterExpressions and similar. This is because they don't actually do anything - they just build the expression tree. The real work happens when you call queryable.Provider.Execute(expression). Basically, this is how the polymorphism is preserved even though we're doing composition, rather than inheritance (/interface implementation). But it does mean that the IQueryable extension methods themselves cannot do any shortcuts - they don't know anything about the way the IQueryProvider is actually going to interpret the query, so they can't throw anything away.
The most important benefit you get from this, though, is that you can compose the queries and subqueries. Consider a query like this:
from item in dataSource
where item.SomeRelatedItem.Where(subItem => subItem.SomeValue == 42).Count() > 2
select item;
Now, this is translated to something like this:
dataSource.Where(item => item.SomeRelatedItem.Where(subItem => subItem.SomeValue == 42).Count() > 2);
The outer query is pretty obvious - we'll get a Where with the given predicate. The inner query, however, is actually going to be a Call to Where, taking the actual predicate as an argument.
By making sure that actual invocations of the Where method are actually translated into a Call of the Where method, both of these cases become the same, and your LINQProvider is that one bit simpler :)
I've actually written LINQ providers that don't implement IQueryable, and which actually have some useful logic in the methods like Where. It's a lot simpler and more efficient, but has the drawback described above - the only way to handle subqueries would be to manually Invoke the Call expressions to get the "real" predicate expression. Yikes - that's quite an overhead for a simple LINQ query!
And of course, it helps you compose different queryable providers, although I haven't actually seen (m)any examples of using two completely different providers in a single query.
As for the difference between Expression.Constant and Expression.Quote themselves, they seem rather similar. The crucial difference is that Expression.Constant will treat any closures as actual constants, rather than closures. Expression.Quote on the other hand, will preserve the "closure-ness" of the closures. Why? Because the closure objects themselves are also passed as Expression.Constant :) And since IQueryable trees are doing lambdas of lambdas of lambdas of [...], you really don't want to lose the closure semantics at any point.

Are C# Lambda Expressions Type Safe and when (complile time/runtime) are they checked?

I'm working on LINQ to XML queries and have used anonymous functions as well as lambda expressions. A quick example would be the select method over IEnumerables.
I understand that LINQ queries are deferred execution, which is somewhat similar to the concept lazy evaluation, but this question came to mind when VS2012's quick watch cannot handle statements with lambda expressions.
Are Lambda Expressions type-safe in C#?
I couldn't find a direct answer to this, or maybe it's because I do not fully understand type safety. I know OCaml and Java is type safe and Python is weakly typed, another way I can think of this is if the language is type safe, then lambda expressions within that language are no special. There is ambiguity in strong/weak typing but here I refer to it as if lambda expressions with erroneous types will pass through the compiler and allowed to execute at runtime. If errors exist that throw exceptions are they only caught at run-time?
When are they checked? Compile-time or Run-time
As an example, OCaml types are checked at compile time, and will not execute until the types are resolved. Whereas Python is less strict and is a dynamic language, in which it will compile and execute even with type error, only catching the errors at run time. How does C# handle lambda expressions in this sense?
Some related research I've done before asking this question:
How are Java lambdas compiled
This blog posts says LINQ is type-safe
Tutorial on using lambda expressions from CodeProject
Difference between C# Anonymous functions and Lambda Expressions

In C# exist two types of Lambda Expression:
A lambda expression is an anonymous function that you can use to create delegates or expression tree types.
The fist type of lambda expression is synctatic sugar for an anonymous function:
Func<int, int> myFunc = x => x + 1;
is totally equivalent to:
Func<int, int> myFunc = delegate(int x) { return x + 1; };
so it is clearly type safe, because it is C# code with a different makeup.
The other type of Lambda Expression is the one that generates Expression Trees:
Expression<Func<int, int>> myFunc = x => x + 1;
This is something different. This isn't compiled to "code" but is compiled to some object of type Expression that "describe" the x => x + 1 (and even describe the type of delegate)... it is compiled to:
ParameterExpression par = Expression.Parameter(typeof(int), "x");
Expression<Func<int, int>> myFunc2 = Expression.Lambda<Func<int, int>>(
Expression.Add(par, Expression.Constant(1)),
par);
Now, this code can't be executed directly. It can be converted to executable code through the .Compile() method. In general a .Compile()d expression tree is type safe, but Expression Trees aren't normally made to be simply compiled. Programs tend to manipulate them to obtain funny result. They can be used for various tasks... For example to extract the "name" of properties or "methods" without including in the code a string with the name of the property or method, or to be converted to other languages (Entity Framework/LinqToSQL convert expression trees to SQL). An Expression Tree is quite safe (it is possible to "manually build" at runtime an invalid expression, but when you do the .Compile() you'll get an exception, and expression trees accepted by the C# compiler are normally safe to be compiled), but if the expression tree is used for other things, then errors could occur, even errors connected to type safety.
I'll quote from: Why the anonymous type instance cannot accept null values returned by the entity framework query?
var l = (from s in db.Samples
let action = db.Actions.Where(x => s.SampleID == x.SampleID && x.ActionTypeID == 1).FirstOrDefault()
where s.SampleID == sampleID
select new
{
SampleID = s.SampleID,
SampleDate = action.ActionDate,
}).ToList();
Equivalent more or less to
var l = db.Samples.Select(s => new
{
s = s,
action = db.Actions.Where(x => s.SampleID == x.SampleID && x.ActionTypeID == 1).FirstOrDefault()
}).Where(x => x.s.SampleID == sampleID).Select(x => new
{
SampleID = x.s.SampleID,
SampleDate = x.action.ActionDate
}).ToList();
Here ActionDate is DateTime, and so is SampleDate. This LINQ expression will be transformed by the compiler to a big Lambda Expression, and executed by Entity Framework SQL Server-side. Now... the problem is that action could become null, and so action.ActionDate could be null (because the expression won't be executed locally there won't be a NullReferenceException), and an exception could be thrown (will be thrown) when null is put in SampleDate (an InvalidCastException I think). So while the expression is type-safe, what the library does with it causes the expression to generate non-type-safe code (an invalid cast)

Lambdas have exactly as much static type checking as any other C# code does. It's built on the same type system, and enforces all of the same compile time type checks. You can of course turn off static type checks (by, say, performing a cast) in a lambda in just the same way that you can in any other C# code.
If a lambda is complied into executable code (instead of, say, an Expression) and is run, the exact same runtime checks will be performed as if you weren't using a lambada.
In fact, if you're using lambdas compiled into executable code, it will simply be transformed into a new named method, even though it is anonymous in your original code, in one of the earlier passes of the compiler. Once transformed into a regular named method, it then goes through all of the same type checking any other code would.

Imagine that you could write a class that is something like this:
public class Foo {
public Baz DoSomething(Bar b)
{
return new Baz(b);
}
}
Clearly this is strongly typed at compile time. So now I could make a delegate declaration that is something like this:
public delegate Baz SomeDelegate(Bar b);
and then I could modify Foo and add a property:
...
public SomeDelegate MyCall { get { return DoSomething; } }
...
You need to ask youself how is it different to do this:
Bar b = new Bar();
Foo aFoo = new Foo();
var myDelegate = aFoo.MyCall;
Baz baz = myDelegate(b);
And
Bar b = new Bar();
var myDelegate = (Bar bar) => new Baz(bar);
Baz baz = myDelegate(b);
Because what happens under the hood is pretty darn close to this. A lambda expression can be implemented by creating an anonymous class with a method in it. (FWIW, before there were lambda expression in Java, I would often simulate them by using a static private inner class). Semantically, it's more complicated than this because of variables that are free/bound and how to handle that morass gracefully (hint: Java doesn't handle it), but ultimately, lambda expression in C# are syntactic sugar to give you a delegate defined inline without about as much type inference as C# can handle and delegates are strongly typed.

Lambda expression arguments for Enumerable and Queryable extension methods

A lambda expression is an anonymous method, which under the covers is a delegate so I can do something like this:
delegate bool Foo(int x);
Foo bar = x => x == 1;
Passing this delegate to an Enumerable extension method makes perfect sense, as the typical expected argument is a Func, which is shorthand for a delegate:
public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate);
However, I am unclear about how it is possible to pass in the delegate to a Queryable extension method like this one:
public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate);
This method expects an Expression<TDelegate> argument, but it is perfectly legal to pass in a lambda expression. What is the mechanism that coerces the lambda expression into Expression<TDelegate> so that it may be consumed?
I am familiar with the fact that Queryable methods build out expression trees for parsing by providers, I'm just curious about this one aspect that isn't immediately obvious to me.
UPDATE
I'm becoming less ignorant about my ignorance. Lambda expressions aren't delegates, but can be used to create either delegates or expressions:
Expression<Func<int, bool>> foo = c => c == 1;
Does the compiler infer the type based on the context? I'm guessing that must be the case, as this isn't legal:
var foo = c => c == 1;

This is described in the specification:
4.6 Expression Tree Types
If a conversion exists from a lambda expression to a delegate type D,
a conversion also exists to the expression tree type Expression<D>.
Whereas the conversion of a lambda expression to a delegate type
generates a delegate that references executable code for the lambda
expression, conversion to an expression tree type creates an
expression tree representation of the lambda expression. Expression
trees are efficient in-memory data representations of lambda
expressionsand make the structure of the lambda expressiontransparent
and explicit
So there is a conversion from a lambda to a compatible expression tree type, and the compiler emits the equivalent expression tree instead of creating a delegate.

Quite simply you can't.
However, to make IQueryable methods useful, VS2008 and above include a clever compiler trick. That a lambda expression that is a single statement may be assignable to both a delegate, and an Expression<TDelegate>. The compiler normally will hoist the expression and make a method.
But for an assignment to an Expression<TDelegate> it breaks down the statements into their syntactic meaning and turns that into an expression tree.
e.g.
Func<int,int> func = x=>x*x;
Expression<Func<int,int>> expression = x=>x*x;
The first one will probably be turned into a static method with a garbled name something like::
private static int <B012>SomeMethod(int x){
return x*x;
}
Where as the second statement will be transformed into something like::
ParameterExpression paramX = Expression.Parameter(typeof(int));
Expression<Func<int,int>> expression = Expression.Lambda<Func<int,int>>(
Expression.Multiply(paramX,paramX),paramX);
But you can not do::
expression = func;
This is not valid, as func is a delegate. You can do this though::
func=expression.Compile()
Which compiles the expressions into a func.
**Note the suggested transformations may not be a 100% correct.
The reason they did this was to allow LINQ-to-Objects (Basically Map/Reduce from other language) to share the same friendly syntax as LINQ-To-Providers. So you can write a statement that means the same thing but can change where the filtering and transformation happens.
GetEmployees().Where(e=>e.LastName=="Smith")
Can read the same, but could in theory be describing doing the filtering on this box, or an the database, or parsing an xml file or any number of various things.

I believe this has to do with how the query is built on an IQueryable. The method requires an expression tree because it can be looked inside and structured to match a more optimized (potentially) query or map more closely to the underlying data source. So simply passing in a Func would allow only for execution, whereas an Expression<Func> allows for expression trees which can be observed and executed.
Also, probably more closely answering your exact question, check this SO post out.

Expression tree emits runtime code?

When you are building the expression tree at runtime there's no code
emitted. It's a way to represent .NET code at runtime...
Ok...
Now lets say I have this code :
ParameterExpression p = Expression.Parameter (typeof (string), "s");
MemberExpression stringLength = Expression.Property (p, "Length");
ConstantExpression five = Expression.Constant (5);
BinaryExpression comparison = Expression.LessThan (stringLength, five);
Expression<Func<string, bool>> lambda= Expression.Lambda<Func<string, bool>> (comparison, p);
Func<string, bool> runnable = lambda.Compile();
This code Wont be in IL ? of course it will be ! ( maybe the last line wont emit code until compile ...but the first lines I think will emit code !)
So what am i saving here ?
Ok so the first 5 lines did emit code and the last one didn't... big deal.
What am i missing ? Can you please let me see the whole picture ?

With an Expression Tree, you build a description of some code instead of the code itself.
Expression Trees should not be used in the context of writing regular code that 'shouldn't be compiled at compile time'. They should be used in more dynamic scenarios.
The expression tree you show will compile to: s.Length < 5 and you invoke the runnable with bool isStringSmallerThan5 = runnable("MyString").
The whole idea of Expression Trees is that they describe some code and can be compiled at runtime. This means that you can do the following:
BinaryExpression comparison = null;
if (lessThen)
{
comparison = Expression.LessThan(stringLength, five);
}
else
{
comparison = Expression.GreaterThan(stringLength, five);
}
Now you can change the behavior of your code at runtime!
The biggest use of Expression Trees is that they can be interpreted by a provider. For example Linq To Entities uses Expression Trees and compiles them to SQL code that can be run against the database. LinqToXml is another example of what you can do with Expression Trees.
This a nice blog post to get you started.

Expression trees are useful when you receive them in a method since they enable you to make more complex use of the expression content. If you receive a predicate in a method, you can run it against a target and check the result. If you receive the expression tree representing an expression tree, you can parse it and do something useful with it. An example is LINQ which utilizes this many places, but among one in the "Where"-methods. Catching the expression tree rather than the IL makes it relatively straight forward to translate into SQL rather than just do a full 'Select' and run predicates against the materialized result.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.