Confusion with Enumerable.Select in C# - c#

I am new to the C# and .NET world. I am trying to understand the following statement.
var xyz = Enumerable.Repeat(obj, 1).ToList();
var abc =
xyz.Select(xyzobj => new res {
foo = bar,
xyzs = new [] {xyzobj},
}).ToList();
I understand that Select takes in an object and a transformer function and returns a new form of the object. But here, it takes in a lambda expression with an enum value and another object.
I am little confused. Is the statement above similar to
var abc = xyz.Select(xyzobj => {
//some work with xyzobj
//and return object.
}).ToList();
Can somebody explain the above statement actually does, my head just spins around with these statements all around in my new work location.
Can somebody direct me to good resources to understand lambda expressions and Enumeration.

There are two main types of lambda expressions in C#.
Expression lambdas like this:
x => foo(x);
This takes a parameter x, and performs some transformation on it, foo, returning the result of foo(x) (though technically it may not return a value of the result type of foo(x) is void).
Statement lambdas look like this:
x => {
// code block
}
This takes a parameter, x and performs some action on it, (optionally returning a value if an explicit return is provided). The code block may be composed of multiple statements, meaning you can declare variables, execute loops, etc. This is not supported in the simpler expression lambda syntax. But it's just another type of lambda.
If it helps you can think of the following as being equivalent (although they are not perfectly equivalent if you are trying to parse this as an Expression):
x => foo(x);
x => { return foo(x); }
Further Reading
Lambda Expressions (C# Programming Guide)

// Creates a IEnumerable<T> of the type of obj with 1 element and converts it to a List
var xyz = Enumerable.Repeat(obj, 1).ToList();
// Select takes each element in the List xyz
// and passes it to the lambda as the parameter xyzobj.
// Each element is used to create a new res object with object initialization.
// The res object has two properties, foo and xyzs. foo is given the value bar
// (which was defined elsewhere).
// The xyzs property is created using the element from xyz passed into the lambda
var abc = xyz.Select(
xyzobj => new res {
foo = bar,
xyzs = new [] {xyzobj},
}).ToList();

Related

When compiling C# expression trees into methods, is it possible to access "this"?

I am trying to dynamically generate a class that implements a given interface. Because of this, I need to implement some methods. I would like to avoid directly emitting IL instructions, so I am trying to use Expression trees and CompileToMethod. Unfortunately, some of these methods need to access a field of the generated class (as if I wrote this.field into the method I am implementing). Is it possible to access "this" using expression trees? (By "this" I mean the object the method will be called on.)
If yes, what would a method like this look like with expression trees?
int SomeMethod() {
return this.field.SomeOtherMethod();
}
Expression.Constant or ParameterExpression are your friends; examples:
var obj = Expression.Constant(this);
var field = Expression.PropertyOrField(obj, "field");
var call = Expression.Call(field, field.Type.GetMethod("SomeOtherMethod"));
var lambda = Expression.Lambda<Func<int>>(call);
or:
var obj = Expression.Parameter(typeof(SomeType));
var field = Expression.PropertyOrField(obj, "field");
var call = Expression.Call(field, field.Type.GetMethod("SomeOtherMethod"));
var lambda = Expression.Lambda<Func<SomeType, int>>(call, obj);
(in the latter case, you'd pass this in as a parameter, but it means you can store the lambda and re-use it for different target instance objects)
Another option here might be dynamic if your names are fixed:
dynamic obj = someDuckTypedObject;
int i = obj.field.SomeOtherMethod();

In C#, how can I create a value type variable at runtime?

I am attempting to implement a method like:
(Func<T> getFn, Action<T> setFn) MakePair<T>(T initialVal) {
}
It will return two runtime generated lambdas that get and set a dynamically created variable using Expression trees to create the code.
My current solution is to dynamically create an array of the type with one element, and reference that:
(Func<T> getFn, Action<T> setFn) MakePair<T>(T initialVal) {
var dynvar = Array.CreateInstance(typeof(T), 1);
Expression<Func<Array>> f = () => dynvar;
var dynref = Expression.Convert(f.Body, typeof(T).MakeArrayType());
var e0 = Expression.Constant(0);
var getBody = Expression.ArrayIndex(dynref, e0);
var setParam = Expression.Parameter(typeof(T));
var setBody = Expression.Assign(Expression.ArrayAccess(dynref, e0), setParam);
var getFn = Expression.Lambda<Func<T>>(getBody).Compile();
var setFn = Expression.Lambda<Action<T>>(setBody, setParam).Compile();
return (getFn, setFn);
}
Is there a better way to create what may be a value type variable at runtime that can be read/written to than using an array?
Is there a better way to reference the runtime created array other than using a lambda to create the (field?) reference for use in the ArrayIndex/ArrayAccess method calls?
Excessive Background Info
For those that wonder, ultimately this came up in an attempt to create something like Perl auto-virification of lvalues for Perl hashes.
Imagine you have a List<T> with duplicate elements and want to create a Dictionary<T,int> that allows you to look up the count for each unique T in the list. You can use a few lines of code to count (in this case T is int):
var countDict = new Dictionary<int, int>();
foreach (var n in testarray) {
countDict.TryGetValue(n, out int c);
countDict[n] = c + 1;
}
But I want to do this with LINQ, and I want to avoid double-indexing countDict (interestingly, ConcurrentDictionary has AddOrUpdate for this purpose) so I use Aggregate:
var countDict = testarray.Aggregate(new Dictionary<int,int>(), (d, n) => { ++d[n]; return d; });
But this has a couple of issues. First, Dictionary won't create a value for a missing value, so you need a new type of Dictionary that auto-creates missing values using e.g. a seed lambda:
var countDict = testarray.Aggregate(new SeedDictionary<int, Ref<int>>(() => Ref.Of(() => 0)), (d, n) => { var r = d[n]; ++r.Value; return d; });
But you still have the lvalue problem, so you replace the plain int counter with a Ref class. Unfortunately, C# can't create a C++ first class Ref class, but using one based around auto-creating a setter lambda from a getter lambda (using expression trees) is close enough. (Unfortunately C# still won't accept ++d[n].Value; even though it should be valid, so you have to create a temporary.)
But now you have the problem of creating multiple runtime integer variables to hold the counts. I extended the Ref<> class to take a lambda that returns a constant (ConstantExpression) and create a runtime variable and build a getter and setter with the constant being the initial value.
I agree with some of the question commenters that expression trees seem unnecessary, so here is a simple implementation of the shown API without them:
struct Box<T> {
public T Value;
}
(Func<T> getFn, Action<T> setFn) MakePair<T>(T initialVal) {
var box = new Box<T> { Value = initialVal };
return (() => box.Value, v => box.Value = v);
}
As an answer to the stated question (how to define dynref without a lambda), then, is there something wrong with the following modifications to dynvar and dynref?
var dynvar = new T[] { initialVal };
var dynref = Expression.Constant(dynvar);

Does the LINQ Expression API offer no way to create a variable?

I want to validate my assumption that the LINQ Expression API does not have any means for us to create an expression that represents the creation of a local variable.
In other words, you cannot create an expression to represent:
int local;
since that is a variable declaration statement, and the API does not support statement lambdas. The only state that a lambda expression, as represented by the LINQ Expression API (and not a delegate instance) can work with is parameters it receives and the captured variables it receives via a closure.
Is my assumption (based on a few months of practice of the LINQ Expression API) correct?
False. There are some overloads of Expression.Block to do it.
What is true is that you can't create a lambda expression through the use of the C# compiler that has a variable, but that is a limitation of the compiler.
So you can't
Expression<Func<int>> exp = () => {
int v = 1;
return v;
};
but you can
var variable = Expression.Variable(typeof(int));
var lambda = Expression.Lambda<Func<int>>(
Expression.Block(
new[] { variable },
Expression.Assign(variable, Expression.Constant(1)),
variable)); // With lambda expressions, there is an implicit
// return of the last value "loaded" on the stack
since that is a variable declaration statement, and the API does not support statement lambdas.
This was true in .NET < 4.0 . In .NET 4.0 Microsoft added Expression methods to build nearly everything that can be present in the body of a method (there are some missing "things", like unsafe code keywords/operators, plus there are the primitives but there aren't complex constructs like the for or lock, that can be built on top of other constructs). Note that 90% of those added things are incompatible with LINQ-to-SQL/EF.
Well, you can use Expression.Block to declare a block which contains local variables...
For example:
using System;
using System.Linq.Expressions;
public class Test
{
static void Main()
{
var x = Expression.Variable(typeof(int), "x");
var assignment1 = Expression.Assign(x, Expression.Constant(1, typeof(int)));
var assignment2 = Expression.Assign(x, Expression.Constant(2, typeof(int)));
var block = Expression.Block(new[] { x }, new[] { assignment1, assignment2 });
}
}
That builds an expression tree equivalent to:
{
int x;
x = 1;
x = 2;
}
The C# compiler doesn't use this functionality within lambda expression conversions to expression trees, which are currently still restricted to expression lambdas, as far as I'm aware.

What are C# lambda's compiled into? A stackframe, an instance of an anonymous type, or?

What are C# lambda's compiled into? A stackframe, an instance of an anonymous type, or?
I've read this question. Which mostly answers "why" you can't use a lambda when also using implicit type features. But, this question is aimed at answering what construct the compiler produces to actually carry out the code of a lambda. Is it a method call of an anonymous type (something like anonymous types that implement an interface in Java?) or is it just a stack frame with references to closed variables and the accepting the parameter signature? Some lambda's don't close over anything -- so are there then 2 different resulting outputs from the compile.
Assuming you mean "as a delegate", then it still depends :p if it captures any variables (including "this", which may be implicit) then those variables are actually implemented as fields on a compiler-generated type (not exposed anywhere public), and the statement body becomes a method on that capture class. If there are multiple levels of capture, the outer capture is again a field on the inner capture class. But essentially:
int i = ...
Func<int,int> func = x => 2*x*i;
Is like;
var capture = new SecretType();
capture.i = ...
Func<int,int> func = capture.SecretMethod;
Where:
class SecretType {
public int i;
public int SecretMethod(int x) { return 2*x*i; }
}
This is identical to "anonymous methods", but with different syntax.
Note that methods that do not capture state may be implemented as static methods without a capture class.
Expression trees, on the other hand... Are trickier to explain :p
But (I don't have a compiler to hand, so bear with me):
int i = ...
Expression<Func<int,int>> func = x => 2*x*i;
Is something like:
var capture = new SecretType();
capture.i = ...
var p = Expression.Parameter("x", typeof(int));
Expression<Func<int,int>> func = Expression.Lambda<Func<int,int>>(
Expression.Multiply(
Expression.Multiply(Expression.Constant(2),p),
Expression.PropertyOrField(Expression.Constant(capture), "i")
), p);
(except using the non-existent "memberof" construct, since the compiler can cheat)
Expression trees are complex, but can be deconstructed and inspected - for example to translate into TSQL.
Lambda expressions are indeed anonymous functions, but with more versatility. These two articles authored by the MSDN have a lot of information on lambda expressions, how to use them, what precedence the operator => has, what their relation to anonymous functions are, and some advanced suggestions of use.
Lambda Expressions (MSDN)
=> Operator (MSDN)
Here are some examples:
public class C
{
private int field = 0;
public void M()
{
int local = 0;
Func<int> f1 = () => 0;
// f1 is a delegate that references a compiler-generated static method in C
Func<int> f2 = () => this.field;
// f2 is a delegate that references a compiler-generated instance method in C
Func<int> f3 = () => local;
// f3 is a delegate that references an instance method of a compiler-generated nested class in C
}
}
A lambda expression is an unnamed method written in place of delegate istance.
The compiler converts it to either:
A delegate instance
An expression tree, of type Expression<TDelegate> that representing the code inside, in a traversable object model. This allows the lambda expression to be interpreted at runtime.
So the compiler solves lambda expressions moving the expression's code into a private method.

Expression Tree parsing, variables ends up as constants

So I'm working on a project where I need to parse an expression tree. I got most of the things working, but I've run into a bit of a problem.
I've been looking at the other questions on StackOverflow on Expression Trees, but can't seem to find an answer to my question, so here goes.
My problem is the difference (or lack of) between constants and variables. Let me start off with an example:
user => user.Email == email
This is clearly not a constant but a variable, but this ends up being a ConstantExpression somewhere in the expression tree. If you take a look at the expression itself, it looks a bit odd:
Expression = {value(stORM.Web.Security.User+<>c__DisplayClassa)}
If we take another example:
task => task.Status != TaskStatus.Done && t.Status != TaskStatus.Failed
Here I'm using an ENUM (TaskStatus).
So my problem is that in the tree parsing I seem to end up with a ConstantExpression in both cases, and I really need to be able to tell them apart. These are just examples, so what I'm asking is a generic way of telling these two types of expression from each other, so I can handle then in 2 different ways in my parsing.
EDIT: okay, my examples might not be clear, so I'll try again. First example:
User user = db.Search < User > (u => u.Email == email);
I'm trying to find a user with the given e-mail address. I'm parsing this into a stored procedure, but that's besides the point I guess.
Second example:
IList < Task > tasks = db.Search(t => t.Status != TaskStatus.Done && t.Status != TaskStatus.Failed);
And here I'm trying to locate all tasks with a status different from Done and Failed.
Again this is being parsing into a stored procedure. In the first example my code needs to determine that the stored procedure needs a input parameter, the value of the email variable. In the second example I don't need any input parameters, I just need to create the SQL for selecting task with a status different from Done and Failed.
Thanks again
So from the point of view of expression the value is a constant. It can not be changed by the expression.
What you have is a potentially open closure - i.e. the value can change between executions of the expression, but not during it. So it is a "constant". This is a paradigm difference between the world of functional programming and un-functional :) programming.
Consider
int a =2;
Expression<Func<int, int>> h = x=> x+ a;
Expression<Func<int, int>> j = x => x +2;
a = 1;
the term a is a member access into an anonymous class that wraps up and access the a variable on the stack. The first node is a MemberAccess node then underneath that - the expression is a constant.
For the code above:
((SimpleBinaryExpression)(h.Body)).Right
{value(WindowsFormsApplication6.Form1+<>c__DisplayClass0).a}
CanReduce: false
DebugView: ".Constant<WindowsFormsApplication6.Form1+<>c__DisplayClass0>(WindowsFormsApplication6.Form1+<>c__DisplayClass0).a"
Expression: {value(WindowsFormsApplication6.Form1+<>c__DisplayClass0)}
Member: {Int32 a}
NodeType: MemberAccess
Type: {Name = "Int32" FullName = "System.Int32"}
And the constant underneath that:
((MemberExpression)((SimpleBinaryExpression)(h.Body)).Right).Expression
{value(WindowsFormsApplication6.Form1+<>c__DisplayClass0)}
CanReduce: false
DebugView: ".Constant<WindowsFormsApplication6.Form1+<>c__DisplayClass0>(WindowsFormsApplication6.Form1+<>c__DisplayClass0)"
NodeType: Constant
Type: {Name = "<>c__DisplayClass0" FullName = "WindowsFormsApplication6.Form1+<>c__DisplayClass0"}
Value: {WindowsFormsApplication6.Form1.}
}
}
The plain old 2 comes out to a:
((SimpleBinaryExpression)(j.Body)).Right
{2}
CanReduce: false
DebugView: "2"
NodeType: Constant
Type: {Name = "Int32" FullName = "System.Int32"}
Value: 2
So I don't know if that helps you or not. You can kind of tell by looking at the parent node - or the type of object being accessed by the parent node.
Adding as a result of your clarification -
so when you say
user => user.Email == email
You mean look for all users with an email equal to a passed in parameter - however that link expression means something quite different.
what you want to say is
Expression<Func<User, string, bool>> (user, email) => user.Email == email
This way the email will now be a parameter. If you don't like that there is one other thing you can do.
The second example will work just fine - no extra params are needed consts will be consts.
t => t.Status != TaskStatus.Done && t.Status != TaskStatus.Failed
Edit: adding another way:
So one of the things that you had to do to get your code working was declare a string email outside the lambda - that is kind of clunky.
You could better identify parameters by conventionally putting them in a specific place - like a static class. Then when going through the Lambda you don't have to look at some horrible cloture object - but a nice static class of your making.
public static class Parameter
{
public static T Input<T>(string name)
{
return default(T);
}
}
Then your code looks like this:
Expression<Func<User, bool>> exp = x => x.Email == Parameter.Input<String>("email");
You can then traverse the tree - when you come to a call to to the Parameter static class you can look at the type and the name (in the arguments collection) and off you go....
The name is a bit unfortunate, it is not actually a constant.
It simply refers to a value outside the Expression.
A captured variable (the first case with email) is typically a ConstantExpression representing the capture class instance, with a MemberExpression to a FieldInfo for the "variable" - as if you had:
private class CaptureClass {
public string email;
}
...
var obj = new CaptureClass();
obj.email = "foo#bar.com";
Here, obj is the constant inside the expression.
So: if you see a MemberExpression (of a field) to a ConstantExpression, it is probably a captured variable. You could also check for CompilerGeneratedAttribute on the capture-class...
A literal constant will typically just be a ConstantExpression; in fact, it would be hard to think of a scenario where you use a constant's member, unless you could something like:
() => "abc".Length
but here .Length is a property (not a field), and the string probably doesn't have [CompilerGenerated].
Just check the Type of the ConstantExpression. Any 'constant' ConstantExpression has a primitive type.

Categories