Why are lambdas convertible to expressions but method groups are not?

Why are lambdas convertible to expressions but method groups are not? - c#

LINQPad example:
void Main()
{
One(i => PrintInteger(i));
One(PrintInteger);
Two(i => PrintInteger(i));
// Two(PrintInteger); - won't compile
}
static void One(Action<int> a)
{
a(1);
}
static void Two(Expression<Action<int>> e)
{
e.Compile()(2);
}
static void PrintInteger(int i)
{
Console.WriteLine(i);
}
Uncommenting the Two(PrintInteger); line results in an error:
cannot convert from 'method group' to
'System.Linq.Expressions.Expression<System.Action<int>>'
This is similar to Convert Method Group to Expression, but I'm interested in the "why." I understand that Features cost money, time and effort; I'm wondering if there's a more interesting explanation.

Because, in order to get the expression tree, we need a representation of the method in (uncompiled) source form. Lambda expressions are locally available in the source code and therefore are always available uncompiled. But methods may not be from inside the current assembly, and may thus be available only in compiled form.
Granted, the C# compiler could decompile the assembly’s IL code to retrieve an expression tree but as you mentioned, implementing feature costs money, this particular feature isn’t trivial, and the benefits are unclear.

There is no reason in principle. It could be done this way. The compiler could just create the lambda by itself before converting it (this is obviously always possible - it knows the exact method being called so it can just create a lambda from its parameters).
There is one catch, though. The name of the parameter of your lambda is normally hard-coded into the IL being generated. If there is no lambda, there is no name. But the compiler could either create a dummy name or reuse the names of the method being called (they are always available in the .NET assembly format).
Why didn't the C# team decide to enable this? The only reason that comes to mind is that they wanted to spend their time elsewhere. I applaud them for that decision. I'd rather have LINQ or async than this obscure feature.

In the One example, you are implicitly creating an Action<int> delegate. It's the same as:
One( new Action<int>( PrintInteger ) );
I believe this is in the language to improve the syntax for subscribing to events.
The same thing doesn't happen for Expression<T>, which is why your second example doesn't compile.
EDIT :
It's called a "method group conversion". It's in the C# spec - section 6.6
An implicit conversion (§6.1) exists from a method group (§7.1) to a compatible delegate type

Related

Using a dynamic variable as a method argument disables (some) compiler checks

Can someone explain to me why the compiler does not check the return type of a function if a dynamic variable is used as an argument to a method call?
class Program
{
static void Main(string[] args)
{
// int a = GetAString(1); // Compiler error CS0029 Cannot impilicitly convert type 'string' to 'int'
dynamic x = 1;
int b = GetAString(x); // No compiler error -> Runtime Binder Exception
// int c = (string)GetAString(x); // Compiler error CS0029 Cannot impilicitly convert type 'string' to 'int'
}
static string GetAString(int uselessInt)
{
return "abc";
}
}

By using dynamic the compiler will generate a call site anywhere you use a dynamic parameter. This call site will attempt to resolve the method at runtime, and if it cannot find a matching method will raise an exception.
In your example the call site examines x and sees that it is an int. It then looks for any methods called GetAString that take an int and finds your method and generates code to make the call.
Next, it will generate code to attempt to assign the return value to b. All of this is still done at runtime as the use of the dynamic variable has made the entire expression require runtime evaluation. The call site will see if it can generate code to assign a string to an int, and as it cannot it will raise an exception.
As an aside, your example doesn't make a lot of sense as you seem to want to assign a string to an int Your GetAsString method is even returning a non-numeric value so it's never going to assign to an int. If you write:
dynamic x = 1;
string b = GetAsString(x);
Then everything should work.

In the general case, the candidates aren't necessarily as straightforward as yours. For example, consider these two methods:
string M(string a) => a;
char[] M(char[] a) => a;
What should this code suggest as the type of the last variable?
dynamic d = SomeExpression();
var s = M(d);
At this point, the designers of C# would have to make a choice:
Assert that the return value of a method called with dynamic arguments is also dynamic itself.
Select a type that can be assigned from all methods of the group (e.g. IEnumerable<char>).
The latter option is essentially what you're describing in your question. The C# designers went with the former option. Possible reasons for that design decision could be:
Maybe they thought that if you opt in to dynamic in an expression, then it's more likely than not that you'll want to keep using dynamic on any dependent expressions, until you explicitly opt out of it again.
Maybe they didn't introduce dynamic to enable multiple dispatch, so they didn't want to encourage it further by including provisions for static typing.
Maybe they thought that including those provisions would bloat the specification or make the language harder to understand.
Maybe the former option is simpler to implement (assuming you already have the rest of dynamic implemented) and they decided the other option wasn't worth more time or effort.
Maybe it's just not that straightforward to implement in C#. Value types could require boxing to match the common supertype, which complicates things. Raw pointer types are out of the unified hierarchy altogether.

Why can't the compiler tell the better conversion target in this overload resolution case? (covariance)

Understanding the C# Language Specification on overload resolution is clearly hard, and now I am wondering why this simple case fails:
void Method(Func<string> f)
{
}
void Method(Func<object> f)
{
}
void Call()
{
Method(() => { throw new NotSupportedException(); });
}
This gives compile-time error CS0121, The call is ambiguous between the following methods or properties: followed by my two Method function members (overloads).
What I would have expected was that Func<string> was a better conversion target than Func<object>, and then the first overload should be used.
Since .NET 4 and C# 4 (2010), the generic delegate type Func<out TResult> has been covariant in TResult, and for that reason an implicit conversion exists from Func<string> to Func<object> while clearly no implicit conversion can exist from Func<object> to Func<string>. So it would make Func<string> the better conversion target, and the overload resolution should pick the first overload?
My question is simply: What part of the C# Spec am I missing here?
Addition: This works fine:
void Call()
{
Method(null); // OK!
}

My question is simply: What part of the C# Spec am I missing here?
Summary:
You have found a minor known bug in the implementation.
The bug will be preserved for backwards compatibility reasons.
The C# 3 specification contained an error regarding how the "null" case was to be handled; it was fixed in the C# 4 specification.
You can reproduce the buggy behavior with any lambda where the return type cannot be inferred. For example: Method(() => null);
Details:
The C# 5 specification says that the betterness rule is:
If the expression has a type then choose the better conversion from that type to the candidate parameter types.
If the expression does not have a type and is not a lambda, choose the conversion to the type that is better.
If the expression is a lambda then first consider which parameter type is better; if neither is better and the delegate types have identical parameter lists then consider the relationship between the inferred return type of the lambda and the return types of the delegates.
So the intended behaviour is: first the compiler should check to see if one parameter type is clearly better than the other, regardless of whether the argument has a type. If that doesn't resolve the situation and the argument is a lambda, then check to see which of the inferred return type converted to the parameters' delegate types' return type is better.
The bug in the implementation is the implementation doesn't do that. Rather, in the case where the argument is a lambda it skips the type betterness check entirely and goes straight to the inferred return type betterness check, which then fails because there is no inferred return type.
My intention was to fix this for Roslyn. However, when I went to implement this, we discovered that making the fix caused some real-world code to stop compiling. (I do not recall what the real-world code was and I no longer have access to the database that holds the compatibility issues.) We therefore decided to maintain the existing small bug.
I note that the bug was basically impossible before I added delegate variance in C# 4; in C# 3 it was impossible for two different delegate types to be more or less specific, so the only rule that could apply was the lambda rule. Since there was no test in C# 3 that would reveal the bug, it was easy to write. My bad, sorry.
I note also that when you start throwing expression tree types into the mix, the analysis gets even more complicated. Even though Func<string> is better than Func<object>, Expression<Func<string>> is not convertible to Expression<Func<object>>! It would be nice if the algorithm for betterness was agnostic with respect to whether the lambda was going to an expression tree or a delegate, but it is in some ways not. Those cases get complicated and I don't want to labour the point here.
This minor bug is an object lesson in the importance of implementing what the spec actually says and not what you think it says. Had I been more careful in C# 3 to ensure that the code matched the spec then the code would have failed on the "null" case and it would then have been clear earlier that the C# 3 spec was wrong. And the implementation does the lambda check before the type check, which was a time bomb waiting to go off when C# 4 rolled around and suddenly that became incorrect code. The type check should have been done first regardless.

Well, you are right. What causes problem here is the delegate you are passing as an argument. It has no explicit return type, you are just throwing an exception. Exception is basically an object but it is not considered as a return type of a method. Since there is no return call following the exception throw, compiler is not sure what overload it should use.
Just try this
void Call()
{
Method(() =>
{
throw new NotSupportedException();
return "";
});
}
No problem with choosing an overload now because of explicitly stated type of an object passed to a return call. It does not matter that the return call is unreachable due to the exception throw, but now the compiler knows what overload it should use.
EDIT:
As for the case with passing null, frenkly, I don't know the answer.

Delegate as first param to an Extension Method

Ladies and Gents,
I recently tried this experiment:
static class TryParseExtensions
{
public delegate bool TryParseMethod<T>(string s, out T maybeValue);
public static T? OrNull<T>(this TryParseMethod<T> tryParser, string s) where T:struct
{
T result;
return tryParser(s, out result) ? (T?)result : null;
}
}
// compiler error "'int.TryParse(string, out int)' is a 'method', which is not valid in the given context"
var result = int.TryParse.OrNull("1"); // int.TryParse.OrNull<int>("1"); doesnt work either
// compiler error: type cannot be infered....why?
var result2 = TryParseExtensions.OrNull(int.TryParse, "2");
// works as expected
var result3 = TryParseExtensions.OrNull<int>(int.TryParse, "3");
var result4 = ((TryParseExtensions.TryParseMethod<int>)int.TryParse).OrNull("4");
I am wondering two things:
Why can the compiler not infer the "int" type parameter?
Do I understand correctly that extensions methods do not get discovered on Delegate types, as I guess they arent really of that type (but are a "Method") that only happen to match the delegates signature? As such a cast solves this. Would it be infeasable to enable scenario 1 to work (not this one specifically of course, but in general)? I guess from a language/compiler perspective and would it actually be useful, or am I just (attempting to) wildly abusing things here?
Looking forward to some insights. Thnx

You have a number of questions here. (In the future I would recommend that when you have multiple questions, split them up into multiple questions rather than one posting with several questions in it; you'll probably get better responses.)
Why can the compiler not infer the "int" type parameter in:
TryParseExtensions.OrNull(int.TryParse, "2");
Good question. Rather than answer that here, I refer you to my 2007 article which explains why this did not work in C# 3.0:
http://blogs.msdn.com/b/ericlippert/archive/2007/11/05/c-3-0-return-type-inference-does-not-work-on-member-groups.aspx
Summing up: fundamentally there is a chicken-and-egg problem here. We must do overload resolution on int.TryParse to determine which overload of TryParse is the intended one (or, if none of them work, what the error is.) Overload resolution always tries to infer from arguments. In this case though, it is precisely the type of the argument that we are attempting to infer.
We could come up with a new overload resolution algorithm that says "well, if there's only one method in the method group then pick that one even if we don't know what the arguments are", but that seems weak. It seems like a bad idea to special-case method groups that have only one method in them because that then penalizes you for adding new overloads; it can suddenly be a breaking change.
As you can see from the comments to that article, we got a lot of good feedback on it. The best feedback was got was basically "well, suppose type inference has already worked out the types of all the argument and it is the return type that we are attempting to infer; in that case you could do overload resolution". That analysis is correct, and changes to that effect went into C# 4. I talked about that a bit more here:
http://blogs.msdn.com/b/ericlippert/archive/2008/05/28/method-type-inference-changes-part-zero.aspx
Do I understand correctly that extensions methods do not get discovered on delegate types, as I guess they arent really of that type (but are a "Method") that only happen to match the delegates signature?
Your terminology is a bit off, but your idea is correct. We do not discover extension methods when the "receiver" is a method group. More generally, we do not discover extension methods when the receiver is something that lacks its own type, but rather takes on a type based on its context: method groups, lambdas, anonymous methods and the null literal all have this property. It would be really bizarre to say null.Whatever() and have that call an extension method on String, or even weirder, (x=>x+1).Whatever() and have that call an extension method on Func<int, int>.
The line of the spec which describes this behaviour is :
An implicit identity, reference or boxing conversion [must exist] from [the receiver expression] to the type of the first parameter [...].
Conversions on method groups are not identity, reference or boxing conversions; they are method group conversions.
Would it be infeasable to enable scenario 1 to work (not this one specifically of course, but in general)? I guess from a language/compiler perspective and would it actually be useful, or am I just (attempting to) wildly abusing things here?
It is not infeasible. We've got a pretty smart team here and there's no theoretical reason why it is impossible to do so. It just doesn't seem to us like a feature that adds more value to the language than the cost of the additional complexity.
There are times when it would be useful. For example, I'd like to be able to do this; suppose I have a static Func<A, R> Memoize<A, R>(this Func<A, R> f) {...}:
var fib = (n=>n<2?1:fib(n-1)+fib(n-2)).Memoize();
Instead of what you have to write today, which is:
Func<int, int> fib = null;
fib = n=>n<2?1:fib(n-1)+fib(n-2);
fib = fib.Memoize();
But frankly, the additional complexity the proposed feature adds to the language is not paid for by the small benefit in making the code above less verbose.

The reason for the first error:
int.TryParse is a method group, not an object instance of any type. Extension methods can only be called on object instances. That's the same reason why the following code is invalid:
var s = int.TryParse;
This is also the reason why the type can't be inferred in the second example: int.TryParse is a method group and not of type TryParseMethod<int>.
I suggest, you use approach three and shorten the name of that extension class. I don't think there is any better way to do it.

Note that your code works if you first declare :
TryParseExtensions.TryParseMethod<int> tryParser = int.TryParse;
and then use tryParser where you used int.TryParse.
The problem is that the compiler doesn't know which overload of int.Parse you're speaking about. So it cannot completely infer it : are you speaking about TryParse(String, Int32) or TryParse(String, NumberStyles, IFormatProvider, Int32) ? The compiler can't guess and won't arbitrarily decide for you (fortunately !).
But your delegate type makes clear which overload you're interested in. That's why assigning tryParser is not a problem. You're not speaking anymore of a "method group" but of a well identified method signature inside this group of methods called int.TryParse.

Why do parentheses around lambda statement cause syntax error?

I'm looking for a good explanation why one piece of code fails to compile and the other compiles just fine.
Fails:
richTextBox1.Invoke(new MethodInvoker((() => { richTextBox1.AppendText("test"); })));
Gives the error
Method name expected
on the opening parenthesis right after MethodInvoker(. Apparently, I can't wrap my lambda statements in parentheses.
Compiles:
richTextBox1.Invoke(new MethodInvoker(() => { richTextBox1.AppendText("test"); }));
The questions is - why?
I always took it for granted that I could wrap any method param in parentheses if I wanted but apparently that's not the case with lambda expressions. I understand that they are somewhat special, but I still can't see a good reason for this. Maybe I don't understand something about the syntax. I would really like to get it.
By the way, this presents in VS2008, .NET 3.5 SP1, I haven't tested it in VS2010 and .NET 4 yet.

It's not a lambda expression, it's a parenthesized expression that contains a lambda expression. Therefore, the node for this parameter in the abstract syntax tree for this method invocation would be a parenthesized expression, and not a lambda expression as required by the specification. This is why.
There are other places where the Microsoft C# compiler does violate the specification and accept such an expression even though it shouldn't (per the specification) but this is not one of them.
The relevant section of the specification is §6.5.

You are mistaken in the premise that you have written a “method param”. The construct you have created is not a method call, you have written a delegate creation expression (see the C# specification, section 7.6.10.5), which is supposed to have a single argument, which must be either
a method group,
an anonymous function or
a value of either the compile time type dynamic or a delegate-type.
In your case, it is not a method group (the error message is hinting that a method name is expected there), nor an anonymous function (since it is an expression which “somewhere inside” contains an anonymous function), nor a value of the said types.
If you wrote a method invokation, you could, indeed, wrap the parameter in parentheses, even if it contains a lambda expression:
void Method(Action action)
{
}
...
Method((() => { Console.WriteLine("OK"); }));

Because the compiler expects ()=>{} inside the Invoke() method and in the first example it does not find it. Everything within the parenthesis is evaluated first returning a single object, at which case the compiler expects the reference to a delegate.
Edited
I have solved the same problem with this Extension method:
public delegate void EmptyHandler();
public static void SafeCall(this Control control, EmptyHandler method)
{
if (control.InvokeRequired)
{
control.Invoke(method);
}
else
{
method();
}
}
So you can call
RichTextBox rtb = new RichRextBox();
...
rtb.SafeCall( ()=> rtb.AppendText("test") );

Events in lambda expressions - C# compiler bug?

I was looking at using a lamba expression to allow events to be wired up in a strongly typed manner, but with a listener in the middle, e.g. given the following classes
class Producer
{
public event EventHandler MyEvent;
}
class Consumer
{
public void MyHandler(object sender, EventArgs e) { /* ... */ }
}
class Listener
{
public static void WireUp<TProducer, TConsumer>(
Expression<Action<TProducer, TConsumer>> expr) { /* ... */ }
}
An event would be wired up as:
Listener.WireUp<Producer, Consumer>((p, c) => p.MyEvent += c.MyHandler);
However this gives a compiler error:
CS0832: An expression tree may not contain an assignment operator
Now at first this seems reasonable, particularly after reading the explanation about why expression trees cannot contain assignments. However, in spite of the C# syntax, the += is not an assignment, it is a call to the Producer::add_MyEvent method, as we can see from the CIL that is produced if we just wire the event up normally:
L_0001: newobj instance void LambdaEvents.Producer::.ctor()
L_0007: newobj instance void LambdaEvents.Consumer::.ctor()
L_000f: ldftn instance void LambdaEvents.Consumer::MyHandler(object, class [mscorlib]System.EventArgs)
L_0015: newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
L_001a: callvirt instance void LambdaEvents.Producer::add_MyEvent(class [mscorlib]System.EventHandler)
So it looks to me like this is a compiler bug as it's complaining about assignments not being allowed, but there is no assignment taking place, just a method call. Or am I missing something...?
Edit:
Please note that the question is "Is this behaviour a compiler bug?". Sorry if I wasn't clear about what I was asking.
Edit 2
After reading Inferis' answer, where he says "at that point the += is considered to be assignment" this does make some sense, because at this point the compiler arguably doesn't know that it's going to be turned into CIL.
However I am not permitted to write the explicit method call form:
Listener.WireUp<Producer, Consumer>(
(p, c) => p.add_MyEvent(new EventHandler(c.MyHandler)));
Gives:
CS0571: 'Producer.MyEvent.add': cannot explicitly call operator or accessor
So, I guess the question comes down to what += actually means in the context of C# events. Does it mean "call the add method for this event" or does it mean "add to this event in an as-yet undefined manner". If it's the former then this appears to me to be a compiler bug, whereas if it's the latter then it's somewhat unintuitive but arguably not a bug. Thoughts?

In the spec, section 7.16.3, the += and -= operators are called "Event assignment" which certainly makes it sound like an assignment operator. The very fact that it's within section 7.16 ("Assignment operators") is a pretty big hint :) From that point of view, the compiler error makes sense.
However, I agree that it is overly restrictive as it's perfectly possible for an expression tree to represent the functionality given by the lambda expression.
I suspect the language designers went for the "slightly more restrictive but more consistent in operator description" approach, at the expense of situations like this, I'm afraid.

+= is an assignment, no matter what it does (e.g. add an event). From the parser point of view, it is still an assignment.
Did you try
Listener.WireUp<Producer, Consumer>((p, c) => { p.MyEvent += c.MyHandler; } );

Actually, as far as the compiler is concerned at that point, it is an assignment.
The += operator is overloaded, but the compiler doesn't care about that at it's point. After all, you're generating an expression through the lambda (which, at one point will be compiled to actual code) and no real code.
So what the compiler does is say: create an expression in where you add c.MyHandler to the current value of p.MyEvent and store the changed value back into p.MyEvent. And so you're actually doing an assignment, even if in the end you aren't.
Is there a reason you want the WireUp method to take an expression and not just an Action?

Why do you want to use the Expression class? Change Expression<Action<TProducer, TConsumer>> in your code to simply Action<TProducer, TConsumer> and all should work as you want. What you're doing here is forcing the compiler to treat the lambda expression as an expression tree rather than a delegate, and an expression tree indeed cannot contain such assignments (it's treated as an assignment because you're using the += operator I believe). Now, a lambda expression can be converted into either form (as stated on [MSDN][1]). By simply using a delegate (that's all the Action class is), such "assignments" are perfectly valid. I may have misunderstood the problem here (perhaps there is a specific reason why you need to use an expression tree?), but it does seem like the solution is fortunately this simple!
Edit: Right, I understand your problem a bit better now from the comment. Is there any reason you can't just pass p.MyEvent and c.MyHandler as arguments to the WireUp method and attach the event handler within the WireUp method (to me this also seems better from a design point of view)... would that not eliminate the need for an expression tree? I think it's best if you avoid expression trees anyway, as they tend to be rather slow compared to delegates.

I think the problem is, that apart from the Expression<TDelegate> object, the expression tree is not statically typed from the perspective of the compiler. MethodCallExpression and friends do not expose static typing information.
Even though the compiler knows all the types in the expression, this information is thrown away when converting the lambda expression to an expression tree. (Have a look at the code generate for expression trees)
I would nonetheless consider submitting this to microsoft.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why are lambdas convertible to expressions but method groups are not? - c#

Related

Using a dynamic variable as a method argument disables (some) compiler checks

Why can't the compiler tell the better conversion target in this overload resolution case? (covariance)

Delegate as first param to an Extension Method

Why do parentheses around lambda statement cause syntax error?

Events in lambda expressions - C# compiler bug?

Categories

Resources