Delegate as first param to an Extension Method - c#

Ladies and Gents,
I recently tried this experiment:
static class TryParseExtensions
{
public delegate bool TryParseMethod<T>(string s, out T maybeValue);
public static T? OrNull<T>(this TryParseMethod<T> tryParser, string s) where T:struct
{
T result;
return tryParser(s, out result) ? (T?)result : null;
}
}
// compiler error "'int.TryParse(string, out int)' is a 'method', which is not valid in the given context"
var result = int.TryParse.OrNull("1"); // int.TryParse.OrNull<int>("1"); doesnt work either
// compiler error: type cannot be infered....why?
var result2 = TryParseExtensions.OrNull(int.TryParse, "2");
// works as expected
var result3 = TryParseExtensions.OrNull<int>(int.TryParse, "3");
var result4 = ((TryParseExtensions.TryParseMethod<int>)int.TryParse).OrNull("4");
I am wondering two things:
Why can the compiler not infer the "int" type parameter?
Do I understand correctly that extensions methods do not get discovered on Delegate types, as I guess they arent really of that type (but are a "Method") that only happen to match the delegates signature? As such a cast solves this. Would it be infeasable to enable scenario 1 to work (not this one specifically of course, but in general)? I guess from a language/compiler perspective and would it actually be useful, or am I just (attempting to) wildly abusing things here?
Looking forward to some insights. Thnx

You have a number of questions here. (In the future I would recommend that when you have multiple questions, split them up into multiple questions rather than one posting with several questions in it; you'll probably get better responses.)
Why can the compiler not infer the "int" type parameter in:
TryParseExtensions.OrNull(int.TryParse, "2");
Good question. Rather than answer that here, I refer you to my 2007 article which explains why this did not work in C# 3.0:
http://blogs.msdn.com/b/ericlippert/archive/2007/11/05/c-3-0-return-type-inference-does-not-work-on-member-groups.aspx
Summing up: fundamentally there is a chicken-and-egg problem here. We must do overload resolution on int.TryParse to determine which overload of TryParse is the intended one (or, if none of them work, what the error is.) Overload resolution always tries to infer from arguments. In this case though, it is precisely the type of the argument that we are attempting to infer.
We could come up with a new overload resolution algorithm that says "well, if there's only one method in the method group then pick that one even if we don't know what the arguments are", but that seems weak. It seems like a bad idea to special-case method groups that have only one method in them because that then penalizes you for adding new overloads; it can suddenly be a breaking change.
As you can see from the comments to that article, we got a lot of good feedback on it. The best feedback was got was basically "well, suppose type inference has already worked out the types of all the argument and it is the return type that we are attempting to infer; in that case you could do overload resolution". That analysis is correct, and changes to that effect went into C# 4. I talked about that a bit more here:
http://blogs.msdn.com/b/ericlippert/archive/2008/05/28/method-type-inference-changes-part-zero.aspx
Do I understand correctly that extensions methods do not get discovered on delegate types, as I guess they arent really of that type (but are a "Method") that only happen to match the delegates signature?
Your terminology is a bit off, but your idea is correct. We do not discover extension methods when the "receiver" is a method group. More generally, we do not discover extension methods when the receiver is something that lacks its own type, but rather takes on a type based on its context: method groups, lambdas, anonymous methods and the null literal all have this property. It would be really bizarre to say null.Whatever() and have that call an extension method on String, or even weirder, (x=>x+1).Whatever() and have that call an extension method on Func<int, int>.
The line of the spec which describes this behaviour is :
An implicit identity, reference or boxing conversion [must exist] from [the receiver expression] to the type of the first parameter [...].
Conversions on method groups are not identity, reference or boxing conversions; they are method group conversions.
Would it be infeasable to enable scenario 1 to work (not this one specifically of course, but in general)? I guess from a language/compiler perspective and would it actually be useful, or am I just (attempting to) wildly abusing things here?
It is not infeasible. We've got a pretty smart team here and there's no theoretical reason why it is impossible to do so. It just doesn't seem to us like a feature that adds more value to the language than the cost of the additional complexity.
There are times when it would be useful. For example, I'd like to be able to do this; suppose I have a static Func<A, R> Memoize<A, R>(this Func<A, R> f) {...}:
var fib = (n=>n<2?1:fib(n-1)+fib(n-2)).Memoize();
Instead of what you have to write today, which is:
Func<int, int> fib = null;
fib = n=>n<2?1:fib(n-1)+fib(n-2);
fib = fib.Memoize();
But frankly, the additional complexity the proposed feature adds to the language is not paid for by the small benefit in making the code above less verbose.

The reason for the first error:
int.TryParse is a method group, not an object instance of any type. Extension methods can only be called on object instances. That's the same reason why the following code is invalid:
var s = int.TryParse;
This is also the reason why the type can't be inferred in the second example: int.TryParse is a method group and not of type TryParseMethod<int>.
I suggest, you use approach three and shorten the name of that extension class. I don't think there is any better way to do it.

Note that your code works if you first declare :
TryParseExtensions.TryParseMethod<int> tryParser = int.TryParse;
and then use tryParser where you used int.TryParse.
The problem is that the compiler doesn't know which overload of int.Parse you're speaking about. So it cannot completely infer it : are you speaking about TryParse(String, Int32) or TryParse(String, NumberStyles, IFormatProvider, Int32) ? The compiler can't guess and won't arbitrarily decide for you (fortunately !).
But your delegate type makes clear which overload you're interested in. That's why assigning tryParser is not a problem. You're not speaking anymore of a "method group" but of a well identified method signature inside this group of methods called int.TryParse.

Related

Why can't the compiler tell the better conversion target in this overload resolution case? (covariance)

Understanding the C# Language Specification on overload resolution is clearly hard, and now I am wondering why this simple case fails:
void Method(Func<string> f)
{
}
void Method(Func<object> f)
{
}
void Call()
{
Method(() => { throw new NotSupportedException(); });
}
This gives compile-time error CS0121, The call is ambiguous between the following methods or properties: followed by my two Method function members (overloads).
What I would have expected was that Func<string> was a better conversion target than Func<object>, and then the first overload should be used.
Since .NET 4 and C# 4 (2010), the generic delegate type Func<out TResult> has been covariant in TResult, and for that reason an implicit conversion exists from Func<string> to Func<object> while clearly no implicit conversion can exist from Func<object> to Func<string>. So it would make Func<string> the better conversion target, and the overload resolution should pick the first overload?
My question is simply: What part of the C# Spec am I missing here?
Addition: This works fine:
void Call()
{
Method(null); // OK!
}
My question is simply: What part of the C# Spec am I missing here?
Summary:
You have found a minor known bug in the implementation.
The bug will be preserved for backwards compatibility reasons.
The C# 3 specification contained an error regarding how the "null" case was to be handled; it was fixed in the C# 4 specification.
You can reproduce the buggy behavior with any lambda where the return type cannot be inferred. For example: Method(() => null);
Details:
The C# 5 specification says that the betterness rule is:
If the expression has a type then choose the better conversion from that type to the candidate parameter types.
If the expression does not have a type and is not a lambda, choose the conversion to the type that is better.
If the expression is a lambda then first consider which parameter type is better; if neither is better and the delegate types have identical parameter lists then consider the relationship between the inferred return type of the lambda and the return types of the delegates.
So the intended behaviour is: first the compiler should check to see if one parameter type is clearly better than the other, regardless of whether the argument has a type. If that doesn't resolve the situation and the argument is a lambda, then check to see which of the inferred return type converted to the parameters' delegate types' return type is better.
The bug in the implementation is the implementation doesn't do that. Rather, in the case where the argument is a lambda it skips the type betterness check entirely and goes straight to the inferred return type betterness check, which then fails because there is no inferred return type.
My intention was to fix this for Roslyn. However, when I went to implement this, we discovered that making the fix caused some real-world code to stop compiling. (I do not recall what the real-world code was and I no longer have access to the database that holds the compatibility issues.) We therefore decided to maintain the existing small bug.
I note that the bug was basically impossible before I added delegate variance in C# 4; in C# 3 it was impossible for two different delegate types to be more or less specific, so the only rule that could apply was the lambda rule. Since there was no test in C# 3 that would reveal the bug, it was easy to write. My bad, sorry.
I note also that when you start throwing expression tree types into the mix, the analysis gets even more complicated. Even though Func<string> is better than Func<object>, Expression<Func<string>> is not convertible to Expression<Func<object>>! It would be nice if the algorithm for betterness was agnostic with respect to whether the lambda was going to an expression tree or a delegate, but it is in some ways not. Those cases get complicated and I don't want to labour the point here.
This minor bug is an object lesson in the importance of implementing what the spec actually says and not what you think it says. Had I been more careful in C# 3 to ensure that the code matched the spec then the code would have failed on the "null" case and it would then have been clear earlier that the C# 3 spec was wrong. And the implementation does the lambda check before the type check, which was a time bomb waiting to go off when C# 4 rolled around and suddenly that became incorrect code. The type check should have been done first regardless.
Well, you are right. What causes problem here is the delegate you are passing as an argument. It has no explicit return type, you are just throwing an exception. Exception is basically an object but it is not considered as a return type of a method. Since there is no return call following the exception throw, compiler is not sure what overload it should use.
Just try this
void Call()
{
Method(() =>
{
throw new NotSupportedException();
return "";
});
}
No problem with choosing an overload now because of explicitly stated type of an object passed to a return call. It does not matter that the return call is unreachable due to the exception throw, but now the compiler knows what overload it should use.
EDIT:
As for the case with passing null, frenkly, I don't know the answer.

Overloaded method-group argument confuses overload resolution?

The following call to the overloaded Enumerable.Select method:
var itemOnlyOneTuples = "test".Select<char, Tuple<char>>(Tuple.Create);
fails with an ambiguity error (namespaces removed for clarity):
The call is ambiguous between the following methods or properties:
'Enumerable.Select<char,Tuple<char>>
(IEnumerable<char>,Func<char,Tuple<char>>)'
and
'Enumerable.Select<char,Tuple<char>>
(IEnumerable<char>, Func<char,int,Tuple<char>>)'
I can certainly understand why not specifying the type-arguments explicitly would result in an ambiguity (both the overloads would apply), but I don't see one after doing so.
It appears clear enough to me that the intention is to call the first overload, with the method-group argument resolving to Tuple.Create<char>(char). The second overload should not apply because none of the Tuple.Create overloads can be converted to the expected Func<char,int,Tuple<char>> type. I'm guessing the compiler is confused by Tuple.Create<char, int>(char, int), but its return-type is wrong: it returns a two-tuple, and is hence not convertible to the relevant Func type.
By the way, any of the following makes the compiler happy:
Specifying a type-argument for the method-group argument: Tuple.Create<char> (Perhaps this is actually a type-inference issue?).
Making the argument a lambda-expression instead of a method-group: x => Tuple.Create(x). (Plays well with type-inference on the Select call).
Unsurprisingly, trying to call the other overload of Select in this manner also fails:
var itemIndexTwoTuples = "test".Select<char, Tuple<char, int>>(Tuple.Create);
What's the exact problem here?
First off, I note that this is a duplicate of:
Why is Func<T> ambiguous with Func<IEnumerable<T>>?
What's the exact problem here?
Thomas's guess is essentially correct. Here are the exact details.
Let's go through it a step at a time. We have an invocation:
"test".Select<char, Tuple<char>>(Tuple.Create);
Overload resolution must determine the meaning of the call to Select. There is no method "Select" on string or any base class of string, so this must be an extension method.
There are a number of possible extension methods for the candidate set because string is convertible to IEnumerable<char> and presumably there is a using System.Linq; in there somewhere. There are many extension methods that match the pattern "Select, generic arity two, takes an IEnumerable<char> as the first argument when constructed with the given method type arguments".
In particular, two of the candidates are:
Enumerable.Select<char,Tuple<char>>(IEnumerable<char>,Func<char,Tuple<char>>)
Enumerable.Select<char,Tuple<char>>(IEnumerable<char>,Func<char,int,Tuple<char>>)
Now, the first question we face is are the candidates applicable? That is, is there an implicit conversion from each supplied argument to the corresponding formal parameter type?
An excellent question. Clearly the first argument will be the "receiver", a string, and it will be implicitly convertible to IEnumerable<char>. The question now is whether the second argument, the method group "Tuple.Create", is implicitly convertible to formal parameter types Func<char,Tuple<char>>, and Func<char,int, Tuple<char>>.
When is a method group convertible to a given delegate type? A method group is convertible to a delegate type when overload resolution would have succeeded given arguments of the same types as the delegate's formal parameter types.
That is, M is convertible to Func<A, R> if overload resolution on a call of the form M(someA) would have succeeded, given an expression 'someA' of type 'A'.
Would overload resolution have succeeded on a call to Tuple.Create(someChar)? Yes; overload resolution would have chosen Tuple.Create<char>(char).
Would overload resolution have succeeded on a call to Tuple.Create(someChar, someInt)? Yes, overload resolution would have chosen Tuple.Create<char,int>(char, int).
Since in both cases overload resolution would have succeeded, the method group is convertible to both delegate types. The fact that the return type of one of the methods would not have matched the return type of the delegate is irrelevant; overload resolution does not succeed or fail based on return type analysis.
One might reasonably say that convertibility from method groups to delegate types ought to succeed or fail based on return type analysis, but that's not how the language is specified; the language is specified to use overload resolution as the test for method group conversion, and I think that's a reasonable choice.
Therefore we have two applicable candidates. Is there any way that we can decide which is better than the other? The spec states that the conversion to the more specific type is better; if you have
void M(string s) {}
void M(object o) {}
...
M(null);
then overload resolution chooses the string version because string is more specific than object. Is one of those delegate types more specific than the other? No. Neither is more specific than the other. (This is a simplification of the better-conversion rules; there are actually lots of tiebreakers, but none of them apply here.)
Therefore there is no basis to prefer one over the other.
Again, one could reasonably say that sure, there is a basis, namely, that one of those conversions would produce a delegate return type mismatch error and one of them would not. Again, though, the language is specified to reason about betterness by considering the relationships between the formal parameter types, and not about whether the conversion you've chosen will eventually result in an error.
Since there is no basis upon which to prefer one over the other, this is an ambiguity error.
It is easy to construct similar ambiguity errors. For example:
void M(Func<int, int> f){}
void M(Expression<Func<int, int>> ex) {}
...
M(x=>Q(++x));
That's ambiguous. Even though it is illegal to have a ++ inside an expression tree, the convertibility logic does not consider whether the body of a lambda has something inside it that would be illegal in an expression tree. The conversion logic just makes sure that the types check out, and they do. Given that, there's no reason to prefer one of the M's over the other, so this is an ambiguity.
You note that
"test".Select<char, Tuple<char>>(Tuple.Create<char>);
succeeds. You now know why. Overload resolution must determine if
Tuple.Create<char>(someChar)
or
Tuple.Create<char>(someChar, someInt)
would succeed. Since the first one does and the second one does not, the second candidate is inapplicable and eliminated, and is therefore not around to become ambiguous.
You also note that
"test".Select<char, Tuple<char>>(x=>Tuple.Create(x));
is unambiguous. Lambda conversions do take into account the compatibility of the returned expression's type with the target delegate's return type. It is unfortunate that method groups and lambda expressions use two subtly different algorithms for determining convertibility, but we're stuck with it now. Remember, method group conversions have been in the language a lot longer than lambda conversions; had they been added at the same time, I imagine that their rules would have been made consistent.
I'm guessing the compiler is confused by Tuple.Create<char, int>(char, int), but its return-type is wrong: it returns a two-tuple.
The return type isn't part of the method signature, so it isn't considered during overload resolution; it's only verified after an overload has been picked. So as far as the compiler knows, Tuple.Create<char, int>(char, int) is a valid candidate, and it is neither better nor worse than Tuple.Create<char>(char), so the compiler can't decide.

Generic methods and method overloading

Method overloading allows us to define many methods with the same name but with a different set of parameters ( thus with the same name but different signature ).
Are these two methods overloaded?
class A
{
public static void MyMethod<T>(T myVal) { }
public static void MyMethod(int myVal) { }
}
EDIT:
Shouldn't statement A<int>.MyMethod(myInt); throw an error, since constructed type A<int> has two methods with the same name and same signature?
Are the two methods overloaded?
Yes.
Shouldn't statement A<int>.MyMethod(myInt); throw an error, since constructed type A<int> has two methods with the same signature?
The question doesn't make sense; A is not a generic type as you have declared it. Perhaps you meant to ask:
Should the statement A.MyMethod(myInt); cause the compiler to report an error, since there are two ambiguous candidate methods?
No. As others have said, overload resolution prefers the non-generic version in this case. See below for more details.
Or perhaps you meant to ask:
Should the declaration of type A be illegal in the first place, since in some sense it has two methods with the same signature, MyMethod and MyMethod<int>?
No. The type A is perfectly legal. The generic arity is part of the signature. So there are not two methods with the same signature because the first has generic arity zero, the second has generic arity one.
Or perhaps you meant to ask:
class G<T>
{
public static void M(T t) {}
public static void M(int t) {}
}
Generic type G<T> can be constructed such that it has two methods with the same signature. Is it legal to declare such a type?
Yes, it is legal to declare such a type. It is usually a bad idea, but it is legal.
You might then retort:
But my copy of the C# 2.0 specification as published by Addison-Wesley states on page 479 "Two function members declared with the same names ... must have have parameter types such that no closed constructed type could have two members with the same name and signature." What's up with that?
When C# 2.0 was originally designed that was the plan. However, then the designers realized that this desirable pattern would be made illegal:
class C<T>
{
public C(T t) { ... } // Create a C<T> from a given T
public C(Stream s) { ... } // Deserialize a C<T> from disk
}
And now we say sorry buddy, because you could say C<Stream>, causing two constructors to unify, the whole class is illegal. That would be unfortunate. Obviously it is unlikely that anyone will ever construct this thing with Stream as the type parameter!
Unfortunately, the spec went to press before the text was updated to the final version. The rule on page 479 is not what we implemented.
Continuing to pose some more questions on your behalf:
So what happens if you call G<int>.M(123) or, in the original example, if you call A.MyMethod(123)?
When overload resolution is faced with two methods that have identical signatures due to generic construction then the one that is generic construction is considered to be "less specific" than the one that is "natural". A less specific method loses to a more specific method.
So why is it a bad idea, if overload resolution works?
The situation with A.MyMethod isn't too bad; it is usually pretty easy to unambiguously work out which method is intended. But the situation with G<int>.M(123) is far worse. The CLR rules make this sort of situation "implementation defined behaviour" and therefore any old thing can happen. Technically, the CLR could refuse to verify a program that constructs type G<int>. Or it could crash. In point of fact it does neither; it does the best it can with the bad situation.
Are there any examples of this sort of type construction causing truly implementation-defined behaviour?
Yes. See these articles for details:
https://ericlippert.com/2006/04/05/odious-ambiguous-overloads-part-one/
https://ericlippert.com/2006/04/06/odious-ambiguous-overloads-part-two/
Yes. MyMethod(int myVal) will be called when the type of the parameter is an int, the generic overload will be called for all other parameter arguments, even when the parameter argument is implicitly convertible to (or is a derived class of) the hardcoded type. Overload resolution will go for the best fit, and the generic overload will resolve to an exact match at compile time.
Note: You can explicitly invoke the generic overload and use an int by providing the type parameter in the method call, as Steven Sudit points out in his answer.
short s = 1;
int i = s;
MyMethod(s); // Generic
MyMethod(i); // int
MyMethod((int)s); // int
MyMethod(1); // int
MyMethod<int>(1); // Generic**
MyMethod(1.0); // Generic
// etc.
Yes, they are. They will allow code as such:
A.MyMethod("a string"); // calls the generic version
A.MyMethod(42); // calls the int version
Yes, they are overloaded. The compiler is supposed to prefer explicit method signatures against generic methods if they are available. Beware, however, that if you can avoid this kind of overload you probably should. There have been bug reports with respect to this sort of overload and unexpected behaviors.
https://connect.microsoft.com/VisualStudio/feedback/details/522202/c-3-0-generic-overload-call-resolution-from-within-generic-function
Yes. They have the same name "MyMethod" but different signatures. The C# specification, however, specifically handles this by saying that the compiler will prefer the non-generic version over the generic version, when both are options.
Yes. Off the top of my head, if you call A.MyMethod(1);, it will always run the second method. You'd have to call A.MyMethod<int>(1); to force it to run the first.

Why does this work?

Why does this work? I'm not complaining, just want to know.
void Test()
{
int a = 1;
int b = 2;
What<int>(a, b);
// Why does this next line work?
What(a, b);
}
void What<T>(T a, T b)
{
}
It works because a and b are integers, so the compiler can infer the generic type argument for What.
In C# 3, the compiler can also infer the type argument even when the types don't match, as long as a widening conversion makes sense. For instance, if c were a long, then What(a, c) would be interpreted as What<long>.
Note that if, say, c were a string, it wouldn't work.
The C# compiler supports type inference for generics, and also commonly seen if you are using the var keyword.
Here int is inferred from the context (a and b) and so <int> is not needed. It keeps code cleaner and easier to read at times.
Sometimes your code may be more clear to read if you let the compiler infer the type, sometimes it may be more clear if you explicitly specify the type. It is a judgement call on your given situation.
It's using type inference for generic methods. Note that this has changed between C# 2 and 3. For example, this wouldn't have worked in C# 2:
What("hello", new object());
... whereas it would in C# 3 (or 4). In C# 2, the type inference was performed on a per-argument basis, and the results had to match exactly. In C# 3, each argument contributes information which is then put together to infer the type arguments. C# 3 also supports multi-phase type inference where the compiler can work out one type argument, then see if it's got any more information on the rest (e.g. due to lambda expressions with implicit parameter types). Basically it keeps going until it can't get any more information, or it finishes - or it sees contradictory information. The type inference in C# isn't as powerful as the Hindley-Milner algorithm, but it works better in other ways (in particular it always makes forward progress).
See section 7.4.2 of the C# 3 spec for more information.
The compiler infers the generic type parameter from the types of the actual parameters that you passed.
This feature makes LINQ calls much simpler. (You don't need to write numbers.Select<int, string>(i => i.ToString()), because the compiler infers the int from numbers and the string from ToString)
The compiler can infer type T to be an int since both parameters passed into What() are of type int. You'll notice a lot of the Linq extensions are defined with generics (as IEnumerable) but are typically used in the manner you show.
If the subject of how this works in C# 3.0 is interesting to you, here's a little video of me explaining it from back in 2006, when we were first designing the version of the feature for C# 3.0.
http://blogs.msdn.com/ericlippert/archive/2006/11/17/a-face-made-for-email-part-three.aspx
See also the "type inference" section of my blog:
http://blogs.msdn.com/ericlippert/archive/tags/Type+Inference/default.aspx
The compiler is smart enough to figure out that the generic type is 'int'

C# 3.0 Func/OrderBy type inference

So odd situation that I ran into today with OrderBy:
Func<SomeClass, int> orderByNumber =
currentClass =>
currentClass.SomeNumber;
Then:
someCollection.OrderBy(orderByNumber);
This is fine, but I was going to create a method instead because it might be usable somewhere else other than an orderBy.
private int ReturnNumber(SomeClass currentClass)
{
return currentClass.SomeNumber;
}
Now when I try to plug that into the OrderBy:
someCollection.OrderBy(ReturnNumber);
It can't infer the type like it can if I use a Func. Seems like to me they should be the same since the method itself is "strongly typed" like the Func.
Side Note: I realize I can do this:
Func<SomeClass, int> orderByNumber = ReturnNumber;
This could also be related to "return-type type inference" not working on Method Groups.
Essentially, in cases (like Where's predicate) where the generic parameters are only in input positions, method group conversion works fine. But in cases where the generic parameter is a return type (like Select or OrderBy projections), the compiler won't infer the appropriate delegate conversion.
ReturnNumber is not a method - instead, it represents a method group containing all methods with the name ReturnNumber but with potentially different arity-and-type signatures. There are some technical issues with figuring out which method in that method group you actually want in a very generic and works-every-time way. Obviously, the compiler could figure it out some, even most, of the time, but a decision was made that putting an algorithm into the compiler which would work only half the time was a bad idea.
The following works, however:
someCollection.OrderBy(new Func<SomeClass, int>(ReturnNumber))

Categories