VS2019 Roslyn Compiler Generic Constraint Method Resolution

VS2019 Roslyn Compiler Generic Constraint Method Resolution - c#

We recently found an issue in our code base, where VS2019 Compiled code fine but VS 2017 Failed.
I've created an extension method for Union which has a generic ISet as a Generic Constraint
using System;
using System.Collections.Generic;
using System.Linq;
public static class Extensions
{
public static S Union<S, T>(this S self, IEnumerable<T> other) where S : ISet<T>, new()
{
//For simplicity issues since this is a compilation based question
return default(S);
}
public static void Test()
{
var values = new[] { 1, 2, 3 };
var values1 = new[] { 1, 2, 3, 4 };
values.Union(values1);
}
}
Union generates a compilation error stating that the int[] is not convertible to ISet.
It was my understanding that method resolution originally ignored Generic constraints. But it seems that this code Compiles in 2019.
I haven't seen anywhere in the release notes which states that they've resolved this bug or added a new feature to improve method resolution for generic methods.
I'm looking for more information about this matter,
Was this a bug fix by microsoft or an intended feature?

It's part of C# 7.3 (so you can use it in VS 2017 as well if you specify version 7.3). It's documented in the C# 7.3 release notes:
Improved overload candidates
In every release, the overload resolution rules get updated to address situations where ambiguous method invocations have an "obvious" choice. This release adds three new rules to help the compiler pick the obvious choice:
...
When a method group contains some generic methods whose type arguments do not satisfy their constraints, these members are removed from the candidate set.
...
This wasn't a bug before - it was obeying the language specification; I don't know why the specification was originally written the way it was here. Possible reasons include:
Expected implementation complexity
Expected implementation performance
Expected usefulness - anticipation that the previous behavior would be fine or even preferable to the current behavior, without realizing where it would be annoying in reality

Related

Are the placeholders of Generics compiled as an actual data type? [duplicate]

I had thought that Generics in C# were implemented such that a new class/method/what-have-you was generated, either at run-time or compile-time, when a new generic type was used, similar to C++ templates (which I've never actually looked into and I very well could be wrong, about which I'd gladly accept correction).
But in my coding I came up with an exact counterexample:
static class Program {
static void Main()
{
Test testVar = new Test();
GenericTest<Test> genericTest = new GenericTest<Test>();
int gen = genericTest.Get(testVar);
RegularTest regTest = new RegularTest();
int reg = regTest.Get(testVar);
if (gen == ((object)testVar).GetHashCode())
{
Console.WriteLine("Got Object's hashcode from GenericTest!");
}
if (reg == testVar.GetHashCode())
{
Console.WriteLine("Got Test's hashcode from RegularTest!");
}
}
class Test
{
public new int GetHashCode()
{
return 0;
}
}
class GenericTest<T>
{
public int Get(T obj)
{
return obj.GetHashCode();
}
}
class RegularTest
{
public int Get(Test obj)
{
return obj.GetHashCode();
}
}
}
Both of those console lines print.
I know that the actual reason this happens is that the virtual call to Object.GetHashCode() doesn't resolve to Test.GetHashCode() because the method in Test is marked as new rather than override. Therefore, I know if I used "override" rather than "new" on Test.GetHashCode() then the return of 0 would polymorphically override the method GetHashCode in object and this wouldn't be true, but according to my (previous) understanding of C# generics it wouldn't have mattered because every instance of T would have been replaced with Test, and thus the method call would have statically (or at generic resolution time) been resolved to the "new" method.
So my question is this: How are generics implemented in C#? I don't know CIL bytecode, but I do know Java bytecode so I understand how Object-oriented CLI languages work at a low level. Feel free to explain at that level.
As an aside, I thought C# generics were implemented that way because everyone always calls the generic system in C# "True Generics," compared to the type-erasure system of Java.

In GenericTest<T>.Get(T), the C# compiler has already picked that object.GetHashCode should be called (virtually). There's no way this will resolve to the "new" GetHashCode method at runtime (which will have its own slot in the method-table, rather than overriding the slot for object.GetHashCode).
From Eric Lippert's What's the difference, part one: Generics are not templates, the issue is explained (the setup used is slightly different, but the lessons translate well to your scenario):
This illustrates that generics in C# are not like templates in C++.
You can think of templates as a fancy-pants search-and-replace
mechanism.[...] That’s not how generic types work; generic types are,
well, generic. We do the overload resolution once and bake in the
result. [...] The IL we’ve generated for the generic type already has
the method its going to call picked out. The jitter does not say
“well, I happen to know that if we asked the C# compiler to execute
right now with this additional information then it would have picked a
different overload. Let me rewrite the generated code to ignore the
code that the C# compiler originally generated...” The jitter knows
nothing about the rules of C#.
And a workaround for your desired semantics:
Now, if you do want overload resolution to be re-executed at runtime based on the runtime types of
the arguments, we can do that for you; that’s what the new “dynamic”
feature does in C# 4.0. Just replace “object” with “dynamic” and when
you make a call involving that object, we’ll run the overload
resolution algorithm at runtime and dynamically spit code that calls
the method that the compiler would have picked, had it known all the
runtime types at compile time.

Why can't List<int> be converted to TCollection in xunit.net theory?

I am trying to write a generic theory in xunit.net that uses collections via the MemberDataAttribute. Please have a look at the following code:
[Theory]
[MemberData("TestData")]
public void AddRangeTest<T>(List<T> items)
{
var sut = new MyCustomList<T>();
sut.AddRange(items);
Assert.Equal(items, sut);
}
public static readonly IEnumerable<object[]> TestData = new[]
{
new object[] { new List<int> { 1, 2, 3 } },
new object[] { new List<string> { "Foo", "Bar" } }
};
When I try to execute this theory, the following exceptions are thrown: "System.ArgumentException: Object of type 'List'1[System.String] / List'1[System.Int32]' cannot be converted to type 'List'[System.Object]'." (I shortened and merged the text of both exceptions).
I get that this could maybe be related to the parameter type as it is not directly a generic type but a type that uses a nested generic. Thus I transformed the test in the following way:
[Theory]
[MemberData("TestData")]
public void AddRangeTest2<T, TCollection>(TCollection items)
where TCollection : IEnumerable<T>
{
var sut = new MyCustomList<T>();
sut.AddRange(items);
Assert.Equal(items, sut);
}
In this case, I introduced a generic called TCollection that is contrained to be an IEnumerable<T> instance. When I execute this test, the run with List<string> works, but the run with List<int> produces the following exception: "System.ArgumentException: GenericArguments[1], 'List'1[System.Int32]', on 'AddRangeTest2' violates the constraint of type 'TCollection'." (Again, I shortened the exception text to the relevant points).
My actual question is: why is it possible to use List<string> but not List<int> in a generic theory? Both of these types satisfy the constraint on the TCollection generic in my opinion.
If I change List<int> to List<object>, then everything works - thus I assume that it has something to do with Value Types / Reference Types and Co- / Contravariance.
I use xunit 2.0.0-rc2 although I also observed this behavior with rc1 - thus I think it is version-independent (and possibly not even a problem of xunit). If you want to have a look at the full source code, you can download it from my dropbox (I created it with VS 2013). Please consider this code to be an MCVE.
Thank you so much in advance for your help!
Edit after this question was closed: my question is not answered by Cannot convert from List<DerivedClass> to List<BaseClass> because I do not use inheritance-based casts here at all - instead I believe that xunit.net uses generics that are resolved via reflection.

Whatever you do, it still boils down to the fact that there is going on IEnumerable<int> to IEnumerable<object> conversion, behind the scenes.
You seem to be under assumption that the generic parameters are filled based on the concrete input given.
That is not the case. The generic type parameters are deduced from the variable that is referenced in MemberData, not the actual content it is holding, thus, your test method will be always called with the next signature:
AddRangeTest2<object, object[]>(object[] items)
The above signature supports implicitly covariance for reference types (meaning: you can pass string[] to this method, but not int[]).
Covariance for value types is not supported.
This said, it would be cool if xUnit is clever enough to specify the correct generic type arguments, based on the actual input. I don't have experience with xUnit myself, perhaps there is already a way? If not, feature request in codeplex, or make it yourself + pull request ;)

Why does the C# compiler crash on this code?

Why does the code below crash the .NET compiler? It was tested on csc.exe version 4.0.
See e.g. here for online demo on different version - it crashes in the same manner while it says dynamic is not supported https://dotnetfiddle.net/FMn59S:
Compilation error (line 0, col 0): Internal Compiler Error (0xc0000005 at address xy): likely culprit is 'TRANSFORM'.
The extension method works fine on List<dynamic> though.
using System;
using System.Collections.Generic;
static class F {
public static void M<T>(this IEnumerable<T> enumeration, Action<T> action){}
static void U(C.K d) {
d.M(kvp => Console.WriteLine(kvp));
}
}
class C {
public class K : Dictionary<string, dynamic>{}
}
Update: this doesn't crash the compiler
static void U(Dictionary<string, dynamic> d)
{
d.M(kvp => Console.WriteLine(kvp));
}
Update 2: the same bug was reported in http://connect.microsoft.com/VisualStudio/feedback/details/892372/compiler-error-with-dynamic-dictinoaries. The bug was reported for FirstOrDefault, but it seems the compiler crashes on any extension method applied to class derived from Dictionary<T1,T2>, where at least one of the parameter types is dynamic. See an even more general description of the problem below by Erik Funkenbusch.
Update 3: another non-standard behaviour. When I try to call extension method as a static method, that is, F.M(d, kvp => Console.WriteLine(kvp));, the compiler doesn't crash, but it cannot find the overload:
Argument 1: cannot convert from 'C.K' to 'System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<string,dynamic>>'
Update 4 - SOLUTION (kind of): Hans sketched 2nd workaround, which is semantically equivalent to original code, but works only for extension method call and not for standard call. Since the bug is likely caused by the fact that the compiler fails to cast class derived from generic class with multiple parameters (with one being dynamic) to its supertype, the solution is to provide an explicit cast. See https://dotnetfiddle.net/oNvlcL:
((Dictionary<string, dynamic>)d).M(kvp => Console.WriteLine(kvp));
M((Dictionary<string, dynamic>)d, kvp => Console.WriteLine(kvp));

It is dynamic that is triggering the instability, the crash disappears when you replace it by object.
Which is one workaround, the other is to help it infer the correct T:
static void U(C.K d) {
d.M(new Action<KeyValuePair<string, dynamic>>(kvp => Console.WriteLine(kvp)));
}
The feedback report that you found is a strong match, no need to file your own I'd say.

Well, the answer to your question as to WHY it crashes the compiler, it's because you've encountered a bug that.... crashes the compiler.
The VS2013 compiler says "Internal Compiler Error (0xc0000005 at address 012DC5B5): likely culprit is 'TRANSFORM'", so clearly it's a bug.
C0000005 is typically a null pointer, or referencing unallocated, or deleted memory. It's a general protection fault.
EDIT:
The problem is also present in pretty much any kind of multiple parameter generic type where the any parameter is dynamic. For instance it crashes on:
List<Tuple<string, dynamic>>{}
It also crashes on
List<KeyValuePair<dynamic, string>>{}
But does not crash on
List<dynamic>{}
but does crash on
List<List<dynamic>>{}

The type arguments for method cannot be inferred from the usage

namespace TestLibrary
{
[TestFixture]
public class Class1
{
public delegate T Initializer<T>();
public static T MyGenericMethod<T>(Initializer<T> initializer) where T : class
{
return initializer != null ? initializer() : null;
}
[Test]
public void Test()
{
var result = MyGenericMethod(MyInitializer);
Assert.IsNotNull(result);
}
private object MyInitializer()
{
return new object();
}
}
}
This is a functioning piece of code when running in Visual Studion 2010. If I try to build this using MSBUILD from command line...
"c:\Windows\Microsoft.NET\Framework\v3.5\MSBuild.exe" Solution1.sln
... I get very familiar error message:
The type arguments for method 'Method name' cannot be inferred from
the usage. Try specifying the type arguments explicitly.
Any ideas?

This appears to be a difference between the compiler versions used by VS 2010 and MSBuild 3.5. This makes sense as type inference was improved a lot in later compiler versions.
If you need to use MSBuild 3.5, you'll need to correct your code:
var result = MyGenericMethod<object>(MyInitializer);
However, you should be able to use MSBuild v4 and target the 3.5 framework. You can also target this framework in VS 2010. Based on the fact that when targeting 3.5 using VS 2010 the code compiles, I think it will likely work via MSBuild v4.
Courtesy of Radex in the comments:
c:\Windows\Microsoft.NET\Framework\v4.0.30319\MSBuild.exe" Solution1.sln /p:TargetFrameworkVersion=v3.5
Just to clarify, this is my educated-guess answer based on the comments.
Not sure if this is relevant, but I found this on MSDN: http://msdn.microsoft.com/en-us/library/ee855831.aspx
Method group type inference
The compiler can infer both generic and non-generic delegates for
method groups, which might introduce ambiguity.
In C# 2008, the compiler cannot infer generic delegates for method
groups. Therefore, it always uses a non-generic delegate if one
exists.
In C# 2010, both generic and non-generic delegates are inferred for
method groups, and the compiler is equally likely to infer either.
This can introduce ambiguity if you have generic and non-generic
versions of a delegate and both satisfy the requirements. For example,
the following code complies in C# 2008 and calls a method that uses a
non-generic delegate. In C# 2010, this code produces a compiler error
that reports an ambiguous call.
Further reading:
http://togaroga.com/2009/11/smarter-type-inference-with-c-4/

c#: Why isn't this ambiguous enum reference resolved using the method signature?

Consider the following code:
namespace ConsoleApplication
{
using NamespaceOne;
using NamespaceTwo;
class Program
{
static void Main(string[] args)
{
// Compilation error. MyEnum is an ambiguous reference
MethodNamespace.MethodClass.Frobble(MyEnum.foo);
}
}
}
namespace MethodNamespace
{
public static class MethodClass
{
public static void Frobble(NamespaceOne.MyEnum val)
{
System.Console.WriteLine("Frobbled a " + val.ToString());
}
}
}
namespace NamespaceOne
{
public enum MyEnum
{
foo, bar, bat, baz
}
}
namespace NamespaceTwo
{
public enum MyEnum
{
foo, bar, bat, baz
}
}
The compiler complains that MyEnum is an ambiguous reference in the call to Frobble(). Since there is no ambiguity in what method is being called, one might expect the compiler to resolve the type reference based on the method signature. Why doesn't it?
Please note that I'm not saying that the compiler should do this. I'm confident that there is a very good reason that it doesn't. I would simply like to know what that reason is.

Paul is correct. In most situation in C# we reason "from inside to outside".
there is no ambiguity in what method is being called,
That it is unambiguous to you is irrelevant to the compiler. The task of overload resolution is to determine whether the method group Frobble can be resolved to a specific method given known arguments. If we can't determine what the argument types are then we don't even try to do overload resolution.
Method groups that just happen to contain only one method are not special in this regard. We still have to have good arguments before overload resolution can succeed.
There are cases where we reason from "outside to inside", namely, when doing type analysis of lambdas. Doing so makes the overload resolution algorithm exceedingly complicated and gives the compiler a problem to solve that is at least NP-HARD in bad cases. But in most scenarios we want to avoid that complexity and expense; expressions are analyzed by analyzing child subexpressions before their parents, not the other way around.
More generally: C# is not a "when the program is ambiguous use heuristics to make guesses about what the programmer probably meant" language. It is a "inform the developer that their program is unclear and possibly broken" language. The portions of the language that are designed to try to resolve ambiguous situations -- like overload resolution or method type inference or implicitly typed arrays -- are carefully designed so that the algorithms have clear rules that take versioning and other real-world aspects into account. Bailing out as soon as one part of the program is ambiguous is one way we achieve this design goal.
If you prefer a more "forgiving" language that tries to figure out what you meant, VB or JScript might be better languages for you. They are more "do what I meant not what I said" languages.

I believe its because the C# compiler won't typically backtrack.

NamespaceOne and NamespaceTwo are defined in the same code file. That would be equivalent to putting them in different code files and referencing them via using statement.
In that case you can see why the names clash. You have equally named enum in two different namesapces and the compiler can't guess which one it is, even though Frobble has a NamespaceOne.MyEnum parameter. Instead of
MethodNamespace.MethodClass.Frobble(MyEnum.foo)
use
MethodNamespace.MethodClass.Frobble(NamespaceOne.MyEnum.foo)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.