Covariance and CS0266 [duplicate] - c#

IEnumerable<T> is co-variant but it does not support value type, just only reference type. The below simple code is compiled successfully:
IEnumerable<string> strList = new List<string>();
IEnumerable<object> objList = strList;
But changing from string to int will get compiled error:
IEnumerable<int> intList = new List<int>();
IEnumerable<object> objList = intList;
The reason is explained in MSDN:
Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type.
I have searched and found that some questions mentioned the reason is boxing between value type and reference type. But it does not still clear up my mind much why boxing is the reason?
Could someone please give a simple and detailed explanation why covariance and contravariance do not support value type and how boxing affects this?

Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.
For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.
You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.
EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:
This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.

It is perhaps easier to understand if you think about the underlying representation (even though this really is an implementation detail). Here is a collection of strings:
IEnumerable<string> strings = new[] { "A", "B", "C" };
You can think of the strings as having the following representation:
[0] : string reference -> "A"
[1] : string reference -> "B"
[2] : string reference -> "C"
It is a collection of three elements, each being a reference to a string. You can cast this to a collection of objects:
IEnumerable<object> objects = (IEnumerable<object>) strings;
Basically it is the same representation except now the references are object references:
[0] : object reference -> "A"
[1] : object reference -> "B"
[2] : object reference -> "C"
The representation is the same. The references are just treated differently; you can no longer access the string.Length property but you can still call object.GetHashCode(). Compare this to a collection of ints:
IEnumerable<int> ints = new[] { 1, 2, 3 };
[0] : int = 1
[1] : int = 2
[2] : int = 3
To convert this to an IEnumerable<object> the data has to be converted by boxing the ints:
[0] : object reference -> 1
[1] : object reference -> 2
[2] : object reference -> 3
This conversion requires more than a cast.

I think everything starts from definiton of LSP (Liskov Substitution Principle), which climes:
if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.
But value types, for example int can not be substitute of object in C#.
Prove is very simple:
int myInt = new int();
object obj1 = myInt ;
object obj2 = myInt ;
return ReferenceEquals(obj1, obj2);
This returns false even if we assign the same "reference" to the object.

It does come down to an implementation detail: Value types are implemented differently to reference types.
If you force value types to be treated as reference types (i.e. box them, e.g. by referring to them via an interface) you can get variance.
The easiest way to see the difference is simply consider an Array: an array of Value types are put together in memory contiguously (directly), where as an array of Reference types only have the reference (a pointer) contiguously in memory; the objects being pointed to are separately allocated.
The other (related) issue(*) is that (almost) all Reference types have the same representation for variance purposes and much code does not need to know of the difference between types, so co- and contra-variance is possible (and easily implemented -- often just by omission of extra type checking).
(*) It may be seen to be the same issue...

Related

Having trouble understanding the syntax of the <> in the code List<string> [duplicate]

I looked at some sample code using C# generics. Why and when should I use them?
All the examples were complex. I need a simple, clear example that gets me started with C# generics.
A very simple example is the generic List<T> class. It can hold a number of objects of any type. For example, you can declare a list of strings (new List<string>()) or a list of Animals (new List<Animal>()), because it is generic.
What if you couldn't use generics? You could use the ArrayList class, but the downside is that it's containing type is an object. So when you'd iterate over the list, you'd have to cast every item to its correct type (either string or Animal) which is more code and has a performance penalty. Plus, since an ArrayList holds objects, it isn't type-safe. You could still add an Animal to an ArrayList of strings:
ArrayList arrayList = new ArrayList();
arrayList.Add(new Animal());
arrayList.Add("");
So when iterating an ArrayList you'd have to check the type to make sure the instance is of a specific type, which results in poor code:
foreach (object o in arrayList)
{
if(o is Animal)
((Animal)o).Speak();
}
With a generic List<string>, this is simply not possible:
List<string> stringList = new List<String>();
stringList.Add("Hello");
stringList.Add("Second String");
stringList.Add(new Animal()); // error! Animal cannot be cast to a string
To summarize other answers with some emphasis:
1) generics enable you to write 'generic' code (i.e., it will work for multiple types). If you have 'generic' behavior you want to write, which you need to behave for differing data types, you only need to write that code once. The example of List is a great example, you can need lists of perhaps customers, products, orders, suppliers...all using the same code instantiated for each type
// snippet
List<Customer> customers = new List<Customer>();
Customer thisCustomer = new Customer();
customers.Add(thisCustomer);
List<Order> orders = new List<Order>();
Order thatOrder = new Order();
orders.Add(thatOrder);
// etc.
2) amazingly, generics still enable type safety! So if you try this, you will rightly get an error:
// continued for snippet above
Order anotherOrder = new Order();
customers.Add(anotherOrder); // FAIL!
And you would want that to be an error, so that later on your customer processing code doesn't have to handle a bogus order showing up in the customers list.
Duplication is the root of all evil. One case of code duplication occurs when you have to perform same operation on different types of data. Generics let you avoid it by allowing you to code around a 'generic' type and later substitute it with specific types.
The other solution to this problem is to use variables of type 'System.Object' to which object of any type can be assigned. This method involves boxing and unboxing operations between value and reference types which hit performance. Also type casting keeps the code from being clean.
Generics are supported in MSIL and the CLR which makes it perform really well.
You should read these articles about generics -
http://msdn.microsoft.com/en-us/library/512aeb7t(VS.80).aspx
http://msdn.microsoft.com/en-us/library/ms379564(VS.80).aspx#csharp_generics_topic1
In a nutshell, generics allow you to write classes that work with objects of any type, but without having to cast the data to Object. There are performance benefits for this, but it also makes your code more readable, maintainable, and less error-prone.
You should always use generics as opposed to the .NET 1.1 style classes when possible.
I have a use case here from Matt Milner, instructor at LinkedIn Learning. It's a little verbose, not simple as you requested, but I found it useful for diving deeper on why generics are necessary.
Generics Over Value and Reference Types (on C#)
Value Types: bool, byte, char, decimal, double, enum, float, int, long, sbyte, short, struct, uint, ulong, ushort.
Reference Types: String, Arrays (even if their elements are value types), Class, Delegate.
Part 1: On Value Types
Let's say you have this method:
static void Swap(object first, object second)
{
object temp = second;
second = first;
first = temp;
}
And you use it like so:
int x = 5, y = 7;
Swap(x, y);
System.Console.WriteLine($"X: {x} and Y: {y}");
Which prints out:
X: 5 and Y: 7
No swap here. Why? Because:
int are value types.
So, when they go into the method, and out, they have different scope. They've been just "copied by value" and not "by reference".
Note: C# allows treating int types as objects, but this involves doing boxing and unboxing. Keep in mind this is an expensive operation at memory level.
Should this problem be solved by using reference types? Let's try it out.
Part 2: On Reference Types
We are going to use a custom class, previously defined in some library:
var p1 = new Person
{
FirstName = "Matt",
LastName = "Milner"
};
var p2 = new Person
{
FirstName = "Amanda",
LastName = "Owner"
};
Let's swap them.
Swap(p1, p2);
System.Console.WriteLine($"Person 1 is: {p1.FirstName}");
We should read, Amanda, but we get instead:
Person 1 is: Matt
Why? Usually we would expect the change to occur, since class instances are passed by reference. But this is not the case, because we are in fact passing a copy of the address to the temp instance.
What if we modify the method with ref?
Part 3: Using ref?
static void Swap(ref object first, ref object second)
{
object temp = second;
second = first;
first = temp;
}
This should allow us to change not just the parts of the object, but also what they point to.
Swap(ref p1, ref p2);
System.Console.WriteLine($"Person 1 is: {p1.FirstName}");
This won't work. Why?
Because you can't convert from a ref of Person to a ref of Object. They're too unlike types.
You'll read a compilation error from the compiler before even running the program.
Part 4: Using Generics
This problem is solved with generics.
They can be used with any given type (int, arrays, etc) without stepping in into these problems.
static void Swap<T>(ref T first, ref T second)
{
T temp = second;
second = first;
first = temp;
}
Here we have a T parameter, generic or "Type" parameter.
T replaces completely the Object declarations in the method, and allows to tell to the method what we are putting in it when it is used. Like so:
Swap<Person>(ref p1, ref p2);
Swap<int>(ref x, ref y);
System.Console.WriteLine($"Person 1: {p1.FirstName}");
System.Console.WriteLine($"X: {x} and Y: {y}");
Which prints out:
Person 1 is: Amanda
X: 7 and Y: 5
Summing Up
For both reference and value types:
static void Swap(object first, object second); // No swap
static void Swap(ref object first, ref object second); // No swap
static void Swap<T>(ref T first, ref T second); // Swap
This specifies dynamically the type for the method.
At compile time, the calling code provides/informs the type is being used.
Note: I still need to check myself why the reference type behave this way, which as far as I know it is related to how object memory addresses are passed, referenced and copied. For the most part I just followed along the explanation from Matt Miller.

C# Generics - out keyword - 'method has the wrong return type' [duplicate]

IEnumerable<T> is co-variant but it does not support value type, just only reference type. The below simple code is compiled successfully:
IEnumerable<string> strList = new List<string>();
IEnumerable<object> objList = strList;
But changing from string to int will get compiled error:
IEnumerable<int> intList = new List<int>();
IEnumerable<object> objList = intList;
The reason is explained in MSDN:
Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type.
I have searched and found that some questions mentioned the reason is boxing between value type and reference type. But it does not still clear up my mind much why boxing is the reason?
Could someone please give a simple and detailed explanation why covariance and contravariance do not support value type and how boxing affects this?
Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.
For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.
You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.
EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:
This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.
It is perhaps easier to understand if you think about the underlying representation (even though this really is an implementation detail). Here is a collection of strings:
IEnumerable<string> strings = new[] { "A", "B", "C" };
You can think of the strings as having the following representation:
[0] : string reference -> "A"
[1] : string reference -> "B"
[2] : string reference -> "C"
It is a collection of three elements, each being a reference to a string. You can cast this to a collection of objects:
IEnumerable<object> objects = (IEnumerable<object>) strings;
Basically it is the same representation except now the references are object references:
[0] : object reference -> "A"
[1] : object reference -> "B"
[2] : object reference -> "C"
The representation is the same. The references are just treated differently; you can no longer access the string.Length property but you can still call object.GetHashCode(). Compare this to a collection of ints:
IEnumerable<int> ints = new[] { 1, 2, 3 };
[0] : int = 1
[1] : int = 2
[2] : int = 3
To convert this to an IEnumerable<object> the data has to be converted by boxing the ints:
[0] : object reference -> 1
[1] : object reference -> 2
[2] : object reference -> 3
This conversion requires more than a cast.
I think everything starts from definiton of LSP (Liskov Substitution Principle), which climes:
if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.
But value types, for example int can not be substitute of object in C#.
Prove is very simple:
int myInt = new int();
object obj1 = myInt ;
object obj2 = myInt ;
return ReferenceEquals(obj1, obj2);
This returns false even if we assign the same "reference" to the object.
It does come down to an implementation detail: Value types are implemented differently to reference types.
If you force value types to be treated as reference types (i.e. box them, e.g. by referring to them via an interface) you can get variance.
The easiest way to see the difference is simply consider an Array: an array of Value types are put together in memory contiguously (directly), where as an array of Reference types only have the reference (a pointer) contiguously in memory; the objects being pointed to are separately allocated.
The other (related) issue(*) is that (almost) all Reference types have the same representation for variance purposes and much code does not need to know of the difference between types, so co- and contra-variance is possible (and easily implemented -- often just by omission of extra type checking).
(*) It may be seen to be the same issue...

Why can I cast an A[] to a B[] if A and B are enum types?

enum makes Code more readable and easy to understand in many case. But I can't understand when I can use this line like below :
public enum A
{
apple,orange,egg
}
public enum B
{
apple,orange,egg
}
public static void main()
{
A[] aa = (A[])(Array) new B[100];
}
Can anyone give me any source code sample where can I used this type of enum Array.
The CLR has more generous casting rules than C# has. Apparently, the CLR allows to convert between arrays of enums if the underlying type has the same size for both types. In the question that I linked the case was (sbyte[])(object)(byte[]) which is similarly surprising.
This is in the ECMA spec Partition I I.8.7 Assignment compatibility.
underlying types – in the CTS enumerations are alternate names for existing types
(§I.8.5.2), termed their underlying type. Except for signature matching (§I.8.5.2)
III.4.3 castclass says
If the actual type (not the verifier tracked type) of obj is verifier-assignable-to the type typeTok
the cast succeeds
castclass is the instruction that the C# compiler uses to perform this cast.
The fact, that the enum members are the same in your example has nothing to do with the problem. Also note, that no new object is being created. It really is just a cast of the object reference.

Generics supports only reference conversions not boxing conversions

While reading c# in a nutshell about boxing and unboxing on page 91, author writes this:
Boxing conversions are crucial in providing a,unified type system. The system is not perfect, however we will see in Generics that variance with arrays and generics supports only **reference conversions ** and not **boxing conversions **
And quoted example code:
object [] a1 = new string [3]; //legal
object [] a2 = new int [3]; // error
Can somebody explain what the author is trying to deliver and why the first line is legal and second line is not?
Well, there's a reference conversion between string and object, in that every string reference can be treated as an object reference. This can be done transparently, with no modification to the value at all. That's why array variance can be done cheaply - for reads, anyway. Reading from an object[] value (as is known at compile-time) which happens to really be a string[] at execution time is basically free - you read the value, and you can treat it as an object reference. Writing is more painful - every write to an object[] has to check that the value you're writing is genuinely compatible with the array you're trying to store it in.
The important thing here is that the representation of a string reference is the same as the representation of an object reference.
There's a boxing conversion between int and object which lets this work:
int x = 10;
object y = x;
... but that conversion involves more action; the CLR has to create an object which contains the relevant integer. The representation of an int is not the same as the representation of an object reference. Checking whether that's necessary - and doing it as you go - would be relatively painful (from a performance perspective) when reading from an array, so it's not valid to treat an int[] as an object[].
The same is true for generic variance. This is fine:
List<string> strings = new List<string>();
IEnumerable<object> objects = strings; // IEnumerable<T> is covariant in T
But this isn't:
List<int> ints = new List<int>();
IEnumerable<object> objects = ints; // Covariance doesn't apply here
For more on representation and identity, see Eric Lippert's blog post on the topic. (It doesn't talk about variance very much, but it's all related...)

C# Casting Performance Implications

When using the 'as' keyword in C# to make a cast which fails, null gets returned. What's going on in the background? Is it simply suppressing an exception so I don't have to write handling code for a failure?
I'm interested in the performance characteristics of it compared to a typical cast wrapped in a try-catch.
It's using the IL instruction isinst to perform the cast instead of the castclass instruction that is used when casting. This is a special instruction which performs the cast if it is valid, else leaves null on the stack if it isn't. So no, it doesn't just suppress an exception, and is orders of magnitude faster than doing so.
Note that there are some differences in behaviour between the isinst instruction and castclass - the main one being that isinst does not take into account user-defined cast operators, it only considers direct inheritance hierarchy, e.g. if you define the following two classes with no inheritance hierarchy but an explicit cast operator:
class A
{
public int Foo;
}
class B
{
public int Foo;
public static explicit operator B(A a)
{
return new B { Foo = a.Foo };
}
}
Then the following will succeed:
var a = new A { Foo = 3 };
var b = (B)a;
Console.WriteLine(b.Foo); // prints 3
However the following does not compile, with the error 'Cannot convert type 'A' to 'B' via a reference conversion, boxing conversion, unboxing conversion, wrapping conversion, or null type conversion'
var a = new A { Foo = 3 };
var b = a as B;
So if you do have any user-defined casts set up (which are typically a bad idea on reference types, for this reason and others) then you should be aware of this difference.
And to add to Greg's excellent post...
The first time a new Type is referenced at runtime, the CLR loads into memory a structure called COREINFO_CLASS_STRUCT ( or something similar) that contains, among other things, a pointer to the COREINFO_CLASS_STRUCT object for the base class that this object derives from... This effectively creates a linked list of COREINFO_CLASS_STRUCT objects for the inheritance chain for the Type, which terminates in the COREINFO_CLASS_STRUCT for System.Object. When you execute isinst, (or it's analogous method castclass) it simply has to find the COREINFO_CLASS_STRUCT memory structure for the concrete type of the object you are examining, and traverse this linked list to see if the Type you are trying to cast to is in the list.
It also contains a pointer to a separate array which contains all the interfaces implemented by the Type, which must be searched separately if you are trying to cast to an interface.

Categories