Does C# have an equivalent to Scala's structural typing? - c#

In Scala, I can define structural types as follows:
type Pressable = { def press(): Unit }
This means that I can define a function or method which takes as an argument something that is Pressable, like this:
def foo(i: Pressable) { // etc.
The object which I pass to this function must have defined for it a method called press() that matches the type signature defined in the type - takes no arguments, returns Unit (Scala's version of void).
I can even use the structural type inline:
def foo(i: { def press(): Unit }) { // etc.
It basically allows the programmer to have all the benefits of duck typing while still having the benefit of compile-time type checking.
Does C# have something similar? I've Googled but can't find anything, but I'm not familiar with C# in any depth. If there aren't, are there any plans to add this?

No, and no plans that I know of. Only named (rather than structural) subtyping (e.g. interfaces).
(Others may want to see also
http://en.wikipedia.org/wiki/Nominative_type_system
http://en.wikipedia.org/wiki/Structural_type_system
)
(A few people may point out some exotic corner cases, like the foreach statement using structural typing for GetEnumerator, but this is the exception rather than the rule.)

There isn't a way to define structural types that has a particular function. There is a library that adds duck typing support to C# that can be found here.
This is the example from Duck Typing project. Please note the the duck typing happens at runtime and can fail. It is my understanding also that this library generates proxies for the types that are duck typed, which is far cry from the elegant compile-time support that is enjoyed in Scala. This is most likely as good as it gets with this generation of C#.
public interface ICanAdd
{
int Add(int x, int y);
}
// Note that MyAdder does NOT implement ICanAdd,
// but it does define an Add method like the one in ICanAdd:
public class MyAdder
{
public int Add(int x, int y)
{
return x + y;
}
}
public class Program
{
void Main()
{
MyAdder myAdder = new MyAdder();
// Even though ICanAdd is not implemented by MyAdder,
// we can duck cast it because it implements all the members:
ICanAdd adder = DuckTyping.Cast<ICanAdd>(myAdder);
// Now we can call adder as you would any ICanAdd object.
// Transparently, this call is being forwarded to myAdder.
int sum = adder.Add(2, 2);
}
}
This is the C# way of achieving the same thing using the good ol boring interfaces.
interface IPressable {
void Press();
}
class Foo {
void Bar(IPressable pressable) {
pressable.Press();
}
}
class Thingy : IPressable, IPushable, etc {
public void Press() {
}
}
static class Program {
public static void Main() {
pressable = new Thingy();
new Foo().Bar(pressable);
}
}

As others noted, this is not really available in .NET (as this is more a matter of the runtime than a language). However, .NET 4.0 supports similar thing for imported COM interfaces and I believe this could be used for implementing structural typing for .NET. See this blog post:
Faking COM to fool the C# compiler
I didn't try playing with this myself yet, but I think it might enable compiler authors to write languages with structural typing for .NET. (The idea is that you (or a compiler) would define an interface behind the scene, but it would work, because the interfaces would be treated as equivalent thanks to the COM equivalence feature).
Also, C# 4.0 supports the dynamic keyword which, I think, could be interpreted as a structural typing (with no static type checking). The keyword allows you to call methods on any object without knowning (at compile-time) whether the object will have the required methods. This is essentially the same thing as the "Duck typing" project mentioned by Igor (but that's, of course, not a proper structural typing).

The awaitable pattern in C# can perhaps be interpreted as a limited, ad hoc instance of structural subtyping / existential typing. The compiler will only await objects that have access to a GetAwaiter() method that returns any INotifyCompletion object with a specific set of methods and properties. Since neither the 'awaitable' object nor the 'awaiter' object needs to implement any interface (except INotifyCompletion in the case of the latter), await is similar to a method that accepts structurally typed awaitable objects.

Related

Are the placeholders of Generics compiled as an actual data type? [duplicate]

I had thought that Generics in C# were implemented such that a new class/method/what-have-you was generated, either at run-time or compile-time, when a new generic type was used, similar to C++ templates (which I've never actually looked into and I very well could be wrong, about which I'd gladly accept correction).
But in my coding I came up with an exact counterexample:
static class Program {
static void Main()
{
Test testVar = new Test();
GenericTest<Test> genericTest = new GenericTest<Test>();
int gen = genericTest.Get(testVar);
RegularTest regTest = new RegularTest();
int reg = regTest.Get(testVar);
if (gen == ((object)testVar).GetHashCode())
{
Console.WriteLine("Got Object's hashcode from GenericTest!");
}
if (reg == testVar.GetHashCode())
{
Console.WriteLine("Got Test's hashcode from RegularTest!");
}
}
class Test
{
public new int GetHashCode()
{
return 0;
}
}
class GenericTest<T>
{
public int Get(T obj)
{
return obj.GetHashCode();
}
}
class RegularTest
{
public int Get(Test obj)
{
return obj.GetHashCode();
}
}
}
Both of those console lines print.
I know that the actual reason this happens is that the virtual call to Object.GetHashCode() doesn't resolve to Test.GetHashCode() because the method in Test is marked as new rather than override. Therefore, I know if I used "override" rather than "new" on Test.GetHashCode() then the return of 0 would polymorphically override the method GetHashCode in object and this wouldn't be true, but according to my (previous) understanding of C# generics it wouldn't have mattered because every instance of T would have been replaced with Test, and thus the method call would have statically (or at generic resolution time) been resolved to the "new" method.
So my question is this: How are generics implemented in C#? I don't know CIL bytecode, but I do know Java bytecode so I understand how Object-oriented CLI languages work at a low level. Feel free to explain at that level.
As an aside, I thought C# generics were implemented that way because everyone always calls the generic system in C# "True Generics," compared to the type-erasure system of Java.
In GenericTest<T>.Get(T), the C# compiler has already picked that object.GetHashCode should be called (virtually). There's no way this will resolve to the "new" GetHashCode method at runtime (which will have its own slot in the method-table, rather than overriding the slot for object.GetHashCode).
From Eric Lippert's What's the difference, part one: Generics are not templates, the issue is explained (the setup used is slightly different, but the lessons translate well to your scenario):
This illustrates that generics in C# are not like templates in C++.
You can think of templates as a fancy-pants search-and-replace
mechanism.[...] That’s not how generic types work; generic types are,
well, generic. We do the overload resolution once and bake in the
result. [...] The IL we’ve generated for the generic type already has
the method its going to call picked out. The jitter does not say
“well, I happen to know that if we asked the C# compiler to execute
right now with this additional information then it would have picked a
different overload. Let me rewrite the generated code to ignore the
code that the C# compiler originally generated...” The jitter knows
nothing about the rules of C#.
And a workaround for your desired semantics:
Now, if you do want overload resolution to be re-executed at runtime based on the runtime types of
the arguments, we can do that for you; that’s what the new “dynamic”
feature does in C# 4.0. Just replace “object” with “dynamic” and when
you make a call involving that object, we’ll run the overload
resolution algorithm at runtime and dynamically spit code that calls
the method that the compiler would have picked, had it known all the
runtime types at compile time.

How does variable know about the type it implements?

As I know, each variable knows about its runtime type.
Here is an example:
void Main()
{
C c = new C();
c.M();
I i = (I)c;
i.M();
}
public interface I
{
void M();
}
public class C : I
{
void I.M()
{
Console.WriteLine("I.M");
}
public void M()
{
Console.WriteLine("M");
}
}
If I understand it right, i still knows that its type is C. So, what is the mechanism which lets i to decide on using I.M instead of M?
Internally each object has its own TypeHandle, see object internal structure below:
MSDN - Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects
You want to know how runtime method binding works, that is, how does the runtime know to call one method M instead of another when there was not enough information at compile time to encode into the program precisely which method to call?
Here's a good exercise: try to write a program that has that behaviour without using the feature as it is already written in the runtime. By doing that you will gain insight into how the people implementing the runtime did it.
I go through this exercise for virtual functions here:
http://blogs.msdn.com/b/ericlippert/archive/2011/03/17/implementing-the-virtual-method-pattern-in-c-part-one.aspx
Read that series and you'll see how you could emulate virtual dispatch in a language that does not have it. The basic idea that I show in the articles is more or less how virtual dispatch actually works in C#. Interface dispatch is a bit trickier in practice but the concept is basically the same.

What is the meaning of in modifier in the parameter list?

I saw the following usage of in:
Covariance and contravariance real world example
interface IGobbler<in T> {
void gobble(T t);
}
I don't understand what the usage of in stands for. Does it have relationship with ref, out??
The in and out modifiers in 4.0 are necessary to enforce (or rather: enable) covariance and contravariance.
If you add in, you are only allowed to use T in inwards (contravariant) positions - so things like Add(T obj) is fine, but T this[int index] {get;} is not as this is an outwards (covariant) position.
This is essential for the variance features in 4.0. With variance, ref and out are both unavailable (they are both, and as such: neither).
Ignore what you know about ref and out because it's unrelated to this context. In this case in means that the T will only appear on the right hand side of function names (i.e. in formal parameter lists like void gobble(T t)). If it said out, then T would only appear to the left of function names (i.e. return values like T foo(int x)). The default (not specifying anything) allows T to appear in either place.
In C# 4.0, Contravariance allows for
example, IComparer<X> to be cast to
IComparer<Y> even if Y is a derived
type of X. To achieve this IComparer
should be marked with the In modifier.
public interface IComparer<in T> {
public int Compare(T left, T right);
}
Have look here for example and explanation:
http://www.csharphelp.com/2010/02/c-4-0-covariance-and-contravariance-of-generics/
The in modifier tells you that the type is contravariant and can be implicitly converted to a narrower type. Notice in the example below that even though gobble takes a Shape, it can be assigned to an Action<Rectangle> because we've declared it to be contravariant. This is because anyone calling the delegate and passing it a Rectangle can obviously also pass the Rectangle to a method that takes a Shape as well.
There are some rules for when you use in, and out, but that's what it enables in a nutshell.
For example:
public class Shape { }
public class Rectangle : Shape { }
public interface IGobbler<Shape>
{
void gobble(Shape shape);
}
public class Gobbler : IGobbler<Shape>
{
public void gobble(Shape r) { }
}
public static class Program
{
public static void Main()
{
var g = new Gobbler();
// notice can implictly convert to a narrower type because of the 'in' keyword
Action<Rectangle> r = g.gobble;
}
}
the in and out does not have anything to do with ref and out.
The in keyword is used to describe that in instance of the interface will consume an instance of T. In the example you linked the line
IGobbler<Donkey> dg = new QuadrupedGobbler();
creates a gobbler that you can feed Donkeys to, eventhough the Donkey isn't a QuadrupledCreature, but it derives from it. So you are able to use a more specialized instance instead of the base class as argument.
The out keyword works much the same way, except it's used to describe a thing that produces stuff instead of comsuming it.
In the same example, the line
ISpewer<Rodent> rs = new MouseSpewer();
creates an ISpewer, which when called spews a mouse. A mouse is not a rodent, but derives from it, so you are able to use a producing class that produces more specialized instance than what the interface declares.
Notice how the way the most specialized class is swapped in the two cases. When using the in keyword, you use the specialized class as the generic argument on the interface, whereas in the out case, you use the base class as the generic argument to tell the compiler, that eventhough you create a more specialized class, it should treat it like the base class.
I like to think of it as consumption and production, as these are familiar metaphors for most developers. A method that takes an IGobbler<Cow> can also accept an IGobbler<Animal>, because a Gobbler that can gobble (consume) any Animal can also gobble a Cow. The Gobbler here is a consumer of a specific type of Animal, so it uses the in tag.
The above case (contravariance) can seem counter-intuitive, but think about it from the perspective of the RestaurantOwner who wants a Gobbler<Cow>. If a Gobbler will only gobble Pigs and the RestaurantOwner tries to feed him a Cow, it won't work. He can only accept Gobblers that are less picky, so a Gobbler<Animal> or Gobbler<Herbivore> works fine.
On the other hand, suppose you have a Farmer<Animal> that sells Animals (having a Farm method that returns IEnumerable<Animal>.) If you have a Purchaser that wants to Buy(IEnumerable<Animal>), then it can accept Farmer<Cow>.Farm(), as the Purchaser is willing to buy any produced Animal and Cows are Animals. The Farmer here is a producer of a specific type of Animal, so it uses the `out' tag.
The IN keyword tells the compiler that we only want to use T as an input value.
It will not allow casting from say, IGobbler to IGobbler

Calling constructor overload when both overload have same signature

Consider the following class,
class Foo
{
public Foo(int count)
{
/* .. */
}
public Foo(int count)
{
/* .. */
}
}
Above code is invalid and won't compile. Now consider the following code,
class Foo<T>
{
public Foo(int count)
{
/* .. */
}
public Foo(T t)
{
/* .. */
}
}
static void Main(string[] args)
{
Foo<int> foo = new Foo<int>(1);
}
Above code is valid and compiles well. It calls Foo(int count).
My question is, if the first one is invalid, how can the second one be valid? I know class Foo<T> is valid because T and int are different types. But when it is used like Foo<int> foo = new Foo<int>(1), T is getting integer type and both constructor will have same signature right? Why don't compiler show error rather than choosing an overload to execute?
There is no ambiguity, because the compiler will choose the most specific overload of Foo(...) that matches. Since a method with a generic type parameter is considered less specific than a corresponding non-generic method, Foo(T) is therefore less specific than Foo(int) when T == int. Accordingly, you are invoking the Foo(int) overload.
Your first case (with two Foo(int) definitions) is an error because the compiler will allow only one definition of a method with precisely the same signature, and you have two.
Your question was hotly debated when C# 2.0 and the generic type system in the CLR were being designed. So hotly, in fact, that the "bound" C# 2.0 specification published by A-W actually has the wrong rule in it! There are four possibilities:
1) Make it illegal to declare a generic class that could POSSIBLY be ambiguous under SOME construction. (This is what the bound spec incorrectly says is the rule.) So your Foo<T> declaration would be illegal.
2) Make it illegal to construct a generic class in a manner which creates an ambiguity. declaring Foo<T> would be legal, constructing Foo<double> would be legal, but constructing Foo<int> would be illegal.
3) Make it all legal and use overload resolution tricks to work out whether the generic or nongeneric version is better. (This is what C# actually does.)
4) Do something else I haven't thought of.
Rule #1 is a bad idea because it makes some very common and harmless scenarios impossible. Consider for example:
class C<T>
{
public C(T t) { ... } // construct a C that wraps a T
public C(Stream state) { ... } // construct a C based on some serialized state from disk
}
You want that to be illegal just because C<Stream> is ambiguous? Yuck. Rule #1 is a bad idea, so we scrapped it.
Unfortunately, it is not as simple as that. IIRC the CLI rules say that an implementation is allowed to reject as illegal constructions that actually do cause signature ambiguities. That is, the CLI rules are something like Rule #2, whereas C# actually implements Rule #3. Which means that there could in theory be legal C# programs that translate into illegal code, which is deeply unfortunate.
For some more thoughts on how these sorts of ambiguities make our lives wretched, here are a couple of articles I wrote on the subject:
http://blogs.msdn.com/ericlippert/archive/2006/04/05/569085.aspx
http://blogs.msdn.com/ericlippert/archive/2006/04/06/odious-ambiguous-overloads-part-two.aspx
Eric Lippert blogged about this recently.
The fact is that they do not both have the same signature - one is using generics while this other is not.
With those methods in place you could also call it using a non-int object:
Foo<string> foo = new Foo<string>("Hello World");

Is C# a single dispatch or multiple dispatch language?

I'm trying to understand what single and multiple dispatch are, exactly.
I just read this:
http://en.wikipedia.org/wiki/Multiple_dispatch
And from that definition is seems to me that C# and VB.Net are multiple-dispatch, even though the choice of which overload to call is made at compile-time.
Am I correct here, or am I missing something?
Thanks!
OK, I understood the subtle difference where function overloading is different from multiple-dispatch.
Basically, the difference is whether which method to call is chosen at run-time or compile-time. Now, I know everybody's said this, but without a clear example this sounds VERY obvious, given that C# is statically typed and multiple-dispatch languages (apparently to me, at least) seem to be dynamically typed. Up to now, with just that definition multiple-dispatch and function overloading sounded exactly the same to me.
The case where this makes a real difference is when you
have two overloads of a method that differ on the type of a parameter (CaptureSpaceShip(IRebelAllianceShip ship) and CaptureSpaceShip(Xwing ship)
the two types (IRebelAllianceShip and CaptureSpaceShip) are polymorphic, and
you call the method with a reference declared as the higher type, which actually points to an object of the lower type
Full Example:
int CaptureSpaceShip(IRebelAllianceShip ship) {}
int CaptureSpaceShip(XWing ship) {}
void Main() {
IRebelAllianceShip theShip = new XWing();
CaptureSpaceShip(theShip);
}
XWing obviously implements IRebelAllianceShip.
In this case, the first method will be called, whereas if C# implemented multiple-dispatch, the second method would be called.
Sorry about the doc rehash... This seems to me the clearest way to explain this difference, rather than just reading the definitions for each dispatch method.
For a more formal explanation:
http://en.wikipedia.org/wiki/Double_dispatch#Double_dispatch_is_more_than_function_overloading
For those that find this article using a search engine, C# 4.0 introduces the dynamic keyword. The code would look like the following.
int CaptureSpaceShip(IRebelAllianceShip ship) {}
int CaptureSpaceShip(XWing ship) {}
void Main() {
IRebelAllianceShip theShip = new XWing();
CaptureSpaceShip((dynamic)theShip);
}
C# is single dispatch but there are some blog posts which by their title looks like they are trying to emulate multimethods. If I can get one of the articles to load I will update my answer here.
Maybe somebody will be interested in good C# example for multiple dispatch using dynamic keyword (MSDN blog)
class Animal
{
}
class Cat : Animal
{
}
class Dog : Animal
{
}
class Mouse : Animal
{
}
We can create several overloads of the same method, specialized according to different combinations of their parameter types:
void ReactSpecialization(Animal me, Animal other)
{
Console.WriteLine("{0} is not interested in {1}.", me, other);
}
void ReactSpecialization(Cat me, Dog other)
{
Console.WriteLine("Cat runs away from dog.");
}
void ReactSpecialization(Cat me, Mouse other)
{
Console.WriteLine("Cat chases mouse.");
}
void ReactSpecialization(Dog me, Cat other)
{
Console.WriteLine("Dog chases cat.");
}
And now the magic part:
void React(Animal me, Animal other)
{
ReactSpecialization(me as dynamic, other as dynamic);
}
This works because of the "as dynamic" cast, which tells the C# compiler, rather than just calling ReactSpecialization(Animal, Animal), to dynamically examine the type of each parameter and make a runtime choice about which method overload to invoke.
To prove it really works:
void Test()
{
Animal cat = new Cat();
Animal dog = new Dog();
Animal mouse = new Mouse();
React(cat, dog);
React(cat, mouse);
React(dog, cat);
React(dog, mouse);
}
Output:
Cat runs away from dog.
Cat chases mouse.
Dog chases cat.
Dog is not interested in Mouse.
Wikipedia says that C# 4.0 (dynamic) is "multiple dispatch" language.
I also think that languages such as Java, C# (prior to 4.0), C++ are single dispatch.
C# does not support multiple dispatch. The Visitor Design pattern emulates something that could be described as multiple dispatch, even though the Visitor pattern's mainly focus on separate the algorithm from an hierarchy.
According to the cited Wikipedia article, multiple dispatch, by definition, is based on the runtime types of the objects involved, so C# and VB.net don't use it, because the decision is made, as you state, at compile-time.
The GoF Visitor Pattern is an example of how to do double dispatch. Scott Meyers "More Effective C++" shows you how to do it in C++. Here's a link from Dr Dobbs that talks about how to do double dispatch in both Java and C++.
I understand that this is an old question..
In .Net 4.0 you can use dynamic keyword for multi methods... Take a look at the following for an example .Net 4.0 Optimized code for refactoring existing "if" conditions and "is" operator

Categories