How do interfaces like IEnumerable work without proper implementation? - c#

From what I understand on interfaces, is that in order to use them, you must declare that a class is implementing it by adding the name of the interface after a colon and then, implement the methods.
I'm currently learning about Enumerators, IEnumerable etc. and this got me confused. Here's an example of what I mean:
static IEnumerable<int> Fibs(int fibCount)
{
for (int i = 0, prevFib = 1, curFib = 1; i < fibCount; i++) {
yield return prevFib;
int newFib = prevFib + curFib;
prevFib = curFib;
curFib = newFib;
}
}
IEnumerable seems to be a normal interface as any other, I even checked the method definition and that's what it pretty much seems like.
How is it possible that I can use an interface as a type/return type in the method definition and when/how do I know I should use certain interfaces as types like in this example?
EDIT: I really doubt it has anything to do with the yield keyword since a lot of interfaces are used as properties this way for example in MVC in Models and passed like it to Views. Example:
public IEnumerable<Category> Categories {get;set;}

There is extra magic when you use yield keyword, i.e. create an iterator block. The compiler makes a state machine for you.
So C# has a special feature here, and it only has this feature with IEnumerable<>. So it is the C# language which is magical.
The interface IEnumerable<> in itself is a boring ordinary type. No magic in it.
Note: Technically, the yield magic works when the "formal" return type is either IEnumerable<>, IEnumerator<>, IEnumerable, or IEnumerator, but usually you use the first of these. Do not go non-generic, of course.

IEnumerable is a special case. The yield return statement instructs the compiler to add the code that implements IEnumerable.
As for your edit:
If an interface is used a the type of a property, any object of a class that implements this interface can be assigned and the property will return an object that implements this interface. In your example, any collection of categories that implements IEnumerable<Category> can be assigned to the property, e.g. a List<Category>. In comparison to using just List<Category>, using the interface allows a wider range of objects to be assigned. The interface defines the abstract requirements that are relevant for the property.

In C#, interfaces are a special "kind" of type that differs from a class in a few key ways:
You cannot include any implementation code of their methods.
A single class can "inherit from" (called "implementing") as many as it wants.
You cannot instantiate a new instance of an interface.
(There is also some language features associated specifically with interfaces, such as explicit implementations, but those aren't important for this discussion.)
Beyond that, interfaces can be use pretty much anywhere you can use any other reference type. That includes defining fields, properties, or local variables with interface types, or using them as parameter types or return types on methods.
The trick is, if you define a property as, say, an IEnumerable<int>, and you want to set it's value, you cannot do this:
public IEnumerable<int> Numbers { get; set; }
...
this.Numbers = new IEnumerable<int>();
That's an error. You can't create a new instance of an interface, because it's just a "template" -- there's nothing "behind" it to actually do anything. However, you can do this:
public IEnumerable<int> Numbers { get; set; }
...
this.Numbers = new List<int>();
Because List<T> implements IEnumerable<T>, the compiler will automatically do the type conversion to make the assignment work. Any concrete class that implements IEnumerable<> can be assigned to a property of type IEnumerable<>, which is why you see interface property types so often. It allows you to change the underlying concrete type (maybe you want to change List<T> to ObservableCollection<T>, but the users of your class neither know nor care when you do.
The same goes for methods with an interface return type, except there's an extra option here that C# throws in as a bonus:
public IEnumerable<string> GetName()
{
// this fails.
return new IEnumerable<string>();
// this works.
return new List<String>();
// this also works because magic!~
yield return "hello";
yield return "there";
yield return "!";
}
That last case is a special form of "syntactic sugar" that C# provides because it's such a common requirement. As other people have mentioned, the compiler specifically looks for yield return statements on methods that return IEnumerable or IEnumerator (both generic and non-generic versions) and does some hefty rewriting of the code.
Behind the scenes, C# is creating a hidden class, which implements IEnumerable<string>, and implementing it's GetEnumerator method to return an IEnumerator<string> object that provides those three string values. That would be a lot of boilerplate code for you to write, although you could certainly write it yourself. In previous versions of C#, there was no yield and you did have to write it yourself.
If you really want to know, you can find the C# equivalent here, among other places. In essence, it takes the method that contains your yield statements and creates a IEnumerator<>.MoveNext method out of it, but turning it into a state machine. It uses the equivalent of labels and goto statements to jump back to the correct place each time a consumer calls MoveNext on the same instance. Also, as I understand it, it does things that you can't actually do in C# (it jumps into and out of loops) but that are legal in IL code, so it's implementation is more efficient that what you could write yourself.
But once you get past the secret cause of the yield keyword, you're still doing the same thing. You're still creating a class, that implements IEnumerable<>, and using that as the return value for your method.

For your specific example, even though the method return type is IEnumerable<int>, the actual return type will be a type that implements IEnumerable<int>. As others have mentioned, the ‘yield return’ results in a type which implements IEnumerable<int>. In this specific case, you don’t know what that type is. When I run this through the debugger and do a result.GetType() (where result is returned from the Fibs method), I see that result is of type <Fibs>d__0, which sounds a little strange to me. So instead of worrying about that strange-sounding type, we can just treat it like an IEnumerable<int>, because all we really want to be able to do is to iterate over it, which is the behavior that IEnumerable<int> exposes.
That’s for your specific example, but the idea is the same elsewhere. By using an interface, you are saying that you don’t care exactly what type is used/returned, as long as it exposes certain behavior or properties. For example if I have an interface IFoo that exposes a method DoSomething(), and I have a method that returns IFoo, then I can return anything that implements IFoo, but no matter what I return, the caller of the method is sure that it can DoSomething() with the object. Similarly, if I have a method that takes an IFoo, then the method is sure that it will be able to DoSomething() with that parameter.
For me interfaces also help me design my classes. For example I write the interface first to determine what’s important for the class to be able to do and have, and then I create the concrete class that implements that interface. And when I’m writing code that uses the class, I ask for the interface rather than the concrete type.
And there’s all sorts of other reasons to use interfaces, for example to create mocks for testing, for dependency injection, for fluent API design, and probably a hundred other reasons.

Related

C# interface cannot be instantiated so why can I use IEnumerable without specifying concrete type

I understand that interfaces cannot be instantiated but that an instance of an object that implements the interface can be created.
But in this example, I am working with an interface variable and it's never instantiated to an object of a class.
public static IEnumerable<Person> GetPeople()
{
IEnumerable<Person> people = AMethod<IEnumerable<Person>>();
return people;
}
var people = MyClass.GetPeople();
// ... operations on the people variable
So at no point am I referring to an array or a list or another type that implements IEnumerable. When I debug this code the debugger shows that the people variable is of type IEnumerable< Person >. But how could it be if interfaces are not "the real thing" but only a contract? Like in memory is this an array, a list, something else? I am following a course on interfaces but I can't wrap my head around this.
AMethod<IEnumerable<Person>>()
This is calling a method that looks something like this:
public T AMethod <T> {
// Do something to create an instance of T
}
IEnumerable<Person> is what you're passing to the <T> of the AMethod signature. It's a generic method.
The result of that is that something concrete (like Person[], List<Person> etc) is being created to fit the interface IEnumerable<Person>. But it's created by AMethod, not any of the code you've posted.
If you're curious about exactly what type people actually is, and can't just look at AMethod, calling Type concreteType = people.GetType(); will tell you.
The useful thing with an abstraction like IEnumerable<T> is that is implemented by many classes.
For instance:
a List<T> is an IEnumerable<T>.
An ImmutableArray<T> is an IEnumerable<T>.
An array T[] is an IEnumerable<T> (though it's a bit of compiler magic, see comments).
Saying that a method returns an IEnumerable<T> just mean this method will return something that implements IEnumerable, you don't need to care which one, but you can rely on the fact that the interface is implemented and thus use foreach and linq methods on it.
If all you need is to be able to enumerate something with foreach, or apply linq methods, you can hide the actual implementation (array or list or whatever) and work with the IEnumerable instead.
On a general note, why do you want to use an abstraction? Here is a short start answer.
Imagine you rely on a specific implementation, like List<T>, and later you pass this to other functions as List<T> parameters etc... And your code grows like this.
One day you realize that List<T> is not a good choice for some reason (for performance, or because you now use another library that returns some other kind of class).
Now you need to untangle everything that was relying on this List assumption.

How do arrays in C# partially implement IList<T>?

So as you may know, arrays in C# implement IList<T>, among other interfaces. Somehow though, they do this without publicly implementing the Count property of IList<T>! Arrays have only a Length property.
Is this a blatant example of C#/.NET breaking its own rules about the interface implementation or am I missing something?
So as you may know, arrays in C# implement IList<T>, among other interfaces
Well, yes, erm no, not really. This is the declaration for the Array class in the .NET 4 framework:
[Serializable, ComVisible(true)]
public abstract class Array : ICloneable, IList, ICollection, IEnumerable,
IStructuralComparable, IStructuralEquatable
{
// etc..
}
It implements System.Collections.IList, not System.Collections.Generic.IList<>. It can't, Array is not generic. Same goes for the generic IEnumerable<> and ICollection<> interfaces.
But the CLR creates concrete array types on the fly, so it could technically create one that implements these interfaces. This is however not the case. Try this code for example:
using System;
using System.Collections.Generic;
class Program {
static void Main(string[] args) {
var goodmap = typeof(Derived).GetInterfaceMap(typeof(IEnumerable<int>));
var badmap = typeof(int[]).GetInterfaceMap(typeof(IEnumerable<int>)); // Kaboom
}
}
abstract class Base { }
class Derived : Base, IEnumerable<int> {
public IEnumerator<int> GetEnumerator() { return null; }
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { return GetEnumerator(); }
}
The GetInterfaceMap() call fails for a concrete array type with "Interface not found". Yet a cast to IEnumerable<> works without a problem.
This is quacks-like-a-duck typing. It is the same kind of typing that creates the illusion that every value type derives from ValueType which derives from Object. Both the compiler and the CLR have special knowledge of array types, just as they do of value types. The compiler sees your attempt at casting to IList<> and says "okay, I know how to do that!". And emits the castclass IL instruction. The CLR has no trouble with it, it knows how to provide an implementation of IList<> that works on the underlying array object. It has built-in knowledge of the otherwise hidden System.SZArrayHelper class, a wrapper that actually implements these interfaces.
Which it doesn't do explicitly like everybody claims, the Count property you asked about looks like this:
internal int get_Count<T>() {
//! Warning: "this" is an array, not an SZArrayHelper. See comments above
//! or you may introduce a security hole!
T[] _this = JitHelpers.UnsafeCast<T[]>(this);
return _this.Length;
}
Yes, you can certainly call that comment "breaking the rules" :) It is otherwise darned handy. And extremely well hidden, you can check this out in SSCLI20, the shared source distribution for the CLR. Search for "IList" to see where the type substitution takes place. The best place to see it in action is clr/src/vm/array.cpp, GetActualImplementationForArrayGenericIListMethod() method.
This kind of substitution in the CLR is pretty mild compared to what happens in the language projection in the CLR that allows writing managed code for WinRT (aka Metro). Just about any core .NET type gets substituted there. IList<> maps to IVector<> for example, an entirely unmanaged type. Itself a substitution, COM doesn't support generic types.
Well, that was a look at what happens behind the curtain. It can be very uncomfortable, strange and unfamiliar seas with dragons living at the end of the map. It can be very useful to make the Earth flat and model a different image of what's really going on in managed code. Mapping it to everybody favorite answer is comfortable that way. Which doesn't work so well for value types (don't mutate a struct!) but this one is very well hidden. The GetInterfaceMap() method failure is the only leak in the abstraction that I can think of.
New answer in the light of Hans's answer
Thanks to the answer given by Hans, we can see the implementation is somewhat more complicated than we might think. Both the compiler and the CLR try very hard to give the impression that an array type implements IList<T> - but array variance makes this trickier. Contrary to the answer from Hans, the array types (single-dimensional, zero-based anyway) do implement the generic collections directly, because the type of any specific array isn't System.Array - that's just the base type of the array. If you ask an array type what interfaces it supports, it includes the generic types:
foreach (var type in typeof(int[]).GetInterfaces())
{
Console.WriteLine(type);
}
Output:
System.ICloneable
System.Collections.IList
System.Collections.ICollection
System.Collections.IEnumerable
System.Collections.IStructuralComparable
System.Collections.IStructuralEquatable
System.Collections.Generic.IList`1[System.Int32]
System.Collections.Generic.ICollection`1[System.Int32]
System.Collections.Generic.IEnumerable`1[System.Int32]
For single-dimensional, zero-based arrays, as far as the language is concerned, the array really does implement IList<T> too. Section 12.1.2 of the C# specification says so. So whatever the underlying implementation does, the language has to behave as if the type of T[] implements IList<T> as with any other interface. From this perspective, the interface is implemented with some of the members being explicitly implemented (such as Count). That's the best explanation at the language level for what's going on.
Note that this only holds for single-dimensional arrays (and zero-based arrays, not that C# as a language says anything about non-zero-based arrays). T[,] doesn't implement IList<T>.
From a CLR perspective, something funkier is going on. You can't get the interface mapping for the generic interface types. For example:
typeof(int[]).GetInterfaceMap(typeof(ICollection<int>))
Gives an exception of:
Unhandled Exception: System.ArgumentException: Interface maps for generic
interfaces on arrays cannot be retrived.
So why the weirdness? Well, I believe it's really due to array covariance, which is a wart in the type system, IMO. Even though IList<T> is not covariant (and can't be safely), array covariance allows this to work:
string[] strings = { "a", "b", "c" };
IList<object> objects = strings;
... which makes it look like typeof(string[]) implements IList<object>, when it doesn't really.
The CLI spec (ECMA-335) partition 1, section 8.7.1, has this:
A signature type T is compatible-with a signature type U if and only if at least one of the following holds
...
T is a zero-based rank-1 array V[], and U is IList<W>, and V is array-element-compatible-with W.
(It doesn't actually mention ICollection<W> or IEnumerable<W> which I believe is a bug in the spec.)
For non-variance, the CLI spec goes along with the language spec directly. From section 8.9.1 of partition 1:
Additionally, a created vector with element type T, implements the interface System.Collections.Generic.IList<U>, where U := T. (§8.7)
(A vector is a single-dimensional array with a zero base.)
Now in terms of the implementation details, clearly the CLR is doing some funky mapping to keep the assignment compatibility here: when a string[] is asked for the implementation of ICollection<object>.Count, it can't handle that in quite the normal way. Does this count as explicit interface implementation? I think it's reasonable to treat it that way, as unless you ask for the interface mapping directly, it always behaves that way from a language perspective.
What about ICollection.Count?
So far I've talked about the generic interfaces, but then there's the non-generic ICollection with its Count property. This time we can get the interface mapping, and in fact the interface is implemented directly by System.Array. The documentation for the ICollection.Count property implementation in Array states that it's implemented with explicit interface implementation.
If anyone can think of a way in which this kind of explicit interface implementation is different from "normal" explicit interface implementation, I'd be happy to look into it further.
Old answer around explicit interface implementation
Despite the above, which is more complicated because of the knowledge of arrays, you can still do something with the same visible effects through explicit interface implementation.
Here's a simple standalone example:
public interface IFoo
{
void M1();
void M2();
}
public class Foo : IFoo
{
// Explicit interface implementation
void IFoo.M1() {}
// Implicit interface implementation
public void M2() {}
}
class Test
{
static void Main()
{
Foo foo = new Foo();
foo.M1(); // Compile-time failure
foo.M2(); // Fine
IFoo ifoo = foo;
ifoo.M1(); // Fine
ifoo.M2(); // Fine
}
}
IList<T>.Count is implemented explicitly:
int[] intArray = new int[10];
IList<int> intArrayAsList = (IList<int>)intArray;
Debug.Assert(intArrayAsList.Count == 10);
This is done so that when you have a simple array variable, you don't have both Count and Length directly available.
In general, explicit interface implementation is used when you want to ensure that a type can be used in a particular way, without forcing all consumers of the type to think about it that way.
Edit: Whoops, bad recall there. ICollection.Count is implemented explicitly. The generic IList<T> is handled as Hans descibes below.
Explicit interface implementation. In short, you declare it like void IControl.Paint() { } or int IList<T>.Count { get { return 0; } }.
It's no different than an explicit interface implementation of IList. Just because you implement the interface doesn't mean its members need to appear as class members. It does implement the Count property, it just doesn't expose it on X[].
With reference-sources being available:
//----------------------------------------------------------------------------------------
// ! READ THIS BEFORE YOU WORK ON THIS CLASS.
//
// The methods on this class must be written VERY carefully to avoid introducing security holes.
// That's because they are invoked with special "this"! The "this" object
// for all of these methods are not SZArrayHelper objects. Rather, they are of type U[]
// where U[] is castable to T[]. No actual SZArrayHelper object is ever instantiated. Thus, you will
// see a lot of expressions that cast "this" "T[]".
//
// This class is needed to allow an SZ array of type T[] to expose IList<T>,
// IList<T.BaseType>, etc., etc. all the way up to IList<Object>. When the following call is
// made:
//
// ((IList<T>) (new U[n])).SomeIListMethod()
//
// the interface stub dispatcher treats this as a special case, loads up SZArrayHelper,
// finds the corresponding generic method (matched simply by method name), instantiates
// it for type <T> and executes it.
//
// The "T" will reflect the interface used to invoke the method. The actual runtime "this" will be
// array that is castable to "T[]" (i.e. for primitivs and valuetypes, it will be exactly
// "T[]" - for orefs, it may be a "U[]" where U derives from T.)
//----------------------------------------------------------------------------------------
sealed class SZArrayHelper {
// It is never legal to instantiate this class.
private SZArrayHelper() {
Contract.Assert(false, "Hey! How'd I get here?");
}
/* ... snip ... */
}
Specifically this part:
the interface stub dispatcher treats this as a special case, loads up
SZArrayHelper, finds the corresponding generic method (matched simply
by method name), instantiates it for type and executes it.
(Emphasis mine)
Source (scroll up).

Generic list of lists, converting List<List<T>> to IList<IList<T>>

We're using a class library that performs calculations on 3D measurement data, it exposes a method:
MeasurementResults Calculate(IList<IList<Measurement>> data)
I would like to allow calling this method with any indexable list of lists (of Measurement of course), for example both:
Measurement[][] array;
List<List<Measurement>> list;
Calling the method using the array works fine, which is a bit strange. Is there some compiler trick at work here? Trying to call with the List gives the familiar error:
cannot convert from 'List<List<Measurement>>' to 'IList<IList<Measurement>>'
So, I have written a facade class (containing some other things as well), with a method that splits the generic definition between the argument and method and converts to the IList type if necessary:
MeasurementResults Calculate<T>(IList<T> data) where T : IList<Measurement>
{
IList<IList<Measurement>> converted = data as IList<IList<Measurement>>;
if(converted == null)
converted = data.Select(o => o as IList<Measurement>).ToList();
return Calculate(converted);
}
Is this a good way to solve the problem, or do you have a better idea?
Also, while testing different solutions to the problem, I found out that if the class library method had been declared with IEnumerable instead of IList, it is ok to call the method using both the array and the List:
MeasurementResults Calculate(IEnumerable<IEnumerable<Measurement>> data)
I suspect that there is some compiler trick at work again, I wonder why they haven't made IList work with List while they were at it?
It's okay to do this with IEnumerable<T>, because that is covariant in T. It wouldn't be safe to do that with IList<T>, as that is used for both "input" and "output".
In particular, consider:
List<List<Foo>> foo = new List<List<Foo>>();
List<IList<Foo>> bar = foo;
bar.Add(new Foo[5]); // Arrays implement IList<T>
// Eh?
List<Foo> firstList = foo[0];
See MSDN for more information about generic covariance and contravariance.
Suppose we have a method
void Bar(IList<IList<int>> foo)
Within Bar, it should then be perfectly admissible to add a int[] to foo - after all, int[] implements IList<int>, doesn't it?
But if we had called Bar with a List<List<int>>, we would now be trying to add a int[] to something that only accepts a List<int>! which would be bad. So the compiler doesn't let you do this.
Also, while testing different solutions to the problem, I found out that if the class library method had been declared with IEnumerable instead of IList, it is ok to call the method using both the array and the List:
Indeed, because if the behavioural contract just says "I can output ints", nothing can go wrong. The key terms for further research are covariance and contravariance, and no one* can ever remember which is which.
In your particular case, if Calculate is only ever reading its input, changing it to consume IEnumerable is absolutely the right thing to do - it both allows you to pass in any qualifying objects, and it further communicates to anyone reading the signature that this method is intentionally designed to only consume, not mutate, its input.
* well, mostly no one
There are a few things with you implementation that I would change. First of all it might insert nulls into the result where there were objects in the original. Personally I would prefer it to throw.
MeasurementResults Calculate<T>(IList<T> data) where T : IList<Measurement>
{
return Calculate(data as IList<IList<Measurement>>
?? data.Cast<IList<Measurement>>().ToList());
}
Is a short example a like what I would personally do in your case, though I doubt I would actually implement the method. The interface was written like that for a reason (I hope) and I would try to use that knowledge in my implementation on top instead of fighting it if possible. Any changes made to the list in the method will not be reflected in the original list if you use your code (or anything like that). It's a new list being passed to the method and since the signature asks for an IList it's possible that changes can be made.
Aside from not potentially having nulls in the list instead of objects I've change to use the cast method since that's basically what you are trying to accomplish. (Cast does not allow for custom conversion operators).

Casting from interface to underlying type

Reviewing an earlier question on SO, I started thinking about the situation where a class exposes a value, such as a collection, as an interface implemented by the value's type. In the code example below, I am using a List, and exposing that list as IEnumerable.
Exposing the list through the IEnumerable interface defines the intent that the list only be enumerated over, not modified. However, since the instance can be re-cast back to a list, the list itself can of course be modified.
I also include in the sample a version of the method that prevents modification by copying the list item references to a new list each time the method is called, thereby preventing changes to the underlying list.
So my question is, should all code exposing a concrete type as an implemented interface do so by means of a copy operation? Would there be value in a language construct that explicitly indicates "I want to expose this value through an interface, and calling code should only be able to use this value through the interface"? What techniques do others use to prevent unintended side-effects like these when exposing concrete values through their interfaces.
Please note, I understand that the behavior illustrated is expected behavior. I am not claiming this behavior is wrong, just that it does allow use of functionality other than the expressed intent. Perhaps I am assigning too much significance to the interface - thinking of it as a functionality constraint. Thoughts?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace TypeCastTest
{
class Program
{
static void Main(string[] args)
{
// Demonstrate casting situation
Automobile castAuto = new Automobile();
List<string> doorNamesCast = (List<string>)castAuto.GetDoorNamesUsingCast();
doorNamesCast.Add("Spare Tire");
// Would prefer this prints 4 names,
// actually prints 5 because IEnumerable<string>
// was cast back to List<string>, exposing the
// Add method of the underlying List object
// Since the list was cast to IEnumerable before being
// returned, the expressed intent is that calling code
// should only be able to enumerate over the collection,
// not modify it.
foreach (string doorName in castAuto.GetDoorNamesUsingCast())
{
Console.WriteLine(doorName);
}
Console.WriteLine();
// --------------------------------------
// Demonstrate casting defense
Automobile copyAuto = new Automobile();
List<string> doorNamesCopy = (List<string>)copyAuto.GetDoorNamesUsingCopy();
doorNamesCopy.Add("Spare Tire");
// This returns only 4 names,
// because the IEnumerable<string> that is
// returned is from a copied List<string>, so
// calling the Add method of the List object does
// not modify the underlying collection
foreach (string doorName in copyAuto.GetDoorNamesUsingCopy())
{
Console.WriteLine(doorName);
}
Console.ReadLine();
}
}
public class Automobile
{
private List<string> doors = new List<string>();
public Automobile()
{
doors.Add("Driver Front");
doors.Add("Passenger Front");
doors.Add("Driver Rear");
doors.Add("Passenger Rear");
}
public IEnumerable<string> GetDoorNamesUsingCopy()
{
return new List<string>(doors).AsEnumerable<string>();
}
public IEnumerable<string> GetDoorNamesUsingCast()
{
return doors.AsEnumerable<string>();
}
}
}
One way you can prevent this is by using AsReadOnly() to prevent any such nefariousness. I think the real answer is though, you should never be relying on anything other than the exposed interface/contract in terms of the return types, etc. Doing anything else defies encapsulation, prevents you from swapping out your implementations for others that don't use a List but instead just a T[], etc, etc.
Edit:
And down-casting like you mention is basically a violation of the Liskov Substition Principle, to get all technical and stuff.
In a situation like this, you could define your own collection class which implements IEnumerable<T>. Internally, your collection could keep a List<T> and then you could just return the enumerator of the underlying list:
public class MyList : IEnumerable<string>
{
private List<string> internalList;
// ...
IEnumerator<string> IEnumerable<string>.GetEnumerator()
{
return this.internalList.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.internalList.GetEnumerator();
}
}
An interface is a constraint on the implementation of a minimum set of things it must do (even if "doing" is no more than throwing a NotSupportedException; or even a NotImplementedException). It is not a constraint that either prevents the implementation from doing more, or on the calling code.
One thing I've learned working with .NET (and with some people who are quick to jump to a hack solution) is that if nothing else, reflection will often allow people to by pass your "protections."
Interfaces are not iron shackles of programming, they're a promise that your code makes to any other code saying "I can definitely do these things." If you "cheat" and cast the interface object into some other object because you, the programmer, know something that the program doesn't, then you're breaking that contract. The consequence is poorer maintainability and a reliance that no one ever mess up anything in that chain of execution, lest some other object get sent down that doesn't cast correctly.
Other tricks like making things readonly or hiding the actual list behind a wrapper are only stop-gaps. You could easily dig into the type using reflection to pull out the private list if you really wanted it. And I think there are attributes you can apply to types to prevent people from reflecting into them.
Likewise, readonly lists aren't really. I could probably figure out a way to modify the list itself. And I can almost certainly modify the items on the list. So a readonly isn't enough, nor is a copy or an array. You need a deep copy (clone) of the original list in order to actually protect the data, to some degree.
But the real question is, why are you fighting so hard against the contract that you wrote. Sometimes reflection hacking is a handy workaround when someone else's library is poorly designed and didn't expose something that it needs to (or a bug requires that you go digging to fix it.) But when you have control over the interface AND the consumer of the interface, there's no excuse to not make the publicly exposed interface as robust as you need it to be to get your work done.
Or in short: If you need a list, don't return IEnumerable, return a List. If you've got an IEnumerable but you actually needed a list, then its safer to make a new list from that IEnum and use that. There are very few reasons (and even fewer, maybe no, good reasons) to cast up to a type simply because "I know it's actually a list, so this will work."
Yeah, you can take steps to try and prevent people from doing that, but 1) the harder you fight people who insist on breaking the system, the harder they will try to break it and 2) they're only looking for more rope, and eventually they'll get enough to hang themselves.

What is cool about generics, why use them?

I thought I'd offer this softball to whomever would like to hit it out of the park. What are generics, what are the advantages of generics, why, where, how should I use them? Please keep it fairly basic. Thanks.
Allows you to write code/use library methods which are type-safe, i.e. a List<string> is guaranteed to be a list of strings.
As a result of generics being used the compiler can perform compile-time checks on code for type safety, i.e. are you trying to put an int into that list of strings? Using an ArrayList would cause that to be a less transparent runtime error.
Faster than using objects as it either avoids boxing/unboxing (where .net has to convert value types to reference types or vice-versa) or casting from objects to the required reference type.
Allows you to write code which is applicable to many types with the same underlying behaviour, i.e. a Dictionary<string, int> uses the same underlying code as a Dictionary<DateTime, double>; using generics, the framework team only had to write one piece of code to achieve both results with the aforementioned advantages too.
I really hate to repeat myself. I hate typing the same thing more often than I have to. I don't like restating things multiple times with slight differences.
Instead of creating:
class MyObjectList {
MyObject get(int index) {...}
}
class MyOtherObjectList {
MyOtherObject get(int index) {...}
}
class AnotherObjectList {
AnotherObject get(int index) {...}
}
I can build one reusable class... (in the case where you don't want to use the raw collection for some reason)
class MyList<T> {
T get(int index) { ... }
}
I'm now 3x more efficient and I only have to maintain one copy. Why WOULDN'T you want to maintain less code?
This is also true for non-collection classes such as a Callable<T> or a Reference<T> that has to interact with other classes. Do you really want to extend Callable<T> and Future<T> and every other associated class to create type-safe versions?
I don't.
Not needing to typecast is one of the biggest advantages of Java generics, as it will perform type checking at compile-time. This will reduce the possibility of ClassCastExceptions which can be thrown at runtime, and can lead to more robust code.
But I suspect that you're fully aware of that.
Every time I look at Generics it gives
me a headache. I find the best part of
Java to be it's simplicity and minimal
syntax and generics are not simple and
add a significant amount of new
syntax.
At first, I didn't see the benefit of generics either. I started learning Java from the 1.4 syntax (even though Java 5 was out at the time) and when I encountered generics, I felt that it was more code to write, and I really didn't understand the benefits.
Modern IDEs make writing code with generics easier.
Most modern, decent IDEs are smart enough to assist with writing code with generics, especially with code completion.
Here's an example of making an Map<String, Integer> with a HashMap. The code I would have to type in is:
Map<String, Integer> m = new HashMap<String, Integer>();
And indeed, that's a lot to type just to make a new HashMap. However, in reality, I only had to type this much before Eclipse knew what I needed:
Map<String, Integer> m = new Ha Ctrl+Space
True, I did need to select HashMap from a list of candidates, but basically the IDE knew what to add, including the generic types. With the right tools, using generics isn't too bad.
In addition, since the types are known, when retrieving elements from the generic collection, the IDE will act as if that object is already an object of its declared type -- there is no need to casting for the IDE to know what the object's type is.
A key advantage of generics comes from the way it plays well with new Java 5 features. Here's an example of tossing integers in to a Set and calculating its total:
Set<Integer> set = new HashSet<Integer>();
set.add(10);
set.add(42);
int total = 0;
for (int i : set) {
total += i;
}
In that piece of code, there are three new Java 5 features present:
Generics
Autoboxing and unboxing
For-each loop
First, generics and autoboxing of primitives allow the following lines:
set.add(10);
set.add(42);
The integer 10 is autoboxed into an Integer with the value of 10. (And same for 42). Then that Integer is tossed into the Set which is known to hold Integers. Trying to throw in a String would cause a compile error.
Next, for for-each loop takes all three of those:
for (int i : set) {
total += i;
}
First, the Set containing Integers are used in a for-each loop. Each element is declared to be an int and that is allowed as the Integer is unboxed back to the primitive int. And the fact that this unboxing occurs is known because generics was used to specify that there were Integers held in the Set.
Generics can be the glue that brings together the new features introduced in Java 5, and it just makes coding simpler and safer. And most of the time IDEs are smart enough to help you with good suggestions, so generally, it won't a whole lot more typing.
And frankly, as can be seen from the Set example, I feel that utilizing Java 5 features can make the code more concise and robust.
Edit - An example without generics
The following is an illustration of the above Set example without the use of generics. It is possible, but isn't exactly pleasant:
Set set = new HashSet();
set.add(10);
set.add(42);
int total = 0;
for (Object o : set) {
total += (Integer)o;
}
(Note: The above code will generate unchecked conversion warning at compile-time.)
When using non-generics collections, the types that are entered into the collection is objects of type Object. Therefore, in this example, a Object is what is being added into the set.
set.add(10);
set.add(42);
In the above lines, autoboxing is in play -- the primitive int value 10 and 42 are being autoboxed into Integer objects, which are being added to the Set. However, keep in mind, the Integer objects are being handled as Objects, as there are no type information to help the compiler know what type the Set should expect.
for (Object o : set) {
This is the part that is crucial. The reason the for-each loop works is because the Set implements the Iterable interface, which returns an Iterator with type information, if present. (Iterator<T>, that is.)
However, since there is no type information, the Set will return an Iterator which will return the values in the Set as Objects, and that is why the element being retrieved in the for-each loop must be of type Object.
Now that the Object is retrieved from the Set, it needs to be cast to an Integer manually to perform the addition:
total += (Integer)o;
Here, a typecast is performed from an Object to an Integer. In this case, we know this will always work, but manual typecasting always makes me feel it is fragile code that could be damaged if a minor change is made else where. (I feel that every typecast is a ClassCastException waiting to happen, but I digress...)
The Integer is now unboxed into an int and allowed to perform the addition into the int variable total.
I hope I could illustrate that the new features of Java 5 is possible to use with non-generic code, but it just isn't as clean and straight-forward as writing code with generics. And, in my opinion, to take full advantage of the new features in Java 5, one should be looking into generics, if at the very least, allows for compile-time checks to prevent invalid typecasts to throw exceptions at runtime.
If you were to search the Java bug database just before 1.5 was released, you'd find seven times more bugs with NullPointerException than ClassCastException. So it doesn't seem that it is a great feature to find bugs, or at least bugs that persist after a little smoke testing.
For me the huge advantage of generics is that they document in code important type information. If I didn't want that type information documented in code, then I'd use a dynamically typed language, or at least a language with more implicit type inference.
Keeping an object's collections to itself isn't a bad style (but then the common style is to effectively ignore encapsulation). It rather depends upon what you are doing. Passing collections to "algorithms" is slightly easier to check (at or before compile-time) with generics.
Generics in Java facilitate parametric polymorphism. By means of type parameters, you can pass arguments to types. Just as a method like String foo(String s) models some behaviour, not just for a particular string, but for any string s, so a type like List<T> models some behaviour, not just for a specific type, but for any type. List<T> says that for any type T, there's a type of List whose elements are Ts. So List is a actually a type constructor. It takes a type as an argument and constructs another type as a result.
Here are a couple of examples of generic types I use every day. First, a very useful generic interface:
public interface F<A, B> {
public B f(A a);
}
This interface says that for some two types, A and B, there's a function (called f) that takes an A and returns a B. When you implement this interface, A and B can be any types you want, as long as you provide a function f that takes the former and returns the latter. Here's an example implementation of the interface:
F<Integer, String> intToString = new F<Integer, String>() {
public String f(int i) {
return String.valueOf(i);
}
}
Before generics, polymorphism was achieved by subclassing using the extends keyword. With generics, we can actually do away with subclassing and use parametric polymorphism instead. For example, consider a parameterised (generic) class used to calculate hash codes for any type. Instead of overriding Object.hashCode(), we would use a generic class like this:
public final class Hash<A> {
private final F<A, Integer> hashFunction;
public Hash(final F<A, Integer> f) {
this.hashFunction = f;
}
public int hash(A a) {
return hashFunction.f(a);
}
}
This is much more flexible than using inheritance, because we can stay with the theme of using composition and parametric polymorphism without locking down brittle hierarchies.
Java's generics are not perfect though. You can abstract over types, but you can't abstract over type constructors, for example. That is, you can say "for any type T", but you can't say "for any type T that takes a type parameter A".
I wrote an article about these limits of Java generics, here.
One huge win with generics is that they let you avoid subclassing. Subclassing tends to result in brittle class hierarchies that are awkward to extend, and classes that are difficult to understand individually without looking at the entire hierarchy.
Wereas before generics you might have classes like Widget extended by FooWidget, BarWidget, and BazWidget, with generics you can have a single generic class Widget<A> that takes a Foo, Bar or Baz in its constructor to give you Widget<Foo>, Widget<Bar>, and Widget<Baz>.
Generics avoid the performance hit of boxing and unboxing. Basically, look at ArrayList vs List<T>. Both do the same core things, but List<T> will be a lot faster because you don't have to box to/from object.
The best benefit to Generics is code reuse. Lets say that you have a lot of business objects, and you are going to write VERY similar code for each entity to perform the same actions. (I.E Linq to SQL operations).
With generics, you can create a class that will be able to operate given any of the types that inherit from a given base class or implement a given interface like so:
public interface IEntity
{
}
public class Employee : IEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int EmployeeID { get; set; }
}
public class Company : IEntity
{
public string Name { get; set; }
public string TaxID { get; set }
}
public class DataService<ENTITY, DATACONTEXT>
where ENTITY : class, IEntity, new()
where DATACONTEXT : DataContext, new()
{
public void Create(List<ENTITY> entities)
{
using (DATACONTEXT db = new DATACONTEXT())
{
Table<ENTITY> table = db.GetTable<ENTITY>();
foreach (ENTITY entity in entities)
table.InsertOnSubmit (entity);
db.SubmitChanges();
}
}
}
public class MyTest
{
public void DoSomething()
{
var dataService = new DataService<Employee, MyDataContext>();
dataService.Create(new Employee { FirstName = "Bob", LastName = "Smith", EmployeeID = 5 });
var otherDataService = new DataService<Company, MyDataContext>();
otherDataService.Create(new Company { Name = "ACME", TaxID = "123-111-2233" });
}
}
Notice the reuse of the same service given the different Types in the DoSomething method above. Truly elegant!
There's many other great reasons to use generics for your work, this is my favorite.
I just like them because they give you a quick way to define a custom type (as I use them anyway).
So for example instead of defining a structure consisting of a string and an integer, and then having to implement a whole set of objects and methods on how to access an array of those structures and so forth, you can just make a Dictionary
Dictionary<int, string> dictionary = new Dictionary<int, string>();
And the compiler/IDE does the rest of the heavy lifting. A Dictionary in particular lets you use the first type as a key (no repeated values).
Typed collections - even if you don't want to use them you're likely to have to deal with them from other libraries , other sources.
Generic typing in class creation:
public class Foo < T> {
public T get()...
Avoidance of casting - I've always disliked things like
new Comparator {
public int compareTo(Object o){
if (o instanceof classIcareAbout)...
Where you're essentially checking for a condition that should only exist because the interface is expressed in terms of objects.
My initial reaction to generics was similar to yours - "too messy, too complicated". My experience is that after using them for a bit you get used to them, and code without them feels less clearly specified, and just less comfortable. Aside from that, the rest of the java world uses them so you're going to have to get with the program eventually, right?
To give a good example. Imagine you have a class called Foo
public class Foo
{
public string Bar() { return "Bar"; }
}
Example 1
Now you want to have a collection of Foo objects. You have two options, LIst or ArrayList, both of which work in a similar manner.
Arraylist al = new ArrayList();
List<Foo> fl = new List<Foo>();
//code to add Foos
al.Add(new Foo());
f1.Add(new Foo());
In the above code, if I try to add a class of FireTruck instead of Foo, the ArrayList will add it, but the Generic List of Foo will cause an exception to be thrown.
Example two.
Now you have your two array lists and you want to call the Bar() function on each. Since hte ArrayList is filled with Objects, you have to cast them before you can call bar. But since the Generic List of Foo can only contain Foos, you can call Bar() directly on those.
foreach(object o in al)
{
Foo f = (Foo)o;
f.Bar();
}
foreach(Foo f in fl)
{
f.Bar();
}
Haven't you ever written a method (or a class) where the key concept of the method/class wasn't tightly bound to a specific data type of the parameters/instance variables (think linked list, max/min functions, binary search, etc.).
Haven't you ever wish you could reuse the algorthm/code without resorting to cut-n-paste reuse or compromising strong-typing (e.g. I want a List of Strings, not a List of things I hope are strings!)?
That's why you should want to use generics (or something better).
The primary advantage, as Mitchel points out, is strong-typing without needing to define multiple classes.
This way you can do stuff like:
List<SomeCustomClass> blah = new List<SomeCustomClass>();
blah[0].SomeCustomFunction();
Without generics, you would have to cast blah[0] to the correct type to access its functions.
Don't forget that generics aren't just used by classes, they can also be used by methods. For example, take the following snippet:
private <T extends Throwable> T logAndReturn(T t) {
logThrowable(t); // some logging method that takes a Throwable
return t;
}
It is simple, but can be used very elegantly. The nice thing is that the method returns whatever it was that it was given. This helps out when you are handling exceptions that need to be re-thrown back to the caller:
...
} catch (MyException e) {
throw logAndReturn(e);
}
The point is that you don't lose the type by passing it through a method. You can throw the correct type of exception instead of just a Throwable, which would be all you could do without generics.
This is just a simple example of one use for generic methods. There are quite a few other neat things you can do with generic methods. The coolest, in my opinion, is type inferring with generics. Take the following example (taken from Josh Bloch's Effective Java 2nd Edition):
...
Map<String, Integer> myMap = createHashMap();
...
public <K, V> Map<K, V> createHashMap() {
return new HashMap<K, V>();
}
This doesn't do a lot, but it does cut down on some clutter when the generic types are long (or nested; i.e. Map<String, List<String>>).
Generics allow you to create objects that are strongly typed, yet you don't have to define the specific type. I think the best useful example is the List and similar classes.
Using the generic list you can have a List List List whatever you want and you can always reference the strong typing, you don't have to convert or anything like you would with a Array or standard List.
the jvm casts anyway... it implicitly creates code which treats the generic type as "Object" and creates casts to the desired instantiation. Java generics are just syntactic sugar.
I know this is a C# question, but generics are used in other languages too, and their use/goals are quite similar.
Java collections use generics since Java 1.5. So, a good place to use them is when you are creating your own collection-like object.
An example I see almost everywhere is a Pair class, which holds two objects, but needs to deal with those objects in a generic way.
class Pair<F, S> {
public final F first;
public final S second;
public Pair(F f, S s)
{
first = f;
second = s;
}
}
Whenever you use this Pair class you can specify which kind of objects you want it to deal with and any type cast problems will show up at compile time, rather than runtime.
Generics can also have their bounds defined with the keywords 'super' and 'extends'. For example, if you want to deal with a generic type but you want to make sure it extends a class called Foo (which has a setTitle method):
public class FooManager <F extends Foo>{
public void setTitle(F foo, String title) {
foo.setTitle(title);
}
}
While not very interesting on its own, it's useful to know that whenever you deal with a FooManager, you know that it will handle MyClass types, and that MyClass extends Foo.
From the Sun Java documentation, in response to "why should i use generics?":
"Generics provides a way for you to communicate the type of a collection to the compiler, so that it can be checked. Once the compiler knows the element type of the collection, the compiler can check that you have used the collection consistently and can insert the correct casts on values being taken out of the collection... The code using generics is clearer and safer.... the compiler can verify at compile time that the type constraints are not violated at run time [emphasis mine]. Because the program compiles without warnings, we can state with certainty that it will not throw a ClassCastException at run time. The net effect of using generics, especially in large programs, is improved readability and robustness. [emphasis mine]"
Generics let you use strong typing for objects and data structures that should be able to hold any object. It also eliminates tedious and expensive typecasts when retrieving objects from generic structures (boxing/unboxing).
One example that uses both is a linked list. What good would a linked list class be if it could only use object Foo? To implement a linked list that can handle any kind of object, the linked list and the nodes in a hypothetical node inner class must be generic if you want the list to contain only one type of object.
If your collection contains value types, they don't need to box/unbox to objects when inserted into the collection so your performance increases dramatically. Cool add-ons like resharper can generate more code for you, like foreach loops.
Another advantage of using Generics (especially with Collections/Lists) is you get Compile Time Type Checking. This is really useful when using a Generic List instead of a List of Objects.
Single most reason is they provide Type safety
List<Customer> custCollection = new List<Customer>;
as opposed to,
object[] custCollection = new object[] { cust1, cust2 };
as a simple example.
In summary, generics allow you to specify more precisily what you intend to do (stronger typing).
This has several benefits for you:
Because the compiler knows more about what you want to do, it allows you to omit a lot of type-casting because it already knows that the type will be compatible.
This also gets you earlier feedback about the correctnes of your program. Things that previously would have failed at runtime (e.g. because an object couldn't be casted in the desired type), now fail at compile-time and you can fix the mistake before your testing-department files a cryptical bug report.
The compiler can do more optimizations, like avoiding boxing, etc.
A couple of things to add/expand on (speaking from the .NET point of view):
Generic types allow you to create role-based classes and interfaces. This has been said already in more basic terms, but I find you start to design your code with classes which are implemented in a type-agnostic way - which results in highly reusable code.
Generic arguments on methods can do the same thing, but they also help apply the "Tell Don't Ask" principle to casting, i.e. "give me what I want, and if you can't, you tell me why".
I use them for example in a GenericDao implemented with SpringORM and Hibernate which look like this
public abstract class GenericDaoHibernateImpl<T>
extends HibernateDaoSupport {
private Class<T> type;
public GenericDaoHibernateImpl(Class<T> clazz) {
type = clazz;
}
public void update(T object) {
getHibernateTemplate().update(object);
}
#SuppressWarnings("unchecked")
public Integer count() {
return ((Integer) getHibernateTemplate().execute(
new HibernateCallback() {
public Object doInHibernate(Session session) {
// Code in Hibernate for getting the count
}
}));
}
.
.
.
}
By using generics my implementations of this DAOs force the developer to pass them just the entities they are designed for by just subclassing the GenericDao
public class UserDaoHibernateImpl extends GenericDaoHibernateImpl<User> {
public UserDaoHibernateImpl() {
super(User.class); // This is for giving Hibernate a .class
// work with, as generics disappear at runtime
}
// Entity specific methods here
}
My little framework is more robust (have things like filtering, lazy-loading, searching). I just simplified here to give you an example
I, like Steve and you, said at the beginning "Too messy and complicated" but now I see its advantages
Obvious benefits like "type safety" and "no casting" are already mentioned so maybe I can talk about some other "benefits" which I hope it helps.
First of all, generics is a language-independent concept and , IMO, it might make more sense if you think about regular (runtime) polymorphism at the same time.
For example, the polymorphism as we know from object oriented design has a runtime notion in where the caller object is figured out at runtime as program execution goes and the relevant method gets called accordingly depending on the runtime type. In generics, the idea is somewhat similar but everything happens at compile time. What does that mean and how you make use of it?
(Let's stick with generic methods to keep it compact) It means that you can still have the same method on separate classes (like you did previously in polymorphic classes) but this time they're auto-generated by the compiler depend on the types set at compile time. You parametrise your methods on the type you give at compile time. So, instead of writing the methods from scratch for every single type you have as you do in runtime polymorphism (method overriding), you let compilers do the work during compilation. This has an obvious advantage since you don't need to infer all possible types that might be used in your system which makes it far more scalable without a code change.
Classes work the pretty much same way. You parametrise the type and the code is generated by the compiler.
Once you get the idea of "compile time", you can make use "bounded" types and restrict what can be passed as a parametrised type through classes/methods. So, you can control what to be passed through which is a powerful thing especially you've a framework being consumed by other people.
public interface Foo<T extends MyObject> extends Hoo<T>{
...
}
No one can set sth other than MyObject now.
Also, you can "enforce" type constraints on your method arguments which means you can make sure both your method arguments would depend on the same type.
public <T extends MyObject> foo(T t1, T t2){
...
}
Hope all of this makes sense.
I once gave a talk on this topic. You can find my slides, code, and audio recording at http://www.adventuresinsoftware.com/generics/.
Using generics for collections is just simple and clean. Even if you punt on it everywhere else, the gain from the collections is a win to me.
List<Stuff> stuffList = getStuff();
for(Stuff stuff : stuffList) {
stuff.do();
}
vs
List stuffList = getStuff();
Iterator i = stuffList.iterator();
while(i.hasNext()) {
Stuff stuff = (Stuff)i.next();
stuff.do();
}
or
List stuffList = getStuff();
for(int i = 0; i < stuffList.size(); i++) {
Stuff stuff = (Stuff)stuffList.get(i);
stuff.do();
}
That alone is worth the marginal "cost" of generics, and you don't have to be a generic Guru to use this and get value.
Generics also give you the ability to create more reusable objects/methods while still providing type specific support. You also gain a lot of performance in some cases. I don't know the full spec on the Java Generics, but in .NET I can specify constraints on the Type parameter, like Implements a Interface, Constructor , and Derivation.
Enabling programmers to implement generic algorithms - By using generics, programmers can implement generic algorithms that work on collections of different types, can be customized, and are type-safe and easier to read.
Stronger type checks at compile time - A Java compiler applies strong type checking to generic code and issues errors if the code violates type safety. Fixing compile-time errors is easier than fixing runtime errors, which can be difficult to find.
Elimination of casts.

Categories