What is cool about generics, why use them? - c#

I thought I'd offer this softball to whomever would like to hit it out of the park. What are generics, what are the advantages of generics, why, where, how should I use them? Please keep it fairly basic. Thanks.

Allows you to write code/use library methods which are type-safe, i.e. a List<string> is guaranteed to be a list of strings.
As a result of generics being used the compiler can perform compile-time checks on code for type safety, i.e. are you trying to put an int into that list of strings? Using an ArrayList would cause that to be a less transparent runtime error.
Faster than using objects as it either avoids boxing/unboxing (where .net has to convert value types to reference types or vice-versa) or casting from objects to the required reference type.
Allows you to write code which is applicable to many types with the same underlying behaviour, i.e. a Dictionary<string, int> uses the same underlying code as a Dictionary<DateTime, double>; using generics, the framework team only had to write one piece of code to achieve both results with the aforementioned advantages too.

I really hate to repeat myself. I hate typing the same thing more often than I have to. I don't like restating things multiple times with slight differences.
Instead of creating:
class MyObjectList {
MyObject get(int index) {...}
}
class MyOtherObjectList {
MyOtherObject get(int index) {...}
}
class AnotherObjectList {
AnotherObject get(int index) {...}
}
I can build one reusable class... (in the case where you don't want to use the raw collection for some reason)
class MyList<T> {
T get(int index) { ... }
}
I'm now 3x more efficient and I only have to maintain one copy. Why WOULDN'T you want to maintain less code?
This is also true for non-collection classes such as a Callable<T> or a Reference<T> that has to interact with other classes. Do you really want to extend Callable<T> and Future<T> and every other associated class to create type-safe versions?
I don't.

Not needing to typecast is one of the biggest advantages of Java generics, as it will perform type checking at compile-time. This will reduce the possibility of ClassCastExceptions which can be thrown at runtime, and can lead to more robust code.
But I suspect that you're fully aware of that.
Every time I look at Generics it gives
me a headache. I find the best part of
Java to be it's simplicity and minimal
syntax and generics are not simple and
add a significant amount of new
syntax.
At first, I didn't see the benefit of generics either. I started learning Java from the 1.4 syntax (even though Java 5 was out at the time) and when I encountered generics, I felt that it was more code to write, and I really didn't understand the benefits.
Modern IDEs make writing code with generics easier.
Most modern, decent IDEs are smart enough to assist with writing code with generics, especially with code completion.
Here's an example of making an Map<String, Integer> with a HashMap. The code I would have to type in is:
Map<String, Integer> m = new HashMap<String, Integer>();
And indeed, that's a lot to type just to make a new HashMap. However, in reality, I only had to type this much before Eclipse knew what I needed:
Map<String, Integer> m = new Ha Ctrl+Space
True, I did need to select HashMap from a list of candidates, but basically the IDE knew what to add, including the generic types. With the right tools, using generics isn't too bad.
In addition, since the types are known, when retrieving elements from the generic collection, the IDE will act as if that object is already an object of its declared type -- there is no need to casting for the IDE to know what the object's type is.
A key advantage of generics comes from the way it plays well with new Java 5 features. Here's an example of tossing integers in to a Set and calculating its total:
Set<Integer> set = new HashSet<Integer>();
set.add(10);
set.add(42);
int total = 0;
for (int i : set) {
total += i;
}
In that piece of code, there are three new Java 5 features present:
Generics
Autoboxing and unboxing
For-each loop
First, generics and autoboxing of primitives allow the following lines:
set.add(10);
set.add(42);
The integer 10 is autoboxed into an Integer with the value of 10. (And same for 42). Then that Integer is tossed into the Set which is known to hold Integers. Trying to throw in a String would cause a compile error.
Next, for for-each loop takes all three of those:
for (int i : set) {
total += i;
}
First, the Set containing Integers are used in a for-each loop. Each element is declared to be an int and that is allowed as the Integer is unboxed back to the primitive int. And the fact that this unboxing occurs is known because generics was used to specify that there were Integers held in the Set.
Generics can be the glue that brings together the new features introduced in Java 5, and it just makes coding simpler and safer. And most of the time IDEs are smart enough to help you with good suggestions, so generally, it won't a whole lot more typing.
And frankly, as can be seen from the Set example, I feel that utilizing Java 5 features can make the code more concise and robust.
Edit - An example without generics
The following is an illustration of the above Set example without the use of generics. It is possible, but isn't exactly pleasant:
Set set = new HashSet();
set.add(10);
set.add(42);
int total = 0;
for (Object o : set) {
total += (Integer)o;
}
(Note: The above code will generate unchecked conversion warning at compile-time.)
When using non-generics collections, the types that are entered into the collection is objects of type Object. Therefore, in this example, a Object is what is being added into the set.
set.add(10);
set.add(42);
In the above lines, autoboxing is in play -- the primitive int value 10 and 42 are being autoboxed into Integer objects, which are being added to the Set. However, keep in mind, the Integer objects are being handled as Objects, as there are no type information to help the compiler know what type the Set should expect.
for (Object o : set) {
This is the part that is crucial. The reason the for-each loop works is because the Set implements the Iterable interface, which returns an Iterator with type information, if present. (Iterator<T>, that is.)
However, since there is no type information, the Set will return an Iterator which will return the values in the Set as Objects, and that is why the element being retrieved in the for-each loop must be of type Object.
Now that the Object is retrieved from the Set, it needs to be cast to an Integer manually to perform the addition:
total += (Integer)o;
Here, a typecast is performed from an Object to an Integer. In this case, we know this will always work, but manual typecasting always makes me feel it is fragile code that could be damaged if a minor change is made else where. (I feel that every typecast is a ClassCastException waiting to happen, but I digress...)
The Integer is now unboxed into an int and allowed to perform the addition into the int variable total.
I hope I could illustrate that the new features of Java 5 is possible to use with non-generic code, but it just isn't as clean and straight-forward as writing code with generics. And, in my opinion, to take full advantage of the new features in Java 5, one should be looking into generics, if at the very least, allows for compile-time checks to prevent invalid typecasts to throw exceptions at runtime.

If you were to search the Java bug database just before 1.5 was released, you'd find seven times more bugs with NullPointerException than ClassCastException. So it doesn't seem that it is a great feature to find bugs, or at least bugs that persist after a little smoke testing.
For me the huge advantage of generics is that they document in code important type information. If I didn't want that type information documented in code, then I'd use a dynamically typed language, or at least a language with more implicit type inference.
Keeping an object's collections to itself isn't a bad style (but then the common style is to effectively ignore encapsulation). It rather depends upon what you are doing. Passing collections to "algorithms" is slightly easier to check (at or before compile-time) with generics.

Generics in Java facilitate parametric polymorphism. By means of type parameters, you can pass arguments to types. Just as a method like String foo(String s) models some behaviour, not just for a particular string, but for any string s, so a type like List<T> models some behaviour, not just for a specific type, but for any type. List<T> says that for any type T, there's a type of List whose elements are Ts. So List is a actually a type constructor. It takes a type as an argument and constructs another type as a result.
Here are a couple of examples of generic types I use every day. First, a very useful generic interface:
public interface F<A, B> {
public B f(A a);
}
This interface says that for some two types, A and B, there's a function (called f) that takes an A and returns a B. When you implement this interface, A and B can be any types you want, as long as you provide a function f that takes the former and returns the latter. Here's an example implementation of the interface:
F<Integer, String> intToString = new F<Integer, String>() {
public String f(int i) {
return String.valueOf(i);
}
}
Before generics, polymorphism was achieved by subclassing using the extends keyword. With generics, we can actually do away with subclassing and use parametric polymorphism instead. For example, consider a parameterised (generic) class used to calculate hash codes for any type. Instead of overriding Object.hashCode(), we would use a generic class like this:
public final class Hash<A> {
private final F<A, Integer> hashFunction;
public Hash(final F<A, Integer> f) {
this.hashFunction = f;
}
public int hash(A a) {
return hashFunction.f(a);
}
}
This is much more flexible than using inheritance, because we can stay with the theme of using composition and parametric polymorphism without locking down brittle hierarchies.
Java's generics are not perfect though. You can abstract over types, but you can't abstract over type constructors, for example. That is, you can say "for any type T", but you can't say "for any type T that takes a type parameter A".
I wrote an article about these limits of Java generics, here.
One huge win with generics is that they let you avoid subclassing. Subclassing tends to result in brittle class hierarchies that are awkward to extend, and classes that are difficult to understand individually without looking at the entire hierarchy.
Wereas before generics you might have classes like Widget extended by FooWidget, BarWidget, and BazWidget, with generics you can have a single generic class Widget<A> that takes a Foo, Bar or Baz in its constructor to give you Widget<Foo>, Widget<Bar>, and Widget<Baz>.

Generics avoid the performance hit of boxing and unboxing. Basically, look at ArrayList vs List<T>. Both do the same core things, but List<T> will be a lot faster because you don't have to box to/from object.

The best benefit to Generics is code reuse. Lets say that you have a lot of business objects, and you are going to write VERY similar code for each entity to perform the same actions. (I.E Linq to SQL operations).
With generics, you can create a class that will be able to operate given any of the types that inherit from a given base class or implement a given interface like so:
public interface IEntity
{
}
public class Employee : IEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int EmployeeID { get; set; }
}
public class Company : IEntity
{
public string Name { get; set; }
public string TaxID { get; set }
}
public class DataService<ENTITY, DATACONTEXT>
where ENTITY : class, IEntity, new()
where DATACONTEXT : DataContext, new()
{
public void Create(List<ENTITY> entities)
{
using (DATACONTEXT db = new DATACONTEXT())
{
Table<ENTITY> table = db.GetTable<ENTITY>();
foreach (ENTITY entity in entities)
table.InsertOnSubmit (entity);
db.SubmitChanges();
}
}
}
public class MyTest
{
public void DoSomething()
{
var dataService = new DataService<Employee, MyDataContext>();
dataService.Create(new Employee { FirstName = "Bob", LastName = "Smith", EmployeeID = 5 });
var otherDataService = new DataService<Company, MyDataContext>();
otherDataService.Create(new Company { Name = "ACME", TaxID = "123-111-2233" });
}
}
Notice the reuse of the same service given the different Types in the DoSomething method above. Truly elegant!
There's many other great reasons to use generics for your work, this is my favorite.

I just like them because they give you a quick way to define a custom type (as I use them anyway).
So for example instead of defining a structure consisting of a string and an integer, and then having to implement a whole set of objects and methods on how to access an array of those structures and so forth, you can just make a Dictionary
Dictionary<int, string> dictionary = new Dictionary<int, string>();
And the compiler/IDE does the rest of the heavy lifting. A Dictionary in particular lets you use the first type as a key (no repeated values).

Typed collections - even if you don't want to use them you're likely to have to deal with them from other libraries , other sources.
Generic typing in class creation:
public class Foo < T> {
public T get()...
Avoidance of casting - I've always disliked things like
new Comparator {
public int compareTo(Object o){
if (o instanceof classIcareAbout)...
Where you're essentially checking for a condition that should only exist because the interface is expressed in terms of objects.
My initial reaction to generics was similar to yours - "too messy, too complicated". My experience is that after using them for a bit you get used to them, and code without them feels less clearly specified, and just less comfortable. Aside from that, the rest of the java world uses them so you're going to have to get with the program eventually, right?

To give a good example. Imagine you have a class called Foo
public class Foo
{
public string Bar() { return "Bar"; }
}
Example 1
Now you want to have a collection of Foo objects. You have two options, LIst or ArrayList, both of which work in a similar manner.
Arraylist al = new ArrayList();
List<Foo> fl = new List<Foo>();
//code to add Foos
al.Add(new Foo());
f1.Add(new Foo());
In the above code, if I try to add a class of FireTruck instead of Foo, the ArrayList will add it, but the Generic List of Foo will cause an exception to be thrown.
Example two.
Now you have your two array lists and you want to call the Bar() function on each. Since hte ArrayList is filled with Objects, you have to cast them before you can call bar. But since the Generic List of Foo can only contain Foos, you can call Bar() directly on those.
foreach(object o in al)
{
Foo f = (Foo)o;
f.Bar();
}
foreach(Foo f in fl)
{
f.Bar();
}

Haven't you ever written a method (or a class) where the key concept of the method/class wasn't tightly bound to a specific data type of the parameters/instance variables (think linked list, max/min functions, binary search, etc.).
Haven't you ever wish you could reuse the algorthm/code without resorting to cut-n-paste reuse or compromising strong-typing (e.g. I want a List of Strings, not a List of things I hope are strings!)?
That's why you should want to use generics (or something better).

The primary advantage, as Mitchel points out, is strong-typing without needing to define multiple classes.
This way you can do stuff like:
List<SomeCustomClass> blah = new List<SomeCustomClass>();
blah[0].SomeCustomFunction();
Without generics, you would have to cast blah[0] to the correct type to access its functions.

Don't forget that generics aren't just used by classes, they can also be used by methods. For example, take the following snippet:
private <T extends Throwable> T logAndReturn(T t) {
logThrowable(t); // some logging method that takes a Throwable
return t;
}
It is simple, but can be used very elegantly. The nice thing is that the method returns whatever it was that it was given. This helps out when you are handling exceptions that need to be re-thrown back to the caller:
...
} catch (MyException e) {
throw logAndReturn(e);
}
The point is that you don't lose the type by passing it through a method. You can throw the correct type of exception instead of just a Throwable, which would be all you could do without generics.
This is just a simple example of one use for generic methods. There are quite a few other neat things you can do with generic methods. The coolest, in my opinion, is type inferring with generics. Take the following example (taken from Josh Bloch's Effective Java 2nd Edition):
...
Map<String, Integer> myMap = createHashMap();
...
public <K, V> Map<K, V> createHashMap() {
return new HashMap<K, V>();
}
This doesn't do a lot, but it does cut down on some clutter when the generic types are long (or nested; i.e. Map<String, List<String>>).

Generics allow you to create objects that are strongly typed, yet you don't have to define the specific type. I think the best useful example is the List and similar classes.
Using the generic list you can have a List List List whatever you want and you can always reference the strong typing, you don't have to convert or anything like you would with a Array or standard List.

the jvm casts anyway... it implicitly creates code which treats the generic type as "Object" and creates casts to the desired instantiation. Java generics are just syntactic sugar.

I know this is a C# question, but generics are used in other languages too, and their use/goals are quite similar.
Java collections use generics since Java 1.5. So, a good place to use them is when you are creating your own collection-like object.
An example I see almost everywhere is a Pair class, which holds two objects, but needs to deal with those objects in a generic way.
class Pair<F, S> {
public final F first;
public final S second;
public Pair(F f, S s)
{
first = f;
second = s;
}
}
Whenever you use this Pair class you can specify which kind of objects you want it to deal with and any type cast problems will show up at compile time, rather than runtime.
Generics can also have their bounds defined with the keywords 'super' and 'extends'. For example, if you want to deal with a generic type but you want to make sure it extends a class called Foo (which has a setTitle method):
public class FooManager <F extends Foo>{
public void setTitle(F foo, String title) {
foo.setTitle(title);
}
}
While not very interesting on its own, it's useful to know that whenever you deal with a FooManager, you know that it will handle MyClass types, and that MyClass extends Foo.

From the Sun Java documentation, in response to "why should i use generics?":
"Generics provides a way for you to communicate the type of a collection to the compiler, so that it can be checked. Once the compiler knows the element type of the collection, the compiler can check that you have used the collection consistently and can insert the correct casts on values being taken out of the collection... The code using generics is clearer and safer.... the compiler can verify at compile time that the type constraints are not violated at run time [emphasis mine]. Because the program compiles without warnings, we can state with certainty that it will not throw a ClassCastException at run time. The net effect of using generics, especially in large programs, is improved readability and robustness. [emphasis mine]"

Generics let you use strong typing for objects and data structures that should be able to hold any object. It also eliminates tedious and expensive typecasts when retrieving objects from generic structures (boxing/unboxing).
One example that uses both is a linked list. What good would a linked list class be if it could only use object Foo? To implement a linked list that can handle any kind of object, the linked list and the nodes in a hypothetical node inner class must be generic if you want the list to contain only one type of object.

If your collection contains value types, they don't need to box/unbox to objects when inserted into the collection so your performance increases dramatically. Cool add-ons like resharper can generate more code for you, like foreach loops.

Another advantage of using Generics (especially with Collections/Lists) is you get Compile Time Type Checking. This is really useful when using a Generic List instead of a List of Objects.

Single most reason is they provide Type safety
List<Customer> custCollection = new List<Customer>;
as opposed to,
object[] custCollection = new object[] { cust1, cust2 };
as a simple example.

In summary, generics allow you to specify more precisily what you intend to do (stronger typing).
This has several benefits for you:
Because the compiler knows more about what you want to do, it allows you to omit a lot of type-casting because it already knows that the type will be compatible.
This also gets you earlier feedback about the correctnes of your program. Things that previously would have failed at runtime (e.g. because an object couldn't be casted in the desired type), now fail at compile-time and you can fix the mistake before your testing-department files a cryptical bug report.
The compiler can do more optimizations, like avoiding boxing, etc.

A couple of things to add/expand on (speaking from the .NET point of view):
Generic types allow you to create role-based classes and interfaces. This has been said already in more basic terms, but I find you start to design your code with classes which are implemented in a type-agnostic way - which results in highly reusable code.
Generic arguments on methods can do the same thing, but they also help apply the "Tell Don't Ask" principle to casting, i.e. "give me what I want, and if you can't, you tell me why".

I use them for example in a GenericDao implemented with SpringORM and Hibernate which look like this
public abstract class GenericDaoHibernateImpl<T>
extends HibernateDaoSupport {
private Class<T> type;
public GenericDaoHibernateImpl(Class<T> clazz) {
type = clazz;
}
public void update(T object) {
getHibernateTemplate().update(object);
}
#SuppressWarnings("unchecked")
public Integer count() {
return ((Integer) getHibernateTemplate().execute(
new HibernateCallback() {
public Object doInHibernate(Session session) {
// Code in Hibernate for getting the count
}
}));
}
.
.
.
}
By using generics my implementations of this DAOs force the developer to pass them just the entities they are designed for by just subclassing the GenericDao
public class UserDaoHibernateImpl extends GenericDaoHibernateImpl<User> {
public UserDaoHibernateImpl() {
super(User.class); // This is for giving Hibernate a .class
// work with, as generics disappear at runtime
}
// Entity specific methods here
}
My little framework is more robust (have things like filtering, lazy-loading, searching). I just simplified here to give you an example
I, like Steve and you, said at the beginning "Too messy and complicated" but now I see its advantages

Obvious benefits like "type safety" and "no casting" are already mentioned so maybe I can talk about some other "benefits" which I hope it helps.
First of all, generics is a language-independent concept and , IMO, it might make more sense if you think about regular (runtime) polymorphism at the same time.
For example, the polymorphism as we know from object oriented design has a runtime notion in where the caller object is figured out at runtime as program execution goes and the relevant method gets called accordingly depending on the runtime type. In generics, the idea is somewhat similar but everything happens at compile time. What does that mean and how you make use of it?
(Let's stick with generic methods to keep it compact) It means that you can still have the same method on separate classes (like you did previously in polymorphic classes) but this time they're auto-generated by the compiler depend on the types set at compile time. You parametrise your methods on the type you give at compile time. So, instead of writing the methods from scratch for every single type you have as you do in runtime polymorphism (method overriding), you let compilers do the work during compilation. This has an obvious advantage since you don't need to infer all possible types that might be used in your system which makes it far more scalable without a code change.
Classes work the pretty much same way. You parametrise the type and the code is generated by the compiler.
Once you get the idea of "compile time", you can make use "bounded" types and restrict what can be passed as a parametrised type through classes/methods. So, you can control what to be passed through which is a powerful thing especially you've a framework being consumed by other people.
public interface Foo<T extends MyObject> extends Hoo<T>{
...
}
No one can set sth other than MyObject now.
Also, you can "enforce" type constraints on your method arguments which means you can make sure both your method arguments would depend on the same type.
public <T extends MyObject> foo(T t1, T t2){
...
}
Hope all of this makes sense.

I once gave a talk on this topic. You can find my slides, code, and audio recording at http://www.adventuresinsoftware.com/generics/.

Using generics for collections is just simple and clean. Even if you punt on it everywhere else, the gain from the collections is a win to me.
List<Stuff> stuffList = getStuff();
for(Stuff stuff : stuffList) {
stuff.do();
}
vs
List stuffList = getStuff();
Iterator i = stuffList.iterator();
while(i.hasNext()) {
Stuff stuff = (Stuff)i.next();
stuff.do();
}
or
List stuffList = getStuff();
for(int i = 0; i < stuffList.size(); i++) {
Stuff stuff = (Stuff)stuffList.get(i);
stuff.do();
}
That alone is worth the marginal "cost" of generics, and you don't have to be a generic Guru to use this and get value.

Generics also give you the ability to create more reusable objects/methods while still providing type specific support. You also gain a lot of performance in some cases. I don't know the full spec on the Java Generics, but in .NET I can specify constraints on the Type parameter, like Implements a Interface, Constructor , and Derivation.

Enabling programmers to implement generic algorithms - By using generics, programmers can implement generic algorithms that work on collections of different types, can be customized, and are type-safe and easier to read.
Stronger type checks at compile time - A Java compiler applies strong type checking to generic code and issues errors if the code violates type safety. Fixing compile-time errors is easier than fixing runtime errors, which can be difficult to find.
Elimination of casts.

Related

How do interfaces like IEnumerable work without proper implementation?

From what I understand on interfaces, is that in order to use them, you must declare that a class is implementing it by adding the name of the interface after a colon and then, implement the methods.
I'm currently learning about Enumerators, IEnumerable etc. and this got me confused. Here's an example of what I mean:
static IEnumerable<int> Fibs(int fibCount)
{
for (int i = 0, prevFib = 1, curFib = 1; i < fibCount; i++) {
yield return prevFib;
int newFib = prevFib + curFib;
prevFib = curFib;
curFib = newFib;
}
}
IEnumerable seems to be a normal interface as any other, I even checked the method definition and that's what it pretty much seems like.
How is it possible that I can use an interface as a type/return type in the method definition and when/how do I know I should use certain interfaces as types like in this example?
EDIT: I really doubt it has anything to do with the yield keyword since a lot of interfaces are used as properties this way for example in MVC in Models and passed like it to Views. Example:
public IEnumerable<Category> Categories {get;set;}
There is extra magic when you use yield keyword, i.e. create an iterator block. The compiler makes a state machine for you.
So C# has a special feature here, and it only has this feature with IEnumerable<>. So it is the C# language which is magical.
The interface IEnumerable<> in itself is a boring ordinary type. No magic in it.
Note: Technically, the yield magic works when the "formal" return type is either IEnumerable<>, IEnumerator<>, IEnumerable, or IEnumerator, but usually you use the first of these. Do not go non-generic, of course.
IEnumerable is a special case. The yield return statement instructs the compiler to add the code that implements IEnumerable.
As for your edit:
If an interface is used a the type of a property, any object of a class that implements this interface can be assigned and the property will return an object that implements this interface. In your example, any collection of categories that implements IEnumerable<Category> can be assigned to the property, e.g. a List<Category>. In comparison to using just List<Category>, using the interface allows a wider range of objects to be assigned. The interface defines the abstract requirements that are relevant for the property.
In C#, interfaces are a special "kind" of type that differs from a class in a few key ways:
You cannot include any implementation code of their methods.
A single class can "inherit from" (called "implementing") as many as it wants.
You cannot instantiate a new instance of an interface.
(There is also some language features associated specifically with interfaces, such as explicit implementations, but those aren't important for this discussion.)
Beyond that, interfaces can be use pretty much anywhere you can use any other reference type. That includes defining fields, properties, or local variables with interface types, or using them as parameter types or return types on methods.
The trick is, if you define a property as, say, an IEnumerable<int>, and you want to set it's value, you cannot do this:
public IEnumerable<int> Numbers { get; set; }
...
this.Numbers = new IEnumerable<int>();
That's an error. You can't create a new instance of an interface, because it's just a "template" -- there's nothing "behind" it to actually do anything. However, you can do this:
public IEnumerable<int> Numbers { get; set; }
...
this.Numbers = new List<int>();
Because List<T> implements IEnumerable<T>, the compiler will automatically do the type conversion to make the assignment work. Any concrete class that implements IEnumerable<> can be assigned to a property of type IEnumerable<>, which is why you see interface property types so often. It allows you to change the underlying concrete type (maybe you want to change List<T> to ObservableCollection<T>, but the users of your class neither know nor care when you do.
The same goes for methods with an interface return type, except there's an extra option here that C# throws in as a bonus:
public IEnumerable<string> GetName()
{
// this fails.
return new IEnumerable<string>();
// this works.
return new List<String>();
// this also works because magic!~
yield return "hello";
yield return "there";
yield return "!";
}
That last case is a special form of "syntactic sugar" that C# provides because it's such a common requirement. As other people have mentioned, the compiler specifically looks for yield return statements on methods that return IEnumerable or IEnumerator (both generic and non-generic versions) and does some hefty rewriting of the code.
Behind the scenes, C# is creating a hidden class, which implements IEnumerable<string>, and implementing it's GetEnumerator method to return an IEnumerator<string> object that provides those three string values. That would be a lot of boilerplate code for you to write, although you could certainly write it yourself. In previous versions of C#, there was no yield and you did have to write it yourself.
If you really want to know, you can find the C# equivalent here, among other places. In essence, it takes the method that contains your yield statements and creates a IEnumerator<>.MoveNext method out of it, but turning it into a state machine. It uses the equivalent of labels and goto statements to jump back to the correct place each time a consumer calls MoveNext on the same instance. Also, as I understand it, it does things that you can't actually do in C# (it jumps into and out of loops) but that are legal in IL code, so it's implementation is more efficient that what you could write yourself.
But once you get past the secret cause of the yield keyword, you're still doing the same thing. You're still creating a class, that implements IEnumerable<>, and using that as the return value for your method.
For your specific example, even though the method return type is IEnumerable<int>, the actual return type will be a type that implements IEnumerable<int>. As others have mentioned, the ‘yield return’ results in a type which implements IEnumerable<int>. In this specific case, you don’t know what that type is. When I run this through the debugger and do a result.GetType() (where result is returned from the Fibs method), I see that result is of type <Fibs>d__0, which sounds a little strange to me. So instead of worrying about that strange-sounding type, we can just treat it like an IEnumerable<int>, because all we really want to be able to do is to iterate over it, which is the behavior that IEnumerable<int> exposes.
That’s for your specific example, but the idea is the same elsewhere. By using an interface, you are saying that you don’t care exactly what type is used/returned, as long as it exposes certain behavior or properties. For example if I have an interface IFoo that exposes a method DoSomething(), and I have a method that returns IFoo, then I can return anything that implements IFoo, but no matter what I return, the caller of the method is sure that it can DoSomething() with the object. Similarly, if I have a method that takes an IFoo, then the method is sure that it will be able to DoSomething() with that parameter.
For me interfaces also help me design my classes. For example I write the interface first to determine what’s important for the class to be able to do and have, and then I create the concrete class that implements that interface. And when I’m writing code that uses the class, I ask for the interface rather than the concrete type.
And there’s all sorts of other reasons to use interfaces, for example to create mocks for testing, for dependency injection, for fluent API design, and probably a hundred other reasons.

Casting from interface to underlying type

Reviewing an earlier question on SO, I started thinking about the situation where a class exposes a value, such as a collection, as an interface implemented by the value's type. In the code example below, I am using a List, and exposing that list as IEnumerable.
Exposing the list through the IEnumerable interface defines the intent that the list only be enumerated over, not modified. However, since the instance can be re-cast back to a list, the list itself can of course be modified.
I also include in the sample a version of the method that prevents modification by copying the list item references to a new list each time the method is called, thereby preventing changes to the underlying list.
So my question is, should all code exposing a concrete type as an implemented interface do so by means of a copy operation? Would there be value in a language construct that explicitly indicates "I want to expose this value through an interface, and calling code should only be able to use this value through the interface"? What techniques do others use to prevent unintended side-effects like these when exposing concrete values through their interfaces.
Please note, I understand that the behavior illustrated is expected behavior. I am not claiming this behavior is wrong, just that it does allow use of functionality other than the expressed intent. Perhaps I am assigning too much significance to the interface - thinking of it as a functionality constraint. Thoughts?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace TypeCastTest
{
class Program
{
static void Main(string[] args)
{
// Demonstrate casting situation
Automobile castAuto = new Automobile();
List<string> doorNamesCast = (List<string>)castAuto.GetDoorNamesUsingCast();
doorNamesCast.Add("Spare Tire");
// Would prefer this prints 4 names,
// actually prints 5 because IEnumerable<string>
// was cast back to List<string>, exposing the
// Add method of the underlying List object
// Since the list was cast to IEnumerable before being
// returned, the expressed intent is that calling code
// should only be able to enumerate over the collection,
// not modify it.
foreach (string doorName in castAuto.GetDoorNamesUsingCast())
{
Console.WriteLine(doorName);
}
Console.WriteLine();
// --------------------------------------
// Demonstrate casting defense
Automobile copyAuto = new Automobile();
List<string> doorNamesCopy = (List<string>)copyAuto.GetDoorNamesUsingCopy();
doorNamesCopy.Add("Spare Tire");
// This returns only 4 names,
// because the IEnumerable<string> that is
// returned is from a copied List<string>, so
// calling the Add method of the List object does
// not modify the underlying collection
foreach (string doorName in copyAuto.GetDoorNamesUsingCopy())
{
Console.WriteLine(doorName);
}
Console.ReadLine();
}
}
public class Automobile
{
private List<string> doors = new List<string>();
public Automobile()
{
doors.Add("Driver Front");
doors.Add("Passenger Front");
doors.Add("Driver Rear");
doors.Add("Passenger Rear");
}
public IEnumerable<string> GetDoorNamesUsingCopy()
{
return new List<string>(doors).AsEnumerable<string>();
}
public IEnumerable<string> GetDoorNamesUsingCast()
{
return doors.AsEnumerable<string>();
}
}
}
One way you can prevent this is by using AsReadOnly() to prevent any such nefariousness. I think the real answer is though, you should never be relying on anything other than the exposed interface/contract in terms of the return types, etc. Doing anything else defies encapsulation, prevents you from swapping out your implementations for others that don't use a List but instead just a T[], etc, etc.
Edit:
And down-casting like you mention is basically a violation of the Liskov Substition Principle, to get all technical and stuff.
In a situation like this, you could define your own collection class which implements IEnumerable<T>. Internally, your collection could keep a List<T> and then you could just return the enumerator of the underlying list:
public class MyList : IEnumerable<string>
{
private List<string> internalList;
// ...
IEnumerator<string> IEnumerable<string>.GetEnumerator()
{
return this.internalList.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.internalList.GetEnumerator();
}
}
An interface is a constraint on the implementation of a minimum set of things it must do (even if "doing" is no more than throwing a NotSupportedException; or even a NotImplementedException). It is not a constraint that either prevents the implementation from doing more, or on the calling code.
One thing I've learned working with .NET (and with some people who are quick to jump to a hack solution) is that if nothing else, reflection will often allow people to by pass your "protections."
Interfaces are not iron shackles of programming, they're a promise that your code makes to any other code saying "I can definitely do these things." If you "cheat" and cast the interface object into some other object because you, the programmer, know something that the program doesn't, then you're breaking that contract. The consequence is poorer maintainability and a reliance that no one ever mess up anything in that chain of execution, lest some other object get sent down that doesn't cast correctly.
Other tricks like making things readonly or hiding the actual list behind a wrapper are only stop-gaps. You could easily dig into the type using reflection to pull out the private list if you really wanted it. And I think there are attributes you can apply to types to prevent people from reflecting into them.
Likewise, readonly lists aren't really. I could probably figure out a way to modify the list itself. And I can almost certainly modify the items on the list. So a readonly isn't enough, nor is a copy or an array. You need a deep copy (clone) of the original list in order to actually protect the data, to some degree.
But the real question is, why are you fighting so hard against the contract that you wrote. Sometimes reflection hacking is a handy workaround when someone else's library is poorly designed and didn't expose something that it needs to (or a bug requires that you go digging to fix it.) But when you have control over the interface AND the consumer of the interface, there's no excuse to not make the publicly exposed interface as robust as you need it to be to get your work done.
Or in short: If you need a list, don't return IEnumerable, return a List. If you've got an IEnumerable but you actually needed a list, then its safer to make a new list from that IEnum and use that. There are very few reasons (and even fewer, maybe no, good reasons) to cast up to a type simply because "I know it's actually a list, so this will work."
Yeah, you can take steps to try and prevent people from doing that, but 1) the harder you fight people who insist on breaking the system, the harder they will try to break it and 2) they're only looking for more rope, and eventually they'll get enough to hang themselves.

Are there drawbacks to creating a class that encapsulates Generic Collection?

A part of my (C# 3.0 .NET 3.5) application requires several lists of strings to be maintained. I declare them, unsurprisingly, as List<string> and everything works, which is nice.
The strings in these Lists are actually (and always) Fund IDs. I'm wondering if it might be more intention-revealing to be more explicit, e.g.:
public class FundIdList : List<string> { }
... and this works as well. Are there any obvious drawbacks to this, either technically or philosophically?
I would start by going in the other direction: wrapping the string up into a class/struct called FundId. The advantage of doing so, I think, is greater than the generic list versus specialised list.
You code becomes type-safe: there is a lot less scope for you to pass a string representing something else into a method that expects a fund identifier.
You can constrain the strings that are valid in the constructor to FundId, i.e. enforce a maximum length, check that the code is in the expected format, &c.
You have a place to add methods/functions relating to that type. For example, if fund codes starting 'I' are internal funds you could add a property called IsInternal that formalises that.
As for FundIdList, the advantage to having such a class is similar to point 3 above for the FundId: you have a place to hook in methods/functions that operate on the list of FundIds (i.e. aggregate functions). Without such a place, you'll find that static helper methods start to crop up throughout the code or, in some static helper class.
List<> has no virtual or protected members - such classes should almost never be subclassed. Also, although it's possible you need the full functionality of List<string>, if you do - is there much point to making such a subclass?
Subclassing has a variety of downsides. If you declare your local type to be FundIdList, then you won't be able to assign to it by e.g. using linq and .ToList since your type is more specific. I've seen people decide they need extra functionality in such lists, and then add it to the subclassed list class. This is problematic, because the List implementation ignores such extra bits and may violate your constraints - e.g. if you demand uniqueness and declare a new Add method, anyone that simply (legally) upcasts to List<string> for instance by passing the list as a parameter typed as such will use the default list Add, not your new Add. You can only add functionality, never remove it - and there are no protected or virtual members that require subclassing to exploit.
So you can't really add any functionality you couldn't with an extension method, and your types aren't fully compatible anymore which limits what you can do with your list.
I prefer declaring a struct FundId containing a string and implementing whatever guarantees concerning that string you need there, and then working with a List<FundId> rather than a List<string>.
Finally, do you really mean List<>? I see many people use List<> for things for which IEnumerable<> or plain arrays are more suitable. Exposing your internal List in an api is particularly tricky since that means any API user can add/remove/change items. Even if you copy your list first, such a return value is still misleading, since people might expect to be able to add/remove/change items. And if you're not exposing the List in an API but merely using it for internal bookkeeping, then it's not nearly as interesting to declare and use a type that adds no functionality, only documentation.
Conclusion
Only use List<> for internals, and don't subclass it if you do. If you want some explicit type-safety, wrap string in a struct (not a class, since a struct is more efficient here and has better semantics: there's no confusion between a null FundId and a null string, and object equality and hashcode work as expected with structs but need to be manually specified for classes). Finally, expose IEnumerable<> if you need to support enumeration, or if you need indexing as well use the simple ReadOnlyCollection<> wrapper around your list rather than let the API client fiddle with internal bits. If you really need a mutatable list API, ObservableCollection<> at least lets you react to changes the client makes.
Personally I would leave it as a List<string>, or possibly create a FundId class that wraps a string and then store a List<FundId>.
The List<FundId> option would enforce type correct-ness and allow you to put some validation on FundIds.
Just leave it as a List<string>, you variable name is enough to tell others that it's storing FundIDs.
var fundIDList = new List<string>();
When do I need to inherit List<T>?
Inherit it if you have really special actions/operations to do to a fund id list.
public class FundIdList : List<string>
{
public void SpecialAction()
{
//can only do with a fund id list
//sorry I can't give an example :(
}
}
Unless I was going to want someone to do everything they could to List<string>, without any intervention on the part of FundIdList I would prefer to implement IList<string> (or an interface higher up the hierarchy if I didn't care about most of that interface's members) and delegate calls to a private List<string> when appropriate.
And if I did want someone to have that degree of control, I'd probably just given them a List<string> in the first place. Presumably you have something to make sure such strings actually are "Fund IDs", which you can't guarantee any more when you publicly use inheritance.
Actually, this sounds (and often does with List<T>) like a natural case for private inheritance. Alas, C# doesn't have private inheritance, so composition is the way to go.

Does the C# 4.0 "dynamic" keyword make Generics redundant?

I'm very excited about the dynamic features in C# (C#4 dynamic keyword - why not?), especially because in certain Library parts of my code I use a lot of reflection.
My question is twofold:
1. does "dynamic" replace Generics, as in the case below?
Generics method:
public static void Do_Something_If_Object_Not_Null<SomeType>(SomeType ObjToTest) {
//test object is not null, regardless of its Type
if (!EqualityComparer<SomeType>.Default.Equals(ObjToTest, default(SomeType))) {
//do something
}
}
dynamic method(??):
public static void Do_Something_If_Object_Not_Null(dynamic ObjToTest) {
//test object is not null, regardless of its Type?? but how?
if (ObjToTest != null) {
//do something
}
}
2. does "dynamic" now allow for methods to return Anonymous types, as in the case below?:
public static List<dynamic> ReturnAnonymousType() {
return MyDataContext.SomeEntities.Entity.Select(e => e.Property1, e.Property2).ToList();
}
cool, cheers
EDIT:
Having thought through my question a little more, and in light of the answers, I see I completely messed up the main generic/dynamic question. They are indeed completely different. So yeah, thanks for all the info.
What about point 2 though?
dynamic might simplify a limited number of reflection scenarios (where you know the member-name up front, but there is no interface) - in particular, it might help with generic operators (although other answers exist) - but other than the generic operators trick, there is little crossover with generics.
Generics allow you to know (at compile time) about the type you are working with - conversely, dynamic doesn't care about the type.
In particular - generics allow you to specify and prove a number of conditions about a type - i.e. it might implement some interface, or have a public parameterless constructor. dynamic doesn't help with either: it doesn't support interfaces, and worse than simply not caring about interfaces, it means that we can't even see explicit interface implementations with dynamic.
Additionally, dynamic is really a special case of object, so boxing comes into play, but with a vengence.
In reality, you should limit your use of dynamic to a few cases:
COM interop
DLR interop
maybe some light duck typing
maybe some generic operators
For all other cases, generics and regular C# are the way to go.
To answer your question. No.
Generics gives you "algorithm reuse" - you write code independent of a data Type. the dynamic keyword doesn't do anything related to this. I define List<T> and then i can use it for List of strings, ints, etc...
Type safety: The whole compile time checking debate. Dynamic variables will not alert you with compile time warnings/errors in case you make a mistake they will just blow up at runtime if the method you attempt to invoke is missing. Static vs Dynamic typing debate
Performance : Generics improves the performance for algorithms/code using Value types by a significant order of magnitude. It prevents the whole boxing-unboxing cycle that cost us pre-Generics. Dynamic doesn't do anything for this too.
What the dynamic keyword would give you is
simpler code (when you are interoperating with Excel lets say..) You don't need to specify the name of the classes or the object model. If you invoke the right methods, the runtime will take care of invoking that method if it exists in the object at that time. The compiler lets you get away even if the method is not defined. However it implies that this will be slower than making a compiler-verified/static-typed method call since the CLR would have to perform checks before making a dynamic var field/method invoke.
The dynamic variable can hold different types of objects at different points of time - You're not bound to a specific family or type of objects.
To answer your first question, generics are resolved compile time, dynamic types at runtime. So there is a definite difference in type safety and speed.
Dynamic classes and Generics are completely different concepts. With generics you define types at compile time. They don't change, they are not dynamic. You just put a "placeholder" to some class or method to make the calling code define the type.
Dynamic methods are defined at runtime. You don't have compile-time type safety there. The dynamic class is similar as if you have object references and call methods by its string names using reflection.
Answer to the second question: You can return anonymous types in C# 3.0. Cast the type to object, return it and use reflection to access it's members. The dynamic keyword is just syntactic sugar for that.

Will C#4 allow "dynamic casting"? If not, should C# support it?

I don't mean dynamic casting in the sense of casting a lower interface or base class to a more derived class, I mean taking an interface definition that I've created, and then dynamically casting to that interface a different object NOT derived from that interface but supporting all the calls.
For example,
interface IMyInterface
{
bool Visible
{
get;
}
}
TextBox myTextBox = new TextBox();
IMyInterface i = (dynamic<IMyInterface>)myTextBox;
This could be achieved at compile time for known types, and runtime for instances declared with dynamic. The interface definition is known, as is the type (in this example) so the compiler can determine if the object supports the calls defined by the interface and perform some magic for us to have the cast.
My guess is that this is not supported in C#4 (I was unable to find a reference to it), but I'd like to know for sure. And if it isn't, I'd like to discuss if it should be included in a future variant of the language or not, and the reasons for and against. To me, it seems like a nice addition to enable greater polymorphism in code without having to create whole new types to wrap existing framework types.
Update
Lest someone accuse me of plagiarism, I was not aware of Jon Skeet having already proposed this. However, nice to know we thought of exceedingly similar syntax, which suggests it might be intuitive at least. Meanwhile, "have an original idea" remains on my bucket list for another day.
I think Jon Skeet has had such a proposal (http://msmvps.com/blogs/jon_skeet/archive/2008/10/30/c-4-0-dynamic-lt-t-gt.aspx), but so far, I haven't heard that C# 4.0 is going to have it.
I think that's problematic. You are introducing coupling between two classes which are not coupled.
Consider the following code.
public interface IFoo
{
int MethodA();
int MethodB();
}
public class Bar
{
int MethodA();
int MethodB();
}
public class SomeClass
{
int MethodFoo(IFoo someFoo);
}
should this then be legal?
int blah = someClass.MethodFoo((dynamic<IFoo>)bar);
It seems like it should be legal, because the compiler should be able to dynamically type bar as something that implements IFoo.
However, at this point you are coupling IFoo and Bar through a call in a completely separate part of your code.
If you edit Bar because it no longer needs MethodB, suddenly someClass.MethodFood doesn't work anymore, even though Bar and IFoo are not related.
In the same way, if you add MethodC() to IFoo, your code would break again, even though IFoo and Bar are ostensibly not related.
The fact is, although this would be useful in select cases where there are similarities amongst objects that you do not control, there is a reason that interfaces have to be explicitly attached to objects, and the reason is so that the compiler can guarantee that the object implements it.
There is no need for C# to support this, as it can be implemented very cleanly as library.
I've seen three or four separate implementations (I started writing one myself before I found them). Here's the most thorough treatment I've seen:
http://bartdesmet.net/blogs/bart/archive/2008/11/10/introducing-the-c-ducktaper-bridging-the-dynamic-world-with-the-static-world.aspx
It will probably be even easier to implement once the DLR is integrated into the runtime.
Because the wrapper/forwarder class for a given interface can be generated once and then cached, and then a given object of unknown type can be wrapped once, there is a lot of scope for caching of call sites, etc. so the performance should be excellent.
In contrast, I think the dynamic keyword, which is a language feature, and a hugely complex one, is an unnecessary and potentially disastrous digression, shoe-horned into a language that previously had a very clear static typing philosophy, which gave it an obvious direction for future improvement. They should have stuck to that and made the type inference work better and better until typing became more invisible. There are so many areas where they could evolve the language, without breaking existing programs, and yet they don't, simply due to resource constraints (e.g. the reason var can't be used in more places is because they would have to rewrite the compiler and they don't have time).
They are still doing good stuff in C# 4.0 (the variance features) but there is so much else that could be to be done to make the type system smarter, more automatic, more powerful at detecting problems at compile time. Instead, we're essentially getting a gimmick.
The opensource framework Impromptu-Interface does this using the C# 4 and the dlr.
using ImpromptuInterface;
interface IMyInterface
{
bool Visible
{
get;
}
}
TextBox myTextBox = new TextBox();
IMyInterface i = myTextBox.ActLike<IMyInterface>();
Since it uses the dlr it will also work with ExpandoObject and DynamicObject.

Categories