Why doesn't IEnumerable<T> implement Add(T)? - c#

Just now find it by chance, Add(T) is defined in ICollection<T>, instead of IEnumerable<T>. And extension methods in Enumerable.cs don't contain Add(T), which I think is really weird. Since an object is enumerable, it must "looks like" a collection of items. Can anyone tell me why?

An IEnumerable<T> is just a sequence of elements; see it as a forward only cursor. Because a lot of those sequences are generating values, streams of data, or record sets from a database, it makes no sense to Add items to them.

IEnumerable is for reading, not for writing.

An enumerable is exactly that - something you can enumerate over and discover all the items. It does not imply that you can add to it.
Being able to enumerate is universal to many types of objects. For example, it is shared by arrays and collections. But you can't 'add' to an array without messing about with it's structure - whereas a Collection is specifically built to be added to and removed from.
Technically you can 'add' to an enumerable, however - by using Concat<> - however all this does is create an enumerator that enumerates from one enumerable to the next - giving the illusion of a single contigious set.

Each ICollection should be IEnumerable (I think, and the .NET Framework team seems to agree with me ;-)), but the other way around does not always make sense. There is a hierarchy of "collection like objects" in this world, and your assumption that an enumerable would be a collection you can add items to does not hold true in that hierarchy.
Example: a list of primary color names would be an IEnumerable returning "Red", "Blue" and "Green". It would make no logical sense at all to be able to do a primaryColors.Add("Bright Purple") on a "collection" filled like this:
...whatever...
{
...
var primaryColors = EnumeratePrimaryColors();
...
}
private static IEnumerable<string> EnumeratePrimaryColors() {
yield return "Red";
yield return "Blue";
yield return "Green";
}

As its name says, you can enumerate (loop) over an IEnumerable, and that's about it.
When you want to be able to Add something to it, it wouldn't be just an enumerable anymore, since it has extra features.
For instance, an array is an IEnumerable, but an array has a fixed length, so you can't add new items to it.
IEnumerable is just the 'base' for all kind of collections (even readonly collections - which have obviously no Add() method).
The more functionality you'd add to such 'base interface', the more specific it would be.

The name says it all. IEnumerable is for enumerating items only. ICollection is the actual collection of items and thus supports the Add method.

Related

How to count the items of an IEnumerable?

Given an instance IEnumerable o how can I get the item Count? (without enumerating through all the items)
For example, if the instance is of ICollection, ICollection<T> and IReadOnlyCollection<T>, each of these interfaces have their own Count method.
Is getting the Count property by reflection the only way?
Instead, can I check and cast o to ICollection<T> for example, so I can then call Count ?
It depends how badly you want to avoid enumerating the items if the count is not available otherwise.
If you can enumerate the items, you can use the LINQ method Enumerable.Count. It will look for a quick way to get the item count by casting into one of the interfaces. If it can't, it will enumerate.
If you want to avoid enumeration at any cost, you will have to perform a type cast. In a real life scenario you often will not have to consider all the interfaces you have named, since you usually use one of them (IReadOnlyCollection is rare and ICollection only used in legacy code). If you have to consider all of the interfaces, try them all in a separate method, which can be an extension:
static class CountExtensions {
public static int? TryCount<T>(this IEnumerable<T> items) {
switch (items) {
case ICollection<T> genCollection:
return genCollection.Count;
case ICollection legacyCollection:
return legacyCollection.Count;
case IReadOnlyCollection<T> roCollection:
return roCollection.Count;
default:
return null;
}
}
}
Access the extension method with:
int? count = myEnumerable.TryCount();
IEnumerable doesn't promise a count . What if it was a random sequence or a real time data feed from a sensor? It is entirely possible for the collection to be infinitely sized. The only way to count them is to start at zero and increment for each element that the enumerator provides. Which is exactly what LINQ does, so don't reinvent the wheel. LINQ is smart enough to use .Count properties of collections that support this.
The only way to really cover all your possible types for a collection is to use the generic interface and call the Count-method. This also covers other types such as streams or just iterators. Furthermore it will use the Count-property as of Count property vs Count() method? to avoid unneccessary overhead.
If you however have a non-generic collection you´d have to use reflection to use the correct property. However this is cumbersome and may fail if your collection doesn´t even have the property (e.g. an endless stream or just an iterator). On the other hand IEnumerable<T>.Count() will handle those types with the optimization mentioned above. Only if neccessary it will iterate the entire collection.

Is IList really a better alternative to an array

Today I read a post from Eric Lippert that describes the harm of arrays. It´s mentioned that when we need a collection of values we should provide values, not a variable that points to a list of values. Thus Eric suggests that whenever we want to return a collection of items in a method, we should return an IList<T>, which provides the same as an array, namely it:
enables indexed access
enables iterating the items
is strongly typed
However in contrast to an array the list also provides members that add or remove items and thus modify the collection-object. We could of course wrap the collection into a ReadOnlyCollection and return an IEnumerable<T> but then we´d lose the indexed accessability. Moreover a caller can´t know if the ReSharper-warning "possible iterations of the same enumerations" applies as he doesn´t know that internally that enumeration is just a list wrapped within a ReadOnlyCollection. So a caller can´t know if the collection was already materialized or not.
So what I want is a collection of items where the collection itself is immutable (the items however don´t have to be, they won´t on an IList neither), meaning we can´t add/remove/insert items to the underlying list. However it seems weird to me returning a ReadOnlyCollection from my API, at least I´ve never seen an API doing so.
Thus array seems perfect for my needs, doesn´t it?
We could of course wrap the collection into a ReadOnlyCollection and return an IEnumerable<T>
Why do that? ReadOnlyCollection<T> implements IList<T>, so unless there's some better approach, declaring a return type of IList<T> and returning an instance of ReadOnlyCollection<T> seems like a good way to go.
However, it just so happens that in current .NET Framework versions, there is a slightly better way: return an instance of ReadOnlyCollection<T>, but specify a return type of IReadOnlyList<T>. While IList<T> doesn't really promise to allow modification by the caller, IReadOnlyList<T> is explicit about the intent.

Why IReadOnlyCollection has ElementAt but not IndexOf

I am working with a IReadOnlyCollection of objects.
Now I'm a bit surprised, because I can use linq extension method ElementAt(). But I don't have access to IndexOf().
This to me looks a bit illogical: I can get the element at a given position, but I cannot get the position of that very same element.
Is there a specific reason for it?
I've already read -> How to get the index of an element in an IEnumerable? and I'm not totally happy with the response.
IReadOnlyCollection is a collection, not a list, so strictly speaking, it should not even have ElementAt(). This method is defined in IEnumerable as a convenience, and IReadOnlyCollection has it because it inherits it from IEnumerable. If you look at the source code, it checks whether the IEnumerable is in fact an IList, and if so it returns the element at the requested index, otherwise it proceeds to do a linear traversal of the IEnumerable until the requested index, which is inefficient.
So, you might ask why IEnumerable has an ElementAt() but not IndexOf(), but I do not find this question very interesting, because it should not have either of these methods. An IEnumerable is not supposed to be indexable.
Now, a very interesting question is why IReadOnlyList has no IndexOf() either.
IReadOnlyList<T> has no IndexOf() for no good reason whatsoever.
If you really want to find a reason to mention, then the reason is historical:
Back in the mid-nineties when C# was laid down, people had not quite started to realize the benefits of immutability and readonlyness, so the IList<T> interface that they baked into the language was, unfortunately, mutable.
The right thing would have been to come up with IReadOnlyList<T> as the base interface, and make IList<T> extend it, adding mutation methods only, but that's not what happened.
IReadOnlyList<T> was invented a considerable time after IList<T>, and by that time it was too late to redefine IList<T> and make it extend IReadOnlyList<T>. So, IReadOnlyList<T> was built from scratch.
They could not make IReadOnlyList<T> extend IList<T>, because then it would have inherited the mutation methods, so they based it on IReadOnlyCollection<T> and IEnumerable<T> instead. They added the this[i] indexer, but then they either forgot to add other methods like IndexOf(), or they intentionally omitted them since they can be implemented as extension methods, thus keeping the interface simpler. But they did not provide any such extension methods.
So, here, is an extension method that adds IndexOf() to IReadOnlyList<T>:
using Collections = System.Collections.Generic;
public static int IndexOf<T>( this Collections.IReadOnlyList<T> self, T elementToFind )
{
int i = 0;
foreach( T element in self )
{
if( Equals( element, elementToFind ) )
return i;
i++;
}
return -1;
}
Be aware of the fact that this extension method is not as powerful as a method built into the interface would be. For example, if you are implementing a collection which expects an IEqualityComparer<T> as a construction (or otherwise separate) parameter, this extension method will be blissfully unaware of it, and this will of course lead to bugs. (Thanks to Grx70 for pointing this out in the comments.)
It is because the IReadOnlyCollection (which implements IEnumerable) does not necessarily implement indexing, which often required when you want to numerically order a List. IndexOf is from IList.
Think of a collection without index like Dictionary for example, there is no concept of numeric index in Dictionary. In Dictionary, the order is not guaranteed, only one to one relation between key and value. Thus, collection does not necessarily imply numeric indexing.
Another reason is because IEnumerable is not really two ways traffic. Think of it this way: IEnumerable may enumerate the items x times as you specify and find the element at x (that is, ElementAt), but it cannot efficiently know if any of its element is located in which index (that is, IndexOf).
But yes, it is still pretty weird even you think it this way as would expect it to have either both ElementAt and IndexOf or none.
IndexOf is a method defined on List, whereas IReadOnlyCollection inherits just IEnumerable.
This is because IEnumerable is just for iterating entities. However an index doesn't apply to this concept, because the order is arbitrary and is not guaranteed to be identical between calls to IEnumerable. Furthermore the interface simply states that you can iterate a collection, whereas List states you can perform adding and removing also.
The ElementAt method sure does exactly this. However I won't use it as it reiterates the whole enumeration to find one single element. Better use First or just a list-based approach.
Anyway the API design seems odd to me as it allows an (inefficient) approach on getting an element at n-th position but does not allow to get the index of an arbitrary element which would be the same inefficient search leading to up to n iterations. I'd agree with Ian on either both (which I wouldn't recommend) or neither.
IReadOnlyCollection<T> has ElementAt<T>() because it is an extension to IEnumerable<T>, which has that method. ElementAt<T>() iterates over the IEnumerable<T> a specified number of iterations and returns value as that position.
IReadOnlyCollection<T> lacks IndexOf<T>() because, as an IEnumerable<T>, it does not have any specified order and thus the concept of an index does not apply. Nor does IReadOnlyCollection<T> add any concept of order.
I would recommend IReadOnlyList<T> when you want an indexable version of IReadOnlyCollection<T>. This allows you to correctly represent an unchangeable collection of objects with an index.
This extension method is almost the same as Mike's. The only difference is that it includes a predicate, so you can use it like this: var index = list.IndexOf(obj => obj.Id == id)
public static int IndexOf<T>(this IReadOnlyList<T> self, Func<T, bool> predicate)
{
for (int i = 0; i < self.Count; i++)
{
if (predicate(self[i]))
return i;
}
return -1;
}

IEnumerable to List

Why IEnumerable.ToList() won't work if like:
var _listReleases= new List<string>;
_listReleases.Add("C#")
_listReleases.Add("Javascript");
_listReleases.Add("Python");
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
_listReleases.Clear();
_listReleases.AddRange(sortedItems); // won't work
_listReleases.AddRange(sortedItems.ToList()); // won't work
Note: _listRelealse will be null
It doesn't work because of this line:
_listReleases.Clear();
First of all, _listReleases is not null at this point. It's merely empty, which is a completely different thing.
But to explain why this doesn't work as you expect: the IEnumerable interface type does not actually allocate or reserve storage for anything. It represents an object that you can use with a foreach loop, and nothing more. It does not actually need to store the items in the collection itself.
Sometimes, an IEnumerable reference does have those items in the same object, but it doesn't have to. That's what's going on here. The OrderBy() extension method only creates an object that knows how to look at the original list and return the items in a specific order. But this does not have storage for those items. It still depends on it's original data source.
The best solution for this situation is to stop using the _listReleases variable at this point, and instead just use the sortedItems variable. As long the former is not garabage collected, the latter will do what you need. But if you really want the _listReleases variable, you can do it like this:
_listReleases = sortedItems.ToList();
Now back to IEnumerables. There are some nice benefits to this property of not requiring immediate storage of the items themselves, and merely abstracting the ability to iterate over a collection:
Lazy Evaluation - That the work required to produce those items is not done until called for (and often, that means it won't need to be done all all, greatly improving performance).
Composition - An IEnumerable object can be modified during a program to incorprate new sets of rules or operations into the final result. This reduces program complexity and improves maintainability by allowing you to break apart a complex set of sorting or filtering requirements into it's component parts. This also makes it much easier to build a program where these rules can be easily determined by the user at run time, instead of in advance by the programmer at compile time.
Memory Efficiency - An IEnumerable makes it possible to iterate collections of data from sources such as a database in ways that only need to keep the current record loaded into memory at any given time. This feature can also be used to create unbounded collections: sets of items that may stretch on to infinity. You can build an IEnumerable with the BigInteger type to calculate the next prime on to infinity, if asked for. Moreover, you could use that collection in a useful way without crashing or hanging the program by combining this with the composition feature, so the program will know when to stop.
LINQ is lazily evaluated. When you run this line:
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
You aren't actually ordering the items right then and there. Instead you're building an enumerable that will, when enumerated, return the objects that are currently in _listReleases in order. So when you Clear() the list, it no longer has any items to order.
You need to force it to evaluate before you clear _listReleases. An easy way to do this is to add a ToList() call. Also, the type IEnumerable isn't compatible with AddRange won't accept it. You can just use var to implicitly type it to List<string>, which will work because List<T> : IEnumerable<T> (it implements the interface).
var sortedItems = _listReleases.OrderBy(x => x).ToList();
_listReleases.Clear();
_listReleases.AddRange(sortedItems);
You should also note that methods like ToList() are extension methods for IEnumerable<T>, not IEnumerable, so ((IEnumerable)something).ToList() won't work. Unlike, say, Java, Something<T> and Something are completely distinct types in C#.

Why create an IEnumerable?

I don't understand why I'd create an IEnumerable. Or why it's important.
I'm looking at the example for IEnumerable:
http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx
But I can basically do the same thing if I just went:
List<Person> people = new List<Person>();
so what's IEnumerable good for? Can you give me a situation where I'd need to create a class that implements IEnumerable?
IEnumerable is an interface, it exposes certain things to the outside. While you are completely right, you could just use a List<T>, but List<T> is very deep in the inheritance tree. What exactly does a List<T>? It stores items, it offers certain methods to Add and Remove. Now, what if you only need the "item-keeping" feature of a List<T>? That's what an IEnumerable<T> is - an abstract way of saying "I want to get a list of items I can iterate over". A list is "I want to get a collection which I can modify, can access by index and iterate". List<T> offers a lot more functionality than IEnumerable<T> does, but it takes up more memory. So if a method is taking an IEnumerable<T>, it doesn't care what exactly it gets, as long as the object offers the possibilites of IEnumerable<T>.
Also, you don't have to create your own IEnumerable<T>, a List<T> IS an IEnumerable<T>!
Lists are, of course IEnumerable - As a general rule, you want to be specific on what you output but broad on what you accept as input eg:
You have a sub which loops through a list of objects and writes something to the console...
You could declare the parameter is as either IEnumerable<T> or IList<T> (or even List<T>). Since you don't need to add to the input list, all you actually need to do is enumerate - so use IEnumerable - then your method will also accept other types which implement IEnumerable including IQueryable, Linked Lists, etc...
You're making your methods more generic for no cost.
Today, you generally wouldn't use IEnumerable anymore unless you were supporting software on an older version of the framework. Today, you'd normally use IEnumerable<T>. Amongst other benefits, IEnumerable fully implements all of the LINQ operations/extensions so that you can easily query any List type that implements IEnumerable<T> using LINQ.
Additionally, it doesn't tie the consumer of your code to a particular collection implementation.
It's rare that nowdays you need to create your own container classes, as you are right there alreay exists many good implementations.
However if you do create your own container class for some specific reason, you may like to implement IEnumerable or IEnumerable<T> because they are a standard "contract" for itteration and by providing an implementation you can take advantage of methods/apis that want an IEnumerable or IEnumerable<T> Linq for example will give you a bunch of useful extension methods for free.
An IList can be thought of as a particular implementation of IEnumerable. (One that can be added to and removed from easily.) There are others, such as IDictionary, which performs an entirely different function but can still be enumerated over. Generally, I would use IEnumerable as a more generic type reference when I only need an enumeration to satisfy a requirement and don't particularly care what kind it is. I can pass it an IList and more often than not I do just that, but the flexibility exists to pass it other enumerations as well.
Here is one situation that I think I have to implement IEnumerable but not using List<>
I want to get all items from a remote server. Let say I have one million items going to return. If you use List<> approach, you need to cache all one million items in the memory first. In some cases, you don't really want to do that because you don't want to use up too much memory. Using IEnumerable allows you to display the data on the screen and then dispose it right away. Therefore, using IEnumerable approach, the memory footprint of the program is much smaller.
It's my understanding that IEnumerable is provided to you as an interface for creating your own enumerable class types.
I believe a simple example of this would be recreating the List type, if you wanted to have your own set of features (or lack thereof) for it.
What if you want to enumerate over a collection that is potentially of infinite size, such as the Fibonacci numbers? You couldn't do that easily with a list, but if you had a class that implemented IEnumerable or IEnumerable<T>, it becomes easy.
When a built in container fits your needs you should definitely use that, and than IEnumerable comes for free. When for whatever reason you have to implement your own container, for example if it must be backed by a DB, than you should make sure to implement both IEnumerable and IEnumerable<T> for two reasons:
It makes foreach work, which is awesome
It enables almost all LINQ goodness. For example you will be able to filter your container down to objects that match a condition with an elegant one liner.
IEnumerable provides means for your API users (including yourself) to use your collection by the means of a foreach. For example, i implemented IENumerable in my Binary Tree class so i could just foreach over all of the items in the tree without having to Ctrl+C Ctrl+V all the logic required to traverse the tree InOrder.
Hope it helps :)
IEnumerable is useful if you have a collection or method which can return a bunch of things, but isn't a Dictionary, List, array, or other such predefined collection. It is especially useful in cases where the set of things to be returned might not be available when one starts outputting it. For example, an object to access records in a database might implement iEnumerable. While it might be possible for such an object to read all appropriate records into an array and return that, that may be impractical if there are a lot of records. Instead, the object could return an enumerator which could read the records in small groups and return them individually.

Categories