Searched in internet for What is IEnumerable interface in C#? The problem it solves? What if we don't use it? But never really did not get much. Lots of posts explain how to implement it.
I've also found the following example
List<string> List = new List<string>();
List.Add("Sourav");
List.Add("Ram");
List.Add("Sachin");
IEnumerable names = from n in List where (n.StartsWith("S")) select n;
// var names = from n in List where (n.StartsWith("S")) select n;
foreach (string name in names)
{
Console.WriteLine(name);
}
The above ex outputs:
Sourav
Sachin
I wanted to know, the advantage of using IEnumerable in the above example? I can achieve the same using 'var' (commented line).
I would appreciate if anyone of you can help me out to understand this and whats the benefit of using IEnumerable with an example? What if we don't use it?.
Beyond reading the documentation I'd describe IEnumerable<T> as a collection of Ts, it can be iterated over and many other functions can be carried out (such as Where(), Any() and Count()) however it's not designed for adding and removing elements. That's a List<T>.
It's useful because it's a fundamental interface for many collections, various data access layers and ORMs use it and many extension methods are automatically included for it.
Many concrete implementations of Lists, Arrays, Bags, Queues, Stacks all implement it allowing a wide variety of collections to use it's extension methods.
Also collections implementing either IEnumerable or IEnumerable can be used in a foreach loop.
From msdn
for each element in an array or an object collection that implements
the System.Collections.IEnumerable or
System.Collections.Generic.IEnumerable interface.
In your code example you've got a variable called names which will be an IEnumerable<string>, it's important to understand that it will be an IEnumerable<string> regardless of whether you use the var keyword or not. var just allows you to avoid writing the type so explicitly each time.
TLDR
It's a common base interface for many different types of collections which let you use your collection in foreach loops and provides a lot of extra extension methods for free.
IEnumerable and much more preferred IEnumerable<T> are the standard way to handle the 'sequence of elements' pattern.
The idea is each type : IEnumerable<T> looks like if there's a label: "ENUMERATE ME". No matter what's there: queue of order items, collection of controls, records from a sql query, xml element subnodes etc etc etc - it's all the same from enumerable's point of view: you've got a sequence and you can do something for each item from the sequence.
Note that IEnumerable is somewhat limited: there's no count, no indexed access, no guarantee for repeatable results, no way to check if enumerable is empty but to get the enumerator and to check if there is anything. The simplicity allows to cover almost all use cases, from collections to ad-hoc sequences (custom iterators, linq queries etc).
The question was asked multiple times, here're some answers: 1, 2, 3
MSDN
"The disadvantage of omitting IEnumerable and IEnumerator is that the collection class is no longer interoperable with the foreach statements, or equivalent statements, of other common language runtime languages."
So you need to implement this interface so your custom collection type can be used with other CLR languages. It seems like a CLS requirement.
Related
Given an instance IEnumerable o how can I get the item Count? (without enumerating through all the items)
For example, if the instance is of ICollection, ICollection<T> and IReadOnlyCollection<T>, each of these interfaces have their own Count method.
Is getting the Count property by reflection the only way?
Instead, can I check and cast o to ICollection<T> for example, so I can then call Count ?
It depends how badly you want to avoid enumerating the items if the count is not available otherwise.
If you can enumerate the items, you can use the LINQ method Enumerable.Count. It will look for a quick way to get the item count by casting into one of the interfaces. If it can't, it will enumerate.
If you want to avoid enumeration at any cost, you will have to perform a type cast. In a real life scenario you often will not have to consider all the interfaces you have named, since you usually use one of them (IReadOnlyCollection is rare and ICollection only used in legacy code). If you have to consider all of the interfaces, try them all in a separate method, which can be an extension:
static class CountExtensions {
public static int? TryCount<T>(this IEnumerable<T> items) {
switch (items) {
case ICollection<T> genCollection:
return genCollection.Count;
case ICollection legacyCollection:
return legacyCollection.Count;
case IReadOnlyCollection<T> roCollection:
return roCollection.Count;
default:
return null;
}
}
}
Access the extension method with:
int? count = myEnumerable.TryCount();
IEnumerable doesn't promise a count . What if it was a random sequence or a real time data feed from a sensor? It is entirely possible for the collection to be infinitely sized. The only way to count them is to start at zero and increment for each element that the enumerator provides. Which is exactly what LINQ does, so don't reinvent the wheel. LINQ is smart enough to use .Count properties of collections that support this.
The only way to really cover all your possible types for a collection is to use the generic interface and call the Count-method. This also covers other types such as streams or just iterators. Furthermore it will use the Count-property as of Count property vs Count() method? to avoid unneccessary overhead.
If you however have a non-generic collection you´d have to use reflection to use the correct property. However this is cumbersome and may fail if your collection doesn´t even have the property (e.g. an endless stream or just an iterator). On the other hand IEnumerable<T>.Count() will handle those types with the optimization mentioned above. Only if neccessary it will iterate the entire collection.
Why IEnumerable.ToList() won't work if like:
var _listReleases= new List<string>;
_listReleases.Add("C#")
_listReleases.Add("Javascript");
_listReleases.Add("Python");
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
_listReleases.Clear();
_listReleases.AddRange(sortedItems); // won't work
_listReleases.AddRange(sortedItems.ToList()); // won't work
Note: _listRelealse will be null
It doesn't work because of this line:
_listReleases.Clear();
First of all, _listReleases is not null at this point. It's merely empty, which is a completely different thing.
But to explain why this doesn't work as you expect: the IEnumerable interface type does not actually allocate or reserve storage for anything. It represents an object that you can use with a foreach loop, and nothing more. It does not actually need to store the items in the collection itself.
Sometimes, an IEnumerable reference does have those items in the same object, but it doesn't have to. That's what's going on here. The OrderBy() extension method only creates an object that knows how to look at the original list and return the items in a specific order. But this does not have storage for those items. It still depends on it's original data source.
The best solution for this situation is to stop using the _listReleases variable at this point, and instead just use the sortedItems variable. As long the former is not garabage collected, the latter will do what you need. But if you really want the _listReleases variable, you can do it like this:
_listReleases = sortedItems.ToList();
Now back to IEnumerables. There are some nice benefits to this property of not requiring immediate storage of the items themselves, and merely abstracting the ability to iterate over a collection:
Lazy Evaluation - That the work required to produce those items is not done until called for (and often, that means it won't need to be done all all, greatly improving performance).
Composition - An IEnumerable object can be modified during a program to incorprate new sets of rules or operations into the final result. This reduces program complexity and improves maintainability by allowing you to break apart a complex set of sorting or filtering requirements into it's component parts. This also makes it much easier to build a program where these rules can be easily determined by the user at run time, instead of in advance by the programmer at compile time.
Memory Efficiency - An IEnumerable makes it possible to iterate collections of data from sources such as a database in ways that only need to keep the current record loaded into memory at any given time. This feature can also be used to create unbounded collections: sets of items that may stretch on to infinity. You can build an IEnumerable with the BigInteger type to calculate the next prime on to infinity, if asked for. Moreover, you could use that collection in a useful way without crashing or hanging the program by combining this with the composition feature, so the program will know when to stop.
LINQ is lazily evaluated. When you run this line:
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
You aren't actually ordering the items right then and there. Instead you're building an enumerable that will, when enumerated, return the objects that are currently in _listReleases in order. So when you Clear() the list, it no longer has any items to order.
You need to force it to evaluate before you clear _listReleases. An easy way to do this is to add a ToList() call. Also, the type IEnumerable isn't compatible with AddRange won't accept it. You can just use var to implicitly type it to List<string>, which will work because List<T> : IEnumerable<T> (it implements the interface).
var sortedItems = _listReleases.OrderBy(x => x).ToList();
_listReleases.Clear();
_listReleases.AddRange(sortedItems);
You should also note that methods like ToList() are extension methods for IEnumerable<T>, not IEnumerable, so ((IEnumerable)something).ToList() won't work. Unlike, say, Java, Something<T> and Something are completely distinct types in C#.
I have got a question about the order in IEnumerable.
As far as I am aware, iterating through IEnumerable is pseudo-code can be written in the following way:
while (enumerable.HasNext())
{
object obj = enumerable.Current;
...
}
Now, assume, that one needs to operate on a sorted collection. Can IEnumerable be used in this case or is it better to try other means (i.e. IList) with indexation support?
In other words: does the contract of IEnumerable make any guarantees about the order in general?
So, IEnumerable is not a proper mean for a generic interface that guarantees ordering. The new question is what interface or class should be used for an immutable collection with order? ReadonlyCollection? IList? Both of them contain Add() method (even is not implemented in the former one).
My own thoughts: IEnumerable does not provide any guarantees about the ordering. The correct implementation could return same elements in different order in different enumerations (consider an SQL query)
I am aware of LINQ First(), but if IEnumerable does not say a word about it's ordering, this extension is pretty useless.
IEnumerable/IEnumerable<T> makes no guarantees about ordering, but the implementations that use IEnumerable/IEnumerable<T>may or may not guarantee ordering.
For instance, if you enumerate List<T>, order is guaranteed, but if you enumerate HashSet<T> no such guarantee is provided, yet both will be enumerated using the IEnumerable<T> interface.
Implementation detail. IEnumerable will enumerate the item - how that is implemented is up to the implementation. MOST lists etc. run along their natural order (index 0 upward etc.).
does the contract of IEnumerable guarantee us some order in general case?
No, it guarantees enumeration only (every item one time etc.). IEnumerable has no guaranteed order because it is also usable on unordered items.
I know about LINQ First(), but if IEnumerable does not say a word about it's order, this extension is rather useless.
No, it is not, because you may have intrinsic order. You give SQL as example - the result is an IEnumerable, but if I have enforced ordering before (By using OrderBy()) then the IEnumerable is ordered per definition of LINQ. AsEnumerable().First() gets me then the first item by Order.
Perhaps you are looking for the IOrderedEnumerable interface? It is returned by extensions methods like OrderBy() and allow for subsequent sorting with ThenBy().
You mix two points: enumerating and ordering.
When you enumerate over IEnumerable you should not care about order. You work with the interface, and its implementation should care about order.
For instance:
void Enumerate(IEnumerable sequence)
{
// loop
}
SortedList<T> sortedList = ...
Enumerate (sortedList);
Inside the method it's still a list with fixed order, but method doesn't know about particular interface implementation and it's peculiarity.
I am confused about which collection type that I should return from my public API methods and properties.
The collections that I have in mind are IList, ICollection and Collection.
Is returning one of these types always preferred over the others, or does it depend on the specific situation?
ICollection<T> is an interface that exposes collection semantics such as Add(), Remove(), and Count.
Collection<T> is a concrete implementation of the ICollection<T> interface.
IList<T> is essentially an ICollection<T> with random order-based access.
In this case you should decide whether or not your results require list semantics such as order based indexing (then use IList<T>) or whether you just need to return an unordered "bag" of results (then use ICollection<T>).
Generally you should return a type that is as general as possible, i.e. one that knows just enough of the returned data that the consumer needs to use. That way you have greater freedom to change the implementation of the API, without breaking the code that is using it.
Consider also the IEnumerable<T> interface as return type. If the result is only going to be iterated, the consumer doesn't need more than that.
The main difference between the IList<T> and ICollection<T> interfaces is that IList<T> allows you to access elements via an index. IList<T> describes array-like types. Elements in an ICollection<T> can only be accessed through enumeration. Both allow the insertion and deletion of elements.
If you only need to enumerate a collection, then IEnumerable<T> is to be preferred. It has two advantages over the others:
It disallows changes to the collection (but not to the elements, if they are of reference type).
It allows the largest possible variety of sources, including enumerations that are generated algorithmically and are not collections at all.
Allows lazy evaluation and can be queried with LINQ.
Collection<T> is a base class that is mainly useful to implementers of collections. If you expose it in interfaces (APIs), many useful collections not deriving from it will be excluded.
One disadvantage of IList<T> is that arrays implement it but do not allow you to add or remove items (i.e. you cannot change the array length). An exception will be thrown if you call IList<T>.Add(item) on an array. The situation is somewhat defused as IList<T> has a Boolean property IsReadOnly that you can check before attempting to do so. But in my eyes, this is still a design flaw in the library. Therefore, I use List<T> directly, when the possibility to add or remove items is required.
Which one should I choose? Let's consider just List<T> and IEnumerable<T> as examples for specialized / generalized types:
Method input parameter
IEnumerable<T> greatest flexibility for the caller. Restrictive for the implementer, read-only.
List<T> Restrictive for the caller. Gives flexibility to the implementer, can manipulate the collection.
Method ouput parameter or return value
IEnumerable<T> Restrictive for the caller, read-only. Greatest flexibility for the implementer. Allows to return about any collection or to implement an iterator (yield return).
List<T> Greatest flexibility for the caller, can manipulate the returned collection. Restrictive for the implementer.
Well, at this point you may be disappointed because I don't give you a simple answer. A statement like "always use this for input and that for output" would not be constructive. The reality is that it depends on use case. A method like void AddMissingEntries(TColl collection) will have to provide a collection type having an Add method or may even require a HashSet<T> for efficiency. A method void PrintItems(TColl collection) can happily live with an IEnumerable<T>.
IList<T> is the base interface for all generic lists. Since it is an ordered collection, the implementation can decide on the ordering, ranging from sorted order to insertion order. Moreover Ilist has Item property that allows methods to read and edit entries in the list based on their index.
This makes it possible to insert, remove a value into/from the list at a position index.
Also since IList<T> : ICollection<T>, all the methods from ICollection<T> are also available here for implementation.
ICollection<T> is the base interface for all generic collections. It defines size, enumerators and synchronization methods. You can add or remove an item into a collection but you cannot choose at which position it happens due to the absence of index property.
Collection<T> provides an implementation for IList<T>, IList and IReadOnlyList<T>.
If you use a narrower interface type such as ICollection<T> instead of IList<T>, you protect your code against breaking changes. If you use a wider interface type such as IList<T>, you are more in danger of breaking code changes.
Quoting from a source,
ICollection, ICollection<T> : You want to modify the collection or
you care about its size.
IList, IList<T>: You want to modify the collection and you care about the ordering and / or positioning of the elements in the collection.
Returning an interface type is more general, so (lacking further information on your specific use case) I'd lean towards that. If you want to expose indexing support, choose IList<T>, otherwise ICollection<T> will suffice. Finally, if you want to indicate that the returned types are read only, choose IEnumerable<T>.
And, in case you haven't read it before, Brad Abrams and Krzysztof Cwalina wrote a great book titled "Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries" (you can download a digest from here).
There are some subjects that come from this question:
interfaces versus classes
which specific class, from several alike classes, collection, list, array ?
Common classes versus subitem ("generics") collections
You may want to highlight that its an Object Oriented A.P.I.
interfaces versus classes
If you don't have much experience with interfaces, I recommend stick to classes.
I see a lot of times of developers jumping to interfaces, even if its not necesarilly.
And, end doing a poor interface design, instead of, a good class design,
which, by the way, can eventually, be migrated to a good interface design ...
You'll see a lot of interfaces in A.P.I., but, don't rush to it,
if you don't need it.
You will eventually learn how to apply interfaces, to your code.
which specific class, from several alike classes, collection, list, array ?
There are several classes in c# (dotnet) that can be interchanged. As already mention, if you need something from a more specific class, such as "CanBeSortedClass", then make it explicit in your A.P.I..
Does your A.P.I. user really needs to know, that your class can be sorted, or apply some format to the elements ? Then use "CanBeSortedClass" or "ElementsCanBePaintedClass",
otherwise use "GenericBrandClass".
Otherwise, use a more general class.
Common collection classes versus subitem ("generics") collections
You'll find that there are classes that contains others elements,
and you can specify that all elements should be of an specific type.
Generic Collections are those classes that you can use the same collection,
for several code applications, without having to create a new collection,
for each new subitem type, like this: Collection.
Does your A.P.I. user is going to need a very specific type, same for all elements ?
Use something like List<WashingtonApple> .
Does your A.P.I. user is going to need several related types ?
Expose List<Fruit> for your A.P.I., and use List<Orange> List<Banana>, List<Strawberry> internally, where Orange, Banana and Strawberry are descendants from Fruit .
Does your A.P.I. user is going to need a generic type collection ?
Use List, where all items are object (s).
Cheers.
I don't understand why I'd create an IEnumerable. Or why it's important.
I'm looking at the example for IEnumerable:
http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx
But I can basically do the same thing if I just went:
List<Person> people = new List<Person>();
so what's IEnumerable good for? Can you give me a situation where I'd need to create a class that implements IEnumerable?
IEnumerable is an interface, it exposes certain things to the outside. While you are completely right, you could just use a List<T>, but List<T> is very deep in the inheritance tree. What exactly does a List<T>? It stores items, it offers certain methods to Add and Remove. Now, what if you only need the "item-keeping" feature of a List<T>? That's what an IEnumerable<T> is - an abstract way of saying "I want to get a list of items I can iterate over". A list is "I want to get a collection which I can modify, can access by index and iterate". List<T> offers a lot more functionality than IEnumerable<T> does, but it takes up more memory. So if a method is taking an IEnumerable<T>, it doesn't care what exactly it gets, as long as the object offers the possibilites of IEnumerable<T>.
Also, you don't have to create your own IEnumerable<T>, a List<T> IS an IEnumerable<T>!
Lists are, of course IEnumerable - As a general rule, you want to be specific on what you output but broad on what you accept as input eg:
You have a sub which loops through a list of objects and writes something to the console...
You could declare the parameter is as either IEnumerable<T> or IList<T> (or even List<T>). Since you don't need to add to the input list, all you actually need to do is enumerate - so use IEnumerable - then your method will also accept other types which implement IEnumerable including IQueryable, Linked Lists, etc...
You're making your methods more generic for no cost.
Today, you generally wouldn't use IEnumerable anymore unless you were supporting software on an older version of the framework. Today, you'd normally use IEnumerable<T>. Amongst other benefits, IEnumerable fully implements all of the LINQ operations/extensions so that you can easily query any List type that implements IEnumerable<T> using LINQ.
Additionally, it doesn't tie the consumer of your code to a particular collection implementation.
It's rare that nowdays you need to create your own container classes, as you are right there alreay exists many good implementations.
However if you do create your own container class for some specific reason, you may like to implement IEnumerable or IEnumerable<T> because they are a standard "contract" for itteration and by providing an implementation you can take advantage of methods/apis that want an IEnumerable or IEnumerable<T> Linq for example will give you a bunch of useful extension methods for free.
An IList can be thought of as a particular implementation of IEnumerable. (One that can be added to and removed from easily.) There are others, such as IDictionary, which performs an entirely different function but can still be enumerated over. Generally, I would use IEnumerable as a more generic type reference when I only need an enumeration to satisfy a requirement and don't particularly care what kind it is. I can pass it an IList and more often than not I do just that, but the flexibility exists to pass it other enumerations as well.
Here is one situation that I think I have to implement IEnumerable but not using List<>
I want to get all items from a remote server. Let say I have one million items going to return. If you use List<> approach, you need to cache all one million items in the memory first. In some cases, you don't really want to do that because you don't want to use up too much memory. Using IEnumerable allows you to display the data on the screen and then dispose it right away. Therefore, using IEnumerable approach, the memory footprint of the program is much smaller.
It's my understanding that IEnumerable is provided to you as an interface for creating your own enumerable class types.
I believe a simple example of this would be recreating the List type, if you wanted to have your own set of features (or lack thereof) for it.
What if you want to enumerate over a collection that is potentially of infinite size, such as the Fibonacci numbers? You couldn't do that easily with a list, but if you had a class that implemented IEnumerable or IEnumerable<T>, it becomes easy.
When a built in container fits your needs you should definitely use that, and than IEnumerable comes for free. When for whatever reason you have to implement your own container, for example if it must be backed by a DB, than you should make sure to implement both IEnumerable and IEnumerable<T> for two reasons:
It makes foreach work, which is awesome
It enables almost all LINQ goodness. For example you will be able to filter your container down to objects that match a condition with an elegant one liner.
IEnumerable provides means for your API users (including yourself) to use your collection by the means of a foreach. For example, i implemented IENumerable in my Binary Tree class so i could just foreach over all of the items in the tree without having to Ctrl+C Ctrl+V all the logic required to traverse the tree InOrder.
Hope it helps :)
IEnumerable is useful if you have a collection or method which can return a bunch of things, but isn't a Dictionary, List, array, or other such predefined collection. It is especially useful in cases where the set of things to be returned might not be available when one starts outputting it. For example, an object to access records in a database might implement iEnumerable. While it might be possible for such an object to read all appropriate records into an array and return that, that may be impractical if there are a lot of records. Instead, the object could return an enumerator which could read the records in small groups and return them individually.