I ended up in this post while searching for solutions to my problem - which led me to propose a new answer there - and - to be confronted with the following question:
Considering ICollection implements IEnumerable, and all linq extensions apply to both interfaces, is there any scenario where I would benefit from working with an IEnumerable instead of an ICollection ?
The non generic IEnumerable, for instance, does not provide a Count extension.
Both ICollection interfaces do.
Given all ICollection, in any case, provide all functionality IEnumerable implement - since it itself implements it - why then would I opt for IEnumerable in place of ICollection ?
Backward compatibility with previous frameworks where ICollection was not available ?
I think there are actually two questions to answer here.
When would I want IEnumerable<T>?
Unlike other collection types and the language in general, queries on IEnumerables are executed using lazy evaluation. That means you can potentially perform several related queries in only enumeration.
It's worth noting that lazy evaluation doesn't interact nicely with side effects because multiple enumeration could then give different results, you need to keep this in mind when using it. In a language like C#, lazy evaluation can be a very powerful tool but also a source of unexplained behaviour if you aren't careful.
When would I not want ICollection<T>?
ICollection<T> is a mutable interface, it exposes, amongst other things, add and remove methods. Unless you want external things to be mutating your object's contents, you don't want to be returning it. Likewise, you generally don't want to be passing it in as an argument for the same reason.
If you do want explicit mutability of the collection, by all means use ICollection<T>.
Additionally, unlike IEnumerable<T> or IReadOnlyCollection<T>, ICollection<T> is not covariant which reduces the flexibility of the type in certain use cases.
Non-generic versions
Things change a bit when it comes to the non-generic versions of these interfaces. In this case, the only real difference between the two is the lazy evaluation offered by IEnumerable and the eager evaluation of ICollection.
I personally would tend to avoid the non-generic versions due to the lack of type safety and poor performance from boxing/unboxing in the case of value types.
Summary
If you want lazy evaluation, use IEnumerable<T>. For eager evaluation and immutability use IReadOnlyCollection<T>. For explicit mutability use ICollection<T>.
IEnumerable provides a read-only interface to a collection and ICollection allows modification. Also IEnumerable needs just to know how to iterate over elements. ICollection has to provide more information.
This is semantically different. You don't always want to provide a functionality for modification of a collection.
There is a IReadOnlyCollection but it doesn't implement ICollection. This is a design of C#, that ReadOnly is a different stripped down interface.
The point made by Tim is quite important. The internal working for Count might be dramatically different. IEnumerable does not need to know how many elements it spans over. Collection has a Property, so it has to know how many elements it contains. That is another crucial difference.
The idea is to use the simplest contract (interface) which fulfills the requirements: ICollection is a collection, IEnumerable is a sequence. A sequence could have deferred execution, it could be infinite, etc. The interface IEnumerable just tells you that you can enumerate the sequence, that is all. This is different from ICollection, which represents an actual collection containing a finite number of items.
As you can see, these are quite different. You cannot ignore the semantics of these contracts, and just focus on which interface inherits which other one.
If your algorithm only involves enumeration of the input data, then it should take an IEnumerable. If you are, by contract, dealing with collections (i.e, you expect collections and nothing else), then you should use ICollection.
Related
What is the difference between returning IList vs List, or IEnumerable vs List.
I want to know which is better to return.
When we need to use one, what effect will it have on performance?
There is no such a type that is always better to return. It's a decision you should make based on your design/performance/etc goals.
IEnumerable<T> is nice to use when you want to represent sequence of items, that you can iterate over, but you don't want to allow modifications(Add, Delete etc).
IList<T> gives you everything you could get using IEnumerable<T>, plus operations that give you more control over a collection: Add, Delete, Count, Index access etc.
List<T> is a concrete implementation of IList<T>. I would say that almost always it's better to expose IList<T> interface from your methods rather that List<T> implementation. And it's not just about lists - it's a basic design principle to prefer interfaces over concrete implementations.
Ok, now about non-generic versions IEnumerable, IList, List:
They actually came from very early versions of .NET framework, and life is much better using generic equivalents.
And few words about performance:
IEnumerable<T>(with IEnumerator<T>) is actually an iterator which allows you to defer some computations until later. It means that there is no need to allocate memory right away for storing amounts of data(of course, it's not the case when you have, say, array behind iterator). You can compute data gradually as needed. But it means that these computations might be performed over and over again(say, with every foreach loop). On the other hand, with List you have fixed data in memory, with cheap Index and Count operations. As you see, it's all about compromise.
Using concrete classes in parameters and results of methods makes a strong dependency, while using interfaces don't. What it mean?
If in the future you'll change the implementation of your class, and will use SynchroinizedCollection, LinkedList, or something other instead of List, then you have to change your methods signature, exactly the type of return value.
After that you have to not only rebuild assemblies that used this class, but may have to rewrite them.
However, if you're using one of IEnumerable, IReadonlyCollection, ICollection, IList interfaces, you'll not have to rewrite and recompile client assemblies. Thus, interfaces always preferred classes in parameters and results. (But remember, we're talking about dependencies between different assemblies. With the same assembly this rule is not so important.)
The question is, what interface to use? It depends on requirements of client classes (use cases). F.e. if you're processing elements one by one, use IEnumerable<T>, and if you need a count of elements, use IReadonlyCollection<T>. Both of these interfaces are co-variance that is convenient for a type-casting.
If you need write abilities (Add, Remove, Clear) or non co-variance read only abilities (Contains), use ICollection<T>. Finally, if you need a random indexed access, use IList<T>.
As for performance, the invocation of interface's method a bit slower, but it's insignificant difference. You shouldn't care about this.
I need to design my own custom GenericCollection class. Now i have plenty of options to derive it using IEnumerable, ICollection, and IList, where later offers some added functionalities.
I am little confused that if i go with IEnumerable<T> i might require declaring the object to actually hold the collection like in this case _list.
public class GenericCollection<T> : IEnumerable<T>
{
private List<T> _list;
//...
}
But if i go with ICollection<T> or IList<T>, i do not require to declare the List object as it is implicitly available.
public class GenericCollection<T> : IList<T>
{
// no need for List object
//private List<T> _list;
//...
}
What is the difference between these two approaches with respect to performance?
In which scenario each one is preferred especially when it comes to designing your own collection. I am interested in the light weight collection with good performance. I think this can be achieved using IEnumerable<T> but how exactly along with some strong reasons to go with it?
I have reviewed some existing posts but none is giving required information.
Returning 'IList' vs 'ICollection' vs 'Collection'
IEnumerable, ICollection, and IList (generally, any type with an I prefix) are just interfaces. They let you expose what your class will do, but unlike if you inherit a class, interfaces do not provide you a default implementation of any of the things they say you must do.
As far as choosing which interface, here's a quick guide:
An IList is an ICollection that can be accessed by index.
An ICollection is an IEnumerable with easy access to things like Add, Remove, and Count.
An IEnumerable is anything that can be enumerated, even if the list of those things doesn't exist until you enumerate it.
Some classes that you might want to extend (or keep as a private field that runs most of the logic) for your collection are List<T>, Collection<T>, (which implements IList<T>, but with easier access to overriding implementation, see Collection<T> versus List<T> what should you use on your interfaces? for the big differences between these two) ObservableCollection<T>, or collections that are not lists, like Dictionary<T, U> and HashSet<T>. For more info on any of these, look up the MSDN documentation on the class.
First off, you don't have to actually choose betwen these interfaces, if it's necessary you can implement all three. Second, implementing IEnumerable does not require you to make the underlying list public. You can implement just the methods to use the Enumerator of the underlying list.
Performancewise, I doubt there'll be much of an impact, focus on what you need functionally. The only way to know for sure is to measure.
The performance is unlikely to be dependent on which interfaces are implemented. It rather depends on how many instructions the processor has to run to achieve a certain goal. If you implement IEnumerable and wrap over the List, you are likely to end up writing Add/Remove/this[] methods that are just propagating the calls to the List, which would add a performance overhead. Hence, although I didn't take any measurements, the inheritance approach would likely be a very little bit faster.
However, such details usually matter only for real-time applications with an extreme need to save every possible CPU cycle. Eric Lippert has a great article about paying attention to such details: http://blogs.msdn.com/b/ericlippert/archive/2003/10/17/53237.aspx. Generally, you are likely to be better off using the approach that better fits business logic and architecture of your application, rather than performance details.
I am confused about which collection type that I should return from my public API methods and properties.
The collections that I have in mind are IList, ICollection and Collection.
Is returning one of these types always preferred over the others, or does it depend on the specific situation?
ICollection<T> is an interface that exposes collection semantics such as Add(), Remove(), and Count.
Collection<T> is a concrete implementation of the ICollection<T> interface.
IList<T> is essentially an ICollection<T> with random order-based access.
In this case you should decide whether or not your results require list semantics such as order based indexing (then use IList<T>) or whether you just need to return an unordered "bag" of results (then use ICollection<T>).
Generally you should return a type that is as general as possible, i.e. one that knows just enough of the returned data that the consumer needs to use. That way you have greater freedom to change the implementation of the API, without breaking the code that is using it.
Consider also the IEnumerable<T> interface as return type. If the result is only going to be iterated, the consumer doesn't need more than that.
The main difference between the IList<T> and ICollection<T> interfaces is that IList<T> allows you to access elements via an index. IList<T> describes array-like types. Elements in an ICollection<T> can only be accessed through enumeration. Both allow the insertion and deletion of elements.
If you only need to enumerate a collection, then IEnumerable<T> is to be preferred. It has two advantages over the others:
It disallows changes to the collection (but not to the elements, if they are of reference type).
It allows the largest possible variety of sources, including enumerations that are generated algorithmically and are not collections at all.
Allows lazy evaluation and can be queried with LINQ.
Collection<T> is a base class that is mainly useful to implementers of collections. If you expose it in interfaces (APIs), many useful collections not deriving from it will be excluded.
One disadvantage of IList<T> is that arrays implement it but do not allow you to add or remove items (i.e. you cannot change the array length). An exception will be thrown if you call IList<T>.Add(item) on an array. The situation is somewhat defused as IList<T> has a Boolean property IsReadOnly that you can check before attempting to do so. But in my eyes, this is still a design flaw in the library. Therefore, I use List<T> directly, when the possibility to add or remove items is required.
Which one should I choose? Let's consider just List<T> and IEnumerable<T> as examples for specialized / generalized types:
Method input parameter
IEnumerable<T> greatest flexibility for the caller. Restrictive for the implementer, read-only.
List<T> Restrictive for the caller. Gives flexibility to the implementer, can manipulate the collection.
Method ouput parameter or return value
IEnumerable<T> Restrictive for the caller, read-only. Greatest flexibility for the implementer. Allows to return about any collection or to implement an iterator (yield return).
List<T> Greatest flexibility for the caller, can manipulate the returned collection. Restrictive for the implementer.
Well, at this point you may be disappointed because I don't give you a simple answer. A statement like "always use this for input and that for output" would not be constructive. The reality is that it depends on use case. A method like void AddMissingEntries(TColl collection) will have to provide a collection type having an Add method or may even require a HashSet<T> for efficiency. A method void PrintItems(TColl collection) can happily live with an IEnumerable<T>.
IList<T> is the base interface for all generic lists. Since it is an ordered collection, the implementation can decide on the ordering, ranging from sorted order to insertion order. Moreover Ilist has Item property that allows methods to read and edit entries in the list based on their index.
This makes it possible to insert, remove a value into/from the list at a position index.
Also since IList<T> : ICollection<T>, all the methods from ICollection<T> are also available here for implementation.
ICollection<T> is the base interface for all generic collections. It defines size, enumerators and synchronization methods. You can add or remove an item into a collection but you cannot choose at which position it happens due to the absence of index property.
Collection<T> provides an implementation for IList<T>, IList and IReadOnlyList<T>.
If you use a narrower interface type such as ICollection<T> instead of IList<T>, you protect your code against breaking changes. If you use a wider interface type such as IList<T>, you are more in danger of breaking code changes.
Quoting from a source,
ICollection, ICollection<T> : You want to modify the collection or
you care about its size.
IList, IList<T>: You want to modify the collection and you care about the ordering and / or positioning of the elements in the collection.
Returning an interface type is more general, so (lacking further information on your specific use case) I'd lean towards that. If you want to expose indexing support, choose IList<T>, otherwise ICollection<T> will suffice. Finally, if you want to indicate that the returned types are read only, choose IEnumerable<T>.
And, in case you haven't read it before, Brad Abrams and Krzysztof Cwalina wrote a great book titled "Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries" (you can download a digest from here).
There are some subjects that come from this question:
interfaces versus classes
which specific class, from several alike classes, collection, list, array ?
Common classes versus subitem ("generics") collections
You may want to highlight that its an Object Oriented A.P.I.
interfaces versus classes
If you don't have much experience with interfaces, I recommend stick to classes.
I see a lot of times of developers jumping to interfaces, even if its not necesarilly.
And, end doing a poor interface design, instead of, a good class design,
which, by the way, can eventually, be migrated to a good interface design ...
You'll see a lot of interfaces in A.P.I., but, don't rush to it,
if you don't need it.
You will eventually learn how to apply interfaces, to your code.
which specific class, from several alike classes, collection, list, array ?
There are several classes in c# (dotnet) that can be interchanged. As already mention, if you need something from a more specific class, such as "CanBeSortedClass", then make it explicit in your A.P.I..
Does your A.P.I. user really needs to know, that your class can be sorted, or apply some format to the elements ? Then use "CanBeSortedClass" or "ElementsCanBePaintedClass",
otherwise use "GenericBrandClass".
Otherwise, use a more general class.
Common collection classes versus subitem ("generics") collections
You'll find that there are classes that contains others elements,
and you can specify that all elements should be of an specific type.
Generic Collections are those classes that you can use the same collection,
for several code applications, without having to create a new collection,
for each new subitem type, like this: Collection.
Does your A.P.I. user is going to need a very specific type, same for all elements ?
Use something like List<WashingtonApple> .
Does your A.P.I. user is going to need several related types ?
Expose List<Fruit> for your A.P.I., and use List<Orange> List<Banana>, List<Strawberry> internally, where Orange, Banana and Strawberry are descendants from Fruit .
Does your A.P.I. user is going to need a generic type collection ?
Use List, where all items are object (s).
Cheers.
I have always been taught that programming against an interface is better, so parameters on my methods I would set to IList<T> rather than List<T>..
But this means I have to cast to List<T> just to use some methods, one comes to mind is Find for example.
Why is this? Should I continue to program against interfaces, but continue to cast or revert?
I am a little bit confused why Find (for example) isn't available on the IList<T> which List<T> inherits from.
Personally I would use IList<T> rather than List<T>, but then use LINQ (Select, Where etc) instead of the List-specific methods.
Casting to List<T> removes much of the point of using IList<T> in the first place - and actually makes it more dangerous, as the implementation may be something other than List<T> at execution time.
In the case of lists you could continue programming against interfaces and use LINQ to filter your objects. You could even work with IEnumerable<T> which is even higher in the object hierarchy.
But more generally if the consumer of your API needs to call a specific method you probably haven't chosen the proper interface to expose.
I am a little bit confused why Find
(for example) isn't available on the
IList which List inherits from.
While I'm not privy to the decision process of the designers, there are a few things they were probably thinking.
1) Not putting these methods on IList keeps the intent of the contract clearer. According to MSDN, IList "Represents a collection of objects that can be individually accessed by index." Adding Find would change the contract to a searchable, indexable collection.
2) Every method you put on an interface makes it harder to implement the interface. If all of those methods were on IList, it would be much more tedious to implement IList. Especially since:
3) Most implementations of these methods would be the same. Find and several of the others on List would really be better placed on a helper class. Take for example, ReadOnlyCollection, Collection, ObservableCollection, and ReadOnlyObservableCollection. If I had to implement Find on all of those (pre-LINQ), I would make a helper class that takes IEnumerable and a predicate and just loop over the collections and have the implementations call the helper method.
4) LINQ (Not so much a reason why it didn't happen, more of why it isn't needed in the future.) With LINQ and extension methods, all IEnumerable's now "have" Find as an extension method (only they called it Where).
I think it's because IList can be different collection types (ie. an IEnumerable of some sort, an array or so).
You can use the Where extension method from System.Linq. Avoid casting back to List from IList.
If you find that the IList<T> parameter being passed between various classes is consistently being recast into List<T>, this indicates that there is a fundamental problem with your design.
From what you're describing, it's clear that you want to use polymorphism, but recasting on a consistent basis to List<T> would mean that IList<T> does not have the level of polymorphism you need.
On the other side of the coin, you simply might be targeting the wrong polymorphic method (e.g., Find rather than FirstOrDefault).
In either case, you should review your design and see what exactly you want to accomplish, and make the choice of List<T> or IList<T> based on the actual requirements, rather than conformity to style.
If you expose your method with a IList<> parameter, someone can pass, for exemple, a ReadOnlyCollection<>, witch is an IList<> but is not a List<>. So your API will crash at runtime.
If you expose a public method with a IList<> parameter, you cannot assume that it is a specific implementation of an IList<>. You must use it as an IList<> an nothing more.
If the list is some part of an Api or service that is exposed then it is probably better to have as an IList to allow the change of the implementation internally.
There is already much discussion on this topic.
No, in this case it has no sense to program to interfaces, because your List is NOT an IList, having extra methods on it.
I just realize that maybe I was mistaken all the time in exposing T[] to my views, instead of IEnumerable<T>.
Usually, for this kind of code:
foreach (var item in items) {}
item should be T[] or IEnumerable<T>?
Than, if I need to get the count of the items, would the Array.Count be faster over the IEnumerable<T>.Count()?
IEnumerable<T> is generally a better choice here, for the reasons listed elsewhere. However, I want to bring up one point about Count(). Quintin is incorrect when he says that the type itself implements Count(). It's actually implemented in Enumerable.Count() as an extension method, which means other types don't get to override it to provide more efficient implementations.
By default, Count() has to iterate over the whole sequence to count the items. However, it does know about ICollection<T> and ICollection, and is optimised for those cases. (In .NET 3.5 IIRC it's only optimised for ICollection<T>.) Now the array does implement that, so Enumerable.Count() defers to ICollection<T>.Count and avoids iterating over the whole sequence. It's still going to be slightly slower than calling Length directly, because Count() has to discover that it implements ICollection<T> to start with - but at least it's still O(1).
The same kind of thing is true for performance in general: the JITted code may well be somewhat tighter when iterating over an array rather than a general sequence. You'd basically be giving the JIT more information to play with, and even the C# compiler itself treats arrays differently for iteration (using the indexer directly).
However, these performance differences are going to be inconsequential for most applications - I'd definitely go with the more general interface until I had good reason not to.
It's partially inconsequential, but standard theory would dictate "Program against an interface, not an implementation". With the interface model you can change the actual datatype being passed without effecting the caller as long as it conforms to the same interface.
The contrast to that is that you might have a reason for exposing an array specifically and in which case would want to express that.
For your example I think IEnumerable<T> would be desirable. It's also worthy to note that for testing purposes using an interface could reduce the amount of headache you would incur if you had particular classes you would have to re-create all the time, collections aren't as bad generally, but having an interface contract you can mock easily is very nice.
Added for edit:
This is more inconsequential because the underlying datatype is what will implement the Count() method, for an array it should access the known length, I would not worry about any perceived overhead of the method.
See Jon Skeet's answer for an explanation of the Count() implementation.
T[] (one sized, zero based) also implements ICollection<T> and IList<T> with IEnumerable<T>.
Therefore if you want lesser coupling in your application IEnumerable<T> is preferable. Unless you want indexed access inside foreach.
Since Array class implements the System.Collections.Generic.IList<T>, System.Collections.Generic.ICollection<T>, and System.Collections.Generic.IEnumerable<T> generic interfaces, I would use IEnumerable, unless you need to use these interfaces.
http://msdn.microsoft.com/en-us/library/system.array.aspx
Your gut feeling is correct, if all the view cares about, or should care about, is having an enumerable, that's all it should demand in its interfaces.
What is it logically (conceptually) from the outside?
If it's an array, then return the array. If the only point is to enumerate, then return IEnumerable. Otherwise IList or ICollection may be the way to go.
If you want to offer lots of functionality but not allow it to be modified, then perhaps use a List internally and return the ReadonlyList returned from it's .AsReadOnly() method.
Given that changing the code from an array to IEnumerable at a later date is easy, but changing it the other way is not, I would go with a IEnumerable until you know you need the small spead benfit of return an array.