I am working with a IReadOnlyCollection of objects.
Now I'm a bit surprised, because I can use linq extension method ElementAt(). But I don't have access to IndexOf().
This to me looks a bit illogical: I can get the element at a given position, but I cannot get the position of that very same element.
Is there a specific reason for it?
I've already read -> How to get the index of an element in an IEnumerable? and I'm not totally happy with the response.
IReadOnlyCollection is a collection, not a list, so strictly speaking, it should not even have ElementAt(). This method is defined in IEnumerable as a convenience, and IReadOnlyCollection has it because it inherits it from IEnumerable. If you look at the source code, it checks whether the IEnumerable is in fact an IList, and if so it returns the element at the requested index, otherwise it proceeds to do a linear traversal of the IEnumerable until the requested index, which is inefficient.
So, you might ask why IEnumerable has an ElementAt() but not IndexOf(), but I do not find this question very interesting, because it should not have either of these methods. An IEnumerable is not supposed to be indexable.
Now, a very interesting question is why IReadOnlyList has no IndexOf() either.
IReadOnlyList<T> has no IndexOf() for no good reason whatsoever.
If you really want to find a reason to mention, then the reason is historical:
Back in the mid-nineties when C# was laid down, people had not quite started to realize the benefits of immutability and readonlyness, so the IList<T> interface that they baked into the language was, unfortunately, mutable.
The right thing would have been to come up with IReadOnlyList<T> as the base interface, and make IList<T> extend it, adding mutation methods only, but that's not what happened.
IReadOnlyList<T> was invented a considerable time after IList<T>, and by that time it was too late to redefine IList<T> and make it extend IReadOnlyList<T>. So, IReadOnlyList<T> was built from scratch.
They could not make IReadOnlyList<T> extend IList<T>, because then it would have inherited the mutation methods, so they based it on IReadOnlyCollection<T> and IEnumerable<T> instead. They added the this[i] indexer, but then they either forgot to add other methods like IndexOf(), or they intentionally omitted them since they can be implemented as extension methods, thus keeping the interface simpler. But they did not provide any such extension methods.
So, here, is an extension method that adds IndexOf() to IReadOnlyList<T>:
using Collections = System.Collections.Generic;
public static int IndexOf<T>( this Collections.IReadOnlyList<T> self, T elementToFind )
{
int i = 0;
foreach( T element in self )
{
if( Equals( element, elementToFind ) )
return i;
i++;
}
return -1;
}
Be aware of the fact that this extension method is not as powerful as a method built into the interface would be. For example, if you are implementing a collection which expects an IEqualityComparer<T> as a construction (or otherwise separate) parameter, this extension method will be blissfully unaware of it, and this will of course lead to bugs. (Thanks to Grx70 for pointing this out in the comments.)
It is because the IReadOnlyCollection (which implements IEnumerable) does not necessarily implement indexing, which often required when you want to numerically order a List. IndexOf is from IList.
Think of a collection without index like Dictionary for example, there is no concept of numeric index in Dictionary. In Dictionary, the order is not guaranteed, only one to one relation between key and value. Thus, collection does not necessarily imply numeric indexing.
Another reason is because IEnumerable is not really two ways traffic. Think of it this way: IEnumerable may enumerate the items x times as you specify and find the element at x (that is, ElementAt), but it cannot efficiently know if any of its element is located in which index (that is, IndexOf).
But yes, it is still pretty weird even you think it this way as would expect it to have either both ElementAt and IndexOf or none.
IndexOf is a method defined on List, whereas IReadOnlyCollection inherits just IEnumerable.
This is because IEnumerable is just for iterating entities. However an index doesn't apply to this concept, because the order is arbitrary and is not guaranteed to be identical between calls to IEnumerable. Furthermore the interface simply states that you can iterate a collection, whereas List states you can perform adding and removing also.
The ElementAt method sure does exactly this. However I won't use it as it reiterates the whole enumeration to find one single element. Better use First or just a list-based approach.
Anyway the API design seems odd to me as it allows an (inefficient) approach on getting an element at n-th position but does not allow to get the index of an arbitrary element which would be the same inefficient search leading to up to n iterations. I'd agree with Ian on either both (which I wouldn't recommend) or neither.
IReadOnlyCollection<T> has ElementAt<T>() because it is an extension to IEnumerable<T>, which has that method. ElementAt<T>() iterates over the IEnumerable<T> a specified number of iterations and returns value as that position.
IReadOnlyCollection<T> lacks IndexOf<T>() because, as an IEnumerable<T>, it does not have any specified order and thus the concept of an index does not apply. Nor does IReadOnlyCollection<T> add any concept of order.
I would recommend IReadOnlyList<T> when you want an indexable version of IReadOnlyCollection<T>. This allows you to correctly represent an unchangeable collection of objects with an index.
This extension method is almost the same as Mike's. The only difference is that it includes a predicate, so you can use it like this: var index = list.IndexOf(obj => obj.Id == id)
public static int IndexOf<T>(this IReadOnlyList<T> self, Func<T, bool> predicate)
{
for (int i = 0; i < self.Count; i++)
{
if (predicate(self[i]))
return i;
}
return -1;
}
Related
Given an instance IEnumerable o how can I get the item Count? (without enumerating through all the items)
For example, if the instance is of ICollection, ICollection<T> and IReadOnlyCollection<T>, each of these interfaces have their own Count method.
Is getting the Count property by reflection the only way?
Instead, can I check and cast o to ICollection<T> for example, so I can then call Count ?
It depends how badly you want to avoid enumerating the items if the count is not available otherwise.
If you can enumerate the items, you can use the LINQ method Enumerable.Count. It will look for a quick way to get the item count by casting into one of the interfaces. If it can't, it will enumerate.
If you want to avoid enumeration at any cost, you will have to perform a type cast. In a real life scenario you often will not have to consider all the interfaces you have named, since you usually use one of them (IReadOnlyCollection is rare and ICollection only used in legacy code). If you have to consider all of the interfaces, try them all in a separate method, which can be an extension:
static class CountExtensions {
public static int? TryCount<T>(this IEnumerable<T> items) {
switch (items) {
case ICollection<T> genCollection:
return genCollection.Count;
case ICollection legacyCollection:
return legacyCollection.Count;
case IReadOnlyCollection<T> roCollection:
return roCollection.Count;
default:
return null;
}
}
}
Access the extension method with:
int? count = myEnumerable.TryCount();
IEnumerable doesn't promise a count . What if it was a random sequence or a real time data feed from a sensor? It is entirely possible for the collection to be infinitely sized. The only way to count them is to start at zero and increment for each element that the enumerator provides. Which is exactly what LINQ does, so don't reinvent the wheel. LINQ is smart enough to use .Count properties of collections that support this.
The only way to really cover all your possible types for a collection is to use the generic interface and call the Count-method. This also covers other types such as streams or just iterators. Furthermore it will use the Count-property as of Count property vs Count() method? to avoid unneccessary overhead.
If you however have a non-generic collection you´d have to use reflection to use the correct property. However this is cumbersome and may fail if your collection doesn´t even have the property (e.g. an endless stream or just an iterator). On the other hand IEnumerable<T>.Count() will handle those types with the optimization mentioned above. Only if neccessary it will iterate the entire collection.
Following the rules that a public APIs should never return a list, i'm blinding converting all code that returned lists, to return ICollection<T> instead:
public IList<T> CommaSeparate(String value) {...}
becomes
public ICollection<T> CommaSeparate(String value) {...}
And although an ICollection has a Count, there is no way to get items by that index.
And although an ICollection exposes an enumerator (allowing foreach), i see no guarantee that the order of enumeration starts at the "top" of the list, as opposed to the "bottom".
i could mitigate this by avoiding the use of ICollection, and instead use Collection:
public Collection<T> Commaseparate(String value) {...}
This allows the use of an Items[index] syntax.
Unfortunately, my internal implementation constructs an array; which i can be cast to return IList or ICollection, but not as a Collection.
Is there a ways to access items of a collection in order?
This begs the wider question: Does an ICollection even have an order?
Conceptually, imagine i want to parse a command line string. It is critical that the order of items be maintained.
Conceptually, i require a contract that indicates an "ordered" set of string tuples. In the case of an API contract, to indicate order, which of the following is correct:
IEnumerable<String> Grob(string s)
ICollection<String> Grob(string s)
IList<String> Grob(string s)
Collection<String> Grob(string s)
List<String> Grob(string s)
The ICollection<T> interface doesn't specify anything about an order. The objects will be in the order specified by the object returned. For example, if you return the Values collection of a SortedDictionary, the objects will be in the the order defined by the dictionary's comparer.
If you need the method to return, by contract, a type whose contract requires a certain order, then you should express that in the method's signature by returning the more specific type.
Regardless of the runtime type of the object returned, consider the behavior when the static reference is IList<T> or ICollection<T>: When you call GetEnumerator() (perhaps implicitly in a foreach loop), you're going to call the same method and get the same object regardless of the static type of the reference. It will therefore behave the same way regardless of the CommaSeparate() method's return type.
Additional thought:
As someone else pointed out, the FXCop rule warns against using List<T>, not IList<T>; the question you linked to is asking why FXCop doesn't recommend using IList<T> in place of List<T>, which is another matter. If I imagine that you are parsing a command-line string where order is important, I would stick with IList<T> if I were you.
ICollection does not have a guaranteed order, but the class that actually implements it may (or may not).
If you want to return an ordered collection, then return an IList<T> and don't get too hung up on FxCop's generally sound, but very generic, advice.
No, ICollection does not imply an order.
The ICollection instance has the "order" of whatever class that implements it. That is, referencing a List<T> as an ICollection will not alter its order at all.
Likewise, if you access an unordered collection as an ICollection, it will not impose an order on the unordered collection either.
So, to your question:
Does an ICollection even have an order?
The answer is: it depends solely on the class that implements it.
ICollection<T> may have an order, but the actual ordering depends on the class implementing it.
It does not have accesor for an item at given index. IList<T> specializes this interface to provide access by index.
An ICollection<T> is just an interface; whether it's ordered or not is entirely dependent up the implementation underlying it (which is supposed to be opaque).
If you want to be able to access it by index, you'd want to return things as an IList<T>, which is both IEnumerable<T> and ICollection<T>. One should bear in mind, though, that depending on the underlying implementation, that getting at an arbitrary item in the collection could require O(N/2) time on the average.
My inclination would be to avoid the 'collection' interfaces altogether and instead use a custom type representing the collection in terms of the problem domain and exposing the appropriate logical operations suitable for that type.
ICollection is just an interface—there is no implementation or explicit specification about ordering. That means if you return something that enumerates in an ordered manner, whatever is consuming your ICollection will do so in an ordered manner.
Order is only implied by the underlying, implementing, object. There is no specification in ICollection that says it should be ordered or not. Enumerating over a result multiple times will invoke the underlying object's enumerator, which is the only place that those rules would be set. An object doesn't change the way it is enumerated just because it inherits this interface. Had the interface specified that it is an ordered result, then you could safely rely on the ordered result of the implementing object.
It depends on the implementation of the instance. An ICollection that happens to be a List has an order, an ICollection that happens to be a Collection does not.
All ICollections implement IEnumerable, which returns the items one at a time, ordered or otherwise.
EDIT: In reply to your additional example about command line parsing in the question, I would argue that the appropriate return type depends on what you are doing with those arguments afterward, but IEnumerable is probably the right one.
My reasoning is that IList, ICollection, and their concrete implementations permit modification of the list returned from Grob, which you probably don't want. Since .NET doesn't have an Indexed Sequence interface, IEnumerable is the best bet to prevent your callers from doing something weird like trying to modify the parameter list that they get back.
If you expect that all present and future versions of your method will have no difficulty returning an object that will be able to quickly and easily retrieve the Nth item, use type IList<T> to return a reference to something that implements both IList<T> and non-generic ICollection. If you expect that some present or future versions might not be able to quickly and easily return the Nth item, but would be able to instantly report the number of items, use type ICollection<T> to return a reference something that implements ICollection<T> and non-generic ICollection. If you expect that present or future versions may have trouble even knowing how many items there are, return IEnumerable<T>. The question of sequencing is irrelevant; the ability to access the Nth thing implies that a defined sequence exists, but ICollection<T> says neither more nor less about sequencing than IEnumerable<T>.
I'm writing a cache-eject method that essentially looks like this:
while ( myHashSet.Count > MAX_ALLOWED_CACHE_MEMBERS )
{
EjectOldestItem( myHashSet );
}
My question is about how Count is determined: is it just a private or protected int, or is it calculated by counting the elements each time its called?
From http://msdn.microsoft.com/en-us/library/ms132433.aspx:
Retrieving the value of this property is an O(1) operation.
This guarantees that accessing the Count won't iterate over the whole collection.
Edit: as many other posters suggested, IEnumerable<...>.Count() is however not guaranteed to be O(1). Use with care!
IEnumerable<...>.Count() is an extension method defined in System.Linq.Enumerable. The current implementation makes an explicit test if the counted IEnumerable<T> is indeed an instance of ICollection<T>, and makes use of ICollection<T>.Count if possible. Otherwise it traverses the IEnumerable<T> (possible making lazy evaluation expand) and counts items one by one.
I've not however found in the documentation whether it's guaranteed that IEnumerable<...>.Count() uses O(1) if possible, I only checked the implementation in .NET 3.5 with Reflector.
Necessary late addition: many popular containers are not derived from Collection<T>, but nevertheless their Count property is O(1) (that is, won't iterate over the whole collection). Examples are HashSet<T>.Count (this one is most likely what the OP wanted to ask about), Dictionary<K, V>.Count, LinkedList<T>.Count, List<T>.Count, Queue<T>.Count, Stack<T>.Count and so on.
All these collections implement ICollection<T> or just ICollection, so their Count is an implementation of ICollection<T>.Count (or ICollection.Count). It's not required for an implementation of ICollection<T>.Count to be an O(1) operation, but the ones mentioned above are doing that way, according to the documentation.
(Note aside: some containers, for instance, Queue<T>, implement non-generic ICollection but not ICollection<T>, so they "inherit" the Count property only from from ICollection.)
Your question does not specify a specific Collection class so...
It depends on the Collection class. ArrayList has an internal variable that tracks the count, as does List. However, it is implementation specific, and depending on the type of the collection, it could theoretically get recalculated on each call.
It is an internal value, and is not calculated. The documentation states that getting the value is an O(1) operation.
As others have noted, Count is maintained when modifying the collection. This is nearly always the case with every collection type in the framework. This is considerably different than using the Count extension method on an IEnumerable which will enumerate the collection each time.
Also, with the newer collection classes the Count property is not virtual which means that the jitter can inline the call to the Count accessor which makes it practically the same as accessing a field. In other words, very quick.
In case of a HashSet it's just an internal int field and even SortedSet (a binary tree based set for .net 4) has its count in an internal field.
According to Reflector, it is implemented as
public int Count{ get; }
so it is defined by the derived type
Just a quick note. Be ware that there are two ways to count a collection in .NET 3.5 when System.Linq is used. For a normal collection, the first choice should be to use the Count property, for the reasons already described in other answers.
An alternative method, via the LINQ .Count() extension method, is also available. The intriguing thing about .Count() is that it can be called on ANY enumerable, regardless of whether the underlying class implements ICollection or not, or whether it has a Count property. If you ever do call .Count() however, be aware that it WILL iterate over the collection to dynamically generate a count. That generally results in O(n) complexity.
The only reason I wanted to note this is, using IntelliSense, it is often easy to accidentally end up using the Count() extension rather than the Count property.
It's an internal int that get incremented each time a new item is added to the collection.
When I'm writing my DAL or other code that returns a set of items, should I always make my return statement:
public IEnumerable<FooBar> GetRecentItems()
or
public IList<FooBar> GetRecentItems()
Currently, in my code I have been trying to use IEnumerable as much as possible but I'm not sure if this is best practice? It seemed right because I was returning the most generic datatype while still being descriptive of what it does, but perhaps this isn't correct to do.
Framework design guidelines recommend using the class Collection when you need to return a collection that is modifiable by the caller or ReadOnlyCollection for read only collections.
The reason this is preferred to a simple IList is that IList does not inform the caller if its read only or not.
If you return an IEnumerable<T> instead, certain operations may be a little trickier for the caller to perform. Also you no longer will give the caller the flexibility to modify the collection, something that you may or may not want.
Keep in mind that LINQ contains a few tricks up its sleeve and will optimize certain calls based on the type they are performed on. So, for example, if you perform a Count and the underlying collection is a List it will NOT walk through all the elements.
Personally, for an ORM I would probably stick with Collection<T> as my return value.
It really depends on why you are using that specific interface.
For example, IList<T> has several methods that aren't present in IEnumerable<T>:
IndexOf(T item)
Insert(int index, T item)
RemoveAt(int index)
and Properties:
T this[int index] { get; set; }
If you need these methods in any way, then by all means return IList<T>.
Also, if the method that consumes your IEnumerable<T> result is expecting an IList<T>, it will save the CLR from considering any conversions required, thus optimizing the compiled code.
In general, you should require the most generic and return the most specific thing that you can. So if you have a method that takes a parameter, and you only really need what's available in IEnumerable, then that should be your parameter type. If your method could return either an IList or an IEnumerable, prefer returning IList. This ensures that it is usable by the widest range of consumers.
Be loose in what you require, and explicit in what you provide.
That depends...
Returning the least derived type (IEnumerable) will leave you the most leeway to change the underlying implementation down the track.
Returning a more derived type (IList) provides the users of your API with more operations on the result.
I would always suggest returning the least derived type that has all the operations your users are going to need... so basically, you first have to deremine what operations on the result make sense in the context of the API you are defining.
One thing to consider is that if you're using a deferred-execution LINQ statement to generate your IEnumerable<T>, calling .ToList() before you return from your method means that your items may be iterated twice - once to create the List, and once when the caller loops through, filters, or transforms your return value. When practical, I like to avoid converting the results of LINQ-to-Objects to a concrete List or Dictionary until I have to. If my caller needs a List, that's a single easy method call away - I don't need to make that decision for them, and that makes my code slightly more efficient in the cases where the caller is just doing a foreach.
List<T> offers the calling code many more features, such as modifying the returned object and access by index. So the question boils down to: in your application's specific use case, do you WANT to support such uses (presumably by returning a freshly constructed collection!), for the caller's convenience -- or do you want speed for the simple case when all the caller needs is to loop through the collection and you can safely return a reference to a real underlying collection without fearing this will get it erroneously changed, etc?
Only you can answer this question, and only by understanding well what your callers will want to do with the return value, and how important performance is here (how big are the collections you would be copying, how likely is this to be a bottleneck, etc).
I think you can use either, but each has a use. Basically List is IEnumerable but you have
count functionality, add element, remove element
IEnumerable is not efficient for counting elements
If the collection is intended to be readonly, or the modification of the collection is controlled by the Parent then returning an IList just for Count is not a good idea.
In Linq, there is a Count() extension method on IEnumerable<T> which inside the CLR will shortcut to .Count if the underlying type is of IList, so performance difference is negligible.
Generally I feel (opinion) it is better practice to return IEnumerable where possible, if you need to do additions then add these methods to the parent class, otherwise the consumer is then managing the collection within Model which violates the principles, e.g. manufacturer.Models.Add(model) violates law of demeter. Of course these are just guidelines and not hard and fast rules, but until you have full grasps of applicability, following blindly is better than not following at all.
public interface IManufacturer
{
IEnumerable<Model> Models {get;}
void AddModel(Model model);
}
(Note: If using nNHibernate you might need to map to private IList using different accessors.)
It's not so simple when you are talking about return values instead of input parameters. When it's an input parameter, you know exactly what you need to do. So, if you need to be able to iterate over the collection, you take an IEnumberable whereas if you need to add or remove, you take an IList.
In the case of a return value, it's tougher. What does your caller expect? If you return an IEnumerable, then he will not know a priori that he can make an IList out of it. But, if you return an IList, he will know that he can iterate over it. So, you have to take into account what your caller is going to do with the data. The functionality that your caller needs/expects is what should govern when making the decision on what to return.
TL; DR; – summary
If you develop in-house software, do use the specific type(Like List) for the return
values and the most generic type for input parameters even in case of collections.
If a method is a part of a redistributable library’s public API, use
interfaces instead of concrete collection types to introduce both return values and input parameters.
If a method returns a read-only collection, show that by using IReadOnlyList or IReadOnlyCollection as the return value type.
More
as all have said it depends,
if you don't want Add/Remove functioanlity at calling layer then i will vote for IEnumerable as it provides only iteration and basic functionality which in design prespective i like.
Returning IList my votes are always againist it but it's mainly what you like and what not.
in performance terms i think they are more of same.
If you do not counting in your external code it is always better to return IEnumerable, because later you can change your implementation (without external code impact), for example, for yield iterator logic and conserve memory resources (very good language feature by the way).
However if you need items count, don't forget that there is another layer between IEnumerable and IList - ICollection.
I might be a bit off here, seeing that no one else suggested it so far, but why don't you return an (I)Collection<T>?
From what I remember, Collection<T> was the preferred return type over List<T> because it abstracts away the implementation. They all implement IEnumerable, but that sounds to me a bit too low-level for the job.
I think you can use either, but each has a use. Basically List is IEnumerable but you have count functionality, Add element, remove element
IEnumerable is not efficient for counting elements, or getting a specific element in the collection.
List is a collection which is ideally suited to finding specific elements, easy to add elements, or remove them.
Generally I try to use List where possible as this gives me more flexibility.
Use
List<FooBar> getRecentItems()
rather than
IList<FooBar> GetRecentItems()
I think the general rule is to use the more specific class to return, to avoid doing unneeded work and give your caller more options.
That said, I think it's more important to consider the code in front of you which you are writing than the code the next guy will write (within reason.) This is because you can make assumptions about the code that already exists.
Remember that moving UP to a collection from IEnumerable in an interface will work, moving down to IEnumerable from a collection will break existing code.
If these opinions all seem conflicted, it's because the decision is subjective.
IEnumerable<T> contains a small subset of what is inside List<T>, which contains the same stuff as IEnumerable<T> but more! You only use IEnumerable<T> if you want a smaller set of features. Use List<T> if you plan to use a larger, richer set of features.
The Pizza Explanation
Here is a much more comprehensive explanation of why you would use an Interface like IEnumerable<T> versus List<T>, or vise versa, when instantiating objects in C languages like Microsoft C#.
Think of Interfaces like IEnumerable<T> and IList<T> as the individual ingredients in a pizza (pepperoni, mushrooms, black olives...) and concrete classes like List<T> as the pizza. List<T> is in fact a Supreme Pizza that always contains all the Interface ingredients combined (ICollection, IEnumerable, IList, etc).
What you get as far as a pizza and its toppings is determined by how you "type" your list when you create its object reference in memory. You have to declare the type of pizza you are cooking as follows:
// Pepperoni Pizza: This gives you a single Interface's members,
// or a pizza with one topping because List<T> is limited to
// acting like an IEnumerable<T> type.
IEnumerable<string> pepperoniPizza = new List<string>();
// Supreme Pizza: This gives you access to ALL 8 Interface
// members combined or a pizza with ALL the ingredients
// because List type uses all Interfaces!!
IList<string> supremePizza = new List<string>();
Note you cannot instantiate an Interface as itself (or eat raw pepperoni). When you instantiate List<T> as one Interface type like IEnumerable<T> you only have access to its Implementations and get the pepperoni pizza with one topping. You can only access IEnumerable<T> members and cannot see all the other Interface members in List<T>.
When List<T> is instantiated as IList<T> it implements all 8 Interfaces, so it has access to all the members of all the Interfaces it has implemented (or a Supreme Pizza toppings)!
Here is the List<T> class, showing you WHY that is. Notice the List<T> in the .NET Library has implemented all the other Interfaces including IList!! But IEnumerable<T> implements just a small subsection of those List Interface members.
public class List<T> :
ICollection<T>,
IEnumerable<T>,
IEnumerable,
IList<T>,
IReadOnlyCollection<T>,
IReadOnlyList<T>,
ICollection,
IList
{
// List<T> types implement all these goodies and more!
public List();
public List(IEnumerable<T> collection);
public List(int capacity);
public T this[int index] { get; set; }
public int Count { get; }
public int Capacity { get; set; }
public void Add(T item);
public void AddRange(IEnumerable<T> collection);
public ReadOnlyCollection<T> AsReadOnly();
public bool Exists(Predicate<T> match);
public T Find(Predicate<T> match);
public void ForEach(Action<T> action);
public void RemoveAt(int index);
public void Sort(Comparison<T> comparison);
// ......and much more....
}
So why NOT instantiate List<T> as List<T> ALL THE TIME?
Instantiating a List<T> as List<T> gives you access to all Interface members! But you might not need everything. Choosing one Interface type allows your application to store a smaller object with less members and keeps your application tight. Who needs Supreme Pizza every time?
But there is a second reason for using Interface types: Flexibility. Because other types in .NET, including your own custom ones, might use the same "popular" Interface type, it means you can later substitute your List<T> type with any other type that implements, say IEnumerable<T>. If your variable is an Interface type, you can now switch out the object created with something other than List<T>. Dependency Injection is a good example of this type of flexibility using Interfaces rather than concrete types, and why you might want to create objects using Interfaces.
EDIT:
From the answers given, it's been made rather clear to me how the design I'm asking about below should actually be implemented. With those suggestions in mind (and in response to a comment politely pointing out that my example code does not even compile), I've edited the following code to reflect what the general consensus seems to be. The question that remains may no longer make sense in light of the code, but I'm leaving it as it is for posterity.
Suppose I have three overloads of a function, one taking IEnumerable<T>, one taking ICollection<T>, and one taking IList<T>, something like the following:
public static T GetMiddle<T>(IEnumerable<T> values) {
IList<T> list = values as IList<T>;
if (list != null) return GetMiddle(list);
int count = GetCount<T>(values);
T middle = default(T);
int index = 0;
foreach (T value in values) {
if (index++ >= count / 2) {
middle = value;
break;
}
}
return middle;
}
private static T GetMiddle<T>(IList<T> values) {
int middleIndex = values.Count / 2;
return values[middleIndex];
}
private static int GetCount<T>(IEnumerable<T> values) {
// if values is actually an ICollection<T> (e.g., List<T>),
// we can get the count quite cheaply
ICollection<T> genericCollection = values as ICollection<T>;
if (genericCollection != null) return genericCollection.Count;
// same for ICollection (e.g., Queue<T>, Stack<T>)
ICollection collection = values as ICollection;
if (collection != null) return collection.Count;
// otherwise, we've got to count values ourselves
int count = 0;
foreach (T value in values) count++;
return count;
}
The idea here is that, if I've got an IList<T>, that makes my job easiest; on the other hand, I can still do the job with an ICollection<T> or even an IEnumerable<T>; the implementation for those interfaces just isn't as efficient.
I wasn't sure if this would even work (if the runtime would be able to choose an overload based on the parameter passed), but I've tested it and it seems to.
My question is: is there a problem with this approach that I haven't thought of? Alternately, is this in fact a good approach, but there's a better way of accomplishing it (maybe by attempting to cast the values argument up to an IList<T> first and running the more efficient overload if the cast works)? I'm just interested to know others' thoughts.
If you have a look at how LINQ extension methods are implemented using Reflector, you can see that a few extension methods on IEnumerable<T>, such as Count(), attempt to cast the sequence to an ICollection<T> or an IList<T> to optimize the operation (for example, using the ICollection<T>.Count property instead of iterating through an IEnumerable<T> and counting the elements). So your best bet is most likely to accept an IEnumerable<T> and then do this kind of optimizations if ICollection<T> or IList<T> are available.
I think one version accepting IEnumerable<T> would be the way to go, and check inside the method if the parameter is one of the more derived collection types. With three versions as you propose, you lose the efficiency benefit if someone passes you a (runtime) IList<T> that the compiler statically considers an IEnumerable<T>:
IList<string> stringList = new List<string> { "A", "B", "C" };
IEnumerable<string> seq = stringList;
Extensions.GetMiddle(stringList); // calls IList version
Extensions.GetMiddle(seq); // calls IEnumerable version
I'd say it's uncommon, and potentially confusing, so would be unlikely to be a good choice for a public API.
You could accept an IEnumerable<T> parameter, and internally check if it is in fact an ICollection<T> or IList<T>, and optimize accordingly.
This might be analagous to some of the optimizations in some of the IEnumerable<T> extension methods in the .NET 3.5 Framework.
I am really indifferent. If I saw it your way I would not think anything of it. But Joe's idea has merit. It might look like the following.
public static T GetMiddle<T>(IEnumerable<T> values)
{
if (values is IList<T>) return GetMiddle((IList<T>)values);
if (values is ICollection<T>) return GetMiddle((ICollection<T>)values);
// Use the default implementation here.
}
private static T GetMiddle<T>(ICollection<T> values)
{
}
private static T GetMiddle<T>(IList<T> values)
{
}
While it is legal to overload a method to accept either a base type or a derived type, with all other parameters being otherwise identical, it is only advantageous to do so if the compiler will often be able to identify the latter form as being a better match. Because it would be very common for objects which implement ICollection<T> to be passed around by code which only needs an IEnumerable<T>, it would be very common for implementations of ICollection<T> to be passed into the IEnumerable<T> overload. Consequently, the IEnumerable<T> overload should probably check whether a passed-in object implements ICollection<T> and handle then specially if so.
If the most natural way of implementing the logic for an ICollection<T> would be to write a special method for it, there would be nothing particularly wrong with having a public overload which accepts an ICollection<T>, and having the IEnumerable<T> overload call the ICollection<T> one if given an object that implements ICollection<T>. Having such an overload be public wouldn't add much value, but it likely wouldn't hurt anything either. On the other hand, in situations where an object implements both IEnumerable<T> and ICollection, but not ICollection<T> (for example, a List<Cat> implements IEnumerable<Animal> and ICollection, but not ICollection<Animal>), one might want to use both interfaces, but that could not be done without either typecasting in the method that uses them, or passing the method which uses them both an ICollection reference and an IEnumerable<T> reference. The latter would be very ugly in a public method, and the former approach would lose the benefits of overloading.
Usually when designing interfaces you want to accept a 'lowest common denominator' type for the arguments. For return types it is a matter of some debate. I generally think creating the above overloads is overkill. It's biggest problem is the introduction of unneeded code-paths that now must be tested. Better to have one method that performs the operation one way and works 100% of the time. With the given overloads above you might have an inconsistency in behavior and not even realize it, or worse yet you may accidentally introduce a change in one and not in the other copies.
If you can do it with IEnumerable<T> then use that, if not then use the least interface needed.
No. It's certainly uncommon.
Anyway.
Since IList<T> inherits from ICollection<T> and IEnumerable<T>, and ICollection<T> inherits from IEnumerable<T>, your only concern would be performance in IEnumerable<T> types.
I just see no reason to overload the function in that way, providing different signatures to achieve exactly the same result and accepting exactly the same types as parameter (no matter if you have an IEnumerable<T> or IList<T>, you would be able to pass it to any of the three overloads); that would just cause confusion.
When you overload a function, is just to provide a way to pass a different type of parameter that you cannot pass to the function if it would not have that signature.
Don't optimize unless it's really necessary.
If you want to optimize, do it undercover.
You won't pretend someone using your class to be aware of that "optimization" in order to decide which method signature to use, right?