FOR-EACH over an IEnumerable vs a List - c#

Is there any benefit or difference if my for-each loop is going through the method argument if I pass in that argument as an IEnumerable or if I pass that argument as a List?

If your IEnumerable is implemented by List then no; no difference. There is a big conceptual difference though; the IEnumerable says "I can be enumerated" which means also that the number of items is not known and the enumeration cannot be reversed, or random accessed. The List says "I am a fully formed list, already populated; I can be reversed and randomly accessed".
So you should generally build your function interface to accept the lowest functionality compatible with your operation; if you are only going to enumerate forwards, iteratively, then accept IEnumerable - this allows your function to be used in more scenarios.
If you made your function accept only List() then any caller with an array or IEnumerable passed into it, must convert their input into List() before calling your function - which may well be poorer performance than simply passing through their array or IEnumerable directly. In this sense accepting an IEnumerable invites better performance code.

In the general case, there can be a difference if the collection has an explicit interface implementation of IEnumerable
List has the explicit implementation, but does not change behavior. There is no difference in your case.
See: https://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs looking at GetEnumerator and similar

No there isn't. In both cases the for-each is translated to something like this
var enumerator = input.GetEnumerator();
while(enumerator.MoveNext())
{
// loop body.
// The current value is accessed through: enumerator.Current
}
Additionally, if the enumerator is disposable, it will be disposed after the loop.
Jon Skeet gives a detailed description here.

If you pass the same object, it doesn't matter whether your method accepts IEnumerable or List.
However, if all you're going to do inside the method is enumerate the object, it's best to expect an IEnumerable in the method argument, you don't want to limit the caller of the method by expecting a List.

No, there is no benefit or difference as to how the foreach loop would go through the collection.
As Olivier Jacot-Descombes has pointed out, the foreach loop will simply go through the elements one by one using the enumerator.
However, it can make a difference if your logic goes through the same collection at least twice. In this case if IEnumerable<> is used, you might end up regenerating the elements each time you go over the iterator.
ReSharper even has a special warning for this type of code: PossibleMultipleEnumeration
I am not saying that you should not use IEnumerable<>. Everything has its time and place and it's not always a good idea to use the most generic interface. Be careful with your choice.

Related

IEnumerable to IReadOnlyCollection

I have IEnumerable<Object> and need to pass to a method as a parameter but this method takes IReadOnlyCollection<Object>
Is it possible to convert IEnumerable<Object> to IReadOnlyCollection<Object> ?
One way would be to construct a list, and call AsReadOnly() on it:
IReadOnlyCollection<Object> rdOnly = orig.ToList().AsReadOnly();
This produces ReadOnlyCollection<object>, which implements IReadOnlyCollection<Object>.
Note: Since List<T> implements IReadOnlyCollection<T> as well, the call to AsReadOnly() is optional. Although it is possible to call your method with the result of ToList(), I prefer using AsReadOnly(), so that the readers of my code would see that the method that I am calling has no intention to modify my list. Of course they could find out the same thing by looking at the signature of the method that I am calling, but it is nice to be explicit about it.
Since the other answers seem to steer in the direction of wrapping the collections in a truly read-only type, let me add this.
I have rarely, if ever, seen a situation where the caller is so scared that an IEnumerable<T>-taking method might maliciously try to cast that IEnumerable<T> back to a List or other mutable type, and start mutating it. Cue organ music and evil laughter!
No. If the code you are working with is even remotely reasonable, then if it asks for a type that only has read functionality (IEnumerable<T>, IReadOnlyCollection<T>...), it will only read.
Use ToList() and be done with it.
As a side note, if you are creating the method in question, it is generally best to ask for no more than an IEnumerable<T>, indicating that you "just want a bunch of items to read". Whether or not you need its Count or need to enumerate it multiple times is an implementation detail, and is certainly prone to change. If you need multiple enumeration, simply do this:
items = items as IReadOnlyCollection<T> ?? items.ToList(); // Avoid multiple enumeration
This keeps the responsibility where it belongs (as locally as possible) and the method signature clean.
When returning a bunch of items, on the other hand, I prefer to return an IReadOnlyCollection<T>. Why? The goal is to give the caller something that fulfills reasonsable expectations - no more, no less. Those expectations are usually that the collection is materialized and that the Count is known - precisely what IReadOnlyCollection<T> provides (and a simple IEnumerable<T> does not). By being no more specific than this, our contract matches expectations, and the method is still free to change the underlying collection. (In contrast, if a method returns a List<T>, it makes me wonder what context there is that I should want to index into the list and mutate it... and the answer is usually "none".)
As an alternative to dasblinkenlight's answer, to prevent the caller casting to List<T>, instead of doing orig.ToList().AsReadOnly(), the following might be better:
ReadOnlyCollection<object> rdOnly = Array.AsReadOnly(orig.ToArray());
It's the same number of method calls, but one takes the other as a parameter instead of being called on the return value.

IEnumerable to List

Why IEnumerable.ToList() won't work if like:
var _listReleases= new List<string>;
_listReleases.Add("C#")
_listReleases.Add("Javascript");
_listReleases.Add("Python");
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
_listReleases.Clear();
_listReleases.AddRange(sortedItems); // won't work
_listReleases.AddRange(sortedItems.ToList()); // won't work
Note: _listRelealse will be null
It doesn't work because of this line:
_listReleases.Clear();
First of all, _listReleases is not null at this point. It's merely empty, which is a completely different thing.
But to explain why this doesn't work as you expect: the IEnumerable interface type does not actually allocate or reserve storage for anything. It represents an object that you can use with a foreach loop, and nothing more. It does not actually need to store the items in the collection itself.
Sometimes, an IEnumerable reference does have those items in the same object, but it doesn't have to. That's what's going on here. The OrderBy() extension method only creates an object that knows how to look at the original list and return the items in a specific order. But this does not have storage for those items. It still depends on it's original data source.
The best solution for this situation is to stop using the _listReleases variable at this point, and instead just use the sortedItems variable. As long the former is not garabage collected, the latter will do what you need. But if you really want the _listReleases variable, you can do it like this:
_listReleases = sortedItems.ToList();
Now back to IEnumerables. There are some nice benefits to this property of not requiring immediate storage of the items themselves, and merely abstracting the ability to iterate over a collection:
Lazy Evaluation - That the work required to produce those items is not done until called for (and often, that means it won't need to be done all all, greatly improving performance).
Composition - An IEnumerable object can be modified during a program to incorprate new sets of rules or operations into the final result. This reduces program complexity and improves maintainability by allowing you to break apart a complex set of sorting or filtering requirements into it's component parts. This also makes it much easier to build a program where these rules can be easily determined by the user at run time, instead of in advance by the programmer at compile time.
Memory Efficiency - An IEnumerable makes it possible to iterate collections of data from sources such as a database in ways that only need to keep the current record loaded into memory at any given time. This feature can also be used to create unbounded collections: sets of items that may stretch on to infinity. You can build an IEnumerable with the BigInteger type to calculate the next prime on to infinity, if asked for. Moreover, you could use that collection in a useful way without crashing or hanging the program by combining this with the composition feature, so the program will know when to stop.
LINQ is lazily evaluated. When you run this line:
IEnumerable sortedItems = _listReleases.OrderBy(x => x);
You aren't actually ordering the items right then and there. Instead you're building an enumerable that will, when enumerated, return the objects that are currently in _listReleases in order. So when you Clear() the list, it no longer has any items to order.
You need to force it to evaluate before you clear _listReleases. An easy way to do this is to add a ToList() call. Also, the type IEnumerable isn't compatible with AddRange won't accept it. You can just use var to implicitly type it to List<string>, which will work because List<T> : IEnumerable<T> (it implements the interface).
var sortedItems = _listReleases.OrderBy(x => x).ToList();
_listReleases.Clear();
_listReleases.AddRange(sortedItems);
You should also note that methods like ToList() are extension methods for IEnumerable<T>, not IEnumerable, so ((IEnumerable)something).ToList() won't work. Unlike, say, Java, Something<T> and Something are completely distinct types in C#.

Is the lost of efficiency worth returning IEnumerable instead of List?

Everytime I've had to return a collection, I've returned a List. I've just read that I should return IEnumerable or similar interface (IQueryable for instance).
The problem I see is that often I want to work with a List. To do that, I'd have to do a .ToList() on the returned result.
Example
//...
List<Guid> listOfGuids = MyMethod().ToList();
//...
public IEnumerable<Guid> MyMethod()
{
using (var context = AccesDataRépart.GetNewContextRépart())
{
return context.MyTable.ToList();
}
}
Is executing a .ToList() twice the right practice.
If the caller actually needs a list, return a list (if that's what you have). Returning an IEnumerable when you already have a list, and when you know the caller is going to need a list, is just being wasteful, and for no real benefit.
If you feel that there is a chance that you'll be changing the underlying type of the object you are returning in future versions of the method it can, potentially, make it a bit easier on the library implementer to return an interface instead, but it's easier on the caller of the method when a more derived type is returned (they have the ability to do more with it than if they are just given an interface).
It is the reverse with input parameters. When passing parameters in the more derived the type the more "power" the library implementer has to work with the type, especially in future revisions, but using a much less restrictive type makes life easier on the caller of your library, as they don't need to convert what they have to what your method accepts.
This makes these decisions something to think about a fair bit when writing a libraries public API. You need to consider how much "power" you need right now, as well as how much you think you might need in the future. Once you know how restrictive/general the types need to be for you to do your job, you can then work to make your methods more convenient to use for callers. There is no one answers that will apply in every case. Saying that you should always return IEnumerable instead of List isn't proper, just the same as saying that you should always return List is also improper. You need to make a judgement call based on the specific situation you are in.
I would recommend just returning a List<T>, or perhaps an IList<T>. The reason that someone might recommend against returning List, is that it locks you in to that implementation. Depending on the usage of the API, that might not be a concern.
My general rule of thumb is to be more permissive in what you accept and more specific in what you return. So, IEnumerable<T> for method parameters, and IList<T>, List<T> or possibly even T[] for method return values.
You don't have to call ToList on the returned value, It is already a List. The reason you can't return IEnumerable is that you have using statement around your DataContext it will be disposed. So modify your method return type as List<T> and then don't call ToList on the returned value.
//...
List<Guid> listOfGuids = MyMethod(); //No ToList here
//...
public List<Guid> MyMethod()
{
using (var context = AccesDataRépart.GetNewContextRépart())
{
return context.MyTable.ToList();
}
}
I've just read that I should return IEnumerable or similar interface
(IQueryable for instance).
Don't worry about that - return IList<> or List<> if you actually need a list object at the point the collection is consumed. The problem with returning IEnumerable can be that no-one knows what the cost of enumerating it is going to be - which is a down-side to the whole Linq concept that doesn't always get fair mention from the people who are encouraging everyone to return IEnumerable everywhere.
It really depends. Do you want to enumerate the collection before or after returning it?
Enumerate before: Every time you call ToList, ToArray, etc. you are enumerating the IEnumerable. If you are doing this many times after it is returned, this can be redundant and wasteful. Either returning it in an already enumerated form (e.g., IList, Array) or enumerating it once after returned and using that for the future processing probably be more preferable.
Enumerate after: Returning an IEnumerable allows you to defer the enumeration of the collection until later (e.g., save processing up front). If it turns out that you never end up enumerating the collection, or you only enumerate a subset of it, then the IEnumerable approach can be very advantageous.

c# string[] vs IEnumerable<string>

What should I prefer if I know the number of elements before runtime?
Resharper offers me IEnumerable<string> instead of string[]?
ReSharper suggests IEnumerable<string> if you are only using methods defined for IEnumerable. It does so with the idea that, since you clearly do not need the value to be typed as array, you might want to hide the exact type from the consumers of (i.e., the code that uses) the value because you might want to change the type in the future.
In most cases, going with the suggestion is the right thing to do. The difference will not be something that you can observe while your program is running; rather, it's in how easily you will find it to make changes to your program in the future.
From the above you can also infer that the whole suggestion/question is meaningless unless the value we are talking about is passed across method boundaries (I don't remember if R# also offers it for a local variable).
If ReSharper suggests you use IEnumerable<string> it means you are only using features of that interface and no array specific features. Go with the suggestion of ReSharper and change it.
If you are trying to provide this method as an interface to other methods, I would prefer to have the output of your method more generic, hence would go for IEnumerable<string>.
Inside a method, if you are trying to instantiate and this is not being passed around to other methods, I would go for string[]. unless I need deferred execution. Although, it doesn't matter which one you use in this case.
The actual type should be string[] but depending on the user you may want to expose it as something else. e.g. IEnumerable<string> sequence = new string[5]... In particular if it's something like static readonly, then you should make it a ReadOnlyCollection so the entries can't be modified.
with string[] you can do more you can acces items by index with IEnumerable you have to loop to find specific index
It's probably suggesting this because it's looking for a better Liskov Substitution at this point in your code. Keep in mind the difference between the declared type and the implementing type. IEnumerable<> isn't an implementation, it's an interface. You can declare the variable as an IEnumerable<string> and build it with a string[] since the string array implements IEnumerable<string>.
What this does for you is allow you to pass around that string array as a more generic, more abstracted type. Anything which expects or returns an IEnumerable<string> (regardless of implementation, be it List<string> or string[] or anything else) can then use your string array, without having to worry about the specific implementation you pass it. As long as it satisfies the interface, it's polymorphic of the correct type.
Keep in mind that this isn't always the way to go. Sometimes you, as the developer, are very concerned with the implementation (perhaps for really fine-grained performance tuning, for example) and don't want to move up to an abstraction. The decision is up to you. ReSharper is merely making a suggestion to use an abstraction rather than an implementation in a variable/method declaration.
ReSharper is likely flagging it for you because you are not returning the least constrained type. If you aren't going to be using access on it by index in the future, I'd go with IEnumerable to have less constraint on the method which returns it.
Depends on your usage later on. If you need to enumare through these elements or sort or compare them later on then I would recommend IEnumerable otherwise go with array.
I wrote this response for a similar question regarding array or IEnumerable for return values, which was then closed as duplicate before I could post it. I thought the answer might be interesting to some so I post it here.
The main advantage of IEnumerable over T[] is that IEnumerable (for return values) can be made lazy. Ie it only computes the next element when needed.
Consider the difference between Directory.GetFiles and Directory.EnumerateFiles. GetFiles returns an Array, EnumerateFiles returns IEnumerable. This means that for a directory with two million files the Array will contain two million strings. EnumerateFiles only instansiate the strings as needed saving memory and improving response time.
However, it's not all benefits.
foreach is significantly less efficient on non-arrays (you can see this by disassembling the ILCode).
Array promises more, ie that its length will not change.
Lazy evaluation is not always better, consider the Directory class. The GetFiles implementation will open a find file handle, iterate over all files, close the find file handle and then return results. EnumerateFiles will do nothing until the first find file is requested, then the find file handle is opened and the files iterated, find file handle is closed when the enumerator is disposed. This means that the life-time of the find file handle is controlled by the caller, not the callee. Can be seen as less encapsulation and can give potential runtime errors with locked file handles.
In my humble opinion, I think R# is overzelous in suggestion IEnumerable over arrays especially so for return values (input parameters have less potential drawbacks). What I tend to do when I see a function that returns IEnumerable is a .ToArray in order to avoid potential issues with Lazy evaluation but if the Collection is already an Array this is inefficient.
I like the principle; promise alot, require little. Ie don't require that the input parameters must be arrays (use IEnumerable) but return Array over IEnumerable as Array is a bigger promise.

IEnumerable<T> vs T[]

I just realize that maybe I was mistaken all the time in exposing T[] to my views, instead of IEnumerable<T>.
Usually, for this kind of code:
foreach (var item in items) {}
item should be T[] or IEnumerable<T>?
Than, if I need to get the count of the items, would the Array.Count be faster over the IEnumerable<T>.Count()?
IEnumerable<T> is generally a better choice here, for the reasons listed elsewhere. However, I want to bring up one point about Count(). Quintin is incorrect when he says that the type itself implements Count(). It's actually implemented in Enumerable.Count() as an extension method, which means other types don't get to override it to provide more efficient implementations.
By default, Count() has to iterate over the whole sequence to count the items. However, it does know about ICollection<T> and ICollection, and is optimised for those cases. (In .NET 3.5 IIRC it's only optimised for ICollection<T>.) Now the array does implement that, so Enumerable.Count() defers to ICollection<T>.Count and avoids iterating over the whole sequence. It's still going to be slightly slower than calling Length directly, because Count() has to discover that it implements ICollection<T> to start with - but at least it's still O(1).
The same kind of thing is true for performance in general: the JITted code may well be somewhat tighter when iterating over an array rather than a general sequence. You'd basically be giving the JIT more information to play with, and even the C# compiler itself treats arrays differently for iteration (using the indexer directly).
However, these performance differences are going to be inconsequential for most applications - I'd definitely go with the more general interface until I had good reason not to.
It's partially inconsequential, but standard theory would dictate "Program against an interface, not an implementation". With the interface model you can change the actual datatype being passed without effecting the caller as long as it conforms to the same interface.
The contrast to that is that you might have a reason for exposing an array specifically and in which case would want to express that.
For your example I think IEnumerable<T> would be desirable. It's also worthy to note that for testing purposes using an interface could reduce the amount of headache you would incur if you had particular classes you would have to re-create all the time, collections aren't as bad generally, but having an interface contract you can mock easily is very nice.
Added for edit:
This is more inconsequential because the underlying datatype is what will implement the Count() method, for an array it should access the known length, I would not worry about any perceived overhead of the method.
See Jon Skeet's answer for an explanation of the Count() implementation.
T[] (one sized, zero based) also implements ICollection<T> and IList<T> with IEnumerable<T>.
Therefore if you want lesser coupling in your application IEnumerable<T> is preferable. Unless you want indexed access inside foreach.
Since Array class implements the System.Collections.Generic.IList<T>, System.Collections.Generic.ICollection<T>, and System.Collections.Generic.IEnumerable<T> generic interfaces, I would use IEnumerable, unless you need to use these interfaces.
http://msdn.microsoft.com/en-us/library/system.array.aspx
Your gut feeling is correct, if all the view cares about, or should care about, is having an enumerable, that's all it should demand in its interfaces.
What is it logically (conceptually) from the outside?
If it's an array, then return the array. If the only point is to enumerate, then return IEnumerable. Otherwise IList or ICollection may be the way to go.
If you want to offer lots of functionality but not allow it to be modified, then perhaps use a List internally and return the ReadonlyList returned from it's .AsReadOnly() method.
Given that changing the code from an array to IEnumerable at a later date is easy, but changing it the other way is not, I would go with a IEnumerable until you know you need the small spead benfit of return an array.

Categories