Does IEnumerable always imply a collection? - c#

Just a quick question regarding IEnumerable:
Does IEnumerable always imply a collection? Or is it legitimate/viable/okay/whatever to use on a single object?

The IEnumerable and IEnumerable<T> interfaces suggest a sequence of some kind, but that sequence doesn't need to be a concrete collection.
For example, where's the underlying concrete collection in this case?
foreach (int i in new EndlessRandomSequence().Take(5))
{
Console.WriteLine(i);
}
// ...
public class EndlessRandomSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
var rng = new Random();
while (true) yield return rng.Next();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}

It is always and mandatory that IEnumerable is used on a single object - the single object is always the holder or producer of zero or more other objects that do not necessarily have any relation to IEnumerable.
It's usual, but not mandatory, that IEnumerable represents a collection.
Enumerables can be collections, as well as generators, queries, and even computations.
Generator:
IEnumerable<int> Generate(
int initial,
Func<int, bool> condition,
Func<int, int> iterator)
{
var i = initial;
while (true)
{
yield return i;
i = iterator(i);
if (!condition(i))
{
yield break;
}
}
}
Query:
IEnumerable<Process> GetProcessesWhereNameContains(string text)
{
// Could be web-service or database call too
var processes = System.Diagnostics.Process.GetProcesses();
foreach (var process in processes)
{
if (process.ProcessName.Contains(text))
{
yield return process;
}
}
}
Computation:
IEnumerable<double> Average(IEnumerable<double> values)
{
var sum = 0.0;
var count = 0;
foreach (var value in values)
{
sum += value;
yield return sum/++count;
}
}
LINQ is itself a series of operators that produce objects that implement IEnumerable<T> that don't have any underlying collections.
Good question, BTW!
NB: Any reference to IEnumerable also applies to IEnumerable<T> as the latter inherits the former.

Yes, IEnumerable implies a collection, or possible collection, of items.
The name is derived from enumerate, which means to:
Mention (a number of things) one by one.
Establish the number of.

According to the docs, it exposes the enumerator over a collection.

You can certainly use it on a single object, but this object will then just be exposed as an enumeration containing a single object, i.e. you could have an IEnumerable<int> with a single integer:
IEnumerable<int> items = new[] { 42 };

IEnumerable represents a collection that can be enumerated, not a single item. Look at MSDN; the interface exposes GetEnumerator(), which
...[r]eturns an enumerator that iterates through a collection.

Yes, IEnumerable always implies a collection, that is what enumerate means.
What is your use case for a single object?
I don't see a problem with using it on a single object, but why do want to do this?

I'm not sure whether you mean a "collection" or a .NET "ICollection" but since other people have only mentioned the former I will mention the latter.
http://msdn.microsoft.com/en-us/library/92t2ye13.aspx
By that definition, All ICollections are IEnumerable. But not the other way around.
But most data structure (Array even) just implement both interfaces.
Going on this train of thought: you could have a car depot (a single object) that does not expose an internal data structure, and put IEnumerable on it. I suppose.

Related

Force IEnumerable<T> to evaluate without calling .ToArray() or .ToList()

If I query EF using something like this...
IEnumerable<FooBar> fooBars = db.FooBars.Where(o => o.SomeValue == something);
IIRC, This creates a lazy-evaluated, iterable state machine in the background, that does not yet contain any results; rather, it contains an expression of "how" to obtain the results when required.
If I want to force the collection to contain results I have to call .ToArray() or .ToList()
Is there a way to force an IEnumerable<T> collection to contain results without calling .ToArray() or .ToList(); ?
Rationale
I don't know if the CLR is capable of doing this, but essentially I want to forcibly create an evaluated collection that implements the IEnumerable<T> interface, but is implemented under the hood by the CLR, thus NOT a List<T> or Array<T>
Presumably this is not possible, since I'm not aware of any CLR capability to create in-memory, evaluated collections that implement IEnumerable<T>
Proposal
Say for example, I could write something like this:
var x = IEnumerable<FooBar> fooBars = db.FooBars
.Where(o => o.SomeValue == something)
.Evaluate(); // Does NOT return a "concrete" impl such as List<T> or Array<T>
Console.WriteLine(x.GetType().Name);
// eg. <EvaluatedEnumerable>e__123
Is there a way to force an IEnumerable<T> collection to contain results without calling .ToArray() or .ToList(); ?
Yes, but it is perhaps not what you want:
IEnumerable<T> source = …;
IEnumerable<T> cached = new List<T>(source);
The thing is, IEnumerable<T> is not a concrete type. It is an interface (contract) representing an item sequence. There can be any concrete type "hiding behind" this interface; some might only represent a query, others actually hold the queried items in memory.
If you want to force-evaluate your sequence so that the result is actually stored in physical memory, you need to make sure that the concrete type behind IEnumerable<T> is a in-memory collection that holds the results of the evaluation. The above code example does just that.
You can use a foreach loop:
foreach (var item in fooBars) { }
Note that this evaluates all items in fooBars, but throws away the result immediately. Next time you run the same foreach loop or .ToArray(), .ToList(), the enumerable will be evaluated once again.
A concrete use case I've run into revolves around needing to ensure that an IEnumerable that wraps a DB Query has begun returning results (indicating that the query did not time out) before returning control to the calling method. But the results are too large to evaluate fully, hence the IEnumerable to support streaming.
internal class EagerEvaluator<T>
{
private readonly T _first;
private readonly IEnumerator<T> _enumerator;
private readonly bool _hasFirst;
public EagerEvaluator(IEnumerable<T> enumerable)
{
_enumerator = enumerable.GetEnumerator();
if (_enumerator.MoveNext())
{
_hasFirst = true;
_first = _enumerator.Current;
}
}
public IEnumerable<T> ToEnumerable()
{
if (_hasFirst)
{
yield return _first;
while (_enumerator.MoveNext())
{
yield return _enumerator.Current;
}
}
}
}
The usage is pretty straight forward:
IEnumerable<FooBar> fooBars = new EagerEvaluator(fooBars).ToEnumerable()
Another options is:
<linq expression>.All( x => true);
I use Aggregate<T>() to evaluate an IEnumerable<T> with side effects:
private static IEnumerable<T> Evaluate<T>(IEnumerable<T> source)
=> source.Aggregate(Enumerable.Empty<T>(), (evaluated, s) => evaluated.Append(s));
See it in action: https://dotnetfiddle.net/iya2l0

Calculating Count for IEnumerable (Non Generic)

Can anyone help me with a Count extension method for IEnumerable (non generic interface).
I know it is not supported in LINQ but how to write it manually?
yourEnumerable.Cast<object>().Count()
To the comment about performance:
I think this is a good example of premature optimization but here you go:
static class EnumerableExtensions
{
public static int Count(this IEnumerable source)
{
int res = 0;
foreach (var item in source)
res++;
return res;
}
}
The simplest form would be:
public static int Count(this IEnumerable source)
{
int c = 0;
using (var e = source.GetEnumerator())
{
while (e.MoveNext())
c++;
}
return c;
}
You can then improve on this by querying for ICollection:
public static int Count(this IEnumerable source)
{
var col = source as ICollection;
if (col != null)
return col.Count;
int c = 0;
using (var e = source.GetEnumerator())
{
while (e.MoveNext())
c++;
}
return c;
}
Update
As Gerard points out in the comments, non-generic IEnumerable does not inherit IDisposable so the normal using statement won't work. It is probably still important to attempt to dispose of such enumerators if possible - an iterator method implements IEnumerable and so may be passed indirectly to this Count method. Internally, that iterator method will be depending on a call to Dispose to trigger its own try/finally and using statements.
To make this easy in other circumstances too, you can make your own version of the using statement that is less fussy at compile time:
public static void DynamicUsing(object resource, Action action)
{
try
{
action();
}
finally
{
IDisposable d = resource as IDisposable;
if (d != null)
d.Dispose();
}
}
And the updated Count method would then be:
public static int Count(this IEnumerable source)
{
var col = source as ICollection;
if (col != null)
return col.Count;
int c = 0;
var e = source.GetEnumerator();
DynamicUsing(e, () =>
{
while (e.MoveNext())
c++;
});
return c;
}
Different types of IEnumerable have different optimal methods for determining count; unfortunately, there's no general-purpose means of knowing which method will be best for any given IEnumerable, nor is there even any standard means by which an IEmumerable can indicate which of the following techniques is best:
Simply ask the object directly. Some types of objects that support IEnumerable, such as Array, List and Collection, have properties which can directly report the number of elements in them.
Enumerate all items, discarding them, and count the number of items enumerated.
Enumerate all items into a list, and then use the list if it's necessary to use the enumeration again.
Each of the above will be optimal in different cases.
I think the type chosen to represent your sequence of elements should have been ICollection instead of IEnumerable, in the first place.
Both ICollection and ICollection<T> provide a Count property - plus - every ICollection implements IEnumearable as well.

How can I add an item to a IEnumerable<T> collection?

My question as title above. For example
IEnumerable<T> items = new T[]{new T("msg")};
items.ToList().Add(new T("msg2"));
but after all it only has 1 item inside. Can we have a method like items.Add(item) like the List<T>?
You cannot, because IEnumerable<T> does not necessarily represent a collection to which items can be added. In fact, it does not necessarily represent a collection at all! For example:
IEnumerable<string> ReadLines()
{
string s;
do
{
s = Console.ReadLine();
yield return s;
} while (!string.IsNullOrEmpty(s));
}
IEnumerable<string> lines = ReadLines();
lines.Add("foo") // so what is this supposed to do??
What you can do, however, is create a new IEnumerable object (of unspecified type), which, when enumerated, will provide all items of the old one, plus some of your own. You use Enumerable.Concat for that:
items = items.Concat(new[] { "foo" });
This will not change the array object (you cannot insert items into to arrays, anyway). But it will create a new object that will list all items in the array, and then "Foo". Furthermore, that new object will keep track of changes in the array (i.e. whenever you enumerate it, you'll see the current values of items).
The type IEnumerable<T> does not support such operations. The purpose of the IEnumerable<T> interface is to allow a consumer to view the contents of a collection. Not to modify the values.
When you do operations like .ToList().Add() you are creating a new List<T> and adding a value to that list. It has no connection to the original list.
What you can do is use the Add extension method to create a new IEnumerable<T> with the added value.
items = items.Add("msg2");
Even in this case it won't modify the original IEnumerable<T> object. This can be verified by holding a reference to it. For example
var items = new string[]{"foo"};
var temp = items;
items = items.Add("bar");
After this set of operations the variable temp will still only reference an enumerable with a single element "foo" in the set of values while items will reference a different enumerable with values "foo" and "bar".
EDIT
I contstantly forget that Add is not a typical extension method on IEnumerable<T> because it's one of the first ones that I end up defining. Here it is
public static IEnumerable<T> Add<T>(this IEnumerable<T> e, T value) {
foreach ( var cur in e) {
yield return cur;
}
yield return value;
}
Have you considered using ICollection<T> or IList<T> interfaces instead, they exist for the very reason that you want to have an Add method on an IEnumerable<T>.
IEnumerable<T> is used to 'mark' a type as being...well, enumerable or just a sequence of items without necessarily making any guarantees of whether the real underlying object supports adding/removing of items. Also remember that these interfaces implement IEnumerable<T> so you get all the extensions methods that you get with IEnumerable<T> as well.
In .net Core, there is a method Enumerable.Append that does exactly that.
The source code of the method is available on GitHub..... The implementation (more sophisticated than the suggestions in other answers) is worth a look :).
A couple short, sweet extension methods on IEnumerable and IEnumerable<T> do it for me:
public static IEnumerable Append(this IEnumerable first, params object[] second)
{
return first.OfType<object>().Concat(second);
}
public static IEnumerable<T> Append<T>(this IEnumerable<T> first, params T[] second)
{
return first.Concat(second);
}
public static IEnumerable Prepend(this IEnumerable first, params object[] second)
{
return second.Concat(first.OfType<object>());
}
public static IEnumerable<T> Prepend<T>(this IEnumerable<T> first, params T[] second)
{
return second.Concat(first);
}
Elegant (well, except for the non-generic versions). Too bad these methods are not in the BCL.
No, the IEnumerable doesn't support adding items to it. The alternative solution is
var myList = new List(items);
myList.Add(otherItem);
To add second message you need to -
IEnumerable<T> items = new T[]{new T("msg")};
items = items.Concat(new[] {new T("msg2")})
I just come here to say that, aside from Enumerable.Concat extension method, there seems to be another method named Enumerable.Append in .NET Core 1.1.1. The latter allows you to concatenate a single item to an existing sequence. So Aamol's answer can also be written as
IEnumerable<T> items = new T[]{new T("msg")};
items = items.Append(new T("msg2"));
Still, please note that this function will not change the input sequence, it just return a wrapper that put the given sequence and the appended item together.
Not only can you not add items like you state, but if you add an item to a List<T> (or pretty much any other non-read only collection) that you have an existing enumerator for, the enumerator is invalidated (throws InvalidOperationException from then on).
If you are aggregating results from some type of data query, you can use the Concat extension method:
Edit: I originally used the Union extension in the example, which is not really correct. My application uses it extensively to make sure overlapping queries don't duplicate results.
IEnumerable<T> itemsA = ...;
IEnumerable<T> itemsB = ...;
IEnumerable<T> itemsC = ...;
return itemsA.Concat(itemsB).Concat(itemsC);
Others have already given great explanations regarding why you can not (and should not!) be able to add items to an IEnumerable. I will only add that if you are looking to continue coding to an interface that represents a collection and want an add method, you should code to ICollection or IList. As an added bonanza, these interfaces implement IEnumerable.
you can do this.
//Create IEnumerable
IEnumerable<T> items = new T[]{new T("msg")};
//Convert to list.
List<T> list = items.ToList();
//Add new item to list.
list.add(new T("msg2"));
//Cast list to IEnumerable
items = (IEnumerable<T>)items;
Easyest way to do that is simply
IEnumerable<T> items = new T[]{new T("msg")};
List<string> itemsList = new List<string>();
itemsList.AddRange(items.Select(y => y.ToString()));
itemsList.Add("msg2");
Then you can return list as IEnumerable also because it implements IEnumerable interface
Instances implementing IEnumerable and IEnumerator (returned from IEnumerable) don't have any APIs that allow altering collection, the interface give read-only APIs.
The 2 ways to actually alter the collection:
If the instance happens to be some collection with write API (e.g. List) you can try casting to this type:
IList<string> list = enumerableInstance as IList<string>;
Create a list from IEnumerable (e.g. via LINQ extension method toList():
var list = enumerableInstance.toList();
IEnumerable items = Enumerable.Empty(T);
List somevalues = new List();
items.ToList().Add(someValues);
items.ToList().AddRange(someValues);
Sorry for reviving really old question but as it is listed among first google search results I assume that some people keep landing here.
Among a lot of answers, some of them really valuable and well explained, I would like to add a different point of vue as, to me, the problem has not be well identified.
You are declaring a variable which stores data, you need it to be able to change by adding items to it ? So you shouldn't use declare it as IEnumerable.
As proposed by #NightOwl888
For this example, just declare IList instead of IEnumerable: IList items = new T[]{new T("msg")}; items.Add(new T("msg2"));
Trying to bypass the declared interface limitations only shows that you made the wrong choice.
Beyond this, all methods that are proposed to implement things that already exists in other implementations should be deconsidered.
Classes and interfaces that let you add items already exists. Why always recreate things that are already done elsewhere ?
This kind of consideration is a goal of abstracting variables capabilities within interfaces.
TL;DR : IMO these are cleanest ways to do what you need :
// 1st choice : Changing declaration
IList<T> variable = new T[] { };
variable.Add(new T());
// 2nd choice : Changing instantiation, letting the framework taking care of declaration
var variable = new List<T> { };
variable.Add(new T());
When you'll need to use variable as an IEnumerable, you'll be able to. When you'll need to use it as an array, you'll be able to call 'ToArray()', it really always should be that simple. No extension method needed, casts only when really needed, ability to use LinQ on your variable, etc ...
Stop doing weird and/or complex things because you only made a mistake when declaring/instantiating.
Maybe I'm too late but I hope it helps anyone in the future.
You can use the insert function to add an item at a specific index.
list.insert(0, item);
Sure, you can (I am leaving your T-business aside):
public IEnumerable<string> tryAdd(IEnumerable<string> items)
{
List<string> list = items.ToList();
string obj = "";
list.Add(obj);
return list.Select(i => i);
}

ICollection - Get single value

What is the best way to get a value from a ICollection?
We know the Collection is empty apart from that.
You can use LINQ for this:.
var foo = myICollection.OfType<YourType>().FirstOrDefault();
// or use a query
var bar = (from x in myICollection.OfType<YourType>() where x.SomeProperty == someValue select x)
.FirstOrDefault();
The simplest way to do this is:
foreach(object o in collection) {
return o;
}
But this isn't particularly efficient if it's actually a generic collection because IEnumerator implements IDisposable, so the compiler has to put in a try/finally, with a Dispose() call in the finally block.
If it's a non-generic collection, or you know the generic collection implements nothing in its Dispose() method, then the following can be used:
IEnumerator en = collection.GetEnumerator();
en.MoveNext();
return en.Current;
If you know if may implement IList, you can do this:
IList iList = collection as IList;
if (iList != null) {
// Implements IList, so can use indexer
return iList[0];
}
// Use the slower way
foreach (object o in collection) {
return o;
}
Likewise, if it's likely it'll be of a certain type of your own definition that has some kind of indexed access, you can use the same technique.
collection.ToArray()[i]
This way is slow, but very simple to use.
Without generics and because ICollection implements IEnumerable you can do like in example 1. With generics you simple need to do like example 2:
List<string> l = new List<string>();
l.Add("astring");
ICollection col1 = (ICollection)l;
ICollection<string> col2 = (ICollection<string>)l;
//example 1
IEnumerator e1 = col1.GetEnumerator();
if (e1.MoveNext())
Console.WriteLine(e1.Current);
//example 2
if (col2.Count != 0)
Console.WriteLine(col2.Single());
If you know your collection has only one item, should only ever have one item, you can use the Linq extension method Single().
This converts a ICollection<T> into a T object containing the single item of that collection. If the length of the collection is 0, or more than one, this will throw an InvalidOperationException.

ReadOnlyCollection or IEnumerable for exposing member collections?

Is there any reason to expose an internal collection as a ReadOnlyCollection rather than an IEnumerable if the calling code only iterates over the collection?
class Bar
{
private ICollection<Foo> foos;
// Which one is to be preferred?
public IEnumerable<Foo> Foos { ... }
public ReadOnlyCollection<Foo> Foos { ... }
}
// Calling code:
foreach (var f in bar.Foos)
DoSomething(f);
As I see it IEnumerable is a subset of the interface of ReadOnlyCollection and it does not allow the user to modify the collection. So if the IEnumberable interface is enough then that is the one to use. Is that a proper way of reasoning about it or am I missing something?
Thanks /Erik
More modern solution
Unless you need the internal collection to be mutable, you could use the System.Collections.Immutable package, change your field type to be an immutable collection, and then expose that directly - assuming Foo itself is immutable, of course.
Updated answer to address the question more directly
Is there any reason to expose an internal collection as a ReadOnlyCollection rather than an IEnumerable if the calling code only iterates over the collection?
It depends on how much you trust the calling code. If you're in complete control over everything that will ever call this member and you guarantee that no code will ever use:
ICollection<Foo> evil = (ICollection<Foo>) bar.Foos;
evil.Add(...);
then sure, no harm will be done if you just return the collection directly. I generally try to be a bit more paranoid than that though.
Likewise, as you say: if you only need IEnumerable<T>, then why tie yourself to anything stronger?
Original answer
If you're using .NET 3.5, you can avoid making a copy and avoid the simple cast by using a simple call to Skip:
public IEnumerable<Foo> Foos {
get { return foos.Skip(0); }
}
(There are plenty of other options for wrapping trivially - the nice thing about Skip over Select/Where is that there's no delegate to execute pointlessly for each iteration.)
If you're not using .NET 3.5 you can write a very simple wrapper to do the same thing:
public static IEnumerable<T> Wrapper<T>(IEnumerable<T> source)
{
foreach (T element in source)
{
yield return element;
}
}
If you only need to iterate through the collection:
foreach (Foo f in bar.Foos)
then returning IEnumerable is enough.
If you need random access to items:
Foo f = bar.Foos[17];
then wrap it in ReadOnlyCollection.
If you do this then there's nothing stopping your callers casting the IEnumerable back to ICollection and then modifying it. ReadOnlyCollection removes this possibility, although it's still possible to access the underlying writable collection via reflection. If the collection is small then a safe and easy way to get around this problem is to return a copy instead.
I avoid using ReadOnlyCollection as much as possible, it is actually considerably slower than just using a normal List.
See this example:
List<int> intList = new List<int>();
//Use a ReadOnlyCollection around the List
System.Collections.ObjectModel.ReadOnlyCollection<int> mValue = new System.Collections.ObjectModel.ReadOnlyCollection<int>(intList);
for (int i = 0; i < 100000000; i++)
{
intList.Add(i);
}
long result = 0;
//Use normal foreach on the ReadOnlyCollection
TimeSpan lStart = new TimeSpan(System.DateTime.Now.Ticks);
foreach (int i in mValue)
result += i;
TimeSpan lEnd = new TimeSpan(System.DateTime.Now.Ticks);
MessageBox.Show("Speed(ms): " + (lEnd.TotalMilliseconds - lStart.TotalMilliseconds).ToString());
MessageBox.Show("Result: " + result.ToString());
//use <list>.ForEach
lStart = new TimeSpan(System.DateTime.Now.Ticks);
result = 0;
intList.ForEach(delegate(int i) { result += i; });
lEnd = new TimeSpan(System.DateTime.Now.Ticks);
MessageBox.Show("Speed(ms): " + (lEnd.TotalMilliseconds - lStart.TotalMilliseconds).ToString());
MessageBox.Show("Result: " + result.ToString());
Sometimes you may want to use an interface, perhaps because you want to mock the collection during unit testing. Please see my blog entry for adding your own interface to ReadonlyCollection by using an adapter.

Categories