Iterate over your own objects (the easy way...) - c#

It seems like I remember an interface that can be implemented that basically has one method that has to return an IEnumerable object, and implementing this interface will allow you use foreach over your object. Can someone tell me what this interface is, or correct me if I'm mis-remembering about this?
Edit: Sorry guys, I just realized I'm mixing two things up in my head. I don't think what I just asked for exists, but what I was (and am still) trying to think of is an interface you can implement instead of either IList or IEnumerable (I forget which) that has a method which lets you just return an object of that type rather than actually implementing the IList (or IEnumerable?) interface.
So... slightly different question but still just as relevant to me.
EDIT: IListSource is what I was trying to think of. Sorry everyone for the poorly thought out question. Ah well, they can't all be good :)

IEnumerable<T> has one method to return an IEnumerator<T>
IEnumerator<T> has methods like MoveNext and Current

It's IEnumerable (or the generic equivalent, IEnumerable<T>) which has a GetEnumerator method, returning either IEnumerator or IEnumerator<T>.
When you implement that method, you can either call GetEnumerator on another collection (e.g. a list within your own class) or you can use iterator blocks to perform more custom iteration.
If you run into any problems doing what you need to, post more details and we can help more.

The EInumerable<T> interface (and the non-generic IEnumerable interface) is used to make classes enumerable.
You can implement IEnumerable<T> like this:
public class MyClass : IEnumerable<Item> {
public IEnumerator<Container<T>> GetEnumerator() {
// here you return your items, for example by returning an enumerator:
return someArray.GetEnumerator();
// or by using yield return:
for (int i = 0; i < 10; i++) {
yield return new Item(i);
}
}
// this is needed, because IEnumerable<T> inherits IEnumerable
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
// uses the generic implementation
return GetEnumerator();
}
}
Edit:
The foreach statement uses duck typing, and not the IEnumerable<T> interface, so you don't really need to implement the interface to use a class in a foreach statement. You just implement the GetEnumerator method.

Objects which implements IEnumberable (generic IEnumberable<T>) need to implement GetEnumerator. Using the method you get strongly typed IEnumerator and in case of IEnumberable<T> using method you will get generic IEnumerator.
IEnumerator provides methods like
MoveNext which yields the next item in the collection
Reset which resets the enumerator to its initial position, which is before the first element in the collection.
using which you can iterate over the collection.

Iterator blocks allow you to "yield return" objects of type T rather than creating a type that implements IEnumerable<T>. For example:
IEnumerable<int> PrimesLessThanTen()
{
yield return 2;
yield return 3;
yield return 5;
yield return 7;
}
or
IEnumerable<byte> OddBytes()
{
byte b = 1;
do
{
yield return b;
b += 2;
} while (b != 1);
}

Related

Why is indexing used in in the ICollection<T> Interface implementation example in Microsoft's documentation, if you cannot use it?

I'm trying to understand how to implement generic collections and the IEnumerator interface; I'm using the Documentation provided to do so.
In the given example the enumerator's method MoveNext() is implemented as follows:
public bool MoveNext()
{
//Avoids going beyond the end of the collection.
if (++curIndex >= _collection.Count)
{
return false;
}
else
{
// Set current box to next item in collection.
curBox = _collection[curIndex];
}
return true;
}
curIndex is used to as index for BoxCollection, which implements ICollection. If I try to do the same I get "Cannot apply indexing with [] to an expression of type 'System.Collections.Generic.ICollection...".
Is the documentation wrong, or is it me not doing something correctly?
BoxCollection itself implements the indexer:
public Box this[int index]
{
get { return (Box)innerCol[index]; }
set { innerCol[index] = value; }
}
(line 129-133 of the sample you linked to)
You're right that you can't use the indexer on a class that implements ICollection<T> - unless that class also implements an indexer.
In the sample code in the documentation _collection is a BoxCollection, which is an ICollection also, but in that manifestation it’s typed as a BoxCollection and can therefore have indexing applied because BoxCollection implements a this[int] indexer property
Had the sample code declared _collection as some ICollection<T>, their code would get the same error yours does; in other words the indexability comes from their variable being an indexable type, which is unrelated to it also implementing ICollection (ICollection does not mandate provision of an indexer)

What is the default concrete type of IEnumerable

(Sorry for the vague title; couldn't think of anything better. Feel free to rephrase.)
So let's say my function or property returns an IEnumerable<T>:
public IEnumerable<Person> Adults
{
get
{
return _Members.Where(i => i.Age >= 18);
}
}
If I run a foreach on this property without actually materializing the returned enumerable:
foreach(var Adult in Adults)
{
//...
}
Is there a rule that governs whether IEnumerable<Person> will be materialized to array or list or something else?
Also is it safe to cast Adults to List<Person> or Array without calling ToList() or ToArray()?
Edit
Many people have spent a lot of effort into answering this question. Thanks to all of them. However, the gist of this question still remains unanswered. Let me put in some more details:
I understand that foreach doesn't require the target object to be an array or list. It doesn't even need to be a collection of any kind. All it needs the target object to do is to implement enumeration. However if I place inspect the value of target object, it reveals that the actual underlying object is List<T> (just like it shows object (string) when you inspect a boxed string object). This is where the confusion starts. Who performed this materialization? I inspected the underlying layers (Where() function's source) and it doesn't look like those functions are doing this.
So my problem lies at two levels.
First one is purely theoretical. Unlike many other disciplines like physics and biology, in computer sciences we always know precisely how something works (answering #zzxyz's last comment); so I was trying to dig about the agent who created List<T> and how it decided it should choose a List and not an Array and if there is a way of influencing that decision from our code.
My second reason was practical. Can I rely on the type of actual underlying object and cast it to List<T>? I need to use some List<T> functionality and I was wondering if for example ((List<Person>)Adults).BinarySearch() is as safe as Adults.ToList().BinarySearch()?
I also understand that it isn't going to create any performance penalty even if I do call ToList() explicitly. I was just trying to understand how it is working. Anyway, thanks again for the time; I guess I have spent just too much time on it.
In general terms all you need for a foreach to work is to have an object with an accessible GetEnumerator() method that returns an object that has the following methods:
void Reset()
bool MoveNext()
T Current { get; private set; } // where `T` is some type.
You don't even need an IEnumerable or IEnumerable<T>.
This code works as the compiler figures out everything it needs:
void Main()
{
foreach (var adult in new Adults())
{
Console.WriteLine(adult.ToString());
}
}
public class Adult
{
public override string ToString() => "Adult!";
}
public class Adults
{
public class Enumerator
{
public Adult Current { get; private set; }
public bool MoveNext()
{
if (this.Current == null)
{
this.Current = new Adult();
return true;
}
this.Current = null;
return false;
}
public void Reset() { this.Current = null; }
}
public Enumerator GetEnumerator() { return new Enumerator(); }
}
Having a proper enumerable makes the process work more easily and more robustly. The more idiomatic version of the above code is:
public class Adults
{
private class Enumerator : IEnumerator<Adult>
{
public Adult Current { get; private set; }
object IEnumerator.Current => this.Current;
public void Dispose() { }
public bool MoveNext()
{
if (this.Current == null)
{
this.Current = new Adult();
return true;
}
this.Current = null;
return false;
}
public void Reset()
{
this.Current = null;
}
}
public IEnumerator<Adult> GetEnumerator()
{
return new Enumerator();
}
}
This enables the Enumerator to be a private class, i.e. private class Enumerator. The interface then does all of the hard work - it's not even possible to get a reference to the Enumerator class outside of Adults.
The point is that you do not know at compile-time what the concrete type of the class is - and if you did you may not even be able to cast to it.
The interface is all you need, and even that isn't strictly true if you consider my first example.
If you want a List<Adult> or an Adult[] you must call .ToList() or .ToArray() respectively.
There is no such thing as a default concrete type for any interface.
The entire point of an interface is to guarantee properties, methods, events or indexers, without the user need of any knowledge of the concrete type that implements it.
When using an interface, all you can know is the properties, methods, events and indexers this interface declares, and that's all you actually need to know. That's just another aspect of encapsulation - same as when you are using a method of a class you don't need to know the internal implementation of that method.
To answer your question in the comments:
who decides that concrete type in case we don't, just as I did above?
That's the code that created the instance that's implementing the interface.
Since you can't do var Adults = new IEnumerable<Person> - it has to be a concrete type of some sort.
As far as I see in the source code for linq's Enumerable extensions - the where returns either an instance of Iterator<TSource> or an instance of WhereEnumerableIterator<TSource>. I didn't bother checking further what exactly are those types, but I can pretty much guarantee they both implement IEnumerable, or the guys at Microsoft are using a different c# compiler then the rest of us... :-)
The following code hopefully highlights why neither you nor the compiler can assume an underlying collection:
public class OneThroughTen : IEnumerable<int>
{
private static int bar = 0;
public IEnumerator<int> GetEnumerator()
{
while (true)
{
yield return ++bar;
if (bar == 10)
{ yield break; }
}
}
IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); }
}
class Program
{
static void Main(string[] args)
{
IEnumerable<int> x = new OneThroughTen();
foreach (int i in x)
{ Console.Write("{0} ", i); }
}
}
Output being, of course:
1 2 3 4 5 6 7 8 9 10
Note, the code above behaves extremely poorly in the debugger. I don't know why. This code behaves just fine:
public IEnumerator<int> GetEnumerator()
{
while (bar < 10)
{
yield return ++bar;
}
bar = 0;
}
(I used static for bar to highlight that not only does the OneThroughTen not have a specific collection, it doesn't have any collection, and in fact has no instance data whatsoever. We could just as easily return 10 random numbers, which would've been a better example, now that I think on it :))
From your edited question and comments it sounds like you understand the general concept of using IEnumerable, and that you cannot assume that "a list object backs all IEnumerable objects". Your real question is about something that has confused you in the debugger, but we've not really been able to understand exactly what it is you are seeing. Perhaps a screenshot would help?
Here I have 5 IEnumerable<int> variables which I assign in various ways, along with how the "Watch" window describes them. Does this show the confusion you are having? If not, can you construct a similarly short program and screenshot that does?
Coming a bit late into the party here :)
Actually Linq's "Where" decides what's going to be the underlying implementation of IEnumerable's GetEnumerator.
Look at the source code:
https://github.com/dotnet/runtime/blob/918e6a9a278bc66fb191c43d4db4a71e63ffad31/src/libraries/System.Linq/src/System/Linq/Where.cs#L59
You'll see that based on the "source" type, the methods return "WhereSelectArrayIterator" or "WhereSelectListIterator" or a more generic "WhereSelectEnumerableSelector".
Each of this objects implement the GetEnumerator over an Array, or a List, so I'm pretty sure that's why you see the underlying object type being one of these on VS inspector.
Hope this helps clarifying.
I have been digging into this myself. I believe the 'underlying type' is an iterator method, not an actual data structure type.
An iterator method defines how to generate the objects in a sequence
when requested.
https://learn.microsoft.com/en-us/dotnet/csharp/iterators#enumeration-sources-with-iterator-methods
In my usecase/testing, the iterator is System.Linq.Enumerable.SelectManySingleSelectorIterator. I don't think this is a collection data type. It is a method that can enumerate IEnumerables.
Here is a snippet:
public IEnumerable<Item> ItemsToBuy { get; set; }
...
ItemsToBuy = Enumerable.Range(1, rng.Next(1, 20))
.Select(RandomItem(rng, market))
.SelectMany(e => e);
The property is IEnumerable and .SelectMany returns IEnumerable. So what is the actual collection data structure? I don't think there is one in how I am interpreting 'collection data structure'.
Also is it safe to cast Adults to List or Array without
calling ToList() or ToArray()?
Not for me. When attempting to cast ItemsToBuy collection in a foreach loop I get the following runtime exception:
{"Unable to cast object of type
'SelectManySingleSelectorIterator2[System.Collections.Generic.IEnumerable1[CashMart.Models.Item],CashMart.Models.Item]'
to type 'CashMart.Models.Item[]'."}
So I could not cast, but I could .ToArray(). I do suspect there is a performance hit as I would think that the IEnumerable would have to 'do things' to make it an array, including memory allocation for the array even if the entities are already in memory.
However if I place inspect the value of target object, it reveals that
the actual underlying object is List
This was not my experience and I think it may depend on the IEnumerable source as well as the LinQ provider. If I add a where, the returned iterator is:
System.Linq.Enumerable.WhereEnumerableIterator
I am unsure what your _Member source is, but using LinQ-to-Objects, I get an iterator. LinQ-to-Entities must call the database and store the result set in memory somehow and then enumerate on that result. I would doubt that it internally makes it a List, but I don't know much. I suspect instead that _Members may be a List somewhere else in your code thus, even after the .Where, it shows as a List.

GetIterator() and the iterator pattern

I'm trying to implement the Iterator pattern.
Basically, from what I understand, it makes a class "foreachble" and makes the code more secure by not revealing the exact collection type to the user.
I have been experimenting a bit and I found out that if I implement
IEnumerator GetEnumerator() in my class, I get the desired result ... seemingly sparing the headache of messing around with realizing interfaces.
Here is a glimpse to what I mean:
public class ListUserLoggedIn
{
/*
stuff
*/
public List<UserLoggedIn> UserList { get; set; }
public IEnumerator<UserLoggedIn> GetEnumerator()
{
foreach (UserLoggedIn user in this.UserList)
{
yield return user;
}
}
public void traverse()
{
foreach (var item in ListUserLoggedIn.Instance)
{
Console.Write(item.Id);
}
}
}
I guess my question is, is this a valid example of Iterator?
If yes, why is this working, and what can I do to make the iterator return only a part or an anonymous object via "var".
If not, what is the correct way ...
First a smaller and simplified self-contained version:
class Program
{
public IEnumerator<int> GetEnumerator() // IEnumerable<int> works too.
{
for (int i = 0; i < 5; i++)
yield return i;
}
static void Main(string[] args)
{
var p = new Program();
foreach (int x in p)
{
Console.WriteLine(x);
}
}
}
And the 'strange' thing here is that class Program does not implement IEnumerable.
The specs from Ecma-334 say:
§ 8.18 Iterators
The foreach statement is used to iterate over the elements of an enumerable collection. In order to be enumerable, a collection shall have a parameterless GetEnumerator method that returns an enumerator.
So that's why foreach() works on your class. No mention of IEnumerable. But how does the GetEnumerator() produce something that implements Current and MoveNext ? From the same section:
An iterator is a statement block that yields an ordered sequence of values. An iterator is distinguished from a normal statement block by the presence of one or more yield statements
It is important to understand that an iterator is not a kind of member, but is a means of implementing a function member
So the body of your method is an iterator-block, the compiler checks a number of constraints (the method must return an IEnumerable or IEnumerator) and then implements the IEnumerator members for you.
And to the deeper "why", I just learned something too. Based on an annotation by Eric Lippert in "The C# Programming Language 3rd", page 369:
This is called the "pattern-based approach" and it dates from before generics. An interface based approach in C#1 would have been totally based on passing object around and value types would always have had to be boxed. The pattern approach allows
foreach (int x in myIntCollection)
without generics and without boxing. Neat.

how List<T> does not implement Add(object value)?

I believe it's pretty stupid, and I am a bit embarrassed to ask this kind of question, but I still could not find the answer:
I am looking at the class List<T> , which implemetns IList.
public class List<T> : IList
one of the methods included in Ilist is
int Add(object value)
I understand that List<T> should not expose that method (type safety...), and it really does not. But how can it be? mustnt class implement the entire interface?
I believe that this (interface) method is implemented explicitly:
public class List<T> : IList
{
int IList.Add( object value ) {this.Add((T)value);}
}
By doing so, the Add( object ) method will by hidden. You'll only able to call it, if you cast the List<T> instance back to an IList instance.
A quick trip to reflector shows that IList.Add is implemented like this:
int IList.Add(object item)
{
ThrowHelper.IfNullAndNullsAreIllegalThenThrow<T>(item, ExceptionArgument.item);
try
{
this.Add((T) item);
}
catch (InvalidCastException)
{
ThrowHelper.ThrowWrongValueTypeArgumentException(item, typeof(T));
}
return (this.Count - 1);
}
In other words, the implementation casts it to T to make it work and fails it you pass a non T compatible type in.
List<T> explicitly implements IList.Add(object value) which is why it's not typically visible. You can test by doing the following:
IList list = new List<string>();
list.Add(new SqlDataReader()); // valid at compile time, will fail at runtime
It implements it explicitly, so you have to cast to IList first to use it.
List<int> l = new List<int>();
IList il = (IList)l;
il.Add(something);
You can call it be casting your list instance to the interface first:
List<int> lst = new List<int>();
((IList)lst).Add("banana");
And you'll get as nice, runtime, ArgumentException.
Frederik is right that List<T>'s implementation of IList is explicit for certain members, particularly those that pose a threat to type safety.
The implementation he suggests in his answer can't be right, of course, since it wouldn't compile.
In cases like this, the typical approach is to make a valiant effort to try to get the interface member to work, but to give up if it's impossible.
Note that the IList.Add method is defined to return:
The position into which the new
element was inserted, or -1 to
indicate that the item was not
inserted into the collection.
So in fact, a full implementation is possible:
int IList.Add(object value)
{
if (value is T)
{
Add((T)value);
return Count - 1;
}
return -1;
}
This is just a guess, of course. (If you really want to know for sure, you can always use Reflector.) It may be slightly different; for example it could throw a NotSupportedException, which is often done for incomplete interface implementations such as ReadOnlyCollection<T>'s implementation of IList<T>. But since the above meets the documented requirements of IList.Add, I suspect it's close to the real thing.

How to allow iteration over a private collection but not modification?

If I have the following class member:
private List<object> obs;
and I want to allow traversal of this list as part of the class' interface, how would I do it?
Making it public won't work because I don't want to allow the list to be modified directly.
You would expose it as an IEnumerable<T>, but not just returning it directly:
public IEnumerable<object> Objects { get { return obs.Select(o => o); } }
Since you indicated you only wanted traversal of the list, this is all you need.
One might be tempted to return the List<object> directly as an IEnumerable<T>, but that would be incorrect, because one could easily inspect the IEnumerable<T> at runtime, determine it is a List<T> and cast it to such and mutate the contents.
However, by using return obs.Select(o => o); you end up returning an iterator over the List<object>, not a direct reference to the List<object> itself.
Some might think that this qualifies as a "degenerate expression" according to section 7.15.2.5 of the C# Language Specification. However, Eric Lippert goes into detail as to why this projection isn't optimized away.
Also, people are suggesting that one use the AsEnumerable extension method. This is incorrect, as the reference identity of the original list is maintained. From the Remarks section of the documentation:
The AsEnumerable<TSource>(IEnumerable<TSource>) method has no effect other than to change the compile-time type of source from a type that implements IEnumerable<T> to IEnumerable<T> itself.
In other words, all it does is cast the source parameter to IEnumerable<T>, which doesn't help protect referencial integrity, the original reference is returned and can be cast back to List<T> and be used to mutate the list.
You can use a ReadOnlyCollection or make a copy of the List and return it instead (considering the performance penalty of the copy operation). You can also use List<T>.AsReadOnly.
This has already been said, but I don't see any of the answers as being superclear.
The easiest way is to simply return a ReadOnlyCollection
private List<object> objs;
public ReadOnlyCollection<object> Objs {
get {
return objs.AsReadOnly();
}
}
The drawback with this is, that if you want to change your implementation later on, then some callers may already be dependent on the fact, that the collection provides random access. So a safer definition would be to just expose an IEnumerable
public IEnumerable<object> Objs {
get {
return objs.AsReadOnly();
}
}
Note that you don't have to call AsReadOnly() to compile this code. But if you don't, the caller my just cast the return value back to a List and modify your list.
// Bad caller code
var objs = YourClass.Objs;
var list = objs as List<object>;
list.Add(new object); // They have just modified your list.
The same is potential problem also exists with this solution
public IEnumerable<object> Objs {
get {
return objs.AsEnumerable();
}
}
So I would definately recommend that you call AsReadOnly() on you list, and return that value.
To your Interface add the following method signature:
public IEnumerable TraverseTheList()
Implimented as so:
public IEnumerable<object> TraverseTheList()
{
foreach( object item in obj)
{
yield return item;
}
}
that will allow you to do the following:
foreach(object item in Something.TraverseTheList())
{
// do something to the item
}
The yield return tells the compiler to build an enumerator for you.
You can do this in two ways:
Either By converting the list into a Readonly collection:
new System.Collections.ObjectModel.ReadOnlyCollection<object>(this.obs)
Or by returning an IEnumerable of the items:
this.obs.AsEnumerable()
Expose a ReadOnlyCollection<T>
Interesting post and dialog on this very issue: http://davybrion.com/blog/2009/10/stop-exposing-collections-already/.
Have you considered deriving a class from System.Collections.ReadOnlyCollectionBase?
Just return an IReadOnlyCollection.
private List<object> obs;
IReadOnlyCollection<object> GetObjects()
{
return obs;
}

Categories