In practice this seems simple but I'm getting really confused about it. Java's enumeration hasMoreElements() and nextElement() methods are related but work differently from C#'s IEnumerator MoveNext() and Current() properties of course. But how would I translate something like this?:
//class declaration, fields constructors, unrelated code etc.
private Vector atomlist = new Vector();
public int getNumberBasis() {
Enumeration basis = this.getBasisEnumeration();
int numberBasis = 0;
while (basis.hasMoreElements()) {
Object temp = basis.nextElement();
numberBasis++;
}
return numberBasis;
}
public Enumeration getBasisEnumeration() {
return new BasisEnumeration(this);
}
private class BasisEnumeration implements Enumeration {
Enumeration atoms;
Enumeration basis;
public BasisEnumeration(Molecule molecule) {
atoms = molecule.getAtomEnumeration();
basis = ((Atom) atoms.nextElement()).getBasisEnumeration();
}
public boolean hasMoreElements() {
return (atoms.hasMoreElements() || basis.hasMoreElements());
}
public Object nextElement() {
if (basis.hasMoreElements())
return basis.nextElement();
else {
basis = ((Atom) atoms.nextElement()).getBasisEnumeration();
return basis.nextElement();
}
}
}
As you can see, the enumration class's methods are overloaded and I don't think replacing hasMoreElements and nextElement with MoveNext and Current everywhere would work... because the basis.nextElement() calls hasMoreElements() again in an if-else statement. If I was to replace hasMoreElements with MoveNext(), the code would advance twice instead of one.
You can indeed implement IEnumerable yourself, but it generally needed only for exercises in internals of C#. You'd probably use either iterator method:
IEnumerable<Atom> GetAtoms()
{
foreach(Atom item in basis)
{
yield return item;
}
foreach(Atom item in atoms)
{
yield return item;
}
}
Or Enumerable.Concat
IEnumerable<Atom> GetAtoms()
{
return basis.Concat(atoms);
}
Related
How exactly is the right way to call IEnumerator.Reset?
The documentation says:
The Reset method is provided for COM interoperability. It does not necessarily need to be implemented; instead, the implementer can simply throw a NotSupportedException.
Okay, so does that mean I'm not supposed to ever call it?
It's so tempting to use exceptions for flow control:
using (enumerator = GetSomeExpensiveEnumerator())
{
while (enumerator.MoveNext()) { ... }
try { enumerator.Reset(); } //Try an inexpensive method
catch (NotSupportedException)
{ enumerator = GetSomeExpensiveEnumerator(); } //Fine, get another one
while (enumerator.MoveNext()) { ... }
}
Is that how we're supposed to use it? Or are we not meant to use it from managed code at all?
never; ultimately this was a mistake. The correct way to iterate a sequence more than once is to call .GetEnumerator() again - i.e. use foreach again. If your data is non-repeatable (or expensive to repeat), buffer it via .ToList() or similar.
It is a formal requirement in the language spec that iterator blocks throw exceptions for this method. As such, you cannot rely on it working. Ever.
I recommend not using it. A lot of modern IEnumerable implementations will just throw an exception.
Getting enumerators is hardly ever "expensive". It is enumerating them all (fully) that can be expensive.
public class PeopleEnum : IEnumerator
{
public Person[] _people;
// Enumerators are positioned before the first element
// until the first MoveNext() call.
int position = -1;
public PeopleEnum(Person[] list)
{
_people = list;
}
public bool MoveNext()
{
position++;
return (position < _people.Length);
}
public void Reset()
{
position = -1;
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public Person Current
{
get
{
try
{
return _people[position];
}
catch (IndexOutOfRangeException)
{
throw new InvalidOperationException();
}
}
}
}
I have an application that implements several objects in C# and then expects those objects to be usable from IronRuby by adding them to a Microsoft.Scripting.Hosting.ScriptScope. This works for the most part: I've learned how to implement the various members of DynamicObject to make things act correctly. But how do I make an object compatible with Ruby's for loops?
My C# object looks like this:
public class TrialObject : DynamicObject, IEnumerable<int> {
public override bool TryGetIndex(GetIndexBinder binder, object[] indexes, out object result) {
if (indexes.Length != 1) return base.TryGetIndex(binder, indexes, out result);
try {
var index = (int)indexes[0];
result = index;
return true;
} catch { return base.TryGetIndex(binder, indexes, out result); }
}
public int Length { get { return 3; } }
public IEnumerator<int> GetEnumerator() {
yield return 0;
yield return 1;
yield return 2;
}
IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); }
}
Then I try to use the object from ruby, like so.
trial.Length # returns 3
trial[0] # returns 0
trial[2] # returns 2
for t in trial do <stuff>; end #error: no block given
How do I adjust the trial object to allow it to work in a for loop? I know I could use "i in 0..trial.Length-1" as a hack around, but I'd rather fix the problem than introduce a hack.
After some trial and error, I learned that I could get the desired effect by adding an "each" method to my class.
public class TrialObject : DynamicObject, IEnumerable<int> {
...
public IEnumerable<BuildableObject> each() { return this; }
...
}
It seems as if ironruby uses the "each" method to implement its for. So by adding an each method, I can get this dynamic object to act like the IEnumerable it's trying to be.
How exactly is the right way to call IEnumerator.Reset?
The documentation says:
The Reset method is provided for COM interoperability. It does not necessarily need to be implemented; instead, the implementer can simply throw a NotSupportedException.
Okay, so does that mean I'm not supposed to ever call it?
It's so tempting to use exceptions for flow control:
using (enumerator = GetSomeExpensiveEnumerator())
{
while (enumerator.MoveNext()) { ... }
try { enumerator.Reset(); } //Try an inexpensive method
catch (NotSupportedException)
{ enumerator = GetSomeExpensiveEnumerator(); } //Fine, get another one
while (enumerator.MoveNext()) { ... }
}
Is that how we're supposed to use it? Or are we not meant to use it from managed code at all?
never; ultimately this was a mistake. The correct way to iterate a sequence more than once is to call .GetEnumerator() again - i.e. use foreach again. If your data is non-repeatable (or expensive to repeat), buffer it via .ToList() or similar.
It is a formal requirement in the language spec that iterator blocks throw exceptions for this method. As such, you cannot rely on it working. Ever.
I recommend not using it. A lot of modern IEnumerable implementations will just throw an exception.
Getting enumerators is hardly ever "expensive". It is enumerating them all (fully) that can be expensive.
public class PeopleEnum : IEnumerator
{
public Person[] _people;
// Enumerators are positioned before the first element
// until the first MoveNext() call.
int position = -1;
public PeopleEnum(Person[] list)
{
_people = list;
}
public bool MoveNext()
{
position++;
return (position < _people.Length);
}
public void Reset()
{
position = -1;
}
object IEnumerator.Current
{
get
{
return Current;
}
}
public Person Current
{
get
{
try
{
return _people[position];
}
catch (IndexOutOfRangeException)
{
throw new InvalidOperationException();
}
}
}
}
Ok, as I was poking around with building a custom enumerator, I had noticed this behavior that concerns the yield
Say you have something like this:
public class EnumeratorExample
{
public static IEnumerable<int> GetSource(int startPoint)
{
int[] values = new int[]{1,2,3,4,5,6,7};
Contract.Invariant(startPoint < values.Length);
bool keepSearching = true;
int index = startPoint;
while(keepSearching)
{
yield return values[index];
//The mind reels here
index ++
keepSearching = index < values.Length;
}
}
}
What makes it possible underneath the compiler's hood to execute the index ++ and the rest of the code in the while loop after you technically do a return from the function?
The compiler rewrites the code into a state machine. The single method you wrote is split up into different parts. Each time you call MoveNext (either implicity or explicitly) the state is advanced and the correct block of code is executed.
Suggested reading if you want to know more details:
The implementation of iterators in C# and its consequences - Raymond Chen
Part 1
Part 2
Part 3
Iterator block implementation details: auto-generated state machines - Jon Skeet
Eric Lippert's blog
The compiler generates a state-machine on your behalf.
From the language specification:
10.14 Iterators
10.14.4 Enumerator objects
When a function member returning an
enumerator interface type is
implemented using an iterator block,
invoking the function member does not
immediately execute the code in the
iterator block. Instead, an enumerator
object is created and returned. This
object encapsulates the code specified
in the iterator block, and execution
of the code in the iterator block
occurs when the enumerator object’s
MoveNext method is invoked. An
enumerator object has the following
characteristics:
• It implements
IEnumerator and IEnumerator, where
T is the yield type of the iterator.
• It implements System.IDisposable.
• It is initialized with a copy of the
argument values (if any) and instance
value passed to the function member.
• It has four potential states,
before, running, suspended, and after,
and is initially in the before state.
An enumerator object is typically an
instance of a compiler-generated
enumerator class that encapsulates the
code in the iterator block and
implements the enumerator interfaces,
but other methods of implementation
are possible. If an enumerator class
is generated by the compiler, that
class will be nested, directly or
indirectly, in the class containing
the function member, it will have
private accessibility, and it will
have a name reserved for compiler use
(§2.4.2).
To get an idea of this, here's how Reflector decompiles your class:
public class EnumeratorExample
{
// Methods
public static IEnumerable<int> GetSource(int startPoint)
{
return new <GetSource>d__0(-2) { <>3__startPoint = startPoint };
}
// Nested Types
[CompilerGenerated]
private sealed class <GetSource>d__0 : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable
{
// Fields
private int <>1__state;
private int <>2__current;
public int <>3__startPoint;
private int <>l__initialThreadId;
public int <index>5__3;
public bool <keepSearching>5__2;
public int[] <values>5__1;
public int startPoint;
// Methods
[DebuggerHidden]
public <GetSource>d__0(int <>1__state)
{
this.<>1__state = <>1__state;
this.<>l__initialThreadId = Thread.CurrentThread.ManagedThreadId;
}
private bool MoveNext()
{
switch (this.<>1__state)
{
case 0:
this.<>1__state = -1;
this.<values>5__1 = new int[] { 1, 2, 3, 4, 5, 6, 7 };
this.<keepSearching>5__2 = true;
this.<index>5__3 = this.startPoint;
while (this.<keepSearching>5__2)
{
this.<>2__current = this.<values>5__1[this.<index>5__3];
this.<>1__state = 1;
return true;
Label_0073:
this.<>1__state = -1;
this.<index>5__3++;
this.<keepSearching>5__2 = this.<index>5__3 < this.<values>5__1.Length;
}
break;
case 1:
goto Label_0073;
}
return false;
}
[DebuggerHidden]
IEnumerator<int> IEnumerable<int>.GetEnumerator()
{
EnumeratorExample.<GetSource>d__0 d__;
if ((Thread.CurrentThread.ManagedThreadId == this.<>l__initialThreadId) && (this.<>1__state == -2))
{
this.<>1__state = 0;
d__ = this;
}
else
{
d__ = new EnumeratorExample.<GetSource>d__0(0);
}
d__.startPoint = this.<>3__startPoint;
return d__;
}
[DebuggerHidden]
IEnumerator IEnumerable.GetEnumerator()
{
return this.System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator();
}
[DebuggerHidden]
void IEnumerator.Reset()
{
throw new NotSupportedException();
}
void IDisposable.Dispose()
{
}
// Properties
int IEnumerator<int>.Current
{
[DebuggerHidden]
get
{
return this.<>2__current;
}
}
object IEnumerator.Current
{
[DebuggerHidden]
get
{
return this.<>2__current;
}
}
}
}
Yield is magic.
Well, not really. The compiler generates a full class to generate the enumeration that you're doing. It's basically sugar to make your life simpler.
Read this for an intro.
EDIT: Wrong this. Link changed, check again if you have once.
That's one of the most complex parts of the C# compiler. Best read the free sample chapter of Jon Skeet's C# in Depth (or better, get the book and read it :-)
Implementing iterators the easy way
For further explanations see Marc Gravell's answer here:
Can someone demystify the yield keyword?
Here is an excellent blog series (from Microsoft veteran Raymond Chen) that details how yield works:
https://devblogs.microsoft.com/oldnewthing/20080812-00/?p=21273
https://devblogs.microsoft.com/oldnewthing/20080813-00/?p=21253
https://devblogs.microsoft.com/oldnewthing/20080814-00/?p=21243
https://devblogs.microsoft.com/oldnewthing/20080815-00/?p=21223
What is the concrete type for this IEnumerable<string>?
private IEnumerable<string> GetIEnumerable()
{
yield return "a";
yield return "a";
yield return "a";
}
It's a compiler-generated type. The compiler generates an IEnumerator<string> implementation that returns three "a" values and an IEnumerable<string> skeleton class that provides one of these in its GetEnumerator method.
The generated code looks something like this*:
// No idea what the naming convention for the generated class is --
// this is really just a shot in the dark.
class GetIEnumerable_Enumerator : IEnumerator<string>
{
int _state;
string _current;
public bool MoveNext()
{
switch (_state++)
{
case 0:
_current = "a";
break;
case 1:
_current = "a";
break;
case 2:
_current = "a";
break;
default:
return false;
}
return true;
}
public string Current
{
get { return _current; }
}
object IEnumerator.Current
{
get { return Current; }
}
void IEnumerator.Reset()
{
// not sure about this one -- never really tried it...
// I'll just guess
_state = 0;
_current = null;
}
}
class GetIEnumerable_Enumerable : IEnumerable<string>
{
public IEnumerator<string> GetEnumerator()
{
return new GetIEnumerable_Enumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
Or maybe, as SLaks says in his answer, the two implementations end up in the same class. I wrote this based on my choppy memory of generated code I'd looked at before; really, one class would suffice, as there's no reason the above functionality requires two.
In fact, come to think of it, the two implementations really should fall within a single class, as I just remembered the functions that use yield statements must have a return type of either IEnumerable<T> or IEnumerator<T>.
Anyway, I'll let you perform the code corrections to what I posted mentally.
*This is purely for illustration purposes; I make no claim as to its real accuracy. It's only to demonstrate in a general way how the compiler does what it does, based on the evidence I've seen in my own investigations.
The compiler will automatically generate a class that implements both IEnumerable<T> and IEnumerator<T> (in the same class).
Jon Skeet has a detailed explanation.
The concrete implementation of IEnumerable<string> returned by the method is an anonymous type generated by the compiler
Console.WriteLine(GetIEnumerable().GetType());
Prints :
YourClass+<GetIEnumerable>d__0