Algorithm for implementing C# yield statement - c#

I'd love to figure it out myself but I was wondering roughly what's the algorithm for converting a function with yield statements into a state machine for an enumerator? For example how does C# turn this:
IEnumerator<string> strings(IEnumerable<string> args)
{ IEnumerator<string> enumerator2 = getAnotherEnumerator();
foreach(var arg in arg)
{ enumerator2.MoveNext();
yield return arg+enumerator.Current;
into this:
bool MoveNext()
{ switch (this.state)
case 0:
this.state = -1;
this.enumerator2 = getAnotherEnumerator();
this.argsEnumerator = this.args.GetEnumerator();
this.state = 1;
while (this.argsEnumerator.MoveNext())
this.arg = this.argsEnumerator.Current;
this.current = this.arg + this.enumerator2.Current;
this.state = 2;
return true;
this.state = 1;
this.state = -1;
if (this.argsEnumerator != null) this.argsEnumerator.Dispose();
case 2:
goto state1;
return false;
Of course the result can be completely different depending on the original code.

The particular code sample you are looking at involves a series of transformations.
Please note that this is an approximate description of the algorithm. The actual names used by the compiler and the exact code it generates may be different. The idea is the same, however.
The first transformation is the "foreach" transformation, which transforms this code:
foreach (var x in y)
into this code:
var enumerator = y.GetEnumerator();
while (enumerator.MoveNext())
var x = enumerator.Current;
if (y != null)
The second transformation finds all the yield return statements in the function body, assigns a number to each (a state value), and creates a "goto label" right after the yield.
The third transformation lifts all the local variables and function arguments in the method body into an object called a closure.
Given the code in your example, that would look similar to this:
class ClosureEnumerable : IEnumerable<string>
private IEnumerable<string> args;
private ClassType originalThis;
public ClosureEnumerator(ClassType origThis, IEnumerable<string> args)
this.args = args;
this.origianlThis = origThis;
public IEnumerator<string> GetEnumerator()
return new Closure(origThis, args);
class Closure : IEnumerator<string>
public Closure(ClassType originalThis, IEnumerable<string> args)
state = 0;
this.args = args;
this.originalThis = originalThis;
private IEnumerable<string> args;
private IEnumerator<string> enumerator2;
private IEnumerator<string> argEnumerator;
//- Here ClassType is the type of the object that contained the method
// This may be optimized away if the method does not access any
// class members
private ClassType originalThis;
//This holds the state value.
private int state;
//The current value to return
private string currentValue;
public string Current
return currentValue;
The method body is then moved from the original method to a method inside "Closure" called MoveNext, which returns a bool, and implements IEnumerable.MoveNext.
Any access to any locals is routed through "this", and any access to any class members are routed through this.originalThis.
Any "yield return expr" is translated into:
currentValue = expr;
state = //the state number of the yield statement;
return true;
Any yield break statement is translated into:
state = -1;
return false;
There is an "implicit" yield break statement at the end of the function.
A switch statement is then introduced at the beginning of the procedure that looks at the state number and jumps to the associated label.
The original method is then translated into something like this:
IEnumerator<string> strings(IEnumerable<string> args)
return new ClosureEnumerable(this,args);
The fact that the state of the method is all pushed into an object and that the MoveNext method uses a switch statement / state variable is what allows the iterator to behave as if control is being passed back to the point immediately after the last "yield return" statement the next time "MoveNext" is called.
It is important to point out, however, that the transformation used by the C# compiler is not the best way to do this. It suffers from poor performance when trying to use "yield" with recursive algorithms. There is a good paper that outlines a better way to do this here:
It's worth a read if you haven't read it yet.

Just spotted this question - I wrote an article on it recently. I'll have to add the other links mentioned here to the article though...

Raymond Chen answers this here.


Is dispose called down to the bottom via yield return?

I give silly examples for simplicity.
IEnumerable<T> Silly<T>(this IEnumerable<T> source)
foreach(var x in source) yield return x;
I know that this will be compiled into a state machine. but its also similar to
IEnumerable<T> Silly<T>(this IEnumerable<T> source)
using(var sillier = source.GetEnumerator())
while(sillier.MoveNext()) yield return sillier.Current;
Now consider this usage
Here you can see that Silly enumerable may not be fully consumed, but Take(2) it self will be fully consumed.
Question: when dispose is called on Take enumerator will it also call dispose on Silly enumerator and more specifically sillier enumerator?
My guess is, compiler can handle this simple use case because of foreach but what about not so simple use cases?
IEnumerable<T> Silly<T>(this IEnumerable<T> source)
using(var sillier = source.GetEnumerator())
// move next can be called on different stages.
Will this ever be a problem? because most enumerators don't use unmanaged resources, but if one does, this can cause memory leaks.
If dispose is not called, How do i make disposable enumerable?
An Idea: there can be a if(disposed) yield break; after every yield return. now dispose method of silly enumerator will just have to set disposed = true and move the enumerator once to dispose all the required stuff.
The C# compiler takes care of a lot for you when it turns your iterator into the real code. For instance, here's the MoveNext which contains the implementation of your second example1:
private bool MoveNext()
switch (this.<>1__state)
case 0:
this.<>1__state = -1;
this.<sillier>5__1 = this.source.GetEnumerator();
this.<>1__state = -3;
while (this.<sillier>5__1.MoveNext())
this.<>2__current = this.<sillier>5__1.Current;
this.<>1__state = 1;
return true;
this.<>1__state = -3;
this.<sillier>5__1 = null;
return false;
case 1:
goto Label_005A;
return false;
So, you'll notice that the finally clause from your using isn't there at all, and it's a state machine2 that relies on being in certain good (>= 0) states in order to make further progress forwards. (It's also illegal C#, but hey ho).
Now lets look at its Dispose:
void IDisposable.Dispose()
switch (this.<>1__state)
case -3:
case 1:
So we can see the <>m__Finally1 is called here (as well as due to exiting the while loop in MoveNext.
And <>m__Finally1:
private void <>m__Finally1()
this.<>1__state = -1;
if (this.<sillier>5__1 != null)
So, we can see that sillier was disposed and we moved into a negative state which means that MoveNext doesn't have to do any special work to handle the "we've already been disposed state".
An Idea: there can be a if(disposed) yield break; after every yield return. now dispose method of silly enumerator will just have to set disposed = true and move the enumerator once to dispose all the required stuff.
Is completely unnecessary. Trust the compiler to transform the code so that it does all of the logical things it should - it just runs it's finally clause once, when it's either exhausted the iterator logic or when it's explicitly disposed.
1All code samples produced by .NET Reflector. But it's too good at decompiling these constructs these days so if you go and look at the Silly method itself:
[IteratorStateMachine(typeof(<Silly>d__1)), Extension]
private static IEnumerable<T> Silly<T>(this IEnumerable<T> source)
IEnumerator<T> <sillier>5__1;
using (<sillier>5__1 = source.GetEnumerator())
while (<sillier>5__1.MoveNext())
yield return <sillier>5__1.Current;
<sillier>5__1 = null;
It's managed to hide most details about that state machine away again. You need to chase the type referenced by the IteratorStateMachine attribute to see all of the gritty bits shown above.
2Please also note that the compiler is under no obligations to produce a state machine to allow iterators to work. It's an implementation detail of the current C# compilers. The C# Specification places no restriction on how the compiler transforms the iterator, just on what the effects should be.

Why you cannot use an unsafe keyword in an iterator context?

In looking at this question which Jon did a fine job in answering... 'How to read a text file reversly with iterator'. And there was a similar question in which I answered using pointers hocus pocus..'.net is there a way to read a text file from bottom to top' before it got closed....
Now I did set out to try solve this using pointers, ok, it looks hackish and rough around the edges...
public class ReadChar : IEnumerable<char>
private Stream _strm = null;
private string _str = string.Empty;
public ReadChar(string s)
this._str = s;
public ReadChar(Stream strm)
this._strm = strm;
public IEnumerator<char> GetEnumerator()
if (this._strm != null && this._strm.CanRead && this._strm.CanSeek)
return ReverseReadStream();
if (this._str.Length > 0)
return ReverseRead();
return null;
private IEnumerator<char> ReverseReadStream()
long lIndex = this._strm.Length;
while (lIndex != 0 && this._strm.Position != 0)
this._strm.Seek(lIndex--, SeekOrigin.End);
int nByte = this._strm.ReadByte();
yield return (char)nByte;
private IEnumerator<char> ReverseRead()
fixed (char* beg = this._str)
char* p = beg + this._str.Length;
while (p-- != beg)
yield return *p;
IEnumerator IEnumerable.GetEnumerator()
return GetEnumerator();
but discovered that C# compiler cannot handle this using this implementation but was devastated when the C# compiler refused with an error CS1629 - 'Unsafe code may not appear in iterators'
Why is that so?
Eric Lippert has an excellent blog post on this topic here: Iterator Blocks, Part Six: Why no unsafe code?
What I want to know is why you would use pointers for this at all. Why not simply say:
private IEnumerator<char> ReverseRead()
int len = _str.Length;
for(int i = 0; i < len; ++i)
yield return _str[len - i - 1];
What's the compelling benefit of messing around with pointers?
It's just part of the C# spec:
26.1 Iterator blocks ... It is a compile-time error for an iterator block to contain an unsafe context (ยง27.1). An iterator block always
defines a safe context, even when its declaration is nested in an unsafe context.
Presumably, the code that gets generated by the compiler needs to be verifiable in order to not have to be labelled 'unsafe'. If you want to use pointers, you'll have to implement IEnumerator yourself.

How should I refactor a long chain of try-and-catch-wrapped speculative casting operations

I have some C# code that walks XML schemata using the Xml.Schema classes from the .NET framework. The various simple type restrictions are abstracted in the framework as a whole bunch of classes derived from Xml.Schema.XmlSchemaFacet. Unless there is something I've missed, the only way to know which of the derived facet types a given facet is, is to speculatively cast it to one of them, catching the resultant InvalidCastOperation in case of failure. Doing this leaves me with a really ugly function like this:
private void NavigateFacet(XmlSchemaFacet facet)
I assume there must be more elegant ways to do this; either using some property I've missed from the .NET framework, or with some clever piece of OO trickery. Can anyone enlighten me?
Because I prefer debugging data to debugging code, I'd do it like this, especially if the code had to handle all of the XmlSchemaFacet subclasses:
Dictionary<Type, Action<XmlSchemaFacet>> HandlerMap =
new Dictionary<Type, Action<XmlSchemaFacet>>
{typeof(XmlSchemaLengthFacet), handler.Length},
{typeof(XmlSchemaMinLengthFacet), handler.MinLength},
{typeof(XmlSchemaMaxLengthFacet), handler.MaxLength}
This'll throw a KeyNotFoundException if facet isn't of a known type. Note that all of the handler methods will have to cast their argument from XmlSchemaFacet, so you're probably not saving on total lines of code, but you're definitely saving on the number of paths through your code.
There also comes a point where (assuming that the map is pre-built) mapping types to methods with a dictionary will be faster than traversing a linear list of types, which is essentially what using a bunch of if blocks gets you.
You can try using the as keyword. Some others have recommended using the is keyword instead. I found this to be a great explanation of why as is better.
Some sample code:
private void NavigateFacet(XmlSchemaFacet facet)
XmlSchemaLengthFacet lengthFacet = facet as XmlSchemaLengthFacet;
if (lengthFacet != null)
// Re-try with XmlSchemaMinLengthFacet, etc.
private void NavigateFacet(XmlSchemaFacet facet)
if (facet is XmlSchemaLengthFacet)
else if (facet is XmlSchemaMinLengthFacet)
else if (facet is XmlSchemaMaxLengthFacet)
// etc.
Update: I decided to benchmark the different methods discussed here (is vs. as). Here is the code that I used:
object c1 = new Class1();
int trials = 10000000;
Class1 tester;
Stopwatch watch = Stopwatch.StartNew();
for (int i = 0; i < trials; i++)
if (c1 is Class1)
tester = (Class1)c1;
MessageBox.Show(watch.ElapsedMilliseconds.ToString()); // ~104 ms
for (int i = 0; i < trials; i++)
tester = c1 as Class1;
if (tester != null)
MessageBox.Show(watch.ElapsedMilliseconds.ToString()); // ~86 ms
for (int i = 0; i < trials; i++)
if (c1 is Class1)
MessageBox.Show(watch.ElapsedMilliseconds.ToString()); // ~74 ms
for (int i = 0; i < trials; i++)
MessageBox.Show(watch.ElapsedMilliseconds.ToString()); // ~50 ms
As expected, using the as keyword and then checking for null is faster than using the is keyword and casting (36 ms vs. 54 ms, subtracting the cost of the loop itself).
However, using the is keyword and then not casting is faster still (24ms), which means that finding the correct type with a series of is checks and then only casting once when the correct type is identified could actually be faster (depending upon the number of different type checks that have to be done in this method before the correct type is identified).
The deeper point, however, is that the number of trials in this test is 10 million, which means it really doesn't make much difference which method you use. Using is and casting takes 0.0000054 milliseconds, while using as and checking for null takes 0.0000036 milliseconds (on my ancient notebook).
You could try using the as keyword - if the cast fails, you get a null instead of an exception.
private void NavigateFacet(XmlSchemaFacet facet)
var length = facet as XmlSchemaLengthFacet;
if (length != null)
var minlength = facet as XmlSchemaMinLengthFacet;
if (minlength != null)
var maxlength = facet as XmlSchemaMaxLengthFacet;
if (maxlength != null)
If you had control over the classes, I'd suggest using a variant of the Visitor pattern (aka Double Despatch to recover the type information more cleanly, but since you don't, this is one relatively simple approach.
Update: Using the variable to store the result of the as cast avoids the need to go through the type checking logic twice.
Update 2: When C# 4 becomes available, you'll be able to use dynamic to do the dispatching for you:
public class HandlerDemo
public void Handle(XmlSchemaLengthFacet facet) { ... }
public void Handle(XmlSchemaMinLengthFacet facet) { ... }
public void Handle(XmlSchemaMaxLengthFacet facet) { ... }
private void NavigateFacet(XmlSchemaFacet facet)
dynamic handler = new HandlerDemo();
This will work because method dispatch on dynamic objects uses the same rules as normal method overriding, but evaluated at run-time instead of compile-time.
Under the hood, the Dynamic Language Runtime (DLR) will be doing much the same kind of trick as the code shown in this (and other answers), but with the addition of caching for performance.
This is a proper way to do this without any try-catches.
if (facet is XmlSchemaLengthFacet)
else if (facet is XmlSchemaMinLengthFacet)
else if (facet is XmlSchemaMaxLengthFacet)
//Handle Error
use "is" to determine whether an object is of a given type
use "as" for type conversion, it is faster than normal casting and won't throw exceptions, it reuturns null on error instead
You can do it like this:
private void NavigateFacet(XmlSchemaFacet facet)
if (facet is XmlSchemaLengthFacet)
handler.Length(facet as XmlSchemaLengthFacet);
else if (facet is XmlSchemaMinLengthFacet)
handler.MinLength(facet as XmlSchemaMinLengthFacet);
else if (facet is XmlSchemaMaxLengthFacet)
handler.MaxLength(facet as XmlSchemaMaxLengthFacet);

yield statement implementation

I want to know everything about the yield statement, in an easy to understand form.
I have read about the yield statement and its ease when implementing the iterator pattern. However, most of it is very dry. I would like to get under the covers and see how Microsoft handles return yield.
Also, when do you use yield break?
yield works by building a state machine internally. It stores the current state of the routine when it exits and resumes from that state next time.
You can use Reflector to see how it's implemented by the compiler.
yield break is used when you want to stop returning results. If you don't have a yield break, the compiler would assume one at the end of the function (just like a return; statement in a normal function)
As Mehrdad says, it builds a state machine.
As well as using Reflector (another excellent suggestion) you might find my article on iterator block implementation useful. It would be relatively simple if it weren't for finally blocks - but they introduce a whole extra dimension of complexity!
Let's rewind a little bit: the yield keyword is translated as many others said to a state machine.
Actually this is not exactly like using a built-in implementation that would be used behind the scenes but rather the compiler rewriting the yield related code to a state machine by implementing of one the relevant interfaces (the return type of the method containing the yield keywords).
A (finite) state machine is just a piece of code that depending on where you are in the code (depending on the previous state, input) goes to another state action, and this is pretty much what is happening when you are using and yield with method return type of IEnumerator<T> / IEnumerator. The yield keyword is what going to create another action to move to the next state from the previous one, hence the state management is created in the MoveNext() implementation.
This is what exactly the C# compiler / Roslyn is going to do: check the presence of a yield keyword plus the kind of return type of the containing method, whether it's a IEnumerator<T>, IEnumerable<T>, IEnumerator or IEnumerable and then create a private class reflecting that method, integrating necessary variables and states.
If you are interested in the details of how the state machine and how the iterations are rewrited by by the compiler, you can check those links out on Github:
IteratorRewriter source code
StateMachineRewriter: the parent class of above source code
Trivia 1: the AsyncRewriter (used when you write async/await code also inherits from StateMachineRewriter since it also leverages a state machine behind.
As mentioned, the state machine is heavily reflected in the bool MoveNext() generated implementation in which there is a switch + sometimes some old fashioned goto based on a state field which represents the different paths of execution to different states in your method.
The code that is generated by the compiler from the user-code does not look that "good", mostly cause the compiler adds some weird prefixes and suffixes here and there
For example, the code:
public class TestClass
private int _iAmAHere = 0;
public IEnumerator<int> DoSomething()
var start = 1;
var stop = 42;
var breakCondition = 34;
var exceptionCondition = 41;
var multiplier = 2;
// Rest of the code... with some yield keywords somewhere below...
The variables and types related to that piece of code above will after compilation look like:
public class TestClass
private sealed class <DoSomething>d__1 : IEnumerator<int>, IDisposable, IEnumerator
// Always present
private int <>1__state;
private int <>2__current;
// Containing class
public TestClass <>4__this;
private int <start>5__1;
private int <stop>5__2;
private int <breakCondition>5__3;
private int <exceptionCondition>5__4;
private int <multiplier>5__5;
Regarding the state machine itself, let's take a look at a very simple example with a dummy branching for yielding some even / odd stuff.
public class Example
public IEnumerator<string> DoSomething()
const int start = 1;
const int stop = 42;
for (var index = start; index < stop; index++)
yield return index % 2 == 0 ? "even" : "odd";
Will be translated in the MoveNext as:
private bool MoveNext()
switch (<>1__state)
return false;
case 0:
<>1__state = -1;
<start>5__1 = 1;
<stop>5__2 = 42;
<index>5__3 = <start>5__1;
case 1:
<>1__state = -1;
goto IL_0094;
case 2:
<>1__state = -1;
goto IL_0094;
if (<index>5__3 < <stop>5__2)
if (<index>5__3 % 2 == 0)
<>2__current = "even";
<>1__state = 1;
return true;
<>2__current = "odd";
<>1__state = 2;
return true;
return false;
As you can see this implementation is far from being straightforward but it does the job!
Trivia 2: What happens with the IEnumerable / IEnumerable<T> method return type?
Well, instead of just generating a class implementing the IEnumerator<T>, it will, generate a class that implement both IEnumerable<T> as well as the IEnumerator<T> so that the implementation of IEnumerator<T> GetEnumerator() will leverage the same generated class.
Warm reminder about the few interfaces that are implemented automatically when used a yield keyword:
public interface IEnumerable<out T> : IEnumerable
new IEnumerator<T> GetEnumerator();
public interface IEnumerator<out T> : IDisposable, IEnumerator
T Current { get; }
public interface IEnumerator
bool MoveNext();
object Current { get; }
void Reset();
You can also check out this example with different paths / branching and the full implementation by the compiler rewriting.
This has been created with SharpLab, you can play with that tool to try different yield related execution paths and see how the compiler will rewrite them as a state machine in the MoveNext implementation.
About the second part of the question, ie, yield break, it has been answered here
It specifies that an iterator has come to an end. You can think of
yield break as a return statement which does not return a value.

How do i exit a List<string>.ForEach loop when using an anonymous delegate?

In a normal loop you can break out of a loop using break. Can the same be done using an anonymous delegate?
inputString and result are both declared outside the delegate.
blackList.ForEach(new Action<string>(
delegate(string item)
result = true;
// I want to break here
Thanks for the replies, I'm actually reading your book at the minute John :) Just for the record i hit this issue and switched back to a normal foreach loop but I posted this question to see if i missed something.
As others have posted, you can't exit the loop in ForEach.
Are you able to use LINQ? If so, you could easily combine TakeWhile and a custom ForEach extension method (which just about every project seems to have these days).
In your example, however, List<T>.FindIndex would be the best alternative - but if you're not actually doing that, please post an example of what you really want to do.
There is no loop that one has access to, from which to break. And each call to the (anonymous) delegate is a new function call so local variables will not help. But since C# gives you a closure, you can set a flag and then do nothing in further calls:
bool stop = false;
myList.ForEach((a) => {
if (stop) {
} else if (a.SomeCondition()) {
stop = true;
(This needs to be tested to check if correct reference semantics for closure is generated.)
A more advanced approach would be to create your own extension method that allowed the delegate to return false to stop the loop:
static class MyExtensions {
static void ForEachStoppable<T>(this IEnumerable<T> input, Func<T, bool> action) {
foreach (T t in input) {
if (!action(t)) {
Do you have LINQ available to you? Your logic seems similar to Any:
bool any = blackList.Any(s=>inputString.Contains(s));
which is the same as:
bool any = blackList.Any(inputString.Contains);
If you don't have LINQ, then this is still the same as:
bool any = blackList.Find(inputString.Contains) != null;
If you want to run additional logic, there are things you can do (with LINQ) with TakeWhile etc
I don't think there's an elegant way to do it when using the ForEach method. A hacky solution is to throw an exception.
What's preventing you from doing an old fashioned foreach?
foreach (string item in blackList)
if (!inputString.Contains(item)) continue;
result = true;
If you want a loop, use a loop.
Action allows for no return value, so there's no way the ForEach function could possibly know that you want to break, short of throwing an exception. Using an exception here is overkill.
The only way to "exit" the loop is to throw an exception. There is no "break" style way of exiting the .ForEach method like you would a normal foreach loop.
The ForEach method is not mean to do this. If you want to know if a collection contains an item you should use the Contains method. And if you want to perform a check on all items in a collection you should try the Any extention method.
bool #break = false;
blackList.ForEach(item =>
if(!#break && inputString.Contains(item))
{ #break = true;
result = true;
if (#break) return;
/* ... */
Note that the above will still iterate through each item but return immediately. Of course, this way is probably not as good as a normal foreach.
class Program
static void Main(string[] args)
List<string> blackList = new List<string>(new[] { "jaime", "jhon", "febres", "velez" });
string inputString = "febres";
bool result = false;
blackList.ForEach((item) =>
if (inputString.Contains(item))
result = true;
() => result);
public static class MyExtensions
public static void ForEach<T>(this IEnumerable<T> enumerable, Action<T> action, Func<bool> breakOn)
foreach (var item in enumerable)
if (breakOn())
Would this work for you:
bool result = null != blackList.Find( item => inputString.Contains(item)) );
blackList.ForEach(new Action<string>(
delegate(string item)
result = true;
// I want to break here
if you realy want to exist a loop foreach in a list you could use the exception like this code:
public class ExitMyForEachListException : Exception
public ExitMyForEachListException(string message)
: base(message)
class Program
static void Main(string[] args)
List<string> str = new List<string>() { "Name1", "name2", "name3", "name4", "name5", "name6", "name7" };
str.ForEach(z =>
if (z.EndsWith("6"))
throw new ExitMyForEachListException("I get Out because I found name number 6!");
catch (ExitMyForEachListException ex)
hope this help to get other point of view.
