Enumerating over lambdas does not bind the scope correctly?

Enumerating over lambdas does not bind the scope correctly? - c#

consider the following C# program:
using System;
using System.Linq;
using System.Collections.Generic;
public class Test
{
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
int capture = i;
yield return () => Console.WriteLine(capture.ToString());
}
}
public static void Main(string[] args)
{
foreach (var a in Get()) a();
foreach (var a in Get().ToList()) a();
}
}
When executed under Mono compiler (e.g. Mono 2.10.2.0 - paste into here), it writes the following output:
0
1
1
1
This seems totally unlogical to me. When directly iterating the yield function, the scope of the for-loop is "correctly" (to my understanding) used. But when I store the result in a list first, the scope is always the last action?!
Can I assume that this is a bug in the Mono compiler, or did I hit a mysterious corner case of C#'s lambda and yield-stuff?
BTW: When using Visual Studio compiler (and either MS.NET or mono to execute), the result is the expected 0 1 0 1

I'll give you the reason why it was 0 1 1 1:
foreach (var a in Get()) a();
Here you go into Get and it starts iterating:
i = 0 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 0 to the screen, then returns to the Get() method and continues.
i = 1 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 1 to the screen, then returns to the Get() method and continues (only to find that it has to stop).
But now, you're not iterating over each item when it happens, you're building a list and then iterating over that list.
foreach (var a in Get().ToList()) a();
What you are doing isn't like above, Get().ToList() returns a List or Array (not sure wich one). So now this happens:
i = 0 => return Console.WriteLine(i);
And in you Main() function, you get the following in memory:
var i = 0;
var list = new List
{
Console.WriteLine(i)
}
You go back into the Get() function:
i = 1 => return Console.WriteLine(i);
Which returns to your Main()
var i = 1;
var list = new List
{
Console.WriteLine(i),
Console.WriteLine(i)
}
And then does
foreach (var a in list) a();
Which will print out 1 1
It seems like it was ignoring that you made sure you encapsulated the value before returning the function.

#Armaron - The .ToList() extension returns List of type T as ToArray() returns T[] as the naming convention implies, but I think you are on the right track with your response.
This sounds like an issuse with the compiler. I agree with Servy that it is probably a bug, however, have you tried the following?
public class Test
{
private static int capture = 0;
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
capture++;
yield return () => Console.WriteLine(capture.ToString());
}
}
}
Additionally you may want to try the static approach, perhaps this will perform a more accurate conversion as your function is static.
List<T> list = Enumerable.ToList(Get());
When calling ToList() it seems as though it is not performing a single iteration for each value but rather:
return new List<T>(Get());
The second for each in your code does not make sense to me in implementation as to why it would ever be necessary or beneficial unless you require additional actions to be added/removed to the List object. The first makes perfect sense since all you are doing is iterating through the object and performing the associated action. My understanding is that an integer within the scope of the static IEnumerbale object is being calculated during conversion by performing the entire iteration and the action is preserving the int as a static int due to scope. Also, keep in mind that IEnumerable is merely an interface that is implemented by List which implements IList, and may contain logic for the conversion built in.
That being said I am interested to see/hear your findings as this is an interesting post. I will definitely upvote the question. Please ask questions if anything I said needs clarification or if something is false say so, although I am confident in my usage of the yield keyword of IEnumerable but this is a unique issue.

Related

lifetime of local variable inside an Action in c# [duplicate]

What is a closure? Do we have them in .NET?
If they do exist in .NET, could you please provide a code snippet (preferably in C#) explaining it?

I have an article on this very topic. (It has lots of examples.)
In essence, a closure is a block of code which can be executed at a later time, but which maintains the environment in which it was first created - i.e. it can still use the local variables etc of the method which created it, even after that method has finished executing.
The general feature of closures is implemented in C# by anonymous methods and lambda expressions.
Here's an example using an anonymous method:
using System;
class Test
{
static void Main()
{
Action action = CreateAction();
action();
action();
}
static Action CreateAction()
{
int counter = 0;
return delegate
{
// Yes, it could be done in one statement;
// but it is clearer like this.
counter++;
Console.WriteLine("counter={0}", counter);
};
}
}
Output:
counter=1
counter=2
Here we can see that the action returned by CreateAction still has access to the counter variable, and can indeed increment it, even though CreateAction itself has finished.

If you are interested in seeing how C# implements Closure read "I know the answer (its 42) blog"
The compiler generates a class in the background to encapsulate the anoymous method and the variable j
[CompilerGenerated]
private sealed class <>c__DisplayClass2
{
public <>c__DisplayClass2();
public void <fillFunc>b__0()
{
Console.Write("{0} ", this.j);
}
public int j;
}
for the function:
static void fillFunc(int count) {
for (int i = 0; i < count; i++)
{
int j = i;
funcArr[i] = delegate()
{
Console.Write("{0} ", j);
};
}
}
Turning it into:
private static void fillFunc(int count)
{
for (int i = 0; i < count; i++)
{
Program.<>c__DisplayClass1 class1 = new Program.<>c__DisplayClass1();
class1.j = i;
Program.funcArr[i] = new Func(class1.<fillFunc>b__0);
}
}

Closures are functional values that hold onto variable values from their original scope. C# can use them in the form of anonymous delegates.
For a very simple example, take this C# code:
delegate int testDel();
static void Main(string[] args)
{
int foo = 4;
testDel myClosure = delegate()
{
return foo;
};
int bar = myClosure();
}
At the end of it, bar will be set to 4, and the myClosure delegate can be passed around to be used elsewhere in the program.
Closures can be used for a lot of useful things, like delayed execution or to simplify interfaces - LINQ is mainly built using closures. The most immediate way it comes in handy for most developers is adding event handlers to dynamically created controls - you can use closures to add behavior when the control is instantiated, rather than storing data elsewhere.

Func<int, int> GetMultiplier(int a)
{
return delegate(int b) { return a * b; } ;
}
//...
var fn2 = GetMultiplier(2);
var fn3 = GetMultiplier(3);
Console.WriteLine(fn2(2)); //outputs 4
Console.WriteLine(fn2(3)); //outputs 6
Console.WriteLine(fn3(2)); //outputs 6
Console.WriteLine(fn3(3)); //outputs 9
A closure is an anonymous function passed outside of the function in which it is created.
It maintains any variables from the function in which it is created that it uses.

A closure is when a function is defined inside another function (or method) and it uses the variables from the parent method. This use of variables which are located in a method and wrapped in a function defined within it, is called a closure.
Mark Seemann has some interesting examples of closures in his blog post where he does a parallel between oop and functional programming.
And to make it more detailed
var workingDirectory = new DirectoryInfo(Environment.CurrentDirectory);//when this variable
Func<int, string> read = id =>
{
var path = Path.Combine(workingDirectory.FullName, id + ".txt");//is used inside this function
return File.ReadAllText(path);
};//the entire process is called a closure.

Here is a contrived example for C# which I created from similar code in JavaScript:
public delegate T Iterator<T>() where T : class;
public Iterator<T> CreateIterator<T>(IList<T> x) where T : class
{
var i = 0;
return delegate { return (i < x.Count) ? x[i++] : null; };
}
So, here is some code that shows how to use the above code...
var iterator = CreateIterator(new string[3] { "Foo", "Bar", "Baz"});
// So, although CreateIterator() has been called and returned, the variable
// "i" within CreateIterator() will live on because of a closure created
// within that method, so that every time the anonymous delegate returned
// from it is called (by calling iterator()) it's value will increment.
string currentString;
currentString = iterator(); // currentString is now "Foo"
currentString = iterator(); // currentString is now "Bar"
currentString = iterator(); // currentString is now "Baz"
currentString = iterator(); // currentString is now null
Hope that is somewhat helpful.

Closures are chunks of code that reference a variable outside themselves, (from below them on the stack), that might be called or executed later, (like when an event or delegate is defined, and could get called at some indefinite future point in time)... Because the outside variable that the chunk of code references may gone out of scope (and would otherwise have been lost), the fact that it is referenced by the chunk of code (called a closure) tells the runtime to "hold" that variable in scope until it is no longer needed by the closure chunk of code...

Basically closure is a block of code that you can pass as an argument to a function. C# supports closures in form of anonymous delegates.
Here is a simple example:
List.Find method can accept and execute piece of code (closure) to find list's item.
// Passing a block of code as a function argument
List<int> ints = new List<int> {1, 2, 3};
ints.Find(delegate(int value) { return value == 1; });
Using C#3.0 syntax we can write this as:
ints.Find(value => value == 1);

If you write an inline anonymous method (C#2) or (preferably) a Lambda expression (C#3+), an actual method is still being created. If that code is using an outer-scope local variable - you still need to pass that variable to the method somehow.
e.g. take this Linq Where clause (which is a simple extension method which passes a lambda expression):
var i = 0;
var items = new List<string>
{
"Hello","World"
};
var filtered = items.Where(x =>
// this is a predicate, i.e. a Func<T, bool> written as a lambda expression
// which is still a method actually being created for you in compile time
{
i++;
return true;
});
if you want to use i in that lambda expression, you have to pass it to that created method.
So the first question that arises is: should it be passed by value or reference?
Pass by reference is (I guess) more preferable as you get read/write access to that variable (and this is what C# does; I guess the team in Microsoft weighed the pros and cons and went with by-reference; According to Jon Skeet's article, Java went with by-value).
But then another question arises: Where to allocate that i?
Should it actually/naturally be allocated on the stack?
Well, if you allocate it on the stack and pass it by reference, there can be situations where it outlives it's own stack frame. Take this example:
static void Main(string[] args)
{
Outlive();
var list = whereItems.ToList();
Console.ReadLine();
}
static IEnumerable<string> whereItems;
static void Outlive()
{
var i = 0;
var items = new List<string>
{
"Hello","World"
};
whereItems = items.Where(x =>
{
i++;
Console.WriteLine(i);
return true;
});
}
The lambda expression (in the Where clause) again creates a method which refers to an i. If i is allocated on the stack of Outlive, then by the time you enumerate the whereItems, the i used in the generated method will point to the i of Outlive, i.e. to a place in the stack that is no longer accessible.
Ok, so we need it on the heap then.
So what the C# compiler does to support this inline anonymous/lambda, is use what is called "Closures": It creates a class on the Heap called (rather poorly) DisplayClass which has a field containing the i, and the Function that actually uses it.
Something that would be equivalent to this (you can see the IL generated using ILSpy or ILDASM):
class <>c_DisplayClass1
{
public int i;
public bool <GetFunc>b__0()
{
this.i++;
Console.WriteLine(i);
return true;
}
}
It instantiates that class in your local scope, and replaces any code relating to i or the lambda expression with that closure instance. So - anytime you are using the i in your "local scope" code where i was defined, you are actually using that DisplayClass instance field.
So if I would change the "local" i in the main method, it will actually change _DisplayClass.i ;
i.e.
var i = 0;
var items = new List<string>
{
"Hello","World"
};
var filtered = items.Where(x =>
{
i++;
return true;
});
filtered.ToList(); // will enumerate filtered, i = 2
i = 10; // i will be overwriten with 10
filtered.ToList(); // will enumerate filtered again, i = 12
Console.WriteLine(i); // should print out 12
it will print out 12, as "i = 10" goes to that dispalyclass field and changes it just before the 2nd enumeration.
A good source on the topic is this Bart De Smet Pluralsight module (requires registration) (also ignore his erroneous use of the term "Hoisting" - what (I think) he means is that the local variable (i.e. i) is changed to refer to the the new DisplayClass field).
In other news, there seems to be some misconception that "Closures" are related to loops - as I understand "Closures" are NOT a concept related to loops, but rather to anonymous methods / lambda expressions use of local scoped variables - although some trick questions use loops to demonstrate it.

A closure aims to simplify functional thinking, and it allows the runtime to manage
state, releasing extra complexity for the developer. A closure is a first-class function
with free variables that are bound in the lexical environment. Behind these buzzwords
hides a simple concept: closures are a more convenient way to give functions access
to local state and to pass data into background operations. They are special functions
that carry an implicit binding to all the nonlocal variables (also called free variables or
up-values) referenced. Moreover, a closure allows a function to access one or more nonlocal variables even when invoked outside its immediate lexical scope, and the body
of this special function can transport these free variables as a single entity, defined in
its enclosing scope. More importantly, a closure encapsulates behavior and passes it
around like any other object, granting access to the context in which the closure was
created, reading, and updating these values.

Just out of the blue,a simple and more understanding answer from the book C# 7.0 nutshell.
Pre-requisit you should know :A lambda expression can reference the local variables and parameters of the method
in which it’s defined (outer variables).
static void Main()
{
int factor = 2;
//Here factor is the variable that takes part in lambda expression.
Func<int, int> multiplier = n => n * factor;
Console.WriteLine (multiplier (3)); // 6
}
Real part:Outer variables referenced by a lambda expression are called captured variables. A lambda expression that captures variables is called a closure.
Last Point to be noted:Captured variables are evaluated when the delegate is actually invoked, not when the variables were captured:
int factor = 2;
Func<int, int> multiplier = n => n * factor;
factor = 10;
Console.WriteLine (multiplier (3)); // 30

A closure is a function, defined within a function, that can access the local variables of it as well as its parent.
public string GetByName(string name)
{
List<things> theThings = new List<things>();
return theThings.Find<things>(t => t.Name == name)[0];
}
so the function inside the find method.
t => t.Name == name
can access the variables inside its scope, t, and the variable name which is in its parents scope. Even though it is executed by the find method as a delegate, from another scope all together.

Why prefer Yield over a simple return?

I am trying to understand use of Yield to enumerate the collection. I have written this basic code:
static void Main(string[] args)
{
Iterate iterate = new Iterate();
foreach (int i in iterate.EnumerateList())
{
Console.Write("{0}", i);
}
Console.ReadLine();
}
class Iterate
{
public IEnumerable<int> EnumerateList()
{
List<int> lstNumbers = new List<int>();
lstNumbers.Add(1);
lstNumbers.Add(2);
lstNumbers.Add(3);
lstNumbers.Add(4);
lstNumbers.Add(5);
foreach (int i in lstNumbers)
{
yield return i;
}
}
}
(1) What if I use simply return i instead of yield return i?
(2) What are the advantages of using Yield and when to prefer using it?
Edited **
In the above code, I think it is an overhead to use foreach two times. First in the main function and the second in the EnumerateList method.

Using yield return i makes this method an iterator. It will create an IEnumerable<int> sequence of values from your entire loop.
If you used return i, it would just return a single int value. In your case, this would cause a compiler error, as the return type of your method is IEnumerable<int>, not int.
In this specific example, I would personally just return lstNumbers instead of using the iterator. You could rewrite this without the list, though, as:
public IEnumerable<int> EnumerateList()
{
yield return 1;
yield return 2;
yield return 3;
yield return 4;
yield return 5;
}
Or even:
public IEnumerable<int> EnumerateList()
{
for (int i=1;i<=5;++i)
yield return i;
}
This is very handy when you're making a class which you want to act like a collection. Implementing IEnumerable<T> by hand for a custom type often requires making a custom class, etc. Prior to iterators, this required a lot of code, as shown in this old sample for implementing a custom collection in C#.

As Reed says using yield allows you to implement an iterator. The major advantage of an iterator is that it allows lazy evaluation. I.e. it doesn't have to materialize the entire result unless needed.
Consider Directory.GetFiles from the BCL. It returns string[]. I.e. it has to get all the files and put the names in an array before it returns. In contrast Directory.EnumerateFiles returns IEnumerable<string>. I.e. the caller is responsible for handling the result set. That means that the caller can opt out of enumerating the collection at any point.

The yield keyword will make the compiler turn the function into an enumerator object.
The yield return statement doesn't simply exit the function and return one value, it leaves the code in a state so that it can be resumed at that point when the next value is requested from the enumerator.

You can't return just i as i is an int and not an IEnumerable. It won't even compile. You could have returned lstNumbers as that implements the interface of IEnumerable.
I prefer yield return since the compiler will handle building the enumerable and not having to build the list in the first place. So for me if I have something that is already implementing the interface then I return that if I have to build it and not use it else where then I yield return.

Linq FirstOrDefault evaluates predicate each iteration?

If I had a statement such as:
var item = Core.Collections.Items.FirstOrDefault(itm => itm.UserID == bytereader.readInt());
Does this code read an integer from my stream each iteration, or does it read the integer once, store it, then use its value throughout the lookup?

Consider this code:
static void Main(string[] args)
{
new[] { 1, 2, 3, 4 }.FirstOrDefault(j => j == Get());
Console.ReadLine();
}
static int i = 5;
static int Get()
{
Console.WriteLine("GET:" + i);
return i--;
}
It shows, that it will call the method the number of times it needs to meet the first element matching the condition. The output will be:
GET:5
GET:4
GET:3

I don't know without checking but would expect it to read it each time.
But this is very easily remedied with the following version of your code.
byte val = bytereader.readInt();
var item = Core.Collections.Items.FirstOrDefault(itm => itm.UserID == val);
Myself, I would automatically take this approach anyway just to remove any doubt. Might be a good habit to form as there is no reason to read it for each item.

It's actually quite obvious that the call is performed for each item - FirstOrDefault() takes an delegate as argument. This fact is a bit obscured by using a lambda method but in the end the method only sees a delegate that it can call for each item to check the predicate. In order to evaluate the right hand side only once some magic mechanism would have to understand and rewrite the method and (sometimes sadly) there is no real magic inside compilers and runtimes.

Need help understanding C# yield in IEnumerable

i am reading C# 2010 Accelerated. i dont get what is yield
When GetEnumerator is called, the code
in the method that contains the yield
statement is not actually executed at
that point in time. Instead, the
compiler generates an enumerator
class, and that class contains the
yield block code
public IEnumerator<T> GetEnumerator() {
foreach( T item in items ) {
yield return item;
}
}
i also read from Some help understanding “yield”
yield is a lazy producer of data, only
producing another item after the first
has been retrieved, whereas returning
a list will return everything in one
go.
does this mean that each call to GetEnumerator will get 1 item from the collection? so 1st call i get 1st item, 2nd, i get the 2nd and so on ... ?

Best way to think of it is when you first request an item from an IEnumerator (for example in a foreach), it starts running trough the method, and when it hits a yield return it pauses execution and returns that item for you to use in your foreach. Then you request the next item, it resumes the code where it left and repeats the cycle until it encounters either yield break or the end of the method.
public IEnumerator<string> enumerateSomeStrings()
{
yield return "one";
yield return "two";
var array = new[] { "three", "four" }
foreach (var item in array)
yield return item;
yield return "five";
}

Take a look at the IEnumerator<T> interface; that may well to clarify what's happening. The compiler takes your code and turns it into a class that implements both IEnumerable<T> and IEnumerator<T>. The call to GetEnumerator() simply returns the class itself.
The implementation is basically a state machine, which, for each call to MoveNext(), executes the code up until the next yield return and then sets Current to the return value. The foreach loop uses this enumerator to walk through the enumerated items, calling MoveNext() before each iteration of the loop. The compiler is really doing some very cool things here, making yield return one of the most powerful constructs in the language. From the programmer's perspective, it's just an easy way to lazily return items upon request.

Yes thats right, heres the example from MSDN that illustrates how to use it
public class List
{
//using System.Collections;
public static IEnumerable Power(int number, int exponent)
{
int counter = 0;
int result = 1;
while (counter++ < exponent)
{
result = result * number;
yield return result;
}
}
static void Main()
{
// Display powers of 2 up to the exponent 8:
foreach (int i in Power(2, 8))
{
Console.Write("{0} ", i);
}
}
}
/*
Output:
2 4 8 16 32 64 128 256
*/

If I understand your question correct then your understanding is incorrect I'm affraid. The yield statements (yield return and yield break) is a very clever compiler trick. The code in you method is actually compiled into a class that implements IEnumerable. An instance of this class is what the method will return. Let's Call the instance 'ins' when calling ins.GetEnumerator() you get an IEnumerator that for each Call to MoveNext() produced the next element in the collection (the yield return is responsible for this part) when the sequence has no more elements (e.g. a yield break is encountered) MoveNext() returns false and further calls results in an exception. So it is not the Call to GetEnumerator that produced the (next) element but the Call to MoveNext

It looks like you understand it.
yield is used in your class's GetEnumerator as you describe so that you can write code like this:
foreach (MyObject myObject in myObjectCollection)
{
// Do something with myObject
}
By returning the first item from the 1st call the second from the 2nd and so on you can loop over all elements in the collection.
yield is defined in MyObjectCollection.

The Simple way to understand yield keyword is we do not need extra class to hold the result of iteration when return using
yield return keyword. Generally when we iterate through the collection and want to return the result, we use collection object
to hold the result. Let's look at example.
public static List Multiplication(int number, int times)
{
List<int> resultList = new List<int>();
int result = number;
for(int i=1;i<=times;i++)
{
result=number*i;
resultList.Add(result);
}
return resultList;
}
static void Main(string[] args)
{
foreach(int i in Multiplication(2,10))
{
Console.WriteLine(i);
}
Console.ReadKey();
}
In the above example, I want to return the result of multiplication of 2 ten times. So I Create a method Multiplication
which returns me the multiplication of 2 ten times and i store the result in the list and when my main method calls the
multiplication method, the control iterates through the loop ten times and store result result in the list. This is without
using yield return. Suppose if i want to do this using yield return it looks like
public static IEnumerable Multiplication(int number, int times)
{
int result = number;
for(int i=1;i<=times;i++)
{
result=number*i;
yield return result;
}
}
static void Main(string[] args)
{
foreach(int i in Multiplication(2,10))
{
Console.WriteLine(i);
}
Console.ReadKey();
}
Now there is slight changes in Multiplication method, return type is IEnumerable and there is no other list to hold the
result because to work with Yield return type must be IEnumerable or IEnumerator and since Yield provides stateful iteration
we do not need extra class to hold the result. So in the above example, when Multiplication method is called from Main
method, it calculates the result in for 1st iteration and return the result to main method and come backs to the loop and
calculate the result for 2nd iteration and returns the result to main method.In this way Yield returns result to calling
method one by one in each iteration.There is other Keyword break used in combination with Yield that causes the iteration
to stop. For example in the above example if i want to calculate multiplication for only half number of times(10/2=5) then
the method looks like this:
public static IEnumerable Multiplication(int number, int times)
{
int result = number;
for(int i=1;i<=times;i++)
{
result=number*i;
yield return result;
if (i == times / 2)
yield break;
}
}
This method now will result multiplication of 2, 5 times.Hope this will help you understand the concept of Yield. For more
information please visit http://msdn.microsoft.com/en-us/library/9k7k7cf0.aspx

Some help understanding "yield"

In my everlasting quest to suck less I'm trying to understand the "yield" statement, but I keep encountering the same error.
The body of [someMethod] cannot be an iterator block because
'System.Collections.Generic.List< AClass>' is not an iterator interface type.
This is the code where I got stuck:
foreach (XElement header in headersXml.Root.Elements()){
yield return (ParseHeader(header));
}
What am I doing wrong? Can't I use yield in an iterator? Then what's the point?
In this example it said that List<ProductMixHeader> is not an iterator interface type.
ProductMixHeader is a custom class, but I imagine List is an iterator interface type, no?
--Edit--
Thanks for all the quick answers.
I know this question isn't all that new and the same resources keep popping up.
It turned out I was thinking I could return List<AClass> as a return type, but since List<T> isn't lazy, it cannot. Changing my return type to IEnumerable<T> solved the problem :D
A somewhat related question (not worth opening a new thread): is it worth giving IEnumerable<T> as a return type if I'm sure that 99% of the cases I'm going to go .ToList() anyway? What will the performance implications be?

A method using yield return must be declared as returning one of the following two interfaces:
IEnumerable<SomethingAppropriate>
IEnumerator<SomethingApropriate>
(thanks Jon and Marc for pointing out IEnumerator)
Example:
public IEnumerable<AClass> YourMethod()
{
foreach (XElement header in headersXml.Root.Elements())
{
yield return (ParseHeader(header));
}
}
yield is a lazy producer of data, only producing another item after the first has been retrieved, whereas returning a list will return everything in one go.
So there is a difference, and you need to declare the method correctly.
For more information, read Jon's answer here, which contains some very useful links.

It's a tricky topic. In a nutshell, it's an easy way of implementing IEnumerable and its friends. The compiler builds you a state machine, transforming parameters and local variables into instance variables in a new class. Complicated stuff.
I have a few resources on this:
Chapter 6 of C# in Depth (free download from that page)
Iterators, iterator blocks and data pipelines (article)
Iterator block implementation details (article)

"yield" creates an iterator block - a compiler generated class that can implement either IEnumerable[<T>] or IEnumerator[<T>]. Jon Skeet has a very good (and free) discussion of this in chapter 6 of C# in Depth.
But basically - to use "yield" your method must return an IEnumerable[<T>] or IEnumerator[<T>]. In this case:
public IEnumerable<AClass> SomeMethod() {
// ...
foreach (XElement header in headersXml.Root.Elements()){
yield return (ParseHeader(header));
}
}

List implements Ienumerable.
Here's an example that might shed some light on what you are trying to learn. I wrote this about 6 months
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace YieldReturnTest
{
public class PrimeFinder
{
private Boolean isPrime(int integer)
{
if (0 == integer)
return false;
if (3 > integer)
return true;
for (int i = 2; i < integer; i++)
{
if (0 == integer % i)
return false;
}
return true;
}
public IEnumerable<int> FindPrimes()
{
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
yield return i;
}
}
}
}
class Program
{
static void Main(string[] args)
{
PrimeFinder primes = new PrimeFinder();
foreach (int i in primes.FindPrimes())
{
Console.WriteLine(i);
Console.ReadLine();
}
Console.ReadLine();
Console.ReadLine();
}
}
}

I highly recommend using Reflector to have a look at what yield actually does for you. You'll be able to see the full code of the class that the compiler generates for you when using yield, and I've found that people understand the concept much more quickly when they can see the low-level result (well, mid-level I guess).

To understand yield, you need to understand when to use IEnumerator and IEnumerable (because you have to use either of them). The following examples help you to understand the difference.
First, take a look at the following class, it implements two methods - one returning IEnumerator<int>, one returning IEnumerable<int>. I'll show you that there is a big difference in usage, although the code of the 2 methods is looking similar:
// 2 iterators, one as IEnumerator, one as IEnumerable
public class Iterator
{
public static IEnumerator<int> IterateOne(Func<int, bool> condition)
{
for(var i=1; condition(i); i++) { yield return i; }
}
public static IEnumerable<int> IterateAll(Func<int, bool> condition)
{
for(var i=1; condition(i); i++) { yield return i; }
}
}
Now, if you're using IterateOne you can do the following:
// 1. Using IEnumerator allows to get item by item
var i=Iterator.IterateOne(x => true); // iterate endless
// 1.a) get item by item
i.MoveNext(); Console.WriteLine(i.Current);
i.MoveNext(); Console.WriteLine(i.Current);
// 1.b) loop until 100
int j; while (i.MoveNext() && (j=i.Current)<=100) { Console.WriteLine(j); }
1.a) prints:
1
2
1.b) prints:
3
4
...
100
because it continues counting right after the 1.a) statements have been executed.
You can see that you can advance item by item using MoveNext().
In contrast, IterateAll allows you to use foreach and also LINQ statements for bigger comfort:
// 2. Using IEnumerable makes looping and LINQ easier
var k=Iterator.IterateAll(x => x<100); // limit iterator to 100
// 2.a) Use a foreach loop
foreach(var x in k){ Console.WriteLine(x); } // loop
// 2.b) LINQ: take 101..200 of endless iteration
var lst=Iterator.IterateAll(x=>true).Skip(100).Take(100).ToList(); // LINQ: take items
foreach(var x in lst){ Console.WriteLine(x); } // output list
2.a) prints:
1
2
...
99
2.b) prints:
101
102
...
200
Note: Since IEnumerator<T> and IEnumerable<T> are Generics, they can be used with any type. However, for simplicity I have used int in my examples for type T.
This means, you can use one of the return types IEnumerator<ProductMixHeader> or IEnumerable<ProductMixHeader> (the custom class you have mentioned in your question).
The type List<ProductMixHeader> does not implement any of these interfaces, which is the reason why you can't use it that way. But Example 2.b) is showing how you can create a list from it.
If you're creating a list by appending .ToList() then the implication is, that it will create a list of all elements in memory, while an IEnumerable allows lazy creation of its elements - in terms of performance, it means that elements are enumerated just in time - as late as possible, but as soon as you're using .ToList(), then all elements are created in memory. LINQ tries to optimize performance this way behind the scenes.
DotNetFiddle of all examples

#Ian P´s answer helped me a lot to understand yield and why it is used. One (major) use case for yield is in "foreach" loops after the "in" keyword not to return a fully completed list. Instead of returning a complete list at once, in each "foreach" loop only one item (the next item) is returned. So you will gain performance with yield in such cases.
I have rewritten #Ian P´s code for my better understanding to the following:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace YieldReturnTest
{
public class PrimeFinder
{
private Boolean isPrime(int integer)
{
if (0 == integer)
return false;
if (3 > integer)
return true;
for (int i = 2; i < integer; i++)
{
if (0 == integer % i)
return false;
}
return true;
}
public IEnumerable<int> FindPrimesWithYield()
{
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
yield return i;
}
}
}
public IEnumerable<int> FindPrimesWithoutYield()
{
var primes = new List<int>();
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
primes.Add(i);
}
}
return primes;
}
}
class Program
{
static void Main(string[] args)
{
PrimeFinder primes = new PrimeFinder();
Console.WriteLine("Finding primes until 7 with yield...very fast...");
foreach (int i in primes.FindPrimesWithYield()) // FindPrimesWithYield DOES NOT iterate over all integers at once, it returns item by item
{
if (i > 7)
{
break;
}
Console.WriteLine(i);
//Console.ReadLine();
}
Console.WriteLine("Finding primes until 7 without yield...be patient it will take lonkg time...");
foreach (int i in primes.FindPrimesWithoutYield()) // FindPrimesWithoutYield DOES iterate over all integers at once, it returns the complete list of primes at once
{
if (i > 7)
{
break;
}
Console.WriteLine(i);
//Console.ReadLine();
}
Console.ReadLine();
Console.ReadLine();
}
}
}

What does the method you're using this in look like? I don't think this can be used in just a loop by itself.
For example...
public IEnumerable<string> GetValues() {
foreach(string value in someArray) {
if (value.StartsWith("A")) { yield return value; }
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Enumerating over lambdas does not bind the scope correctly? - c#

Related

lifetime of local variable inside an Action in c# [duplicate]

Why prefer Yield over a simple return?

Linq FirstOrDefault evaluates predicate each iteration?

Need help understanding C# yield in IEnumerable

Some help understanding "yield"

Categories

Resources