Why does Lazy<T> using ExecutionAndPublication behave differently with yielded IEnumerables?

Why does Lazy<T> using ExecutionAndPublication behave differently with yielded IEnumerables? - c#

I had a problem in my code where a Lazy initializer was called more frequently than what I expected. From the documentation, I expected that using LazyThreadSafetyMode.ExecutionAndPublication would ensure my initializer function was only ever called once, for example if accessing numbers.Value after defining:
numbers = new Lazy<IEnumerable<int>>(
() => GetNumbers(),
LazyThreadSafetyMode.ExecutionAndPublication
);
However, what I have found is that if the initialization function yields its results, the initialization function gets called more than once. I presume this has to with yields delayed execution but I only have a fuzzy sense why.
Question:
In the code below, why do the respective initialization functions get executed a different number of times?
void Main()
{
var foo = new foo();
var tasks = new List<Task>();
for (int i = 0; i < 10; ++i) tasks.Add(Task.Run(() => {foreach (var number in foo.Numbers) Debug.WriteLine(number);}));
Task.WaitAll(tasks.ToArray());
tasks.Clear();
for (int i = 0; i < 10; ++i) tasks.Add(Task.Run(() => {foreach (var letter in foo.Letters) Debug.WriteLine(letter);}));
Task.WaitAll(tasks.ToArray());
}
public class foo
{
public IEnumerable<int> Numbers => numbers.Value;
public IEnumerable<char> Letters => letters.Value;
readonly Lazy<IEnumerable<int>> numbers;
readonly Lazy<IEnumerable<char>> letters;
public foo()
{
numbers = new Lazy<IEnumerable<int>>(
() => GetNumbers(),
LazyThreadSafetyMode.ExecutionAndPublication
);
letters = new Lazy<IEnumerable<char>>(
() => GetLetters().ToList(), //ToList enumerates all yielded letters, creating the expected call once behavior
LazyThreadSafetyMode.ExecutionAndPublication
);
}
protected IEnumerable<char> GetLetters()
{
Debug.WriteLine($"{nameof(GetLetters)} Called");
yield return 'a';
yield return 'b';
yield return 'c';
yield break;
}
protected IEnumerable<int> GetNumbers()
{
Debug.WriteLine($"{nameof(GetNumbers)} Called");
yield return 1;
yield return 2;
yield return 3;
yield break;
}
}

I had a problem in my code where a Lazy initializer was called more frequently than what I expected.
No, the initializer of the lazy is called once. The initializer of the lazy is
() => GetNumbers()
and that is called exactly once.
GetNumbers returns an IEnumerable<int> -- a sequence of integers.
When you foreach that sequence, it calls GetEnumerator to get an enumerator, and then calls MoveNext on the enumerator object until MoveNext returns false.
You've said that you want the sequence to be enumerated as:
the first time MoveNext is called, do a writeline and produce 1
the second time MoveNext is called, produce 2
the third time MoveNext is called, produce 3
the fourth time MoveNext is called, produce 4
Every subsequent time MoveNext is called, return false
So every time you enumerate the sequence, that's what happens.
Can you explain what you expected to happen? I am interested to learn why people have false beliefs about computer programs.
Also it is not clear to me why you are using Lazy at all. You would typically use Lazy to avoid expensive work until it is needed, but sequences already defer work until they are enumerated.

Related

Enumerating over lambdas does not bind the scope correctly?

consider the following C# program:
using System;
using System.Linq;
using System.Collections.Generic;
public class Test
{
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
int capture = i;
yield return () => Console.WriteLine(capture.ToString());
}
}
public static void Main(string[] args)
{
foreach (var a in Get()) a();
foreach (var a in Get().ToList()) a();
}
}
When executed under Mono compiler (e.g. Mono 2.10.2.0 - paste into here), it writes the following output:
0
1
1
1
This seems totally unlogical to me. When directly iterating the yield function, the scope of the for-loop is "correctly" (to my understanding) used. But when I store the result in a list first, the scope is always the last action?!
Can I assume that this is a bug in the Mono compiler, or did I hit a mysterious corner case of C#'s lambda and yield-stuff?
BTW: When using Visual Studio compiler (and either MS.NET or mono to execute), the result is the expected 0 1 0 1

I'll give you the reason why it was 0 1 1 1:
foreach (var a in Get()) a();
Here you go into Get and it starts iterating:
i = 0 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 0 to the screen, then returns to the Get() method and continues.
i = 1 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 1 to the screen, then returns to the Get() method and continues (only to find that it has to stop).
But now, you're not iterating over each item when it happens, you're building a list and then iterating over that list.
foreach (var a in Get().ToList()) a();
What you are doing isn't like above, Get().ToList() returns a List or Array (not sure wich one). So now this happens:
i = 0 => return Console.WriteLine(i);
And in you Main() function, you get the following in memory:
var i = 0;
var list = new List
{
Console.WriteLine(i)
}
You go back into the Get() function:
i = 1 => return Console.WriteLine(i);
Which returns to your Main()
var i = 1;
var list = new List
{
Console.WriteLine(i),
Console.WriteLine(i)
}
And then does
foreach (var a in list) a();
Which will print out 1 1
It seems like it was ignoring that you made sure you encapsulated the value before returning the function.

#Armaron - The .ToList() extension returns List of type T as ToArray() returns T[] as the naming convention implies, but I think you are on the right track with your response.
This sounds like an issuse with the compiler. I agree with Servy that it is probably a bug, however, have you tried the following?
public class Test
{
private static int capture = 0;
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
capture++;
yield return () => Console.WriteLine(capture.ToString());
}
}
}
Additionally you may want to try the static approach, perhaps this will perform a more accurate conversion as your function is static.
List<T> list = Enumerable.ToList(Get());
When calling ToList() it seems as though it is not performing a single iteration for each value but rather:
return new List<T>(Get());
The second for each in your code does not make sense to me in implementation as to why it would ever be necessary or beneficial unless you require additional actions to be added/removed to the List object. The first makes perfect sense since all you are doing is iterating through the object and performing the associated action. My understanding is that an integer within the scope of the static IEnumerbale object is being calculated during conversion by performing the entire iteration and the action is preserving the int as a static int due to scope. Also, keep in mind that IEnumerable is merely an interface that is implemented by List which implements IList, and may contain logic for the conversion built in.
That being said I am interested to see/hear your findings as this is an interesting post. I will definitely upvote the question. Please ask questions if anything I said needs clarification or if something is false say so, although I am confident in my usage of the yield keyword of IEnumerable but this is a unique issue.

Buffering a LINQ query

FINAL EDIT:
I've chosen Timothy's answer but if you want a cuter implementation that leverages the C# yield statement check Eamon's answer: https://stackoverflow.com/a/19825659/145757
By default LINQ queries are lazily streamed.
ToArray/ToList give full buffering but first they're eager and secondly it may take quite some time to complete with an infinite sequence.
Is there any way to have a combination of both behaviors : streaming and buffering values on the fly as they are generated, so that the next querying won't trigger the generation of the elements that have already been queried.
Here is a basic use-case:
static IEnumerable<int> Numbers
{
get
{
int i = -1;
while (true)
{
Console.WriteLine("Generating {0}.", i + 1);
yield return ++i;
}
}
}
static void Main(string[] args)
{
IEnumerable<int> evenNumbers = Numbers.Where(i => i % 2 == 0);
foreach (int n in evenNumbers)
{
Console.WriteLine("Reading {0}.", n);
if (n == 10) break;
}
Console.WriteLine("==========");
foreach (int n in evenNumbers)
{
Console.WriteLine("Reading {0}.", n);
if (n == 10) break;
}
}
Here is the output:
Generating 0.
Reading 0.
Generating 1.
Generating 2.
Reading 2.
Generating 3.
Generating 4.
Reading 4.
Generating 5.
Generating 6.
Reading 6.
Generating 7.
Generating 8.
Reading 8.
Generating 9.
Generating 10.
Reading 10.
==========
Generating 0.
Reading 0.
Generating 1.
Generating 2.
Reading 2.
Generating 3.
Generating 4.
Reading 4.
Generating 5.
Generating 6.
Reading 6.
Generating 7.
Generating 8.
Reading 8.
Generating 9.
Generating 10.
Reading 10.
The generation code is triggered 22 times.
I'd like it to be triggered 11 times, the first time the enumerable is iterated.
Then the second iteration would benefit from the already generated values.
It would be something like:
IEnumerable<int> evenNumbers = Numbers.Where(i => i % 2 == 0).Buffer();
For those familiar with Rx it's a behavior similar to a ReplaySubject.

IEnumerable<T>.Buffer() extension method
public static EnumerableExtensions
{
public static BufferEnumerable<T> Buffer(this IEnumerable<T> source)
{
return new BufferEnumerable<T>(source);
}
}
public class BufferEnumerable<T> : IEnumerable<T>, IDisposable
{
IEnumerator<T> source;
List<T> buffer;
public BufferEnumerable(IEnumerable<T> source)
{
this.source = source.GetEnumerator();
this.buffer = new List<T>();
}
public IEnumerator<T> GetEnumerator()
{
return new BufferEnumerator<T>(source, buffer);
}
public void Dispose()
{
source.Dispose()
}
}
public class BufferEnumerator<T> : IEnumerator<T>
{
IEnumerator<T> source;
List<T> buffer;
int i = -1;
public BufferEnumerator(IEnumerator<T> source, List<T> buffer)
{
this.source = source;
this.buffer = buffer;
}
public T Current
{
get { return buffer[i]; }
}
public bool MoveNext()
{
i++;
if (i < buffer.Count)
return true;
if (!source.MoveNext())
return false;
buffer.Add(source.Current);
return true;
}
public void Reset()
{
i = -1;
}
public void Dispose()
{
}
}
Usage
using (var evenNumbers = Numbers.Where(i => i % 2 == 0).Buffer())
{
...
}
Comments
The key point here is that the IEnumerable<T> source given as input to the Buffer method only has GetEnumerator called once, regardless of how many times the result of Buffer is enumerated. All enumerators for the result of Buffer share the same source enumerator and internal list.

You can use the Microsoft.FSharp.Collections.LazyList<> type from the F# power pack (yep, from C# without F# installed - no problem!) for this. It's in Nuget package FSPowerPack.Core.Community.
In particular, you want to call LazyListModule.ofSeq(...) which returns a LazyList<T> that implements IEnumerable<T> and is lazy and cached.
In your case, usage is just a matter of...
var evenNumbers = LazyListModule.ofSeq(Numbers.Where(i => i % 2 == 0));
var cachedEvenNumbers = LazyListModule.ofSeq(evenNumbers);
Though I personally prefer var in all such cases, note that this does mean the compile-time type will be more specific than just IEnumerable<> - not that this is likely to ever be a downside. Another advantage of the F# non-interface types is that they expose some efficient operations you can't do efficienly with plain IEnumerables, such as LazyListModule.skip.
I'm not sure whether LazyList is thread-safe, but I suspect it is.
Another alternative pointed out in the comments below (if you have F# installed) is SeqModule.Cache (namespace Microsoft.FSharp.Collections, it'll be in GACed assembly FSharp.Core.dll) which has the same effective behavior. Like other .NET enumerables, Seq.cache doesn't have a tail (or skip) operator you can efficiently chain.
Thread-safe: unlike other solutions to this question Seq.cache is thread-safe in the sense that you can have multiple enumerators running in parallel (each enumerator is not thread safe).
Performance I did a quick benchmark, and the LazyList enumerable has at least 4 times more overhead than the SeqModule.Cache variant, which has at least three times more overhead than the custom implementation answers. So, while the F# variants work, they're not quite as fast. Note that 3-12 times slower still isn't very slow compared to an enumerable that does (say) I/O or any non-trivial computation, so this probably won't matter most of the time, but it's good to keep in mind.
TL;DR If you need an efficient, thread-safe cached enumerable, just use SeqModule.Cache.

Building upon Eamon's answer above, here's another functional solution (no new types) that works also with simultaneous evaluation. This demonstrates that a general pattern (iteration with shared state) underlies this problem.
First we define a very general helper method, meant to allow us to simulate the missing feature of anonymous iterators in C#:
public static IEnumerable<T> Generate<T>(Func<Func<Tuple<T>>> generator)
{
var tryGetNext = generator();
while (true)
{
var result = tryGetNext();
if (null == result)
{
yield break;
}
yield return result.Item1;
}
}
Generate is like an aggregator with state. It accepts a function that returns initial state, and a generator function that would have been an anonymous with yield return in it, if it were allowed in C#. The state returned by initialize is meant to be per-enumeration, while a more global state (shared between all enumerations) can be maintained by the caller to Generate e.g. in closure variables as we'll show below.
Now we can use this for the "buffered Enumerable" problem:
public static IEnumerable<T> Cached<T>(IEnumerable<T> enumerable)
{
var cache = new List<T>();
var enumerator = enumerable.GetEnumerator();
return Generate<T>(() =>
{
int pos = -1;
return () => {
pos += 1;
if (pos < cache.Count())
{
return new Tuple<T>(cache[pos]);
}
if (enumerator.MoveNext())
{
cache.Add(enumerator.Current);
return new Tuple<T>(enumerator.Current);
}
return null;
};
});
}

I hope this answer combines the brevity and clarity of sinelaw's answer and the support for multiple enumerations of Timothy's answer:
public static IEnumerable<T> Cached<T>(this IEnumerable<T> enumerable) {
return CachedImpl(enumerable.GetEnumerator(), new List<T>());
}
static IEnumerable<T> CachedImpl<T>(IEnumerator<T> source, List<T> buffer) {
int pos=0;
while(true) {
if(pos == buffer.Count)
if (source.MoveNext())
buffer.Add(source.Current);
else
yield break;
yield return buffer[pos++];
}
}
Key ideas are to use the yield return syntax to make for a short enumerable implementation, but you still need a state-machine to decide whether you can get the next element from the buffer, or whether you need to check the underlying enumerator.
Limitations: This makes no attempt to be thread-safe, nor does it dispose the underlying enumerator (which, in general, is quite tricky to do as the underlying uncached enumerator must remain undisposed as long as any cached enumerabl might still be used).

As far as I know there is no built-in way to do this, which - now that you mention it - is slightly surprising (my guess is, given the frequency with which one would want to use this option, it was probably not worth the effort needed to analyse the code to make sure that the generator gives the exact same sequence every time).
You can however implement it yourself. The easy way would be on the call-site, as
var evenNumbers = Numbers.Where(i => i % 2 == 0).
var startOfList = evenNumbers.Take(10).ToList();
// use startOfList instead of evenNumbers in the loop.
More generally and accurately, you could do it in the generator: create a List<int> cache and every time you generate a new number add it to the cache before you yield return it. Then when you loop through again, first serve up all the cached numbers. E.g.
List<int> cachedEvenNumbers = new List<int>();
IEnumerable<int> EvenNumbers
{
get
{
int i = -1;
foreach(int cached in cachedEvenNumbers)
{
i = cached;
yield return cached;
}
// Note: this while loop now starts from the last cached value
while (true)
{
Console.WriteLine("Generating {0}.", i + 1);
yield return ++i;
}
}
}
I guess if you think about this long enough you could come up with a general implementation of a IEnumerable<T>.Buffered() extension method - again, the requirement is that the enumeration doesn't change between calls and the question is if it is worth it.

Here's an incomplete yet compact 'functional' implementation (no new types defined).
The bug is that it does not allow simultaneous enumeration.
Original description:
The first function should have been an anonymous lambda inside the second, but C# does not allow yield in anonymous lambdas:
// put these in some extensions class
private static IEnumerable<T> EnumerateAndCache<T>(IEnumerator<T> enumerator, List<T> cache)
{
while (enumerator.MoveNext())
{
var current = enumerator.Current;
cache.Add(current);
yield return current;
}
}
public static IEnumerable<T> ToCachedEnumerable<T>(this IEnumerable<T> enumerable)
{
var enumerator = enumerable.GetEnumerator();
var cache = new List<T>();
return cache.Concat(EnumerateAndCache(enumerator, cache));
}
Usage:
var enumerable = Numbers.ToCachedEnumerable();

Full credit to Eamon Nerbonne and sinelaw for their answers, just a couple of tweaks! First, to release the enumerator when it is completed. Secondly to protect the underlying enumerator with a lock so the enumerable can be safely used on multiple threads.
// This is just the same as #sinelaw's Generator but I didn't like the name
public static IEnumerable<T> AnonymousIterator<T>(Func<Func<Tuple<T>>> generator)
{
var tryGetNext = generator();
while (true)
{
var result = tryGetNext();
if (null == result)
{
yield break;
}
yield return result.Item1;
}
}
// Cached/Buffered/Replay behaviour
public static IEnumerable<T> Buffer<T>(this IEnumerable<T> self)
{
// Rows are stored here when they've been fetched once
var cache = new List<T>();
// This counter is thread-safe in that it is incremented after the item has been added to the list,
// hence it will never give a false positive. It may give a false negative, but that falls through
// to the code which takes the lock so it's ok.
var count = 0;
// The enumerator is retained until it completes, then it is discarded.
var enumerator = self.GetEnumerator();
// This lock protects the enumerator only. The enumerable could be used on multiple threads
// and the enumerator would then be shared among them, but enumerators are inherently not
// thread-safe so a) we must protect that with a lock and b) we don't need to try and be
// thread-safe in our own enumerator
var lockObject = new object();
return AnonymousIterator<T>(() =>
{
int pos = -1;
return () =>
{
pos += 1;
if (pos < count)
{
return new Tuple<T>(cache[pos]);
}
// Only take the lock when we need to
lock (lockObject)
{
// The counter could have been updated between the check above and this one,
// so now we have the lock we must check again
if (pos < count)
{
return new Tuple<T>(cache[pos]);
}
// Enumerator is set to null when it has completed
if (enumerator != null)
{
if (enumerator.MoveNext())
{
cache.Add(enumerator.Current);
count += 1;
return new Tuple<T>(enumerator.Current);
}
else
{
enumerator = null;
}
}
}
}
return null;
};
});
}

I use the following extension method.
This way, the input is read at maximum speed, and the consumer processes at maximum speed.
public static IEnumerable<T> Buffer<T>(this IEnumerable<T> input)
{
var blockingCollection = new BlockingCollection<T>();
//read from the input
Task.Factory.StartNew(() =>
{
foreach (var item in input)
{
blockingCollection.Add(item);
}
blockingCollection.CompleteAdding();
});
foreach (var item in blockingCollection.GetConsumingEnumerable())
{
yield return item;
}
}
Example Usage
This example has a fast producer (find files), and a slow consumer (upload files).
long uploaded = 0;
long total = 0;
Directory
.EnumerateFiles(inputFolder, "*.jpg", SearchOption.AllDirectories)
.Select(filename =>
{
total++;
return filename;
})
.Buffer()
.ForEach(filename =>
{
//pretend to do something slow, like upload the file.
Thread.Sleep(1000);
uploaded++;
Console.WriteLine($"Uploaded {uploaded:N0}/{total:N0}");
});

Why prefer Yield over a simple return?

I am trying to understand use of Yield to enumerate the collection. I have written this basic code:
static void Main(string[] args)
{
Iterate iterate = new Iterate();
foreach (int i in iterate.EnumerateList())
{
Console.Write("{0}", i);
}
Console.ReadLine();
}
class Iterate
{
public IEnumerable<int> EnumerateList()
{
List<int> lstNumbers = new List<int>();
lstNumbers.Add(1);
lstNumbers.Add(2);
lstNumbers.Add(3);
lstNumbers.Add(4);
lstNumbers.Add(5);
foreach (int i in lstNumbers)
{
yield return i;
}
}
}
(1) What if I use simply return i instead of yield return i?
(2) What are the advantages of using Yield and when to prefer using it?
Edited **
In the above code, I think it is an overhead to use foreach two times. First in the main function and the second in the EnumerateList method.

Using yield return i makes this method an iterator. It will create an IEnumerable<int> sequence of values from your entire loop.
If you used return i, it would just return a single int value. In your case, this would cause a compiler error, as the return type of your method is IEnumerable<int>, not int.
In this specific example, I would personally just return lstNumbers instead of using the iterator. You could rewrite this without the list, though, as:
public IEnumerable<int> EnumerateList()
{
yield return 1;
yield return 2;
yield return 3;
yield return 4;
yield return 5;
}
Or even:
public IEnumerable<int> EnumerateList()
{
for (int i=1;i<=5;++i)
yield return i;
}
This is very handy when you're making a class which you want to act like a collection. Implementing IEnumerable<T> by hand for a custom type often requires making a custom class, etc. Prior to iterators, this required a lot of code, as shown in this old sample for implementing a custom collection in C#.

As Reed says using yield allows you to implement an iterator. The major advantage of an iterator is that it allows lazy evaluation. I.e. it doesn't have to materialize the entire result unless needed.
Consider Directory.GetFiles from the BCL. It returns string[]. I.e. it has to get all the files and put the names in an array before it returns. In contrast Directory.EnumerateFiles returns IEnumerable<string>. I.e. the caller is responsible for handling the result set. That means that the caller can opt out of enumerating the collection at any point.

The yield keyword will make the compiler turn the function into an enumerator object.
The yield return statement doesn't simply exit the function and return one value, it leaves the code in a state so that it can be resumed at that point when the next value is requested from the enumerator.

You can't return just i as i is an int and not an IEnumerable. It won't even compile. You could have returned lstNumbers as that implements the interface of IEnumerable.
I prefer yield return since the compiler will handle building the enumerable and not having to build the list in the first place. So for me if I have something that is already implementing the interface then I return that if I have to build it and not use it else where then I yield return.

Need help understanding C# yield in IEnumerable

i am reading C# 2010 Accelerated. i dont get what is yield
When GetEnumerator is called, the code
in the method that contains the yield
statement is not actually executed at
that point in time. Instead, the
compiler generates an enumerator
class, and that class contains the
yield block code
public IEnumerator<T> GetEnumerator() {
foreach( T item in items ) {
yield return item;
}
}
i also read from Some help understanding “yield”
yield is a lazy producer of data, only
producing another item after the first
has been retrieved, whereas returning
a list will return everything in one
go.
does this mean that each call to GetEnumerator will get 1 item from the collection? so 1st call i get 1st item, 2nd, i get the 2nd and so on ... ?

Best way to think of it is when you first request an item from an IEnumerator (for example in a foreach), it starts running trough the method, and when it hits a yield return it pauses execution and returns that item for you to use in your foreach. Then you request the next item, it resumes the code where it left and repeats the cycle until it encounters either yield break or the end of the method.
public IEnumerator<string> enumerateSomeStrings()
{
yield return "one";
yield return "two";
var array = new[] { "three", "four" }
foreach (var item in array)
yield return item;
yield return "five";
}

Take a look at the IEnumerator<T> interface; that may well to clarify what's happening. The compiler takes your code and turns it into a class that implements both IEnumerable<T> and IEnumerator<T>. The call to GetEnumerator() simply returns the class itself.
The implementation is basically a state machine, which, for each call to MoveNext(), executes the code up until the next yield return and then sets Current to the return value. The foreach loop uses this enumerator to walk through the enumerated items, calling MoveNext() before each iteration of the loop. The compiler is really doing some very cool things here, making yield return one of the most powerful constructs in the language. From the programmer's perspective, it's just an easy way to lazily return items upon request.

Yes thats right, heres the example from MSDN that illustrates how to use it
public class List
{
//using System.Collections;
public static IEnumerable Power(int number, int exponent)
{
int counter = 0;
int result = 1;
while (counter++ < exponent)
{
result = result * number;
yield return result;
}
}
static void Main()
{
// Display powers of 2 up to the exponent 8:
foreach (int i in Power(2, 8))
{
Console.Write("{0} ", i);
}
}
}
/*
Output:
2 4 8 16 32 64 128 256
*/

If I understand your question correct then your understanding is incorrect I'm affraid. The yield statements (yield return and yield break) is a very clever compiler trick. The code in you method is actually compiled into a class that implements IEnumerable. An instance of this class is what the method will return. Let's Call the instance 'ins' when calling ins.GetEnumerator() you get an IEnumerator that for each Call to MoveNext() produced the next element in the collection (the yield return is responsible for this part) when the sequence has no more elements (e.g. a yield break is encountered) MoveNext() returns false and further calls results in an exception. So it is not the Call to GetEnumerator that produced the (next) element but the Call to MoveNext

It looks like you understand it.
yield is used in your class's GetEnumerator as you describe so that you can write code like this:
foreach (MyObject myObject in myObjectCollection)
{
// Do something with myObject
}
By returning the first item from the 1st call the second from the 2nd and so on you can loop over all elements in the collection.
yield is defined in MyObjectCollection.

The Simple way to understand yield keyword is we do not need extra class to hold the result of iteration when return using
yield return keyword. Generally when we iterate through the collection and want to return the result, we use collection object
to hold the result. Let's look at example.
public static List Multiplication(int number, int times)
{
List<int> resultList = new List<int>();
int result = number;
for(int i=1;i<=times;i++)
{
result=number*i;
resultList.Add(result);
}
return resultList;
}
static void Main(string[] args)
{
foreach(int i in Multiplication(2,10))
{
Console.WriteLine(i);
}
Console.ReadKey();
}
In the above example, I want to return the result of multiplication of 2 ten times. So I Create a method Multiplication
which returns me the multiplication of 2 ten times and i store the result in the list and when my main method calls the
multiplication method, the control iterates through the loop ten times and store result result in the list. This is without
using yield return. Suppose if i want to do this using yield return it looks like
public static IEnumerable Multiplication(int number, int times)
{
int result = number;
for(int i=1;i<=times;i++)
{
result=number*i;
yield return result;
}
}
static void Main(string[] args)
{
foreach(int i in Multiplication(2,10))
{
Console.WriteLine(i);
}
Console.ReadKey();
}
Now there is slight changes in Multiplication method, return type is IEnumerable and there is no other list to hold the
result because to work with Yield return type must be IEnumerable or IEnumerator and since Yield provides stateful iteration
we do not need extra class to hold the result. So in the above example, when Multiplication method is called from Main
method, it calculates the result in for 1st iteration and return the result to main method and come backs to the loop and
calculate the result for 2nd iteration and returns the result to main method.In this way Yield returns result to calling
method one by one in each iteration.There is other Keyword break used in combination with Yield that causes the iteration
to stop. For example in the above example if i want to calculate multiplication for only half number of times(10/2=5) then
the method looks like this:
public static IEnumerable Multiplication(int number, int times)
{
int result = number;
for(int i=1;i<=times;i++)
{
result=number*i;
yield return result;
if (i == times / 2)
yield break;
}
}
This method now will result multiplication of 2, 5 times.Hope this will help you understand the concept of Yield. For more
information please visit http://msdn.microsoft.com/en-us/library/9k7k7cf0.aspx

Need help in terminating deferred execution

The following snippet prints 1 through 10 on the console, but does not terminate until variable 'i' reaches int.MaxValue. TIA for pointing out what I am missing.
class Program
{
public static IEnumerable<int> GetList()
{
int i = 0;
while (i < int.MaxValue)
{
i++;
yield return i;
}
}
static void Main(string[] args)
{
var q = from i in GetList() // keeps calling until i reaches int.MaxValue
where i <= 10
select i;
foreach (int i in q)
Console.WriteLine(i);
}
}

You could try:
var q = GetList ().TakeWhile ((i)=> i <=10);

The query that you defined in Main doesn't know anything about the ordering of your GetList method, and it must check every value of that list with the predicate i <= 10. If you want to stop processing sooner, you will you can use the Take extension method or use the TakeWhile extension method:
foreach (int i in GetList().Take(10))
Console.WriteLine(i);
foreach (int i in GetList().TakeWhile(x => x <= 10))
Console.WriteLine(i);

Your iterators limits are 0 through Int32.MaxValue, so it will process that whole range. Iterators are only smart enough to not pre-iterate the results of the range of data you design it to iterate. However they are not smart enough to know when the code that uses them no longer needs more unless you tell it so (i.e. you break out of a foreach loop.) The only way to allow the iterator to limit itself is to pass in the upper bound to the GetList function:
public static IEnumerable<int> GetList(int upperBound)
{
int i = 0;
while (i < upperBound)
{
i++;
yield return i;
}
}
You could also explicitly tell the iterator that you only wish to iterate the first 10 results:
var numbers = GetList().Take(10);

Consider using the LINQ extension method .Take() with your argument instead of having it in your where clause. More on Take.
var q = from i in GetList().Take(10)
select i;

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why does Lazy<T> using ExecutionAndPublication behave differently with yielded IEnumerables? - c#

Related

Enumerating over lambdas does not bind the scope correctly?

Buffering a LINQ query

Why prefer Yield over a simple return?

Need help understanding C# yield in IEnumerable

Need help in terminating deferred execution

Categories

Resources