C# Having a method call inside Select statement causes multiple method calls - c#

When running the below code
private static IEnumerable<Tester2> Foo()
{
Console.WriteLine("Hello world!");
for (var i = 0; i < 10; i++)
yield return new Tester2(i);
}
private class Tester2
{
public int Hey { get; set; }
public Tester2(int hey)
{
Hey = hey;
}
}
private class Tester3
{
public int Hey3 { get; set; }
public Tester3(int hey)
{
Hey3 = hey + 5;
}
}
private static IEnumerable<Tester3> Bar(IEnumerable<Tester2> lister)
{
Console.WriteLine("Hello world2!");
return lister.Select(Convert);
}
private static Tester3 Convert(Tester2 tester)
{
Console.WriteLine("Hello world3!");
var hey = "test";
var result= hey.Map(x => new Tester3(tester.Hey)).First();
return result;
}
public static void Main(string[] args)
{
var x = Foo().ToList();
var bar = Bar(x);
foreach (var b in bar)
{
Console.WriteLine("Hmm2");
}
}
I get my main thread calling the Convert method multiple times, as shown by the output:
Hello world!
Hello world2!
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Hello world3!
Hmm2
Why is this? I've only ever seen behaviour like this when using yield and deferring execution, but never on simple IEnumerable methods - I would expect the behaviour to be that it would calculate an 'IEnumerable' when we go over it in the for loop and then evaluate it once. Obviously when I introduce .ToList() the issue goes away, but I'd like to understand the behaviour beforehand. Thanks in advance!

Why is this?
Select means "Apply this function to each element". So it should be no surprise it is called one for each element. Since Convert takes a single element this is the only possible way for it to work.
Select, and most of the functions in LINQ, uses deferred execution / lazy evaluation. That is one of the main points with LINQ. Lazy evaluation is in no way restricted to iterator blocks (i.e. yield return-methods). Consider for example:
var one = new []{1, 2, 3}.Select(i => i.ToString()).First();
.ToString() will only be called once. And that can be quite useful. The length of the original list is irrelevant, since we only need the first item. It might even be infinitely long!
However, if you are using LINQ on regular lists, and not a database, some methods will need to evaluate the entire list once at least one value is required. For example OrderBy, since it needs to know all the elements in order to sort them. But the evaluation would still happen inside the foreach loop.

Related

Enumerating over lambdas does not bind the scope correctly?

consider the following C# program:
using System;
using System.Linq;
using System.Collections.Generic;
public class Test
{
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
int capture = i;
yield return () => Console.WriteLine(capture.ToString());
}
}
public static void Main(string[] args)
{
foreach (var a in Get()) a();
foreach (var a in Get().ToList()) a();
}
}
When executed under Mono compiler (e.g. Mono 2.10.2.0 - paste into here), it writes the following output:
0
1
1
1
This seems totally unlogical to me. When directly iterating the yield function, the scope of the for-loop is "correctly" (to my understanding) used. But when I store the result in a list first, the scope is always the last action?!
Can I assume that this is a bug in the Mono compiler, or did I hit a mysterious corner case of C#'s lambda and yield-stuff?
BTW: When using Visual Studio compiler (and either MS.NET or mono to execute), the result is the expected 0 1 0 1
I'll give you the reason why it was 0 1 1 1:
foreach (var a in Get()) a();
Here you go into Get and it starts iterating:
i = 0 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 0 to the screen, then returns to the Get() method and continues.
i = 1 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 1 to the screen, then returns to the Get() method and continues (only to find that it has to stop).
But now, you're not iterating over each item when it happens, you're building a list and then iterating over that list.
foreach (var a in Get().ToList()) a();
What you are doing isn't like above, Get().ToList() returns a List or Array (not sure wich one). So now this happens:
i = 0 => return Console.WriteLine(i);
And in you Main() function, you get the following in memory:
var i = 0;
var list = new List
{
Console.WriteLine(i)
}
You go back into the Get() function:
i = 1 => return Console.WriteLine(i);
Which returns to your Main()
var i = 1;
var list = new List
{
Console.WriteLine(i),
Console.WriteLine(i)
}
And then does
foreach (var a in list) a();
Which will print out 1 1
It seems like it was ignoring that you made sure you encapsulated the value before returning the function.
#Armaron - The .ToList() extension returns List of type T as ToArray() returns T[] as the naming convention implies, but I think you are on the right track with your response.
This sounds like an issuse with the compiler. I agree with Servy that it is probably a bug, however, have you tried the following?
public class Test
{
private static int capture = 0;
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
capture++;
yield return () => Console.WriteLine(capture.ToString());
}
}
}
Additionally you may want to try the static approach, perhaps this will perform a more accurate conversion as your function is static.
List<T> list = Enumerable.ToList(Get());
When calling ToList() it seems as though it is not performing a single iteration for each value but rather:
return new List<T>(Get());
The second for each in your code does not make sense to me in implementation as to why it would ever be necessary or beneficial unless you require additional actions to be added/removed to the List object. The first makes perfect sense since all you are doing is iterating through the object and performing the associated action. My understanding is that an integer within the scope of the static IEnumerbale object is being calculated during conversion by performing the entire iteration and the action is preserving the int as a static int due to scope. Also, keep in mind that IEnumerable is merely an interface that is implemented by List which implements IList, and may contain logic for the conversion built in.
That being said I am interested to see/hear your findings as this is an interesting post. I will definitely upvote the question. Please ask questions if anything I said needs clarification or if something is false say so, although I am confident in my usage of the yield keyword of IEnumerable but this is a unique issue.

how to execute a method based on a string which contains its name

I have an Object which contains several methods and outside of it I have a list of strings where each of the strings value is the name of the Method. I would like to Execute the the method based on the name. From expirience, in python it is deadly simple. In c# I assume that it should be done with delegates I suposse. Or with methodInvoking?
I wanted to ignore reflection on this one.
i python you can store methods as objects, because it is an object.
def a():
return 1
def b():
return 2
def c():
return 3
l= [a,b,c]
for i in l:
print i()
The output would be:
>>> 1
>>> 2
>>> 3
If you want to ignore reflection, you can create a delegate for each method call and store in a Dictionary.
Heres how you do it:
var methods = new Dictionary<string, Action >() {
{"Foo", () => Foo()},
{"Moo", () => Moo()},
{"Boo", () => Boo()}
};
methods["Foo"].Invoke();
Note that in your Python example, you are not "[executing] a method based on a string which contains its name" but rather adding the method to a collection.
You can do basically the same thing as you are doing in Python in C#. Take a look at the Func delegate.
class FuncExample
{
static void Main(string[] args)
{
var funcs = new List<Func<int>> { a, b, c };
foreach (var f in funcs)
{
Console.WriteLine(f());
}
}
private static int a()
{
return 1;
}
private static int b()
{
return 2;
}
private static int c()
{
return 3;
}
}
and the output is
1
2
3
If you need to execute a function based on its name as a string, Uri-Abramson's answer to this very question is a good place to start, though you may want to reconsider not using reflection.

Run multiply instances of the same method simultaneously in c# without data loss?

I really don't understand Tasks and Threads well.
I have a method inside three levels of nested for that I want to run multiple times in different threads/tasks, but the variables I pass to the method go crazy, let me explain with some code:
List<int> numbers=new List<int>();
for(int a=0;a<=70;a++)
{
for(int b=0;b<=6;b++)
{
for(int c=0;b<=10;c++)
{
Task.Factory.StartNew(()=>MyMethod(numbers,a,b,c));
}
}
}
private static bool MyMethod(List<int> nums,int a,int b,int c)
{
//Really a lot of stuff here
}
This is the nest, myMethod really does a lot of things, like calculating the factorial of some numbers, writing into different documents and matching responses with a list of combinations and calling other little methods, it has also some return value (booleans), but I don't care about them at the moment.
The problem is that no task reach an end, it's like everytime the nest call the method it refreshes itself, removing previous instances.
It also give an error, "try to divide for 0", with values OVER the ones delimited by FORs, for example a=71, b=7, c=11 and all variables empty(that's why divided by zero). I really don't know how to solve it.
The problem is, that you are using a variable that has been or will be modifed outside your closure/lambda. You should get a warning, saying "Access to modified closure".
You can fix it by putting your loop variables into locals first and use those:
namespace ConsoleApplication9
{
using System.Collections.Generic;
using System.Threading.Tasks;
class Program
{
static void Main()
{
var numbers = new List<int>();
for(int a=0;a<=70;a++)
{
for(int b=0;b<=6;b++)
{
for(int c=0;c<=10;c++)
{
var unmodifiedA = a;
var unmodifiedB = b;
var unmodifiedC = c;
Task.Factory.StartNew(() => MyMethod(numbers, unmodifiedA, unmodifiedB, unmodifiedC));
}
}
}
}
private static void MyMethod(List<int> nums, int a, int b, int c)
{
//Really a lot of stuffs here
}
}
}
Check your for statements. b and c are never incremented.
You then have a closure over the loop variables which is likely to be the cause of other problems.
Captured variable in a loop in C#
Why is it bad to use an iteration variable in a lambda expression

Cannot print to console using yield return

In the tests below, I cannot get Console.WriteLine to really print when using yield return.
I'm experimenting with yield return and I understand I have something missing in my understanding of it, but cannot find out what it is. Why aren't the strings printed inside PrintAllYield?
Code:
class Misc1 {
public IEnumerable<string> PrintAllYield(IEnumerable<string> list) {
foreach(string s in list) {
Console.WriteLine(s); // doesn't print
yield return s;
}
}
public void PrintAll(IEnumerable<string> list) {
foreach(string s in list) {
Console.WriteLine(s); // surely prints OK
}
}
}
Test:
[TestFixture]
class MiscTests {
[Test]
public void YieldTest() {
string[] list = new[] { "foo", "bar" };
Misc1 test = new Misc1();
Console.WriteLine("Start PrintAllYield");
test.PrintAllYield(list);
Console.WriteLine("End PrintAllYield");
Console.WriteLine();
Console.WriteLine("Start PrintAll");
test.PrintAll(list);
Console.WriteLine("End PrintAll");
}
}
Output:
Start PrintAllYield
End PrintAllYield
Start PrintAll
foo
bar
End PrintAll
1 passed, 0 failed, 0 skipped, took 0,39 seconds (NUnit 2.5.5).
You have to actually enumerate the returned IEnumerable to see the output:
Console.WriteLine("Start PrintAllYield");
foreach (var blah in test.PrintAllYield(list))
; // Do nothing
Console.WriteLine("End PrintAllYield");
When you use the yield return keyword, the compiler will construct a state machine for you. Its code will only be run when you actually use it to iterate the returned enumerable.
Is there a particular reason you're trying to use yield return to print out a sequence of strings? That's not really the purpose of the feature, which is to simplify the creation of sequence generators, rather than the enumeration of an already generated sequence. foreach is the preferred method for the latter.

Some help understanding "yield"

In my everlasting quest to suck less I'm trying to understand the "yield" statement, but I keep encountering the same error.
The body of [someMethod] cannot be an iterator block because
'System.Collections.Generic.List< AClass>' is not an iterator interface type.
This is the code where I got stuck:
foreach (XElement header in headersXml.Root.Elements()){
yield return (ParseHeader(header));
}
What am I doing wrong? Can't I use yield in an iterator? Then what's the point?
In this example it said that List<ProductMixHeader> is not an iterator interface type.
ProductMixHeader is a custom class, but I imagine List is an iterator interface type, no?
--Edit--
Thanks for all the quick answers.
I know this question isn't all that new and the same resources keep popping up.
It turned out I was thinking I could return List<AClass> as a return type, but since List<T> isn't lazy, it cannot. Changing my return type to IEnumerable<T> solved the problem :D
A somewhat related question (not worth opening a new thread): is it worth giving IEnumerable<T> as a return type if I'm sure that 99% of the cases I'm going to go .ToList() anyway? What will the performance implications be?
A method using yield return must be declared as returning one of the following two interfaces:
IEnumerable<SomethingAppropriate>
IEnumerator<SomethingApropriate>
(thanks Jon and Marc for pointing out IEnumerator)
Example:
public IEnumerable<AClass> YourMethod()
{
foreach (XElement header in headersXml.Root.Elements())
{
yield return (ParseHeader(header));
}
}
yield is a lazy producer of data, only producing another item after the first has been retrieved, whereas returning a list will return everything in one go.
So there is a difference, and you need to declare the method correctly.
For more information, read Jon's answer here, which contains some very useful links.
It's a tricky topic. In a nutshell, it's an easy way of implementing IEnumerable and its friends. The compiler builds you a state machine, transforming parameters and local variables into instance variables in a new class. Complicated stuff.
I have a few resources on this:
Chapter 6 of C# in Depth (free download from that page)
Iterators, iterator blocks and data pipelines (article)
Iterator block implementation details (article)
"yield" creates an iterator block - a compiler generated class that can implement either IEnumerable[<T>] or IEnumerator[<T>]. Jon Skeet has a very good (and free) discussion of this in chapter 6 of C# in Depth.
But basically - to use "yield" your method must return an IEnumerable[<T>] or IEnumerator[<T>]. In this case:
public IEnumerable<AClass> SomeMethod() {
// ...
foreach (XElement header in headersXml.Root.Elements()){
yield return (ParseHeader(header));
}
}
List implements Ienumerable.
Here's an example that might shed some light on what you are trying to learn. I wrote this about 6 months
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace YieldReturnTest
{
public class PrimeFinder
{
private Boolean isPrime(int integer)
{
if (0 == integer)
return false;
if (3 > integer)
return true;
for (int i = 2; i < integer; i++)
{
if (0 == integer % i)
return false;
}
return true;
}
public IEnumerable<int> FindPrimes()
{
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
yield return i;
}
}
}
}
class Program
{
static void Main(string[] args)
{
PrimeFinder primes = new PrimeFinder();
foreach (int i in primes.FindPrimes())
{
Console.WriteLine(i);
Console.ReadLine();
}
Console.ReadLine();
Console.ReadLine();
}
}
}
I highly recommend using Reflector to have a look at what yield actually does for you. You'll be able to see the full code of the class that the compiler generates for you when using yield, and I've found that people understand the concept much more quickly when they can see the low-level result (well, mid-level I guess).
To understand yield, you need to understand when to use IEnumerator and IEnumerable (because you have to use either of them). The following examples help you to understand the difference.
First, take a look at the following class, it implements two methods - one returning IEnumerator<int>, one returning IEnumerable<int>. I'll show you that there is a big difference in usage, although the code of the 2 methods is looking similar:
// 2 iterators, one as IEnumerator, one as IEnumerable
public class Iterator
{
public static IEnumerator<int> IterateOne(Func<int, bool> condition)
{
for(var i=1; condition(i); i++) { yield return i; }
}
public static IEnumerable<int> IterateAll(Func<int, bool> condition)
{
for(var i=1; condition(i); i++) { yield return i; }
}
}
Now, if you're using IterateOne you can do the following:
// 1. Using IEnumerator allows to get item by item
var i=Iterator.IterateOne(x => true); // iterate endless
// 1.a) get item by item
i.MoveNext(); Console.WriteLine(i.Current);
i.MoveNext(); Console.WriteLine(i.Current);
// 1.b) loop until 100
int j; while (i.MoveNext() && (j=i.Current)<=100) { Console.WriteLine(j); }
1.a) prints:
1
2
1.b) prints:
3
4
...
100
because it continues counting right after the 1.a) statements have been executed.
You can see that you can advance item by item using MoveNext().
In contrast, IterateAll allows you to use foreach and also LINQ statements for bigger comfort:
// 2. Using IEnumerable makes looping and LINQ easier
var k=Iterator.IterateAll(x => x<100); // limit iterator to 100
// 2.a) Use a foreach loop
foreach(var x in k){ Console.WriteLine(x); } // loop
// 2.b) LINQ: take 101..200 of endless iteration
var lst=Iterator.IterateAll(x=>true).Skip(100).Take(100).ToList(); // LINQ: take items
foreach(var x in lst){ Console.WriteLine(x); } // output list
2.a) prints:
1
2
...
99
2.b) prints:
101
102
...
200
Note: Since IEnumerator<T> and IEnumerable<T> are Generics, they can be used with any type. However, for simplicity I have used int in my examples for type T.
This means, you can use one of the return types IEnumerator<ProductMixHeader> or IEnumerable<ProductMixHeader> (the custom class you have mentioned in your question).
The type List<ProductMixHeader> does not implement any of these interfaces, which is the reason why you can't use it that way. But Example 2.b) is showing how you can create a list from it.
If you're creating a list by appending .ToList() then the implication is, that it will create a list of all elements in memory, while an IEnumerable allows lazy creation of its elements - in terms of performance, it means that elements are enumerated just in time - as late as possible, but as soon as you're using .ToList(), then all elements are created in memory. LINQ tries to optimize performance this way behind the scenes.
DotNetFiddle of all examples
#Ian P´s answer helped me a lot to understand yield and why it is used. One (major) use case for yield is in "foreach" loops after the "in" keyword not to return a fully completed list. Instead of returning a complete list at once, in each "foreach" loop only one item (the next item) is returned. So you will gain performance with yield in such cases.
I have rewritten #Ian P´s code for my better understanding to the following:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace YieldReturnTest
{
public class PrimeFinder
{
private Boolean isPrime(int integer)
{
if (0 == integer)
return false;
if (3 > integer)
return true;
for (int i = 2; i < integer; i++)
{
if (0 == integer % i)
return false;
}
return true;
}
public IEnumerable<int> FindPrimesWithYield()
{
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
yield return i;
}
}
}
public IEnumerable<int> FindPrimesWithoutYield()
{
var primes = new List<int>();
int i;
for (i = 1; i < 2147483647; i++)
{
if (isPrime(i))
{
primes.Add(i);
}
}
return primes;
}
}
class Program
{
static void Main(string[] args)
{
PrimeFinder primes = new PrimeFinder();
Console.WriteLine("Finding primes until 7 with yield...very fast...");
foreach (int i in primes.FindPrimesWithYield()) // FindPrimesWithYield DOES NOT iterate over all integers at once, it returns item by item
{
if (i > 7)
{
break;
}
Console.WriteLine(i);
//Console.ReadLine();
}
Console.WriteLine("Finding primes until 7 without yield...be patient it will take lonkg time...");
foreach (int i in primes.FindPrimesWithoutYield()) // FindPrimesWithoutYield DOES iterate over all integers at once, it returns the complete list of primes at once
{
if (i > 7)
{
break;
}
Console.WriteLine(i);
//Console.ReadLine();
}
Console.ReadLine();
Console.ReadLine();
}
}
}
What does the method you're using this in look like? I don't think this can be used in just a loop by itself.
For example...
public IEnumerable<string> GetValues() {
foreach(string value in someArray) {
if (value.StartsWith("A")) { yield return value; }
}
}

Categories