I can't understand quite clearly the difference between two blocks of code. Consider there is a program
class Program
{
static void Main(string[] args)
{
List<Number> numbers = new List<Number>
{
new Number(1),
new Number(2),
new Number(3)
};
List<Action> actions = new List<Action>();
foreach (Number numb in numbers)
{
actions.Add(() => WriteNumber(numb));
}
Number number = null;
IEnumerator<Number> enumerator = numbers.GetEnumerator();
while (enumerator.MoveNext())
{
number = enumerator.Current;
actions.Add(() => WriteNumber(number));
}
foreach (Action action in actions)
{
action();
}
Console.ReadKey();
}
public static void WriteNumber(Number num)
{
Console.WriteLine(num.Value);
}
public class Number
{
public int Value;
public Number(int i)
{
this.Value = i;
}
}
}
The output is
1
2
3
3
3
3
These two blocks of code should work identically. But you can see that the closure is not working for the first loop. What am i missing?
Thanks in advance.
You declare the number variable outside of your while loop. For each number you store the reference of it in your number variable - every time overwriting the last value.
You should just move the declaration inside the while-loop, so you have a new variable for each of your numbers.
IEnumerator<Number> enumerator = numbers.GetEnumerator();
while (enumerator.MoveNext())
{
Number number = enumerator.Current;
actions.Add(() => WriteNumber(number));
}
These two blocks of code should work identically.
No they shouldn't - at least in C# 5. In C# 3 and 4 they would, in fact.
But in the foreach loop, in C# 5, you have one variable per iteration of the loop. Your lambda expression captures that variable. Subsequent iterations of the loop create different variables which don't affect the previously-captured variable.
In the while loop, you have one variable which all the iterations capture. Changes to that variable will be seen in all of the delegates that captured it. You can see this by adding this line after your while loop:
number = new Number(999);
Then your output would be
1
2
3
999
999
999
Now in C# 3 and 4, the foreach specification was basically broken by design - it would capture a single variable across all iterations. This was then fixed in C# 5 to use a separate variable per iteration, which is basically what you always want with that sort of code.
In your loop:
Number number = null;
IEnumerator<Number> enumerator = numbers.GetEnumerator();
while (enumerator.MoveNext())
{
number = enumerator.Current;
actions.Add(() => WriteNumber(number));
}
number is declared outside of the loop scope. So when it gets set to the next current iterator, all your action refernces to number also get updated to the latest. So when you run each action, they will all use the last number.
Thanks for all your answers. But I think I was misunderstood. I WANT the closue to work. That's why i set the loop variable out of scope. The question is: Why does not it work in the first case? I forgot to mention that I use C# 3.5 (not C# 5.0). So the soop variable should be defined out of scope and two code blocks shoul work identically.
Related
Given an array of int numbers like:
int[] arr = new int[] { 0, 1, 2, 3, 4, 5 };
If we want to increment every number by 1, the best choice would be:
for(int i = 0; i < arr.Length; i++)
{
arr[i]++;
}
If we try to do it using foreach
foreach(int n in arr)
{
n++;
}
as expected, we meet the error:
Cannot assign to 'n' because it is a 'foreach iteration variable'
Why if we use this approach:
Array.ForEach(arr, (n) => {
n++;
});
which is equal to the foreach above, visual studio and compiler aren't going to tell us anything, the code is going to compile and just not producing any result in runtime, neither throw an exception?
foreach(int n in arr)
{
n++;
}
This is a language construct, the compiler knows exactly what a foreach-loop is supposed to do and what nis. It can therefore prevent you from changing the iteration variable n.
Array.ForEach(arr, (n) => {
n++;
});
This is a regular function call passing in a lambda. It is perfectly valid to modify local variables in a function (or lambda), so changing n is okay. While the compiler could warn you that the increment has no effect as it's never been used afterwards, it's valid code, and just because the function is called ForEach and actually does something similar to the foreach-loop doesn't change the fact that this is a regular function and a regular lambda.
As pointed out by #tkausl, n with ForEach is a local variable. Therefore:
static void Main()
{
int[] arr = new int[] { 0, 1, 2, 3, 4, 5 };
Console.WriteLine(string.Join(" ",arr));
Array.ForEach(arr, (n) => {
n++;
});
Console.WriteLine(string.Join(" ",arr));
}
will output:
0 1 2 3 4 5
0 1 2 3 4 5
Meaning you don't change the values of arr.
Array.ForEach is not identical to a foreach-loop. It´s an extension-method which will iterate a collection and performs an action on every of its elements.
Array.ForEach(arr, (n) => {
n++;
});
however won´t modify the actuzal collection, it will just re-assign a new value to n which has no relation to the underlying value in the array, because it´s a value-type which is **copied* to the anonymous method. So whatever you do with the param in your anonymous method isn´t reflected to the ForEach-method and thus has no effect in your array. This is why you can do this.
But even if you had an array of reference-types that would work, because you simply re-assign a new instance to the provided parameter, which again has no effect to the underlying array.
Take a look at this simplified example:
MyClass
{
void ForEach(Action<Item> a)
{
foreach(var e in myList)
Action(e);
}
}
In your case the action looks like this:
x => x++
which simply assigns a new value to x. As x however is passed by value, this won´t have any effect to the calling method and thus to myList.
Both are two different things.
First we need to be clear what we need. If the requirement is to mutate the existing values then you can use for loop as modifying the values while enumerating the collection shouldn't be done that' why you face error for the first foreach loop.
So one approach could be if mutating is the intention:
for(int i=0; i< arr.Length; i++)
{
arr[i] = arr[i] +1;
}
Secondly, If the intention is to get a new collection with the updated values then consider using linq Select method which will return a new collection of int.
var incrementedArray = arr.Select( x=> (x+1));
EDIT:
the key difference is in the first example we are modifying the values of colelction while enumerating it while in lambda syntax foreach a delegate is used which get input as local variable.
The foreach statement executes a statement or a block of statements for each element in an instance of the type that implements the System.Collections.IEnumerable or System.Collections.Generic.IEnumerable<T> interface. You cannot modify iterated value because you are using System.Collections.IEnumberable or System.COllections.Generic.IEnumberable<T> interfaces which support deferred execution.
If you want to modify value you can also use
foreach(ref int n in arr)
{
n++;
}
Updated
The Array.Foreach is a method that performs specified action on each element of the specified array. This function support immediate execution behavior and can be applied to only data that holds in memory. The Array.Foreach method take an array and used For loop to iterate through collection.
foreach and Array.Foreach both looks same but are different in their working.
I have this class:
public class SimHasher {
int count = 0;
//take each string and make an int[] out of it
//should call Hash method lines.Count() times
public IEnumerable<int[]> HashAll(IEnumerable<string> lines) {
//return lines.Select(il => Hash(il));
var linesCount = lines.Count();
var hashes = new int[linesCount][];
for (var i = 0; i < linesCount; ++i) {
hashes[i] = Hash(lines.ElementAt(i));
}
return hashes;
}
public int[] Hash(string line) {
Debug.WriteLine(++count);
//stuff
}
}
When I run a program that calls HashAll and passes it an IEnumerable<string> with 1000 elements, it acts as expected: loops 1000 times, writing numbers from 1 to 1000 in the debug console with the program finishing in under 1 second. However if I replace the code of the HashAll method with the LINQ statement, like so:
public IEnumerable<int[]> HashAll(IEnumerable<string> lines) {
return lines.Select(il => Hash(il));
}
the behavior seems to depend on where HashAll gets called from.
If I call it from this test method
[Fact]
public void SprutSequentialIntegrationTest() {
var inputContainer = new InputContainer(new string[] {
#"D:\Solutions\SimHash\SimHashTests\R.in"
});
var simHasher = new SimHasher();
var documentSimHashes = simHasher.HashAll(inputContainer.InputLines); //right here
var queryRunner = new QueryRunner(documentSimHashes);
var queryResults = queryRunner.RunAllQueries
(inputContainer.Queries);
var expectedQueryResults = System.IO.File.ReadAllLines(
#"D:\Solutions\SimHash\SimHashTests\R.out")
.Select(eqr => int.Parse(eqr));
Assert.Equal(expectedQueryResults, queryResults);
}
the counter in the debug console reaches around 13,000, even though there are only 1000 input lines. It also takes around 6 seconds to finish, but still manages to produce the same results as the loop version.
If I run it from the Main method like so
static void Main(string[] args) {
var inputContainer = new InputContainer(args);
var simHasher = new SimHasher();
var documentSimHashes = simHasher.HashAll(inputContainer.InputLines);
var queryRunner = new QueryRunner(documentSimHashes);
var queryResults = queryRunner.RunAllQueries
(inputContainer.Queries);
foreach (var queryResult in queryResults) {
Console.WriteLine(queryResult);
}
}
it starts writing out to the output console right away, altough very slowly, while the counter in the debug console goes into tens of thousands. When I try to debug it line by line, it goes straight to the foreach loop and writes out the results one by one. After some Googling, I've found out that this is due to LINQ queries being lazily evaluated. However, each time it lazily evaluates a result, the counter in the debug console increase by more than 1000, which is even more than the number of input lines.
What is causing so many calls to the Hash method? Can it be deduced from these snippets?
The reason why you get more iterations than you would expect is that there are LINQ calls that iterate the IEnumerable<T> multiple times.
When you call Count() on an IEnumerable<T>, LINQ tries to see if there is a Count or Length to avoid iterating, but when there is no shortcut, it iterates IEnumerable<T> all the way to the end.
Similarly, when you call ElementAt(i), LINQ tries to see if there is an indexer, but generally it iterates the collection up to point i. This renders your loop an O(n2).
You can easily fix your problem by storing your IEnumerable<T> in a list or an array by calling ToList() or ToArray(). This would iterate through IEnumerable<T> once, and then use Count and indexes to avoid further iterations.
IEnumerable<T> does not allow random access.
The ElementAt() method will actually loop through the entire sequence until it reaches the N'th element.
consider the following C# program:
using System;
using System.Linq;
using System.Collections.Generic;
public class Test
{
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
int capture = i;
yield return () => Console.WriteLine(capture.ToString());
}
}
public static void Main(string[] args)
{
foreach (var a in Get()) a();
foreach (var a in Get().ToList()) a();
}
}
When executed under Mono compiler (e.g. Mono 2.10.2.0 - paste into here), it writes the following output:
0
1
1
1
This seems totally unlogical to me. When directly iterating the yield function, the scope of the for-loop is "correctly" (to my understanding) used. But when I store the result in a list first, the scope is always the last action?!
Can I assume that this is a bug in the Mono compiler, or did I hit a mysterious corner case of C#'s lambda and yield-stuff?
BTW: When using Visual Studio compiler (and either MS.NET or mono to execute), the result is the expected 0 1 0 1
I'll give you the reason why it was 0 1 1 1:
foreach (var a in Get()) a();
Here you go into Get and it starts iterating:
i = 0 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 0 to the screen, then returns to the Get() method and continues.
i = 1 => return Console.WriteLine(i);
The yield returns with the function and executes the function, printing 1 to the screen, then returns to the Get() method and continues (only to find that it has to stop).
But now, you're not iterating over each item when it happens, you're building a list and then iterating over that list.
foreach (var a in Get().ToList()) a();
What you are doing isn't like above, Get().ToList() returns a List or Array (not sure wich one). So now this happens:
i = 0 => return Console.WriteLine(i);
And in you Main() function, you get the following in memory:
var i = 0;
var list = new List
{
Console.WriteLine(i)
}
You go back into the Get() function:
i = 1 => return Console.WriteLine(i);
Which returns to your Main()
var i = 1;
var list = new List
{
Console.WriteLine(i),
Console.WriteLine(i)
}
And then does
foreach (var a in list) a();
Which will print out 1 1
It seems like it was ignoring that you made sure you encapsulated the value before returning the function.
#Armaron - The .ToList() extension returns List of type T as ToArray() returns T[] as the naming convention implies, but I think you are on the right track with your response.
This sounds like an issuse with the compiler. I agree with Servy that it is probably a bug, however, have you tried the following?
public class Test
{
private static int capture = 0;
static IEnumerable<Action> Get()
{
for (int i = 0; i < 2; i++)
{
capture++;
yield return () => Console.WriteLine(capture.ToString());
}
}
}
Additionally you may want to try the static approach, perhaps this will perform a more accurate conversion as your function is static.
List<T> list = Enumerable.ToList(Get());
When calling ToList() it seems as though it is not performing a single iteration for each value but rather:
return new List<T>(Get());
The second for each in your code does not make sense to me in implementation as to why it would ever be necessary or beneficial unless you require additional actions to be added/removed to the List object. The first makes perfect sense since all you are doing is iterating through the object and performing the associated action. My understanding is that an integer within the scope of the static IEnumerbale object is being calculated during conversion by performing the entire iteration and the action is preserving the int as a static int due to scope. Also, keep in mind that IEnumerable is merely an interface that is implemented by List which implements IList, and may contain logic for the conversion built in.
That being said I am interested to see/hear your findings as this is an interesting post. I will definitely upvote the question. Please ask questions if anything I said needs clarification or if something is false say so, although I am confident in my usage of the yield keyword of IEnumerable but this is a unique issue.
I really don't understand Tasks and Threads well.
I have a method inside three levels of nested for that I want to run multiple times in different threads/tasks, but the variables I pass to the method go crazy, let me explain with some code:
List<int> numbers=new List<int>();
for(int a=0;a<=70;a++)
{
for(int b=0;b<=6;b++)
{
for(int c=0;b<=10;c++)
{
Task.Factory.StartNew(()=>MyMethod(numbers,a,b,c));
}
}
}
private static bool MyMethod(List<int> nums,int a,int b,int c)
{
//Really a lot of stuff here
}
This is the nest, myMethod really does a lot of things, like calculating the factorial of some numbers, writing into different documents and matching responses with a list of combinations and calling other little methods, it has also some return value (booleans), but I don't care about them at the moment.
The problem is that no task reach an end, it's like everytime the nest call the method it refreshes itself, removing previous instances.
It also give an error, "try to divide for 0", with values OVER the ones delimited by FORs, for example a=71, b=7, c=11 and all variables empty(that's why divided by zero). I really don't know how to solve it.
The problem is, that you are using a variable that has been or will be modifed outside your closure/lambda. You should get a warning, saying "Access to modified closure".
You can fix it by putting your loop variables into locals first and use those:
namespace ConsoleApplication9
{
using System.Collections.Generic;
using System.Threading.Tasks;
class Program
{
static void Main()
{
var numbers = new List<int>();
for(int a=0;a<=70;a++)
{
for(int b=0;b<=6;b++)
{
for(int c=0;c<=10;c++)
{
var unmodifiedA = a;
var unmodifiedB = b;
var unmodifiedC = c;
Task.Factory.StartNew(() => MyMethod(numbers, unmodifiedA, unmodifiedB, unmodifiedC));
}
}
}
}
private static void MyMethod(List<int> nums, int a, int b, int c)
{
//Really a lot of stuffs here
}
}
}
Check your for statements. b and c are never incremented.
You then have a closure over the loop variables which is likely to be the cause of other problems.
Captured variable in a loop in C#
Why is it bad to use an iteration variable in a lambda expression
Apologies if this question has been asked already, but suppose we have this code (I've run it with Mono 2.10.2 and compiled with gmcs 2.10.2.0):
using System;
public class App {
public static void Main(string[] args) {
Func<string> f = null;
var strs = new string[]{
"foo",
"bar",
"zar"
};
foreach (var str in strs) {
if ("foo".Equals(str))
f = () => str;
}
Console.WriteLine(f()); // [1]: Prints 'zar'
foreach (var str in strs) {
var localStr = str;
if ("foo".Equals(str))
f = () => localStr;
}
Console.WriteLine(f()); // [2]: Prints 'foo'
{ int i = 0;
for (string str; i < strs.Length; ++i) {
str = strs[i];
if ("foo".Equals(str))
f = () => str;
}}
Console.WriteLine(f()); // [3]: Prints 'zar'
}
}
It seems logical that [1] print the same as [3]. But to be honest, I somehow expected it to print the same as [2]. I somehow believed the implementation of [1] would be closer to [2].
Question: Could anyone please provide a reference to the specification where it tells exactly how the str variable (or perhaps even the iterator) is captured by the lambda in [1].
I guess what I am looking for is the exact implementation of the foreach loop.
You asked for a reference to the specification; the relevant location is section 8.8.4, which states that a "foreach" loop is equivalent to:
V v;
while (e.MoveNext()) {
v = (V)(T)e.Current;
embedded-statement
}
Note that the value v is declared outside the while loop, and therefore there is a single loop variable. That is then closed over by the lambda.
UPDATE
Because so many people run into this problem the C# design and compiler team changed C# 5 to have these semantics:
while (e.MoveNext()) {
V v = (V)(T)e.Current;
embedded-statement
}
Which then has the expected behaviour -- you close over a different variable every time. Technically that is a breaking change, but the number of people who depend on the weird behaviour you are experiencing is hopefully very small.
Be aware that C# 2, 3, and 4 are now incompatible with C# 5 in this regard. Also note that the change only applies to foreach, not to for loops.
See http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/ for details.
Commenter abergmeier states:
C# is the only language that has this strange behavior.
This statement is categorically false. Consider the following JavaScript:
var funcs = [];
var results = [];
for(prop in { a : 10, b : 20 })
{
funcs.push(function() { return prop; });
results.push(funcs[0]());
}
abergmeier, would you care to take a guess as to what are the contents of results?
The core difference between 1 / 3 and 2 is the lifetime of the variable which is being captured. In 1 and 3 the lambda is capturing the iteration variable str. In both for and foreach loops there is one iteration variable for the lifetime of the loop. When the lambda is executed at the end of the loop it executes with the final value: zar
In 2 you are capturing a local variable who's lifetime is a single iteration of the loop. Hence you capture the value at that time which is "foo"
The best reference I can you you to is Eric's blog post on the subject
http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/
The following happens in loop 1 and 3:
The current value is assigned to the variable str. It is always the same variable, just with a different value in each iteration. This variable is captured by the lambda. As the lambda is executed after the loop finishes, it has the value of the last element in your array.
The following happens in loop 2:
The current value is assigned to a new variable localStr. It is always a new variable that gets the value assigned. This new variable is captured by the lambda. Because the next iteration of the loop creates a new variable, the value of the captured variable is not changed and because of that it outputs "foo".
For the people from google
I've fixed lambda bug using this approach:
I have changed this
for(int i=0;i<9;i++)
btn.OnTap += () => { ChangeCurField(i * 2); };
to this
for(int i=0;i<9;i++)
{
int numb = i * 2;
btn.OnTap += () => { ChangeCurField(numb); };
}
This forces "numb" variable to be the only one for the lambda and also makes generate at this moment and not when lambda is called/generated < not sure when it happens.