How closure in c# works when using lambda expressions? - c#

In to following tutorial : http://www.albahari.com/threading/
They say that the following code :
for (int i = 0; i < 10; i++)
new Thread (() => Console.Write (i)).Start();
is non deterministic and can produce the following answer :
0223557799
I thought that when one uses lambda expressions the compiler creates some kind of anonymous class that captures the variables that are in use by creating members like them in the capturing class.
But i is value type, so i thought that he should be copied by value.
where is my mistake ?
It will be very helpful if the answer will explain how does closure work, how do it hold a "pointer" to a specific int , what code does generated in this specific case ?

The key point here is that closures close over variables, not over values. As such, the value of a given variable at the time you close over it is irrelevant. What matters is the value of that variable at the time the anonymous method is invoked.
How this happens is easy enough to see when you see what the compiler transforms the closure into. It'll create something morally similar to this:
public class ClosureClass1
{
public int i;
public void AnonyousMethod1()
{
Console.WriteLine(i);
}
}
static void Main(string[] args)
{
ClosureClass1 closure1 = new ClosureClass1();
for (closure1.i = 0; closure1.i < 10; closure1.i++)
new Thread(closure1.AnonyousMethod1).Start();
}
So here we can see a bit more clearly what's going on. There is one copy of the variable, and that variable has now been promoted to a field of a new class, instead of being a local variable. Anywhere that would have modified the local variable now modifies the field of this instance. We can now see why your code prints what it does. After starting the new thread, but before it can actually execute, the for loop in the main thread is going back and incrementing the variable in the closure. The variable that hasn't yet been read by the closure.
To produce the desired result what you need to do is make sure that, instead of having every iteration of the loop closing over a single variable, they need to each have a variable that they close over:
for (int i = 0; i < 10; i++)
{
int copy = i;
new Thread(() => Console.WriteLine(copy));
}
Now the copy variable is never changed after it is closed over, and our program will print out 0-9 (although in an arbitrary order, because threads can be scheduled however the OS wants).

As Albahari states, Although the passing arguments are value types, each thread captures the memory location thus resulting in unexpected results.
This is happening because before the Thread had any time to start, the loop already changed whatever value that inside i.
To avoid that, you should use a temp variable as Albahari stated, or only use it when you know the variable is not going to change.

i in Console.Write(i) is evaluated right when that statement is about to be executed. That statement will be executed once thread has been fully created and started running and got to that code. By that time loop has moved forward a few times and thus i can be any value by then. Closures, unlike regular functions, have visibility into local variables of a function in which it is defined (what makes them useful, and way to shoot oneself in a foot).

Related

SSIS Script Component Increment variable

I am trying to do what seems simple to me, but can't manage to implement it. I want to increment a simple variable in a Data flow Task...
The variable is set in the ReadWriteVariables, there is no output nor input columns.
This is the end-goal (I'll avoid sharing the monstrosity my current code is) :
public class ScriptMain : UserComponent
{
public override void PostExecute()
{
base.PostExecute();
Variables.intDatasourceUpdated++;
}
}
I suppose I'm missing something (very junior with C# and .Net), so any help would be appreciated.
Edit:
I want to increment my "updated" or my "inserted" variables depending on the lookup : lookup printscreen. Here, it is always "updated".
My error is : error printscreen. Note that it says "at Variables.get_intDatasourceInserted()" but I never go to that branch here. So I commented the increment line in the "insert script" and it worked.
But then, when I'll have the "insert" case, as I currently have it deactivated, it won't increment.
"The collection of variables locked for read and write access is not available outside of PostExecute".
You are posting your "Update" snippet, the error is saying you are trying to update intDatasourceInserted which likely in your "Insert"s PreExecute(). Which isn't allowed.
In either case you'll still have an issue, since each task there execute simultaneously, waiting for pipeline data, and variables don't work well between tasks inside one data flow, you'll probably need to mangle the data itself as it flows or access the altered variable outside the data flow in the control flow.
You need to decalre the variable outside of row processing.
public int counter = 0;
public void main()
{
counter++;
}
And at the end set the variable to counter.
post execute...
Variables.Counter = counter;
Usually this can be done, by defining a new C# variable. Set the value of the SSIS-variable to the new variable and then back again.
// Declare user-defined variable and increase the value by 1
int variableValue = Convert.ToInt32(Dts.Variables["myVariable"].Value);
variableValue++;
// Write the new value back into the variable
Dts.Variables["myVariable"].Value = variableValue;

C# local copied variable value keep changing

I am facing this strange problem with strings.
I assigned a string like this:
string temp = DateTime.UtcNow.ToString("s");
_snapShotTime = string.Copy(temp);
//here threads started....
//while thread progressing I am passing _snapShotTime to create a directory.
//same in second threads.
But the time of local private variable _snapShotTime is keep on changing. I don't know why. I have used a local variable and copy value in it.
Thanks
I suspect your thread uses a lambda expression (or anonymous function) which captures _snapShotTime. That would indeed allow it to be changed. It's hard to say for sure without any code though.
If this is the problem, it's typically that you're referring to a captured variable which is declared outside the loop, but changed on every iteration of a loop. You can fix this by declaring a new variable which takes a copy of the original variable inside the loop, and only using that copy variable in the lambda expression. You'll get a "new" variable inside the loop on each iteration, so you won't have problems.
Strings are immutable, they do not change unless a variable is reassigned to a new string.
We need to see more code in order to help pinpoint the problem.
Why don't you just do
_snapShotTime = DateTime.UtcNow.ToString("s");
Also, place a breakpoint on that line and see when it is being called.
When it does break, see the stack and it will clarify things.
I suspect that your threads change the value of _snapShotTime

C# lambda, local variable value not taken when you think?

Suppose we have the following code:
void AFunction()
{
foreach(AClass i in AClassCollection)
{
listOfLambdaFunctions.AddLast( () => { PrintLine(i.name); } );
}
}
void Main()
{
AFunction();
foreach( var i in listOfLambdaFunctions)
i();
}
One might think that the above code would out the same as the following:
void Main()
{
foreach(AClass i in AClassCollection)
PrintLine(i.name);
}
However, it doesn't. Instead, it prints the name of the last item in AClassCollection every time.
It appears as if the same item was being used in each lambda function. I suspect there might be some delay from when the lambda was created to when the lambda took a snapshot of the external variables used in it.
Essentially, the lambda is holding a reference to the local variable i, instead of taking a "snapshot" of i's value when the lambda was created.
To test this theory, I tried this code:
string astr = "a string";
AFunc fnc = () => { System.Diagnostics.Debug.WriteLine(astr); };
astr = "changed";
fnc();
and, surprise, it outputs changed!
I am using XNA 3.1, and whichever version of C# that comes with it.
My questions are:
What is going on?
Does the lambda function somehow store a 'reference' to the variable or something?
Is there any way around this problem?
This is a modified closure
See: similar questions like Access to Modified Closure
To work around the issue you have to store a copy of the variable inside the scope of the for loop:
foreach(AClass i in AClassCollection)
{
AClass anotherI= i;
listOfLambdaFunctions.AddLast( () => { PrintLine(anotherI.name); } );
}
does the lambda function somehow store a 'reference' to the variable or something?
Close. The lambda function captures the variable itself. There is no need to store a reference to a variable, and in fact, in .NET it is impossible to permanently store a reference to a variable. You just capture the entire variable. You never capture the value of the variable.
Remember, a variable is a storage location. The name "i" refers to a particular storage location, and in your case, it always refers to the same storage location.
Is there anyway around this problem?
Yes. Create a new variable every time through the loop. The closure then captures a different variable every time.
This is one of the most frequently reported problems with C#. We're considering changing the semantics of the loop variable declaration so that a new variable is created every time through the loop.
For more details on this issue see my articles on the subject:
http://ericlippert.com/2009/11/12/closing-over-the-loop-variable-considered-harmful-part-one/
what is going on? does the lambda function somehow store a 'reference' to the variable or something?
Yes exactly that; c# captured variables are to the variable, not the value of the variable. You can usually get around this by introducing a temp variable and binding to that:
string astr = "a string";
var tmp = astr;
AFunc fnc = () => { System.Diagnostics.Debug.WriteLine(tmp); };
especially in foreach where this is notorious.
Yes, the lambda stores a reference to the variable (conceptually speaking, anyway).
A very simple workaround is this:
foreach(AClass i in AClassCollection)
{
AClass j = i;
listOfLambdaFunctions.AddLast( () => { PrintLine(j.name); } );
}
In every iteration of the foreach loop, a new j gets created, which the lambda captures.
i on the other hand, is the same variable throughout, but gets updated with every iteration (so all the lambdas end up seeing the last value)
And I agree that this is a bit surprising. :)
I've been caught by this one as well, as said by Calgary Coder, it is a modified closure. I really had trouble spotting them until I got resharper. Since it is one of the warnings that resharper watches for, I am much better at identifying them as I code.

Access to modified closure... but why?

Saw several similar questions here, but none of them seemed to quite be my issue...
I understand (or thought I understood) the concept of closure, and understand what would cause Resharper to complain about access to a modified closure, but in the below code I don't understand how I'm breaching closure.
Because primaryApps is declared within the context of the for loop, primary isn't going to change while I'm processing primaryApps. If I had declared primaryApps outside the for loop, then absolutely, I have closure issues. But why in the code below?
var primaries = (from row in openRequestsDataSet.AppPrimaries
select row.User).Distinct();
foreach (string primary in primaries) {
// Complains because 'primary' is accessing a modified closure
var primaryApps = openRequestsDataSet.AppPrimaries.Select(x => x.User == primary);
Is Resharper just not smart enough to figure out it's not an issue, or is there a reason closure is an issue here that I'm not seeing?
The problem is in the following statement
Because primaryApps is declared within the context of the for loop, primary isn't going to change while I'm processing primaryApps.
There is simply no way for Resharper to 100% verify this. The lambda which references the closure here is passed to function outside the context of this loop: The AppPrimaries.Select method. This function could itself store the resulting delegate expression, execute it later and run straight into the capture of the iteration variable issue.
Properly detecting whether or not this is possible is quite an undertaking and frankly not worth the effort. Instead ReSharper is taking the safe route and warning about the potentially dangerous capture of the iteration variable.
Because primaryApps is declared within the context of the for loop, primary isn't going to change while I'm processing primaryApps. If I had declared primaryApps outside the for loop, then absolutely, I have closure issues. But why in the code below?
Jared is right; to demonstrate why your conclusion does not follow logically from your premise, let's make a program that declares primaryApps within the context for the for loop, and still suffers from a captured loop variable problem. Easy enough to do that.
static class Extensions
{
public IEnumerable<int> Select(this IEnumerable<int> items, Func<int, bool> selector)
{
C.list.Add(selector);
return System.Enumerable.Select(items, selector);
}
}
class C
{
public static List<Func<int, bool>> list = new List<Func<int, bool>>();
public static void M()
{
int[] primaries = { 10, 20, 30};
int[] secondaries = { 11, 21, 30};
foreach (int primary in primaries)
{
var primaryApps = secondaries.Select(x => x == primary);
// do something with primaryApps
}
C.N();
}
public static void N()
{
Console.WriteLine(C.list[0](10)); // true or false?
}
}
Where "primaryApps" is declared is completely irrelevant. The only thing that is relevant is that the closure might survive the loop, and therefore someone might invoke it later, incorrectly expecting that the variable captured in the closure was captured by value.
Resharper has no way to know that a particular implementation of Select does not stash away the selector for later; in fact, that is exactly what all of them do. How is Resharper supposed to know that they happen to stash it away in a place that won't be accessible later?
As far as I know Resharper generates the warning every time you access the foreach variable even if it does not really cause closure.
Yes it's just warning,
Look :
http://devnet.jetbrains.net/thread/273042

How to accomplish scoped variable reset in C#?

One common pattern I see and use frequently in C++ is to temporarily set a variable to a new value, and then reset it when I exit that scope. In C++, this is easily accomplished with references and templated scope classes, and allows for increased safety and prevention of errors where the variable is set to a new value, then reset to an incorrect assumed initial value.
Here is a simplified example of what I mean (in C++):
void DoSomething()
{
// The following line captures GBL.counter by reference, stores its current
// value, and sets it to 1
ScopedReset<int> resetter(GBL.counter, 1);
// In this function and all below, GBL.counter will be 1
CallSomethingThatNeedsCounterOf1();
// When I hit the close brace, ~ScopedReset will be called, and it will
// reset GBL.counter to it's previous value
}
Is there any way to do this in C#? I've found the hard way that I can't capture a ref parameter inside an IEnumerator or a lambda, which were my first two thoughts. I don't want to use the unsafe keyword if possible.
The first challenge to doing this in C# is dealing with non-deterministic destruction. Since C# doesn't have destructors you need a mechanism to control scope in order to execute the reset. IDisposable helps there and the using statement will mimic C++ deterministic destruction semantics.
The second is getting at the value you want to reset without using pointers. Lambdas and delegates can do that.
class Program
{
class ScopedReset<T> : IDisposable
{
T originalValue = default(T);
Action<T> _setter;
public ScopedReset(Func<T> getter, Action<T> setter, T v)
{
originalValue = getter();
setter(v);
_setter = setter;
}
public void Dispose()
{
_setter(originalValue);
}
}
static int counter = 0;
static void Main(string[] args)
{
counter++;
counter++;
Console.WriteLine(counter);
using (new ScopedReset<int>(() => counter, i => counter = i, 1))
Console.WriteLine(counter);
Console.WriteLine(counter);
}
}
Can you not simply copy the reference value to a new local variable, and use this new variable throughout your method, i.e. copy value by value?
Indeed, changing it from a ref to regular value parameter will accomplish this!
I don't think you can capture a ref paramenter to a local variable, and have it stay a ref - a local copy will be created.
GBL.counter is effectively an implicit, hidden parameter to CallSomethingThatNeedsCounterOf1. If you could convert it to a regular, declared paraemter your problem would go away. Also, if that would result in to many parameters, a solution would be a pair of methods which set up and reset the environment so that CallSomethingThatNeedsCounterOf1() can run.
You can create a class that calls the SetUp method in its constructor and the Reset method in Dispose(). You can use this class with the using statement, to aproximate the c++ behaviour. You would, however, have to create one of these classes for each scenario.

Categories