I have the following code that creates 10 threads which in turn write out messages to the console:
for (int i = 0; i < 10; i++)
{
{
Thread thread = new Thread((threadNumber) =>
{
for (int j = 0; j < 10; j++)
{
Thread.Sleep(200);
Console.WriteLine(string.Format("Thread: {0}, Line: {1}", threadNumber, j));
}
});
thread.Start(i);
}
}
My understanding is that ParameterizedThreadStart takes an object for which a copy of the reference is sent to the thread. If that is the case since I have not made a local copy of i within each loop all new threads would point to the same memory location meaning certain thread numbers could be 'missed out'. Having run this though (and even against a larger number of threads/sleep times) each value of i has its own thread. Can anyone explain why?
You haven't applied anything deferred or "captured" in the sense of creating an anonymous function that would wrap i.
The lambda function here does not reference i anywhere and its state is completely internalized/contained so no issues here:
(threadNumber) =>
{
for (int j = 0; j < 10; j++)
{
Thread.Sleep(200);
Console.WriteLine(string.Format("Thread: {0}, Line: {1}", threadNumber, j));
}
});
The Start call here:
thread.Start(i);
Passes i by value (i.e. copies its value) because it is a "value type" and it's not captured in any kind of anonymous function. In this sense, it is passed as any normal struct would to any normal method (because this is exactly what is happening).
If instead you had written your lambda as this using i instead of your threadNumber:
{
for (int j = 0; j < 10; j++)
{
Thread.Sleep(200);
Console.WriteLine(string.Format("Thread: {0}, Line: {1}", i, j));
}
});
Then you would be in trouble. In this case i is referring to the original variable location and will be evaluated whenever the thread executes. This means it could be the current value of i when it was created (unlikely just due to processing times), or the value set later on in the for loop, or the last possible value 10, and quite possibly have the number skip or shared between iterations.
Related
I have this simple function,working as a task, that only print the values of a dataset. I pass the dataset from main function, and the index. The problem is that i have populated only 2 dataset index, however the function always jumps one ahead, i.e. in the last iteration it would want to start reading index 2, which is uninitialized and therefore the exception.
for (int i = 0; i < 2; i++)
{
tasks.Add(Task.Factory.StartNew(() => {
int a = i;
showNodeID(dataSet,a);
}));
}
and the function is
private static void showNodeID(DataSet[] ds, int a)
{
Console.WriteLine(a.ToString());
Console.WriteLine(ds[a].GetXml());
} //END
In the last iteration when i print 1 however in function if i print a it would be 2.
I assume you are aware of the dangers of captured counter variables in lambda closures, since you attempt to avoid the issue by assigning the counter to a locally-scoped variable. However, your assignment is happening too late – by the time the task starts and copies the value, the counter might already have been incremented by the next iteration. To properly avoid the issue, you need to copy the value before the task, not within it:
for (int i = 0; i < 2; i++)
{
int a = i;
tasks.Add(Task.Factory.StartNew(() =>
{
showNodeID(dataSet, a);
}));
}
If you just need to perform a parallel loop, you could alternatively use:
Parallel.For(0, 2, i => showNodeID(dataSet, i));
for (int i = 0; i < 10; i++)
new Thread (() => Console.Write (i)).Start();
As expected the output of the above code is non-deterministic, because i variable refers to the same memory location throughout the loop’s lifetime. Therefore, each thread calls Console.Write on a variable whose value may change as it is running
However,
for (int i = 0; i < 10; i++)
{
int temp = i;
new Thread (() => Console.Write (temp)).Start();
}
Is also giving non-deterministic output! I thought variable temp was local to each loop iteration. Therefore, each thread captured a different memory location and there should have been np problem.
Your program should have 10 lambdas, each writing one of the digits from 0 to 9 to the console. However, there's no guarantee that the threads will execute in order.
Is also giving non-deterministic output!
No, its not. I have checked ten times your first code (had repeating numbers) and the second (had not).
So it all works fine. Just as it should.
The second code snippet should be deterministic in the sense that each thread eventually writes its temp, and all their temps will differ.
However, it does not guarantee that threads will be scheduled for execution in the order of their creation. You'll see all possible temps, but not necessarily in ascending order.
Here is the proof that OP is right and both pieces of his code are incorrect.
And there is also a solution with proof also.
However need to note that 'non-deterministic' means that the threads receive wrong parameter. The order will be never guaranteed.
The code below examines second piece of OP code and demonstrates that it is working as expected..
I am storing the pair (thread identity, parameter) and then print it to compare with the thread output to prove the pairs aren't changed. I also added few hundreds millisecond random sleep so the for index should obviously change at those times.
Dictionary<int, int> hash = new Dictionary<int, int>();
Random r = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < 10; i++)
{
int temp = i;
var th = new Thread(() =>
{
Thread.Sleep(r.Next(9) * 100);
Console.WriteLine("{0} {1}",
Thread.CurrentThread.GetHashCode(), temp);
});
hash.Add(th.GetHashCode(), temp);
th.Start();
}
Thread.Sleep(1000);
Console.WriteLine();
foreach (var kvp in hash)
Console.WriteLine("{0} {1}", kvp.Key, kvp.Value);
I wrote this experiment to demonstrate to someone that accessing shared data conccurently with multiple threads was a big no-no. To my surprise, regardless of how many threads I created, I was not able to create a concurrency issue and the value always resulted in a balanced value of 0. I know that the increment operator is not thread-safe which is why there are methods like Interlocked.Increment() and Interlocked.Decrement() (also noted here Is the ++ operator thread safe?).
If the increment/decrement operator is not thread safe, then why does the below code execute without any issues and results to the expected value?
The below snippet creates 2,000 threads. 1,000 constantly incrementing and 1,000 constantly decrementing to insure that the data is being accessed by multiple threads at the same time. What makes it worse is that in a normal program you would not have nearly as many threads. Yet despite the exaggerated numbers in an effort to create a concurrency issue the value always results in being a balanced value of 0.
static void Main(string[] args)
{
Random random = new Random();
int value = 0;
for (int x=0; x<1000; x++)
{
Thread incThread = new Thread(() =>
{
for (int y=0; y<100; y++)
{
Console.WriteLine("Incrementing");
value++;
}
});
Thread decThread = new Thread(() =>
{
for (int z=0; z<100; z++)
{
Console.WriteLine("Decrementing");
value--;
}
});
incThread.Start();
decThread.Start();
}
Thread.Sleep(TimeSpan.FromSeconds(15));
Console.WriteLine(value);
Console.ReadLine();
}
I'm hoping someone can provide me with an explanation so that I know that all my effort into writing thread-safe software is not in vain, or perhaps this experiment is flawed in some way. I have also tried with all threads incrementing and using the ++i instead of i++. The value always results in the expected value.
You'll usually only see issues if you have two threads which are incrementing and decrementing at very close times. (There are also memory model issues, but they're separate.) That means you want them spending most of the time incrementing and decrementing, in order to give you the best chance of the operations colliding.
Currently, your threads will be spending the vast majority of the time sleeping or writing to the console. That's massively reducing the chances of collision.
Additionally, I'd note that absence of evidence is not evidence of absence - concurrency issues can indeed be hard to provoke, particularly if you happen to be running on a CPU with a strong memory model and internally-atomic increment/decrement instructions that the JIT can use. It could be that you'll never provoke the problem on your particular machine - but that the same program could fail on another machine.
IMO these loops are too short. I bet that by the time the second thread starts the first thread has already finished executing its loop and exited. Try to drastically increase the number of iterations that each thread executes. At this point you could even spawn just two threads (remove the outer loop) and it should be enough to see wrong values.
For example, with the following code I'm getting totally wrong results on my system:
static void Main(string[] args)
{
Random random = new Random();
int value = 0;
Thread incThread = new Thread(() =>
{
for (int y = 0; y < 2000000; y++)
{
value++;
}
});
Thread decThread = new Thread(() =>
{
for (int z = 0; z < 2000000; z++)
{
value--;
}
});
incThread.Start();
decThread.Start();
incThread.Join();
decThread.Join();
Console.WriteLine(value);
}
In addition to Jon Skeets answer:
A simple test that at least on my litte Dual Core shows the problem easily:
Sub Main()
Dim i As Long = 1
Dim j As Long = 1
Dim f = Sub()
While Interlocked.Read(j) < 10 * 1000 * 1000
i += 1
Interlocked.Increment(j)
End While
End Sub
Dim l As New List(Of Task)
For n = 1 To 4
l.Add(Task.Run(f))
Next
Task.WaitAll(l.ToArray)
Console.WriteLine("i={0} j={1}", i, j)
Console.ReadLine()
End Sub
i and j should both have the same final value. But they dont have!
EDIT
And in case you think, that C# is more clever than VB:
static void Main(string[] args)
{
long i = 1;
long j = 1;
Task[] t = new Task[4];
for (int k = 0; k < 4; k++)
{
t[k] = Task.Run(() => {
while (Interlocked.Read(ref j) < (long)(10*1000*1000))
{
i++;
Interlocked.Increment(ref j);
}});
}
Task.WaitAll(t);
Console.WriteLine("i = {0} j = {1}", i, j);
Console.ReadLine();
}
it isnt ;)
The result: i is around 15% (percent!) lower than j. ON my machine. Having an eight thread machine, probabyl might even make the result more imminent, because the error is more likely to happen if several tasks run truly parallel and are not just pre-empted.
The above code is flawed of course :(
IF a task is preempted, just AFTER i++, all other tasks continue to increment i and j, so i is expected to differ from j, even if "++" would be atomic. There a simple solution though:
static void Main(string[] args)
{
long i = 0;
int runs = 10*1000*1000;
Task[] t = new Task[Environment.ProcessorCount];
Stopwatch stp = Stopwatch.StartNew();
for (int k = 0; k < t.Length; k++)
{
t[k] = Task.Run(() =>
{
for (int j = 0; j < runs; j++ )
{
i++;
}
});
}
Task.WaitAll(t);
stp.Stop();
Console.WriteLine("i = {0} should be = {1} ms={2}", i, runs * t.Length, stp.ElapsedMilliseconds);
Console.ReadLine();
}
Now a task could be pre-empted somewhere in the loop statements. But that wouldn't effect i. So the only way to see an effect on i would be, if a task is preempted when it just at the i++ statement. And thats what was to be shown: It CAN happen and it's more likely to happen when you have fewer but longer running tasks.
If you write Interlocked.Increment(ref i); instead of i++ the code runs much longer (because of the locking), but i is exactly what it should be!
I have a very strange problem with my code. It will fully run the 1st for loop, then complete the foreach, but then it will skip back to the "ThreadStart IMAPDelegate" (line 1 of the for loop) and then crash because of an ArgumentOutOfRangeException. Can someone explain why the program is doing this? I debugged it line by line and it literally just skips back up into the a line in the for loop. If it had normally run the for loop again, it would have set x back to 0 and it would not have crashed. Any suggestions?
for (int x = 0; x < UserInfo.Count; x++)
{
ThreadStart IMAPDelegate = delegate{SendParams(UserInfo[x], IMAPServers[x]); };
MyThreads.Add(new Thread(IMAPDelegate));
}
foreach (Thread thread in MyThreads)
{
thread.Start();
}
This is by design when you use an anonymous method like that. As soon as the thread starts running, it executes the SendParams() method call. Which then bombs because the "x" variable is already incremented beyond UserInfo.Count. Fix:
for (int x = 0; x < UserInfo.Count; x++)
{
int user = x;
ThreadStart IMAPDelegate = delegate{SendParams(UserInfo[user], IMAPServers[user]); };
MyThreads.Add(new Thread(IMAPDelegate));
}
Does anyone know how why this code returns out of range exception?
For example if the leastAbstractions List instance has count == 10, the loop will execute 11 times finishing with i = 10 and returning this exception.
for (int i = 0; i < leastAbstractions.Count; i++)
{
Task.Factory.StartNew((object state) =>
{
this.Authenticate(new HighFragment(leastAbstractions[i])).Reactivate();
}, TaskCreationOptions.PreferFairness);
}
Your loop isn't actually executing 11 times - it's only executing 10 times, but i == 10 by the time some of those tasks execute.
It's the normal problem - you're capturing a loop variable in a lambda expression. Just take a copy of the counter, and capture that instead:
for (int i = 0; i < leastAbstractions.Count; i++)
{
int copy = i;
Task.Factory.StartNew((object state) =>
{
this.Authenticate(new HighFragment(leastAbstractions[copy]))
.Reactivate();
}, TaskCreationOptions.PreferFairness);
}
That way, when your task executes, you'll see the current value of the "instance" of copy that you captured - and that value never changes, unlike the value of i.
See Eric Lippert's blog posts on this: part 1; part 2.