.NET 4.5 parallel processing and for loop

.NET 4.5 parallel processing and for loop - c#

I am trying to create a list of tasks which depend on the number of processors available. I have a for loop which which seems to be behaving strangely. I am aware of the concept of closures in javascript, and it seems like something similar could be happening here:
var tasks = new Task[Environment.ProcessorCount];
for(int x = 0; x < Environment.ProcessorCount; x ++)
{
tasks[x] = Task.Run(() => new Segment(SizeOfSegment, x * SizeOfSegment, listOfNumbers).generateNewList());
}
What I am finding is when I break on the line in the for loop, the variable x seems to be correct, so it starts at 0 and ends at 3 (number of processors is 4). But when I put the break point within the constructor for Segment, I am finding that x was actually 4 when stepping back in the Call Stack.
Any help would be greatly appreciated.

You're capturing x within your lambda expression - but you've got a single x variable which changes values across the course of the loop, so by the time your task actually runs, it may well have a different value. You need to create a copy of the variable inside the loop, creating a new "variable instance" on each iteration. Then you can capture that variable safely:
for(int x = 0; x < Environment.ProcessorCount; x ++)
{
int copy = x;
tasks[x] = Task.Run(() => new Segment(SizeOfSegment,
copy * SizeOfSegment,
listOfNumbers).generateNewList());
}
(I'd also advise you to rename generateNewList to GenerateNewList to comply with .NET naming conventions.)

Related

Why does it take longer to access to a previously created variable than a variable just declared?

I've recently ran a benchmark to see whether access times are less for a variable that is declared at the end of a block of variable declarations or after.
Benchmark code (selected variable declared at end of block),
// Benchmark 1
for (long i = 0; i < 6000000000; i++)
{
var a1 = 0;
var b1 = 0;
var c1 = 0;
// 53 variables later...
var x2 = 0;
var y2 = 0;
var z2 = 0;
z2 = 1; // Write
var _ = z2; // Read
}
Benchmark code (selected variable declared at start of block),
// Benchmark 2
for (long i = 0; i < 6000000000; i++)
{
var a1 = 0;
var b1 = 0;
var c1 = 0;
// 53 variables later...
var x2 = 0;
var y2 = 0;
var z2 = 0;
a1 = 1; // Write
var _ = a1; // Read
}
To my surprise the results (averaged over 3 runs, excluding first build and without optimizations) are as follows,
Benchmark 1: 9,419.7 milliseconds.
Benchmark 2: 12,262 milliseconds.
As you can see accessing the "newer" variable in the above benchmark is 23.18% (2842.3 ms) faster, but why?

Normally, unused locals are deleted by optimizations in basically any optimizing compiler in the world. You are only writing to most variables. This is an easy case for deletion of their physical storage.
The relation between logical locals and their physical storage is highly complex. They might be deleted, enregistered or spilled.
So don't think that var _ = a1; actually result in a read from a1 and a write to _. It does nothing.
The JIT switches off a few optimizations in functions with many (I believe 64) local variables because some algorithms have quadratic running time in the number of locals. Maybe that's why those locals impact performance.
Try it with fewer variables and you will not be able to distinguish variations of this function from one another.
Or, try it with VC++, GCC or Clang. They all should delete the entire loop. I'd be very disappointed if they didn't.
I don't think you are measuring something relevant here. Whatever the result of your benchmark - it helps you nothing with real-world code. If this was an interesting case I'd look at the disassembly but as I said I think this is irrelevant. Whatever I would find it would not be an interesting find.
If you want to learn what code a compiler typically generates you should probably write some simple functions and look at the generated machine code. This can be very instructional.

Trying to think in assembler/closer to hardware, it might be something like this:
In the faster version, you still have the address of the previously accessed variable z2 stored in the current register, which then directly can be used again without needing to change its contents (=recalculate the correct memory address) to do the write and read.
It could be an automatic optimization done by the interpreter/compiler.
Have you tried other variables instead of z2 for your W/R test at the end of the loop?
What happens if you use x2 or y2 or even any of the other variables in the middle?
Are the access times for all the variables other than z2 equal or do they differ as well?

ArgumentOutOfRangeException. But it should not be there

So I have and method like this.
var someColletion = _someService.GetSomeCollection(someParam);
var taskCollection = new Task<double>[someCollection.Count];
for (int i = 0; i < taskCollection.Length; i++)
{
// do some stuff on the i-th element of someCollection and taskCollection
// and start the i-th task
}
Task.WaitAll(taskCollection);
double total = 0;
for (int i = 0; i < taskCollection.Length; i++)
{
// get the result of each task and sum it in total variable
}
return total;
the case is when it comes into first for loop and the number of elements in both collections are suppose 1 the ArgumentOutOfRangeException is being thrown and then AggregateException is being thrown on Task.WaitAll() because the i becomes 1 (I don't know why but it does) and when it tries to access the i-th (second) element in array that contains just one element, this happens. But there is more to this. If i set a break point before first loop and go step by step then this thing does not happen. when i becomes one the cycle ends. and everything's okay. now the method I provided above is called by an ASP.NET MVC Controller's Action which itself is called Asynchronously (by ajax call) suppose 3 times. and out of this three just one executes correctly other two do the thing I said above (if not breakpointed). I think that this problem is caused by ajax call most probably because when I breakpoint it stops other calls from executing. Can anyone suggest anything ?

I suspect you're using i within the first loop, capturing it with a lambda expression or anonymous method, like this:
for (int i = 0; i < taskCollection.Length; i++)
{
taskCollection[i] = Task.Run(() => Console.WriteLine(i));
}
If that's the case, it's the variable i which is being captured - not the value of the variable for that iteration of the loop. So by the time the task actually executes, it's likely that the value of i has changed. The solution is to take a copy of the iteration variable within the loop, in a separate "new" variable, and capture that in the anonymous function instead:
for (int i = 0; i < taskCollection.Length; i++)
{
int copy = i;
taskCollection[i] = Task.Run(() => Console.WriteLine(copy));
}
That way, each task captures a separate variable, whose value never changes.

Confusing thread behaviour [duplicate]

This question already has answers here:
creating new threads in a loop
(2 answers)
Closed 9 years ago.
In C#, if I execute
for (int i = 0;i < 10;i++)
new Thread(() => Console.Write(i)).Start();
I will possibly get 0223557799, that's strange, since i is a int, I think it should be copied before the thread starts.

Closures are your problem here.
Basically, instead of grabbing the value when you create the lambda (in the loop), it grabs it when it needs it. And computers are so fast that by the time that happens, it's already changed. It can't go through the whole loop, but it goes through some of it.
Here's a fix:
for (int i = 0; i < 10; i++)
{
var n = i;
new Thread(() => Console.Write(n)).Start();
}

Because Start() returns immediately, i++ happens before the thread gets a chance to print i to the console. I believe that a workaround is to create a local copy of the int, then print that:
for (int i = 0;i < 10;i++) {
int j = i;
new Thread(() => Console.Write(j)).Start();
}

What basically is happening is this:
You want to start a thread that prints the value of i.
The thread starts.
The code operating in the thread gets the value if i. Note that the value of i can be changed by now.
The value of i gets printed. But no guarantees to get a logical output.
Copy the value of i into another variable first and then print that value. The other answers provide enough samplecode.

Your lambda will be translated into set of the method and context class which will handle a refference to the i.

I would use the built in .NET parallelism support Task Parallelism
You won't have to worry about managing the threads it's done for you.
Example your code converted to the Parallelism libraries.
Parallel.For(0, 10, (i, state) =>
{
Console.WriteLine(i);
});

Why c# doesn't preserve the context for an anonymous delegate calls?

I have the following method:
static Random rr = new Random();
static void DoAction(Action a)
{
ThreadPool.QueueUserWorkItem(par =>
{
Thread.Sleep(rr.Next(200));
a.Invoke();
});
}
now I call this in a for loop like this:
for (int i = 0; i < 10; i++)
{
var x = i;
DoAction(() =>
{
Console.WriteLine(i); // scenario 1
//Console.WriteLine(x); // scenario 2
});
}
in scenario 1 the output is: 10 10 10 10 ... 10
in scenario 2 the output is: 2 6 5 8 4 ... 0 (random permutation of 0 to 9)
How do you explain this? Is c# not supposed to preserve variables (here i) for the anonymous delegate call?

The problem here is that there is one i variable and ten instances / copies of x. Each lambda gets a reference to the single variable i and one of the instances of x. Every x is only written to once and hence each lambda sees the one value which was written to the value it references.
The variable i is written to until it reaches 10. None of the lambdas run until the loop completes so they all see the final value of i which is 10
I find this example is a bit clearer if you rewrite it as follows
int i = 0; // Single i for every iteration of the loop
while (i < 10) {
int x = i; // New x for every iteration of the loop
DoAction(() => {
Console.WriteLine(i);
Console.WriteLine(x);
});
i++;
};

DoAction spawns the thread, and returns right away. By the time the thread awakens from its random sleep, the loop will be finished, and the value of i will have advanced all the way to 10. The value of x, on the other hand, is captured and frozen before the call, so you will get all values from 0 to 9 in a random order, depending on how long each thread gets to sleep based on your random number generator.

I think you'll get the same result with java or any Object oriented Language (not sure but here it seems logical).
The scope of i is for the whole loop and the scope of x is for each occurrence.
Resharper helps you top spot this kind of problem.

Speed: making a new variable or setting the variable to 0?

I'm making a small game, and this has a lot of loops, which all use a certain variable adjacentSquares. After every loop however, this should be set to 0. What would be faster, creating this variable again every time or just setting it to 0? Is there maybe a certain 'exotic' approach, that will perform even better?
The associated (unfinished) code:
void Update ()
{
int adjacentSquares = 0;
for (int x = 0; x <= gridX; x++)
{
for (int y = 0; y <= gridY; y++)
{
if (grid[x - 1,y - 1] == true)
adjacentSquares += 1;
//and some more logic
}
}
}

Why not experiment and measure the time elapsed using the System.Diagnostics.Stopwatch class? http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx
Set up a Stopwatch object before that loop and then measure elapsed time after it. Then, report back with your findings :D

The real answer here is: try it out and see!
But, I would not expect there to be a difference in speed. If anything, you're stack will use 4 bytes more memory (per variable), but even that is not a guarantee. There's a good change that (if there is a performance benefit here) either the C# compiler or the JIT compiler will recognized that the first variable is no longer used, so it will simply use that same memory for the subsequent variables. But I'll echo what I said before: run some tests - that's the only true answer to your question.

If you really want to improve performance here you could look at doing a parallel solution here, depending on if each individual calculation relies on all the previous ones you have done.
You can probably even do this with LINQ depending on the "some more logic" you are doing.

Just for improving a little bit more the performance:
void Update ()
{
int adjacentSquares = 0;
for (int x = -1; x < gridX; x++)
{
for (int y = -1; y < gridY; y++)
{
if (grid[x, y])
adjacentSquares++;
//and some more logic
}
}
}
I don't know exactly why you need to start from -1 (0 - 1), but if you have, then put that on the for instead of executing the same each time.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

.NET 4.5 parallel processing and for loop - c#

Related

Why does it take longer to access to a previously created variable than a variable just declared?

ArgumentOutOfRangeException. But it should not be there

Confusing thread behaviour [duplicate]

Why c# doesn't preserve the context for an anonymous delegate calls?

Speed: making a new variable or setting the variable to 0?

Categories

Resources