I've been learning how to use the threadpools but I'm not sure that each of the threads in the pool are being executed properly and I suspect some are being executed more than once. I've cut down the code to the bare minimum and having been using Debug.WriteLine to try and work out what is going on but this produces some odd results.
My code is as follows (based on code from (WaitAll for multiple handles on a STA thread is not supported):
public void ThreadCheck()
{
string[] files;
classImport Import;
CountdownEvent done = new CountdownEvent(1);
ManualResetEvent[] doneEvents = new ManualResetEvent[10];
try
{
files = Directory.GetFiles(importDirectory, "*.ZIP");
for (int j = 0; j < doneEvents.Length; j++)
{
done.AddCount();
Import = new classImport(j, files[j], workingDirectory + #"\" + j.ToString(), doneEvents[j]);
ThreadPool.QueueUserWorkItem(
(state) =>
{
try
{
Import.ThreadPoolCallBack(state);
Debug.WriteLine("Thread " + j.ToString() + " started");
}
finally
{
done.Signal();
}
}, j);
}
done.Signal();
done.Wait();
}
catch (Exception ex)
{
Debug.WriteLine("Error in ThreadCheck():\n" + ex.ToString());
}
}
The classImport.ThreadPoolCallBack doesn't actually do anything at the minute.
If I step through the code manually I get:
Thread 1 started
Thread 2 started
.... all the way to ....
Thread 10 started
However, if I run it manually the Output window is filled with "Thread 10 started"
My question is: is there something wrong with my code for use of the threadpool or is the Debug.WriteLine's results being confused by the multiple threads?
The problem is that you're using the loop variable (j) within a lambda expression.
The details of why this is a problem are quite longwinded - see Eric Lippert's blog post for details (also read part 2).
Fortunately the fix is simple: just create a new local variable inside the loop and use that within the lambda expression:
for (int j = 0; j < doneEvents.Length; j++)
{
int localCopyOfJ = j;
... use localCopyOfJ within the lambda ...
}
For the rest of the loop body it's fine to use just j - it's only when it's captured by a lambda expression or anonymous method that it becomes a problem.
This is a common issue which trips up a lot of people - the C# team have considered changes to the behaviour for the foreach loop (where it really looks like you're already declaring a separate variable on each iteration), but it would cause interesting compatibility issues. (You could write C# 5 code which works fine, and with C# 4 it might compile fine but really be broken, for example.)
Essentially the local variable j you've got there is captured by the lambda expression, resulting in the old modified closure problem. You'll have to read that post to get a broad understanding of the issue, but I can speak about some specifics in this context.
It might appear as though each thread-pool task is seeing it's own "version" of j, but it isn't. In other words, subsequent mutations to j after a task has been created is visible to the task.
When you step through your code slowly, the thread-pool executes each task before the variable has an opportunity to change, which is why you get the expected result (one value for the variable is effectively "associated" with one task). In production, this isn't the case. It appears that for your specific test run, the loop completed before any of the tasks had an opportunity to run. This is why all of the tasks happened to see the same "last" value for j (Given the time it takes to schedule a job on the thread-pool, I would imagine this output to be typical.) But this isn't guaranteed by any means; you could see pretty much any output, depending on the particular timing characteristics of the environment you're running this code on.
Fortunately, the fix is simple:
for (int j = 0; j < doneEvents.Length; j++)
{
int jCopy = j;
// work with jCopy instead of j
Now, each task will "own" a particular value of the loop-variable.
the problem is that the j is a captured variable and is therefore the same capture reference is being used for each lambda expression.
Related
I tried to create a multithreaded dice rolling simulation - just for curiosity, the joy of multithreaded progarmming and to show others the effects of "random results" (many people can't understand that if you roll a laplace dice six times in a row and you already had 1, 2, 3, 4, 5 that the next roll is NOT a 6.). To show them the distribution of n rolls with m dice I created this code.
Well, the result is fine BUT even though I create a new task for each dice the program runs single threaded.
Multithreaded would be reasonable to simulate "millions" of rerolls with 6 or more dice as the time to finish will grow rapidly.
I read several examples from msdn that all indicate that there should be several tasks running simultanously.
Can someone please give me a hint, why this code does not utilize many threads / cores? (Not even, when I try to run it for 400 dice at once and 10 Million rerolls)
At first I initialize the jagged Array that stores the results. 1st dimension: 1 Entry per dice, the second dimension will be the distribution of eyes rolled with each dice.
Next I create an array of tasks that each return an array of results (the second dimension, as described above)
Each of these arrays has 6 entries that represent eachs side of a laplace W6 dice. If the dice roll results in 1 eye the first entry [0] is increased by +1. So you can visualize how often each value has been rolled.
Then I use a plain for-loop to start all threads. There is no indication to wait for a thread until all are started.
At the end I wait for all to finish and sum up the results. It does not make any difference if change
Task.WaitAll(tasks); to
Task.WhenAll(tasks);
Again my quation: Why doesn't that code utilize more than one core of my CPU? What do I have to change?
Thanks in advance!
Here's the code:
private void buttonStart_Click(object sender, RoutedEventArgs e)
{
int tries = 1000000;
int numberofdice = 20 ;
int numberofsides = 6; // W6 = 6
var rnd = new Random();
int[][] StoreResult = new int[numberofdice][];
for (int i = 0; i < numberofdice; i++)
{
StoreResult[i] = new int[numberofsides];
}
Task<int[]>[] tasks = new Task<int[]>[numberofdice];
for (int ctr = 0; ctr < numberofdice; ctr++)
{
tasks[ctr] = Task.Run(() =>
{
int newValue = 0;
int[] StoreTemp = new int[numberofsides]; // Array that represents how often each value appeared
for (int i = 1; i <= tries; i++) // how often to roll the dice
{
newValue = rnd.Next(1, numberofsides + 1); // Roll the dice; UpperLimit for random int EXCLUDED from range
StoreTemp[newValue-1] = StoreTemp[newValue-1] + 1; //increases value corresponding to eyes on dice by +1
}
return StoreTemp;
});
StoreResult[ctr] = tasks[ctr].Result; // Summing up the individual results for each dice in an array
}
Task.WaitAll(tasks);
// do something to visualize the results - not important for the question
}
}
The issue here is tasks[ctr].Result. The .Result portion itself waits for the function to complete before storing the resulting int array into StoreResult. Instead, make a new loop after Task.WaitAll to get your results.
You may consider doing a Parallel.Foreach loop instead of manually creating separate tasks for this.
As others have indicated, when you try to aggregate this you just end up waiting for each individual task to finish, so this isn't actually multi-threaded.
Very important note: The C# random number generator is not thread safe (see also this MSDN blog post for discussion on the topic). Don't share the same instance between multiple threads. From the documentation:
...Random objects are not thread safe. If your app calls Random
methods from multiple threads, you must use a synchronization object
to ensure that only one thread can access the random number generator
at a time. If you don't ensure that the Random object is accessed in a
thread-safe way, calls to methods that return random numbers return 0.
Also, just to be nit-picky, using a Task is not really the same thing as doing multithreading; while you are, in fact, doing multithreading here, it's also possible to do purely asynchronous, non-multithreaded code with async/await. This is used mostly for I/O-bound operations where it's largely pointless to create a separate thread just to wait for a result (but it's desirable to allow the calling thread to do other work while it's waiting for the result).
I don't think you should have to worry about thread safety while assigning to the main array (assuming that each thread is assigning only to a specific index in the array and that no one else is assigning to the same memory location); you only have to worry about locking when multiple threads are accessing/modifying shared mutable state at the same time. If I'm reading this correctly, this is mutable state (but it's not shared mutable state).
I've read a number of other questions about Access to Modified closure so I understand the basic principle. Still, I couldn't tell - does Parallel.ForEach have the same issues?
Take the following snippet where I recompute the usage stats for users for the last week as an example:
var startTime = DateTime.Now;
var endTime = DateTime.Now.AddHours(6);
for (var i = 0; i < 7; i++)
{
// this next line gives me "Access To Modified Closure"
Parallel.ForEach(allUsers, user => UpdateUsageStats(user, startTime, endTime));
// move back a day and continue the process
startTime = startTime.AddDays(-1);
endTime = endTime.AddDays(-1);
}
From what I know of this code the foreach should run my UpdateUsageStats routine right away and start/end time variables won't be updated till the next time around the loop. Is that correct or should I use local variables to make sure there aren't issues?
You are accessing a modified closure, so it does apply. But, you are not changing its value while you are using it, so assuming you are not changing the values inside UpdateUsageStats you don't have a problem here.
Parallel.Foreach waits for the execution to end, and only then are you changing the values in startTime and endTime.
"Access to modified closure" only leads to problems if the capture scope leaves the loop in which the capture takes place and is used elsewhere. For example,
var list = new List<Action>();
for (var i = 0; i < 7; i++)
{
list.Add(() => Console.WriteLine(i));
}
list.ForEach(a => a()); // prints "7" 7 times, because `i` was captured inside the loop
In your case the lamda doing the capture doesn't leave the loop (the Parallel.ForEach call is executed completely within the loop, each time around).
You still get the warning because the compiler doesn't know whether or not Parallel.ForEach is causing the the lambda to be stored for later invocation. Since we know more than the compiler we can safely ignore the warning.
I am using Visual C# 2013 to create a win-forms applications. I have a loop which takes up to a minute to complete due to a large calculation having to be carried out on many rows of data within a table, therefore whilst the user waits I display a 'loading...' form.
On this form I wish to display a count down of the number of rows, so the user can see how many rows of the data there are left to be calculated, however the label with this number on will not update as everything 'freezes' until the loop has finished.
System.Windows.Forms.Form f = System.Windows.Forms.Application.OpenForms["LoadingForm"];
int DataRowsRemaining = TotalRowCount;
for (int i=0; i<=TotalRowCount; i++)
{
//CALCULATION CODE
((LoadingForm)f).label1.Text = Convert.ToString(DataRowsRemaining--);
}
This code does not allow the label to update during the loop. Using, Application.DoEvents(); after the label does allow it to be updated but this also refreshes every other label on the form which significantly slows down the calculation, therefore I think I need to allow this one line of code to be carried out on a separate thread.
Due to my knowledge being limited on the subject, could anyone advise me whether the multi-threading technique would be the best way to solve this issue, and if so any advice on how I could code this as I have been struggling to understand online examples of multi-threading.
Thanks for your time, Aaron.
Simplest way would be
await Task.Run(() => { /* CALCULATION CODE */ });
private async void Calculation()
{
for (int i = 0; i <= TotalRowCount; i++)
{
await Task.Run(() => { /* CALCULATION CODE */ });
((LoadingForm)f).label1.Text = Convert.ToString(DataRowsRemaining--);
}
}
There is an example here in MSDN on how to use .Invoke(). However, L.B's answer is probably much simpler.
I have a program with two methods. The first method takes two arrays as parameters, and performs an operation in which values from one array are conditionally written into the other, like so:
void Blend(int[] dest, int[] src, int offset)
{
for (int i = 0; i < src.Length; i++)
{
int rdr = dest[i + offset];
dest[i + offset] = src[i] > rdr? src[i] : rdr;
}
}
The second method creates two separate sets of int arrays and iterates through them such that each array of one set is Blended with each array from the other set, like so:
void CrossBlend()
{
int[][] set1 = new int[150][75000]; // we'll pretend this actually compiles
int[][] set2 = new int[25][10000]; // we'll pretend this actually compiles
for (int i1 = 0; i1 < set1.Length; i1++)
{
for (int i2 = 0; i2 < set2.Length; i2++)
{
Blend(set1[i1], set2[i2], 0); // or any offset, doesn't matter
}
}
}
First question: Since this apporoach is an obvious candidate for parallelization, is it intrinsically thread-safe? It seems like no, since I can conceive a scenario (unlikely, I think) where one thread's changes are lost because a different threads ~simultaneous operation.
If no, would this:
void Blend(int[] dest, int[] src, int offset)
{
lock (dest)
{
for (int i = 0; i < src.Length; i++)
{
int rdr = dest[i + offset];
dest[i + offset] = src[i] > rdr? src[i] : rdr;
}
}
}
be an effective fix?
Second question: If so, what would be the likely performance cost of using locks like this? I assume that with something like this, if a thread attempts to lock a destination array that is currently locked by another thread, the first thread would block until the lock was released instead of continuing to process something.
Also, how much time does it actually take to acquire a lock? Nanosecond scale, or worse than that? Would this be a major issue in something like this?
Third question: How would I best approach this problem in a multi-threaded way that would take advantage of multi-core processors (and this is based on the potentially wrong assumption that a multi-threaded solution would not speed up this operation on a single core processor)? I'm guessing that I would want to have one thread running per core, but I don't know if that's true.
The potential contention with CrossBlend is set1 - the destination of the blend. Rather than using a lock, which is going to be comparatively expensive compared to the amount of work you are doing, arrange for each thread to work on it's own destination. That is a given destination (array at some index in set1) is owned by a given task. This is possible since the outcome is independent of the order that CrossBlend processes the arrays in.
Each task should then run just the inner loop in CrossBlend, and the task is parameterized with the index of the dest array (set1) to use (or range of indices.)
You can also parallelize the Blend method, since each index is computed independently of the others, so no contention there. But on todays machines, with <40 cores you will get sufficient parallism just threading the CrossBlend method.
To run effectively on multi-core you can either
for N cores, divide the problem into N parts. Given that set1 is reasonably large compared to the number of cores, you could just divide set1 into N ranges, and pass each range of indices into N threads running the inner CrossBlend loop. That will give you fairly good parallelism, but it's not optimal. (Some threads will finish sooner and end up with no work to do.)
A more involved scheme is to make each iteration of the CrossBlend inner loop a separate task. Have N queues (for N cores), and distribute the tasks amongst the queues. Start N threads, with each thread reading it's tasks from a queue. If a threads queue becomes empty, it takes a task from some other thread's queue.
The second approach is best suited to irregularly sized tasks, or where the system is being used for other tasks, so some cores may be time switching between other processes, so you cannot expect that equal amounts of work complete in the roughly same time on different cores.
The first approach is much simpler to code, and will give you a good level of parallelism.
i am creating project in c#.net. my execution process is very slow. i also found the reason for that.in one method i copied the values from one list to another.that list consists more 3000values for every row . how can i speed up this process.any body help me
for (int i = 0; i < rectTristrip.NofStrips; i++)
{
VertexList verList = new VertexList();
verList = rectTristrip.Strip[i];
GraphicsPath rectPath4 = verList.TristripToGraphicsPath();
for (int j = 0; j < rectPath4.PointCount; j++)
{
pointList.Add(rectPath4.PathPoints[j]);
}
}
This is the code slow up my procees.Rect tristirp consists lot of vertices each vertices has more 3000 values..
A profiler will tell you exactly how much time is spent on which lines and which are most important to optimize. Red-gate makes a very good one.
http://www.red-gate.com/products/ants_performance_profiler/index.htm
Like musicfreak already mentioned you should profile your code to get reliable result on what's going on. But some processes are just taking some time.
In some way you can't get rid of them, they must be done. The question is just: When they are neccessary? So maybe you can put them into some initialization phase or into another thread which will compute the results for you, while your GUI is accessible to your users.
In one of my applications i make a big query against a SQL Server. This task takes a while (built up connection, send query, wait for result, putting result into a data table, making some calculations on my own, presenting the results to the user). All of these steps are necessary and can't be make any faster. But they will be done in another thread while the user sees in the result window a 'Please wait' with a progress bar. In the meantime the user can already make some other settings in the UI (if he likes). So the UI is responsive and the user has no big problem to wait a few seconds.
So this is not a real answer, but maybe it gives you some ideas on how to solve your problem.
You can split the load into a couple of worker threads, say 3 threads each dealing with 1000 elements.
You can synchronize it with AutoResetEvent
Some suggestions, even though I think the bulk of the work is in TristripToGraphicsPath():
// Use rectTristrip.Strip.Length instead of NoOfStrips
// to let the JIT eliminate bounds checking
// .Count if it is a list instead of array
for (int i = 0; i < rectTristrip.Strip.Length; i++)
{
VertexList verList = rectTristrip.Strip[i]; // Removed 'new'
GraphicsPath rectPath4 = verList.TristripToGraphicsPath();
// Assuming pointList is infact a list, do this:
pointList.AddRange(rectPath4.PathPoints);
// Else do this:
// Use PathPoints.Length instead of PointCount
// to let the JIT eliminate bounds checking
for (int j = 0; j < rectPath4.PathPoints.Length; j++)
{
pointList.Add(rectPath4.PathPoints[j]);
}
}
And maybe verList = rectTristrip.Strip[i]; // Removed 'VertexList' to save some memory
Define variable VertexList verList above loop.