Csharp threading starting new threads when one finished without waiting on join - c#

I've searched all morning and I can't seem to find the answer to this question.
I have an array of Threads each doing work and then I'll loop through the ids joining each one then starting new threads. What's the best way to detect when a thread has finish so I can fire off a new thread without waiting for each thread to finish?
EDIT added code snippet maybe this will help
if (threadCount > maxItems)
{
threadCount = maxItems;
}
threads = new Thread[threadCount];
for (int i = 0; i < threadCount; i++)
{
threads[i] = new Thread(delegate() { this.StartThread(); });
threads[i].Start();
}
while (loopCounter < threadCount)
{
if (loopCounter == (threadCount - 1))
{
loopCounter = 0;
}
if (threads[loopCounter].ThreadState == ThreadState.Stopped)
{
threads[loopCounter] = new Thread(delegate() { this.StartThread(); });
threads[loopCounter].Start();
}
}

Rather than creating new thread each time, why not just have each thread call a function that returns the next ID (or null if there's no more data to process) when it's finished with the current one? That function will obviously have to be threadsafe, but should reduce your overhead versus watching for finished threads and starting new ones.
so,
void RunWorkerThreads(int threadCount) {
for (int i = 0; i < threadCount; ++i) {
new Thread(() => {
while(true) {
var nextItem = GetNextItem();
if (nextItem == null) break;
/*do work*/
}
}).Start();
}
}
T GetNextItem() {
lock(_lockObject) {
//return the next item
}
}
I'd probably pull GetNextItem and "do work" out and pass them as a parameters to RunWorkerThreads to make that more generic -- so it would be RunWorkerThreads<T>(int count, Func<T> getNextItem, Action<T> workDoer), but that's up to you.
Note that Parallel.ForEach() does essentially this though plus give ways of monitoring and aborting and such, so there's probably no need to reinvent the wheel here.

You can check the thread's ThreadState property and when it's Stopped you can kick off a new thread.
http://msdn.microsoft.com/en-us/library/system.threading.thread.threadstate.aspx
http://msdn.microsoft.com/en-us/library/system.threading.threadstate.aspx

Get each thread, as the last thing it does, to signal that it is done. That way there needs to be no waiting at all.
Even better move to a higher level of abstraction, e.g. threadpool and let someone else worry about such details.

Related

await Task.Run taking longer than expected

The method below is suppose to run for the (duration is milliseconds) being passed in for case 0:, but what I'm seeing is the method may take up to 2 seconds to run for a 400ms duration. Is it possible that Task.run is taking long time to start? If so is there a better way?
private static async void PulseWait(int duration, int axis){
await Task.Run(() =>
{
try
{
var logaction = true;
switch (axis)
{
case 0:
var sw1 = Stopwatch.StartNew();
if (duration > 0) duration += 20; // allowance for the call to the mount
while (sw1.Elapsed.TotalMilliseconds <= duration) { } // wait out the duration
_isPulseGuidingRa = false;
logaction = false;
break;
case 1:
var axis2Stopped = false;
var loopcount = 0;
switch (SkySettings.Mount)
{
case MountType.Simulator:
while (!axis2Stopped && loopcount < 30)
{
loopcount++;
var statusy = new CmdAxisStatus(MountQueue.NewId, Axis.Axis2);
var axis2Status = (AxisStatus)MountQueue.GetCommandResult(statusy).Result;
axis2Stopped = axis2Status.Stopped;
if (!axis2Stopped) Thread.Sleep(10);
}
break;
case MountType.SkyWatcher:
while (!axis2Stopped && loopcount < 30)
{
loopcount++;
var statusy = new SkyIsAxisFullStop(SkyQueue.NewId, AxisId.Axis2);
axis2Stopped = Convert.ToBoolean(SkyQueue.GetCommandResult(statusy).Result);
if (!axis2Stopped) Thread.Sleep(10);
}
break;
default:
throw new ArgumentOutOfRangeException();
}
_isPulseGuidingDec = false;
logaction = false;
break;
}
var monitorItem = new MonitorEntry
{ Datetime = HiResDateTime.UtcNow, Device = MonitorDevice.Telescope, Category = MonitorCategory.Mount, Type = MonitorType.Data, Method = MethodBase.GetCurrentMethod().Name, Thread = Thread.CurrentThread.ManagedThreadId, Message = $"PulseGuide={logaction}" };
MonitorLog.LogToMonitor(monitorItem);
}
catch (Exception)
{
_isPulseGuidingDec = false;
_isPulseGuidingRa = false;
}
});}
Log showing how time taken...
33652,2019:07:12:01:15:35.590,13,AxisPulse,Axis1,0.00208903710815278,400,0,True <<--line just before PulseWait is called with 400ms duration
33653,2019:07:12:01:15:35.591,13,SendRequest,:I1250100
33654,2019:07:12:01:15:35.610,13,ReceiveResponse,:I1250100,=
33655,2019:07:12:01:15:36.026,13,SendRequest,:I1B70100
33656,2019:07:12:01:15:36.067,13,ReceiveResponse,:I1B70100,=
33657,2019:07:12:01:15:36.067,13,SendRequest,:j1
33658,2019:07:12:01:15:36.120,13,ReceiveResponse,:j1,=DDCDBD
33659,2019:07:12:01:15:36.120,13,SendRequest,:j2
33660,2019:07:12:01:15:36.165,13,ReceiveResponse,:j2,=67CF8A
33661,2019:07:12:01:15:36.467,13,SendRequest,:j1
33662,2019:07:12:01:15:36.484,13,ReceiveResponse,:j1,=10CEBD
33663,2019:07:12:01:15:36.484,13,SendRequest,:j2
33664,2019:07:12:01:15:36.501,13,ReceiveResponse,:j2,=67CF8A
33665,2019:07:12:01:15:36.808,13,SendRequest,:j1
33666,2019:07:12:01:15:36.842,13,ReceiveResponse,:j1,=3CCEBD
33667,2019:07:12:01:15:36.842,13,SendRequest,:j2
33668,2019:07:12:01:15:36.868,13,ReceiveResponse,:j2,=67CF8A
33669,2019:07:12:01:15:37.170,13,SendRequest,:j1
33670,2019:07:12:01:15:37.188,13,ReceiveResponse,:j1,=6BCEBD
33671,2019:07:12:01:15:37.188,13,SendRequest,:j2
33672,2019:07:12:01:15:37.204,13,ReceiveResponse,:j2,=67CF8A
33673,2019:07:12:01:15:37.221,5,b__0,PulseGuide=False <<--PulseWait is finished 1.631ms after start
The purpose of async and await is to make things easy. But just like everything that makes things easy, it comes with a cost of having full control over what's going on. Here, it's really a cost of asynchronous programming in general. The point of asynchronous programming is to free up the current thread so that the current thread can go off and do something else. But if something else is done on the current thread, then the continuation of what you were doing must wait until that is done. (i.e. What comes after the await may not happen instantaneously after the task completes)
So while asynchronous programming will help overall performance (like increasing the overall throughput performance of a web app), but will actually hurt the performance of any one specific task. If every millisecond counts to you, you might be able to do the low-level tasks yourself, like creating a Thread (if this really needs to be run on a separate thread).
Here is a simple example that demonstrates this:
var s = new Stopwatch();
// Test the time it takes to run an empty method on a
// different thread with Task.Run and await it.
s.Start();
await Task.Run(() => { });
s.Stop();
Console.WriteLine($"Time of Task.Run: {s.ElapsedMilliseconds}ms");
// Test the time it takes to create a new thread directly
// and wait for it.
s.Restart();
var t = new Thread(() => { });
t.Start();
t.Join();
s.Stop();
Console.WriteLine($"Time of new Thread: {s.ElapsedMilliseconds}ms");
The output will vary, but it looks something like this:
Time of Task.Run: 8ms
Time of new Thread: 0ms
In an application with lots of other things going on, that 8ms could be much more if some other operation uses the thread during the await.
That's not to say that you should use Thread. t.Join() is not an asynchronous operation. It will block the thread. So if PulseWait runs on the UI thread (if this is a UI app), it will lock the UI thread, which is a bad user experience. In that case, you may not be able to get around the cost of using asynchronous code.
If this is not an application with a UI, then I don't see why you need to do all that on a different thread at all. Maybe you can just.... not do that.

c# - create thread in for loop (arguement out of range exception)

I don't know how to describe this problem precisely. Let's look at my code.
for (int i = 0; i < myMT.Keys[key_indexer].Count; i++)
{
threads.Add(new Thread(
() =>
{
sounds[myMT.Keys[key_indexer][i]].PlayLooping();
}
));
threads[threads.Count - 1].Start();
}
Note: sounds is a list of SoundPlayers
The initialization of threads and myMT:
List<Thread> threads = null;
MusicTransfer myMT=null;
and in the constructor:
threads = new List<Thread>();
myMT = new MusicTransfer(bubblePanel);
The variable Keys in myMT is with type of List<List<int>>. It is initialized with the same way of myMT and threads. Imagine a matrix, the outer list is a list of rows and the inner one is for each cell.
When I run the program, I set myMT.Keys[key_indexer].Count to 1. So, normally, the for loop should stop when i reach 1.
However, it throws an exception of ArgumentOutOfRange at the line of sounds[myMT.Keys[key_indexer][i]].PlayLooping(). So, I used debugger to check the value of each variable.
What I found are:
If I use "step over" check step by step, which means time is consumed quite much after the new thread runs, for loop will stop when i reaches 1, which is the way it should be.
If I click "continue" after the breakpoint triggered, the for loop is still processing after i equals 1.
the break point should always be set at the line of threads.Add(new Thread(. If it is set at the line of sounds[myMT.Keys[key_indexer][i]].PlayLooping();, the exception will be triggered even after "step over"
I guess the problem is about thread, but have no idea how to solve it.
Thanks for any help!
There is so many things wrong with your post, however maybe this will help you out a bit
Note : Make your code readable, trust me it does wonders
// List of threads
var threads = new List<Thread>();
// Lets stop indexing everything and make it easy for ourselves
var someList = myMT.Keys[key_indexer];
for (var i = 0; i < someList.Count; i++)
{
// we need to create a reference to the indexed value
// in the someList, otherwise there is no gaurentee
// the thread will have the right index when it needs it
// (thank me later)
var someSound = someList[i];
// create a thread and your callback
var thread = new Thread(() => someSound.PlayLooping());
// add thread to the list
threads.Add(thread);
}
// now lets start the treads in a nice orderly fashion
foreach (var thread in threads)
{
thread.Start();
}
Another way to do this with Tasks
var tasks = new List<Task>();
var someList = myMT.Keys[key_indexer];
for (var i = 0; i < someList.Count; i++)
{
var someSound = someList[1];
var task = new Task(() => someSound.PlayLooping());
tasks.Add(task);
task.Start();
}
Task.WaitAll(tasks.ToArray());
Disclaimer : i take no responsibility for your other logic problems, this was for pure morbid academic purposes

Why simple multi task doesn't work when multi thread does?

var finalList = new List<string>();
var list = new List<int> {1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ................. 999999};
var init = 0;
var limitPerThread = 5;
var countDownEvent = new CountdownEvent(list.Count);
for (var i = 0; i < list.Count; i++)
{
var listToFilter = list.Skip(init).Take(limitPerThread).ToList();
new Thread(delegate()
{
Foo(listToFilter);
countDownEvent.Signal();
}).Start();
init += limitPerThread;
}
//wait all to finish
countDownEvent.Wait();
private static void Foo(List<int> listToFilter)
{
var listDone = Boo(listToFilter);
lock (Object)
{
finalList.AddRange(listDone);
}
}
This doesn't:
var taskList = new List<Task>();
for (var i = 0; i < list.Count; i++)
{
var listToFilter = list.Skip(init).Take(limitPerThread).ToList();
var task = Task.Factory.StartNew(() => Foo(listToFilter));
taskList.add(task);
init += limitPerThread;
}
//wait all to finish
Task.WaitAll(taskList.ToArray());
This process must create at least 700 threads in the end. When I run using Thread, it works and creates all of them. But with Task it doesn't.. It seems like its not starting multiples Tasks async.
I really wanna know why.... any ideas?
EDIT
Another version with PLINQ (as suggested).
var taskList = new List<Task>(list.Count);
Parallel.ForEach(taskList, t =>
{
var listToFilter = list.Skip(init).Take(limitPerThread).ToList();
Foo(listToFilter);
init += limitPerThread;
t.Start();
});
Task.WaitAll(taskList.ToArray());
EDIT2:
public static List<Communication> Foo(List<Dispositive> listToPing)
{
var listResult = new List<Communication>();
foreach (var item in listToPing)
{
var listIps = item.listIps;
var communication = new Communication
{
IdDispositive = item.Id
};
try
{
for (var i = 0; i < listIps.Count(); i++)
{
var oPing = new Ping().Send(listIps.ElementAt(i).IpAddress, 10000);
if (oPing != null)
{
if (oPing.Status.Equals(IPStatus.TimedOut) && listIps.Count() > i+1)
continue;
if (oPing.Status.Equals(IPStatus.TimedOut))
{
communication.Result = "NOK";
break;
}
communication.Result = oPing.Status.Equals(IPStatus.Success) ? "OK" : "NOK";
break;
}
if (listIps.Count() > i+1)
continue;
communication.Result = "NOK";
break;
}
}
catch
{
communication.Result = "NOK";
}
finally
{
listResult.Add(communication);
}
}
return listResult;
}
Tasks are NOT multithreading. They can be used for that, but mostly they're actually used for the opposite - multiplexing on a single thread.
To use tasks for multithreading, I suggest using Parallel LINQ. It has many optimizations in it already, such as intelligent partitioning of your lists and only spawning as many threads as there ar CPU cores, etc.
To understand Task and async, think of it this way - a typical workload often includes IO that needs to be waited upon. Maybe you read a file, or query a webservice, or access a database, or whatever. The point is - your thread gets to wait a loooong time (in CPU cycles at least) until you get a response from some faraway destination.
In the Olden Days™ that meant that your thread was getting locked down (suspended) until that response came. If you wanted to do something else in the meantime, you needed to spawn a new thread. That's doable, but not too efficient. Each OS thread carries a significant overhead (memory, kernel resources) with it. And you could end up with several threads actively burning the CPU, which means that the OS needs to switch between them so that each gets a bit of CPU time and these "context switches" are pretty expensive.
async changes that workflow. Now you can have multiple workloads executing on the same thread. While one piece of work is awaiting the result from a faraway source, another can step in and use that thread to do something else useful. When that second workload gets to its own await, the first can awaken and continue.
After all, it doesn't make sense to spawn more threads than there are CPU cores. You're not going to get more work done that way. Just the opposite - more time will be spent on switching the threads and less time will be available for useful work.
That is what the Task/async/await was originally designed for. However Parallel LINQ has also taken advantage of it and reused it for multithreading. In this case you can look at it this way - the other threads is what your main thread is the "faraway destination" that your main thread is waiting on.
Tasks are executed on the Thread Pool. This means that a handful of threads will serve a large number of tasks. You have multi-threading, but not a thread for every task spawned.
You should use tasks. You should aim to use as much threads as your CPU. Generally, the thread pool is doing this for you.
How did you measure up the performance? Do you think that the 700 threads will work faster than 700 tasks executing by 4 threads? No, they would not.
It seems like its not starting multiples Tasks async
How did you came up with this? As other suggested in comments and in other answers, you probably need to remove a thread creation, as after creating 700 threads you'll degrade your system performance, as your threads would fight to each other for the processor time, without any work done faster.
So, you need to add the async/await for your IO operations, into the Foo method, with SendPingAsync version. Also, your method could be simplyfied, as many checks for a listIps.Count() > i + 1 conditions are useless - you do it in the for condition block:
public static async Task<List<Communication>> Foo(List<Dispositive> listToPing)
{
var listResult = new List<Communication>();
foreach (var item in listToPing)
{
var listIps = item.listIps;
var communication = new Communication
{
IdDispositive = item.Id
};
try
{
var ping = new Ping();
communication.Result = "NOK";
for (var i = 0; i < listIps.Count(); i++)
{
var oPing = await ping.SendPingAsync(listIps.ElementAt(i).IpAddress, 10000);
if (oPing != null)
{
if (oPing.Status.Equals(IPStatus.Success)
{
communication.Result = "OK";
break;
}
}
}
}
catch
{
communication.Result = "NOK";
}
finally
{
listResult.Add(communication);
}
}
return listResult;
}
Other problem with your code is that PLINQ version isn't threadsafe:
init += limitPerThread;
This can fail while executing in parallel. You may introduce some helper method, like in this answer:
private async Task<List<PingReply>> PingAsync(List<Communication> theListOfIPs)
{
Ping pingSender = new Ping();
var tasks = theListOfIPs.Select(ip => pingSender.SendPingAsync(ip, 10000));
var results = await Task.WhenAll(tasks);
return results.ToList();
}
And do this kind of check (try/catch logic removed for simplicity):
public static async Task<List<Communication>> Foo(List<Dispositive> listToPing)
{
var listResult = new List<Communication>();
foreach (var item in listToPing)
{
var listIps = item.listIps;
var communication = new Communication
{
IdDispositive = item.Id
};
var check = await PingAsync(listIps);
communication.Result = check.Any(p => p.Status.Equals(IPStatus.Success)) ? "OK" : "NOK";
}
}
And you probably should use Task.Run instead of Task.StartNew for being sure that you aren't blocking the UI thread.

How to know that your application is not responding?

I have such particular code:
for (int i = 0; i < SingleR_mustBeWorkedUp._number_of_Requestes; i++)
{
Random myRnd = new Random(SingleR_mustBeWorkedUp._num_path);
while (true)
{
int k = myRnd.Next(start, end);
if (CanRequestBePutted(timeLineR, k, SingleR_mustBeWorkedUp._time_service, start + end) == true)
{
SingleR_mustBeWorkedUp.placement[i] = k;
break;
}
}
}
I use an infinite loop here which will end only if CanRequestBePutted returns true. So how to know that the app isn't responding?
There is a solution by controlling time of working each loop, but it doesn't seem to be really good. And I can't forecast that is going to happen in every cases.
Any solutions?
If you're concerned that this operation could potentially take long enough for the application's user to notice, you should be running it in a non-UI thread. Then you can be sure that it will not be making your application unrepsonsive. You should only be running it in the UI thread if you're sure it will always complete very quickly. When in doubt, go to a non-UI thread.
Don't try to figure out dynamically whether the operation will take a long time or not. If it taking a while is a possibility, do the work in another thread.
Why not use a task or threadpool so you're not blocking and put a timer on it?
The task could look something like this:
//put a class level variable
static object _padlock = new object();
var tasks = new List<Task>();
for (int i = 0; i < SingleR_mustBeWorkedUp._number_of_Requestes; i++)
{
var task = new Task(() =>
{
Random myRnd = new Random(SingleR_mustBeWorkedUp._num_path);
while (true)
{
int k = myRnd.Next(start, end);
if (CanRequestBePutted(timeLineR, k, SingleR_mustBeWorkedUp._time_service, start + end) == true)
{
lock(_padlock)
SingleR_mustBeWorkedUp.placement[i] = k;
break;
}
}
});
task.Start();
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
However I would also try to figure out a way to take out your while(true), which is a bit dangerous. Also Task requires .NET 4.0 or above and i'm not sure what framework your targeting.
If you need something older you can use ThreadPool.
Also you might want to put locks around shared resources like SingleR_mustBeWorkedUp.placement or anywhere else might be changing a variable. I put one around SingleR_mustBeWorkedUp.placement as an example.

Multithreaded code executes by threadnumber-times slower using System.Threading and Visual Studio C# Express Hosting Process

I have a very simple program counting the characters in a string. An integer threadnum sets the number of threads and divides the data by threadnum accordingly into chunks for each thread to process.
Each thread increments the values contained in a shared dictionary, building a character historgram.
private Dictionary<UInt32, int> dict = new Dictionary<UInt32, int>();
In order to wait for all threads to finish and continue with the main process, I invoke Thread.Join
Initially I had a local dictionary for each thread which get merged afterwards, but a shared dictionary worked fine, without locking.
No references are locked in the method BuildDictionary, though locking the dictionary did not significantly impact thread-execution time.
Each thread is timed, and the resulting dictionary compared.
The dictionary content is the same regardless of a single or multiple threads - as it should be.
Each thread takes a fraction determined by threadnum to complete - as it should be.
Problem:
The total time is roughly a multiple of threadnum , that is to say the execution time increases ?
(Unfortunately I cannot run a C# Profiler at the moment. Additionally I would prefer C# 3 code compatibility. )
Others are likely struggling as well. It may be that the VS 2010 express edition vshost process stacks and schedules threads to be run sequentially?
Another MT-performance issue was posted recently posted here as "Visual Studio C# 2010 Express Debug running Faster than Release":
Code:
public int threadnum = 8;
Thread[] threads = new Thread[threadnum];
Stopwatch stpwtch = new Stopwatch();
stpwtch.Start();
for (var threadidx = 0; threadidx < threadnum; threadidx++)
{
threads[threadidx] = new Thread(BuildDictionary);
threads[threadidx].Start(threadidx);
threads[threadidx].Join(); //Blocks the calling thread, till thread completion
}
WriteLine("Total - time: {0} msec", stpwtch.ElapsedMilliseconds);
Can you help please?
Update:
It appears that the strange behavior of an almost linear slowdown with increasing thread-number is an artifact due to the numerous hooks of the IDE's Debugger.
Running the process outside the developer environment, I actually do get a 30% speed increase on a 2 logical/physical core machine. During debugging I am already at the high end of CPU utilization, and hence I suspect it is wise to have some leeway during development through additional idle cores.
As initially, I let each thread compute on its own local data-chunk, which is locked and written back to a shared list and aggregated after all threads have finished.
Conclusion:
Be heedful of the environment the process is running in.
We can put the dictionary synchronization issues Tony the Lion mentions in his answer aside for the moment, because in your current implementation you are in fact not running anything in parallel!
Let's take a look at what you are currently doing in your loop:
Start a thread.
Wait for the thread to complete.
Start the next thread.
In other words, you should not be calling Join inside the loop.
Instead, you should start all threads as you are doing, but use a singaling construct such as an AutoResetEvent to determine when all threads have completed.
See example program:
class Program
{
static EventWaitHandle _waitHandle = new AutoResetEvent(false);
static void Main(string[] args)
{
int numThreads = 5;
for (int i = 0; i < numThreads; i++)
{
new Thread(DoWork).Start(i);
}
for (int i = 0; i < numThreads; i++)
{
_waitHandle.WaitOne();
}
Console.WriteLine("All threads finished");
}
static void DoWork(object id)
{
Thread.Sleep(1000);
Console.WriteLine(String.Format("Thread {0} completed", (int)id));
_waitHandle.Set();
}
}
Alternatively you could just as well be calling Join in the second loop if you have references to the threads available.
After you have done this you can and should worry about the dictionary synchronization problems.
A Dictionary can support multiple readers concurrently, as long as the collection is not modified. From MSDN
You say:
but a shared dictionary worked fine, without locking.
Each thread increments the values contained in a shared dictionary
Your program is by definition broken, if you alter the data in the dictionary without proper locking, you will end up with bugs. Nothing more needs to be said.
I wouldn't use some shared static Dictionary, if each thread worked on a local copy you could amalgamate your results once all threads had signalled completion.
WaitHandle.WaitAll avoids any deadlocking on an AutoResetEvent.
class Program
{
static void Main()
{
char[] text = "Some String".ToCharArray();
int numThreads = 5;
// I leave the implementation of the next line to the OP.
Partition[] partitions = PartitionWork(text, numThreads);
completions = new WaitHandle[numThreads];
results = IDictionary<char, int>[numThreads];
for (int i = 0; i < numThreads; i++)
{
results[i] = new IDictionary<char, int>();
completions[i] = new ManualResetEvent(false);
new Thread(DoWork).Start(
text,
partitions[i].Start,
partitions[i].End,
results[i],
completions[i]);
}
if (WaitHandle.WaitAll(completions, new TimeSpan(366, 0, 0, 0))
{
Console.WriteLine("All threads finished");
}
else
{
Console.WriteLine("Timed out after a year and a day");
}
// Merge the results
IDictionary<char, int> result = results[0];
for (int i = 1; i < numThreads - 1; i ++)
{
foreach(KeyValuePair<char, int> item in results[i])
{
if (result.ContainsKey(item.Key)
{
result[item.Key] += item.Value;
}
else
{
result.Add(item.Key, item.Value);
}
}
}
}
static void BuildDictionary(
char[] text,
int start,
int finish,
IDictionary<char, int> result,
WaitHandle completed)
{
for (int i = start; i <= finish; i++)
{
if (result.ContainsKey(text[i])
{
result[text[i]]++;
}
else
{
result.Add(text[i], 1);
}
}
completed.Set();
}
}
With this implementation the only variable that is ever shared is the char[] of the text and that is always read only.
You do have the burden of merging the dictionaries at the end but, that is a small price for avoiding any concurrencey issues. In a later version of the framework I would have used TPL and ConcurrentDictionary and possibly Partitioner<TSource>.
I totally agree with TonyTheLion and others, and as you fix the actual problem with join'ing at the wrong place, there still will be problem with (no) locks and updating the shared dictionary. I wanted to drop you a quick workaround: just wrap your integer value into some object:
instead of:
Dictionary<uint, int> dict = new Dictionary<uint, int>();
use:
class Entry { public int value; }
Dictionary<uint, Entry> dict = new Dictionary<uint, Entry>();
and now increment the Entry::value instead. That way, the Dictionary will not notice any changes and it will be safe without locking the dictionary.
Note: this will however work only if you are guaranteed if one thread would use only its own one Entry. I've just noticed this is not true as you said 'histogram of characters'. You will have to lock over each Entry during the increment, or some increments may be lost. Still, locking at Entry layer will speed up signinificantly when compared to locking at whole dictionary
Roem saw it.
Your main thread should Join the X other Threads after having started all of them.
Else it waits for the 1st thread to be finished, to start and wait for the 2nd one.
for (var threadidx = 0; threadidx < threadnum; threadidx++)
{
threads[threadidx] = new Thread(BuildDictionary);
threads[threadidx].Start(threadidx);
}
for (var threadidx = 0; threadidx < threadnum; threadidx++)
{
threads[threadidx].Join(); //Blocks the calling thread, till thread completion
}
As Rotem points out, by joining in the loop you are waiting for each thread to complete before going continuing.
The hint for why this is can be found on the Thread.Join documentation on MSDN
Blocks the calling thread until a thread terminates
So you loop will not continue until that one thread has completed it's work. To start all the threads then wait for them to complete, join them outside the loop:
public int threadnum = 8;
Thread[] threads = new Thread[threadnum];
Stopwatch stpwtch = new Stopwatch();
stpwtch.Start();
// Start all the threads doing their work
for (var threadidx = 0; threadidx < threadnum; threadidx++)
{
threads[threadidx] = new Thread(BuildDictionary);
threads[threadidx].Start(threadidx);
}
// Join to all the threads to wait for them to complete
for (var threadidx = 0; threadidx < threadnum; threadidx++)
{
threads[threadidx].Join();
}
System.Diagnostics.Debug.WriteLine("Total - time: {0} msec", stpwtch.ElapsedMilliseconds);
You will really need to post your BuildDictionary function. It is very likely that the operation will be no faster with multiple threads and the threading overhead will actually increase execution time.

Categories