timing a mutex,unsynchronized threads, and a semaphore in C# - c#

I have an assignment where I am tasked to write a program in c# that will test out 5 different functions. I seem to be having problems because it seems like my output for the timing of these tests is wrong (i.e they are always returning 0ms back). Also, my unsynced threads are always returning back the full value of 5000 when they should not be doing that all the time, what am I doing wrong with test #2 and Am I stopping the timer in the wrong area? Threading is kind of messing with my mind at this point and I would like to have some clarification on the matter... Below is the spec and my code...
Test #1: Will be an increment (basic) to the value 5000.
Test #2: The main thread will create 10 threads that will increment the shared integer 500 times each. You will not use any synchronization to protect the updating of the shared integer. It is likely that the unsynchronized access to the shared integer will cause it to have an incorrect final value (<5000).
Test #3: The main thread will create 10 threads that will increment the shared integer 500 times each. You will protect the updating of the shared integer using a mutex.
Test #4: The main thread will create 10 threads that will increment the shared integer 500 times each. You will update the shared integer using the Increment method of the Interlocked class.
Test #5: The main thread will create 10 threads that will increment the shared integer 500 times each. You will protect the updating of the shared integer using a semaphore.
Each test will be timed using the Stopwatch class and the total time the test took will be written to the console.
Use Thread.Sleep() to simulate a random amount of processing time and assume the shared integer is being used at that time.
Random time should be a random value between 0 and 10ms.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;
namespace Lab3
{
class MainClass
{
public static int sharedVal = 0;
public static int y = 1;
static Stopwatch timer = new Stopwatch();
private static Mutex mutex = new Mutex();
private static Semaphore semaphore = new Semaphore(1, 1);
public static void Main(string[] args)
{
Console.WriteLine("***************** Lab 3 Thread/Synchronization Testing *****************" + "\n");
Console.WriteLine("Starting Test 1 (no Threads)...");
test1();
Console.WriteLine("***************** Lab 3 Thread/Synchronization Testing *****************" + "\n");
Console.WriteLine("Starting Test 2 (Threads without any synchronization)...");
test2(10);
Console.WriteLine("***************** Lab 3 Thread/Synchronization Testing *****************" + "\n");
Console.WriteLine("Starting Test 3 (Threads with a mutex)...");
test3(10);
Console.WriteLine("***************** Lab 3 Thread/Synchronization Testing *****************" + "\n");
Console.WriteLine("Starting Test 4 (Interlocked Methods)...");
test4(10);
Console.WriteLine("***************** Lab 3 Thread/Synchronization Testing *****************" + "\n");
Console.WriteLine("Starting Test 5 (Threads with a semaphore)...");
test5(10);
}
/*****************************************************************************/
public static void test1()
{
timer.Reset();
timer.Start();
intChanger(ref sharedVal, 5000,0);
Console.WriteLine("Test Complete");
timer.Stop();
Console.WriteLine("Shared Value: {0}, Total time: {1}ms ", sharedVal, timer.ElapsedMilliseconds);
clearSharedVal(ref sharedVal);
}
/*****************************************************************************/
public static void test2(int numOfThreads)
{
timer.Reset();
timer.Start();
threadCreate(numOfThreads);
Console.WriteLine("Test Complete");
Console.WriteLine("Shared Value: {0}, Total time: {1}ms ", sharedVal, timer.ElapsedMilliseconds);
clearSharedVal(ref sharedVal);
y++;
}
/*****************************************************************************/
public static void test3(int numOfThreads)
{
timer.Reset();
timer.Start();
threadCreate(numOfThreads);
Console.WriteLine("Test Complete");
Console.WriteLine("Shared Value: {0}, Total time: {1}ms ", sharedVal, timer.ElapsedMilliseconds);
clearSharedVal(ref sharedVal);
y++;
}
/*****************************************************************************/
public static void test4(int numOfThreads)
{
timer.Reset();
timer.Start();
threadCreate(numOfThreads);
Console.WriteLine("Test Complete");
Console.WriteLine("Shared Value: {0}, Total time: {1}ms ", sharedVal, timer.ElapsedMilliseconds);
clearSharedVal(ref sharedVal);
y++;
}
/*****************************************************************************/
public static void test5(int numOfThreads)
{
timer.Reset();
timer.Start();
threadCreate(numOfThreads);
Console.WriteLine("Test Complete");
Console.WriteLine("Shared Value: {0}, Total time: {1}ms ",sharedVal, timer.ElapsedMilliseconds);
clearSharedVal(ref sharedVal);
}
/*****************************************************************************/
public static void threadCreate(int n)
{
Thread[] threadArr = new Thread[n];
for (int i = 0; i < n; i++)
{
threadArr[i] = new Thread(new ThreadStart(WorkThreadFunction));
threadArr[i].Start();
}
//join for loop
for (int i = 0; i < n; i++)
{
threadArr[i].Join();
}
Console.WriteLine("All Threads have been joined");
timer.Stop();
}
/******************************************************************************/
public static void WorkThreadFunction()
{
switch (y) {
case 1:
intChanger(ref sharedVal, 500, 1);
break;
case 2:
intChanger(ref sharedVal, 500, 2);
break;
case 3:
intChanger(ref sharedVal, 500, 3);
break;
case 4:
intChanger(ref sharedVal, 500, 4);
break;
}
}
/*****************************************************************************/
public static void intChanger(ref int s, int n , int flag)
{
/*******************************************************************************
* Flag Explanation *
********************************************************************************
* int flag = 0 --> unsynchronized single thread adding to shared value *
* int flag = 1 --> unsynchronized threads adding to shared value *
* int flag = 2 --> synchronized using a mutex *
* int flag = 3 --> synchronized using Increment method of Interlocked class *
* int flag = 4 --> synchronized using semaphore *
********************************************************************************/
Random rand = new Random();
switch (flag)
{
case 0:
for (int i = 0; i < n; i++)
{
s++;
}
break;
case 1:
for (int i = 0; i < n; i++)
{
s++;
}
timer.Stop();
break;
case 2:
for (int i = 0; i < n; i++)
{
mutex.WaitOne();
Thread.Sleep(rand.Next(0, 10));
s++;
mutex.ReleaseMutex();
}
break;
case 3:
for (int i = 0; i < n; i++)
{
//Thread.Sleep(rand.Next(0, 10) * 1000);
Interlocked.Increment(ref s);
}
timer.Stop();
break;
case 4:
for (int i = 0; i < n; i++)
{
semaphore.WaitOne();
Thread.Sleep(rand.Next(0, 10));
s++;
semaphore.Release();
}
break;
}
}
/*****************************************************************************/
public static void clearSharedVal(ref int n)
{
n = 0;
}
}
}

Well, 5000 increments is a pretty simple task for a modern PC, so 0ms is expected for me. Try to increase a number of operations, or compare ElapsedTicks instead of ElapsedMilliseconds.
About #2. Creation of a new Thread is a heavy task, and the job itself is a light task. So, probably, to the time the next thread is ready to run, the previous one already finishes. They are running serially, not in parallel. You may again increase the number of operations, and use threadpool methods to run the job.
Edit:
Keep a note, that Thread.Sleep method works not as you may expect for very low delays (below 15-20ms): If dwMilliseconds is less than the resolution of the system clock, the thread may sleep for less than the specified length of time.

Related

Multithreads - passing arguments and receiving results

I am trying various options on working with threads. I wrote the code below, but it does not work as expected. How can I fix the code, so that the main function will correctly display the product?
using System;
using System.Threading;
namespace MultiThreads
{
class Program
{
static int prod;
public static void Main(string[] args)
{
Thread thread = new Thread(() => Multiply(2, 3));
thread.Start();
for(int i = 0; i < 10; i++) { // do some other work until thread completes
Console.Write(i + " ");
Thread.Sleep(100);
}
Console.WriteLine();
Console.WriteLine("Prod = " + prod); // I expect 6 and it shows 0
Console.ReadKey(true);
}
public static void Multiply(int a, int b)
{
Thread.Sleep(2000);
prod = a * b;
}
}
}
Ignoring the fact that you should be using non-blocking tasks, volatile properties and other coroutine principals, the immediate reason your program does not work as intended is because you didn't re-join the child thread back into the parent. See Join
Without the join, the Console.WriteLine("Prod = " + prod); occurs before the assignment prod = a * b;
static int prod;
static void Main(string[] args)
{
Thread thread = new Thread(() => Multiply(2, 3));
thread.Start();
for (int i = 0; i < 10; i++)
{ // do some other work until thread completes
Console.Write(i + " ");
Thread.Sleep(100);
}
thread.Join(); // Halt current thread until the other one finishes.
Console.WriteLine();
Console.WriteLine("Prod = " + prod); // I expect 6 and it shows 0
Console.ReadKey(true);
}
public static void Multiply(int a, int b)
{
Thread.Sleep(2000);
prod = a * b;
}

Task.Delay delays too long

I've created a multi task program. This program has around 20 main tasks and each of them calls some sub tasks to operate file I/Os. I wanted each main task to repeat periodically every 500ms, so I enterd the code Task.Delay(500).
The problem is Task.Delay delays a lot more than 500ms sometimes. There is a case it delays more than 3 seconds.
How can I fix it?
The original progam is so big that I created a sample program below.
(1) If Task.Delay is on, over-delay happens.
(2) If Thead.Sleep is on, over-delay doesn't happen.
ThreadPool.SetMinThreads() doesn't seem to resolve it.
Thanks.
class Program
{
const int DELAY_TIME = 500;
const int TASKS = 100;
const int WAITS = 100;
const int WARNING_THRESHOLD = 100;
static void Main(string[] args)
{
//ThreadPool.SetMinThreads(workerThreads: 200, completionPortThreads: 200);
Console.WriteLine("*** Start...");
Test();
Console.WriteLine("*** Done!");
Console.ReadKey();
}
private static void Test()
{
List<Task> tasks = new List<Task>();
for (int taskId = 0; taskId < TASKS; taskId++)
{
tasks.Add(DelaysAsync(taskId));
}
Task.WaitAll(tasks.ToArray());
}
static async Task DelaysAsync(int taskId)
{
await Task.Yield();
Stopwatch sw = new Stopwatch();
for (int i = 0; i < WAITS; i++)
{
sw.Reset();
sw.Start();
await Task.Delay(DELAY_TIME).ConfigureAwait(false); // (1)
//Thread.Sleep(DELAY_TIME); // (2)
sw.Stop();
Console.Write($"Task({taskId})_iter({i}) Elapsed={sw.ElapsedMilliseconds}");
if (sw.ElapsedMilliseconds > DELAY_TIME + WARNING_THRESHOLD)
{
Console.WriteLine(" *********** Too late!! ************");
}
else
{
Console.WriteLine();
}
}
}
}
I’ve run your test, with .NET 4.6.1 and VS 2017. Here on Xeon E3-1230 v3 CPU it never printed “Too late”, the Elapsed value was within 498-527 ms.
The Thread.Sleep version performed very similarly, 500-528ms per sleep, however the total execution time was much longer because the runtime refused to create 100 OS threads, that’s way too many, so less than 100 DelaysAsync functions ran in parallel. The debugger showed me there were 27 worker threads in Thread.Sleep version and only 9 worker threads in Task.Delay version.
I think you have other apps on your PC creating too many threads and consuming too much CPU. Windows tries to load balance threads evenly so when the whole system is CPU bound, more native threads = more CPU time and therefore less jitter.
If that’s your case and you want to prioritize your app in the scheduler, instead of using Thread.Sleep and more threads, raise the priority of your process.
It seems that I could find the answer. I changed the previous sample program like below. The main difference is using StopWatch or DateTime to measure time durations.
In StopWatch version, many delays happen.
In DateTime version, no or at least very little delays happen(s).
I guess that the cause is the contention of Timer that is used by both StopWatch and Task.Delay. I concluded that I should not use StopWatch and Task.Delay together.
Thank you.
class Program
{
const int DELAY_TIME = 500;
const int TASKS = 100;
const int WAITS = 100;
const int WARNING_THRESHOLD = 500;
static void Main(string[] args)
{
using (Process p = Process.GetCurrentProcess())
{
p.PriorityClass = ProcessPriorityClass.RealTime;
//ThreadPool.SetMinThreads(workerThreads: 200, completionPortThreads: 200);
int workerThreads;
int completionPortThreads;
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
Console.WriteLine($"{workerThreads}, {completionPortThreads}");
Console.WriteLine("*** Start...");
Test();
Console.WriteLine("*** Done!");
Console.ReadKey();
}
}
private static void Test()
{
int totalCount = 0;
List<Task<int>> tasks = new List<Task<int>>();
for (int taskId = 0; taskId < TASKS; taskId++)
{
//tasks.Add(DelaysWithStopWatchAsync(taskId)); // many delays
tasks.Add(DelaysWithDateTimeAsync(taskId)); // no delays
}
Task.WaitAll(tasks.ToArray());
foreach (var task in tasks)
{
totalCount += task.Result;
}
Console.WriteLine($"Total counts of deday = {totalCount}");
}
static async Task<int> DelaysWithStopWatchAsync(int taskId)
{
await Task.Yield();
int count = 0;
Stopwatch sw = new Stopwatch();
for (int i = 0; i < WAITS; i++)
{
sw.Reset();
sw.Start();
await Task.Delay(DELAY_TIME).ConfigureAwait(false); // (1)
//Thread.Sleep(DELAY_TIME); // (2)
sw.Stop();
Console.Write($"task({taskId})_iter({i}) elapsed={sw.ElapsedMilliseconds}");
if (sw.ElapsedMilliseconds > DELAY_TIME + WARNING_THRESHOLD)
{
Console.WriteLine($" *********** Too late!! ************");
count++;
}
else
{
Console.WriteLine();
}
}
return count;
}
static async Task<int> DelaysWithDateTimeAsync(int taskId)
{
await Task.Yield();
int count = 0;
for (int i = 0; i < WAITS; i++)
{
DateTime start = DateTime.Now;
await Task.Delay(DELAY_TIME).ConfigureAwait(false); // (1)
//Thread.Sleep(DELAY_TIME); // (2)
DateTime end = DateTime.Now;
int duration = (end - start).Milliseconds;
Console.Write($"Task({taskId})_iter({i}) Elapsed={duration}");
if (duration > DELAY_TIME + WARNING_THRESHOLD)
{
Console.WriteLine($" *********** Too late!! ************");
count++;
}
else
{
Console.WriteLine();
}
}
return count;
}
}

Best way to let many worker-threads wait for mainthread and vice versa

I'm looking for a fast way to let many worker threads wait for an event to continue and block the main thread until all worker threads are finished. I first used TPL or AutoResetEvent but since my calculation isn't that expensive the overhead was way too much.
I found a pretty interesting article concerning this problem and got great results (using only one worker thread) with the last synchronization solution (Interlocked.CompareExchange). But I don't know how to utilize it for a scenario where many threads wait for one main tread repeatedly.
Here is an example using single thread, CompareExchange, and Barrier:
static void Main(string[] args)
{
int cnt = 1000000;
var stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i < cnt; i++) { }
Console.WriteLine($"Single thread: {stopwatch.Elapsed.TotalSeconds}s");
var run = true;
Task task;
stopwatch.Restart();
int interlock = 0;
task = Task.Run(() =>
{
while (run)
{
while (Interlocked.CompareExchange(ref interlock, 0, 1) != 1) { Thread.Sleep(0); }
interlock = 2;
}
Console.WriteLine($"CompareExchange synced: {stopwatch.Elapsed.TotalSeconds}s");
});
for (int i = 0; i < cnt; i++)
{
interlock = 1;
while (Interlocked.CompareExchange(ref interlock, 0, 2) != 2) { Thread.Sleep(0); }
}
run = false;
interlock = 1;
task.Wait();
run = true;
var barrier = new Barrier(2);
stopwatch.Restart();
task = Task.Run(() =>
{
while (run) { barrier.SignalAndWait(); }
Console.WriteLine($"Barrier synced: {stopwatch.Elapsed.TotalSeconds}s");
});
for (int i = 0; i < cnt; i++) { barrier.SignalAndWait(); }
Thread.Sleep(0);
run = false;
if (barrier.ParticipantsRemaining == 1) { barrier.SignalAndWait(); }
task.Wait();
Console.ReadKey();
}
Average results (in seconds) are:
Single thread: 0,002
CompareExchange: 0,4
Barrier: 1,7
As you can see Barriers' overhead seems to be arround 4 times higher! If someone can rebuild me the CompareExchange-scenario to work with multiple worker threads this would surely help, too!
Sure, 1 second overhead for a million calculations is pretty less! Actually it just interests me.
Edit:
System.Threading.Barrier seems to be the fastest solution for this scenario. For saving a double blocking (all workers ready for work, all workes finished) I used the following code for the best results:
while(work)
{
while (barrier.ParticipantsRemaining > 1) { Thread.Sleep(0); }
//Set work package
barrier.SignalAndWait()
}
It seems like you might want to use a Barrier to synchronise a number of workers with a main thread.
Here's a compilable example. Have a play with it, paying attention to when the output tells you that you can "Press <Return> to signal the workers to start".
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace Demo
{
static class Program
{
static void Main()
{
print("Main thread is starting the workers.");
int numWorkers = 10;
var barrier = new Barrier(numWorkers + 1); // Workers + main (controlling) thread.
for (int i = 0; i < numWorkers; ++i)
{
int n = i; // Prevent modified closure.
Task.Run(() => worker(barrier, n));
}
while (true)
{
print("***************** Press <RETURN> to signal the workers to start");
Console.ReadLine();
print("Main thread is signalling all the workers to start.");
// This will wait for all the workers to issue their call to
// barrier.SignalAndWait() before it returns:
barrier.SignalAndWait();
// At this point, all workers AND the main thread are at the same point.
}
}
static void worker(Barrier barrier, int workerNumber)
{
int iter = 0;
while (true)
{
print($"Worker {workerNumber} on iteration {iter} is waiting for barrier.");
// This will wait for all the other workers AND the main thread
// to issue their call to barrier.SignalAndWait() before it returns:
barrier.SignalAndWait();
// At this point, all workers AND the main thread are at the same point.
int delay = randomDelayMilliseconds();
print($"Worker {workerNumber} got barrier, now sleeping for {delay}");
Thread.Sleep(delay);
print($"Worker {workerNumber} finished work for iteration {iter}.");
}
}
static void print(string message)
{
Console.WriteLine($"[{sw.ElapsedMilliseconds:00000}] {message}");
}
static int randomDelayMilliseconds()
{
lock (rng)
{
return rng.Next(10000) + 5000;
}
}
static Random rng = new Random();
static Stopwatch sw = Stopwatch.StartNew();
}
}

How to block new threads until all threads are created and started

I am building a small application simulating a horse race in order to gain some basic skill in working with threads.
My code contains this loop:
for (int i = 0; i < numberOfHorses; i++)
{
horsesThreads[i] = new Thread(horsesTypes[i].Race);
horsesThreads[i].Start(100);
}
In order to keep the race 'fair', I've been looking for a way to make all newly created threads wait until the rest of the new threads are set, and only then launch all of them to start running their methods (Please note that I understand that technically the threads can't be launched at the 'same time')
So basically, I am looking for something like this:
for (int i = 0; i < numberOfHorses; i++)
{
horsesThreads[i] = new Thread(horsesTypes[i].Race);
}
Monitor.LaunchThreads(horsesThreads);
Threading does not promise fairness or deterministic results, so it's not a good way to simulate a race.
Having said that, there are some sync objects that might do what you ask. I think the Barrier class (Fx 4+) is what you want.
The Barrier class is designed to support this.
Here's an example:
using System;
using System.Threading;
namespace Demo
{
class Program
{
private void run()
{
int numberOfHorses = 12;
// Use a barrier with a participant count that is one more than the
// the number of threads. The extra one is for the main thread,
// which is used to signal the start of the race.
using (Barrier barrier = new Barrier(numberOfHorses + 1))
{
var horsesThreads = new Thread[numberOfHorses];
for (int i = 0; i < numberOfHorses; i++)
{
int horseNumber = i;
horsesThreads[i] = new Thread(() => runRace(horseNumber, barrier));
horsesThreads[i].Start();
}
Console.WriteLine("Press <RETURN> to start the race!");
Console.ReadLine();
// Signals the start of the race. None of the threads that called
// SignalAndWait() will return from the call until *all* the
// participants have signalled the barrier.
barrier.SignalAndWait();
Console.WriteLine("Race started!");
Console.ReadLine();
}
}
private static void runRace(int horseNumber, Barrier barrier)
{
Console.WriteLine("Horse " + horseNumber + " is waiting to start.");
barrier.SignalAndWait();
Console.WriteLine("Horse " + horseNumber + " has started.");
}
private static void Main()
{
new Program().run();
}
}
}
[EDIT] I just noticed that Henk already mentioned Barrier, but I'll leave this answer here because it has some sample code.
I'd be looking at a ManualResetEvent as a gate; inside the Thread, decrement a counter; if it is still non-zero, wait on the gate; otherwise, open the gate. Basically:
using System;
using System.Threading;
class Program
{
static void Main()
{
ManualResetEvent gate = new ManualResetEvent(false);
int numberOfThreads = 10, pending = numberOfThreads;
Thread[] threads = new Thread[numberOfThreads];
ParameterizedThreadStart work = name =>
{
Console.WriteLine("{0} approaches the tape", name);
if (Interlocked.Decrement(ref pending) == 0)
{
Console.WriteLine("And they're off!");
gate.Set();
}
else gate.WaitOne();
Race();
Console.WriteLine("{0} crosses the line", name);
};
for (int i = 0; i < numberOfThreads; i++)
{
threads[i] = new Thread(work);
threads[i].Start(i);
}
for (int i = 0; i < numberOfThreads; i++)
{
threads[i].Join();
}
Console.WriteLine("all done");
}
static readonly Random rand = new Random();
static void Race()
{
int time;
lock (rand)
{
time = rand.Next(500,1000);
}
Thread.Sleep(time);
}
}

Dual-core performance worse than single core?

The following nunit test compares performance between running a single thread versus running 2 threads on a dual core machine. Specifically, this is a VMWare dual core virtual Windows 7 machine running on a quad core Linux SLED host with is a Dell Inspiron 503.
Each thread simply loops and increments 2 counters, addCounter and readCounter. This test was original testing a Queue implementation which was discovered to perform worse on a multi-core machine. So in narrowing down the problem to the small reproducible code, you have here no queue only incrementing variables and to shock and dismay, it's far slower with 2 threads then one.
When running the first test, the Task Manager shows 1 of the cores 100% busy with the other core almost idle. Here's the test output for the single thread test:
readCounter 360687000
readCounter2 0
total readCounter 360687000
addCounter 360687000
addCounter2 0
You see over 360 Million increments!
Next the dual thread test shows 100% busy on both cores for the whole 5 seconds duration of the test. However it's output shows only:
readCounter 88687000
readCounter2 134606500
totoal readCounter 223293500
addCounter 88687000
addCounter2 67303250
addFailure0
That's only 223 Million read increments. What is god's creation are those 2 CPU's doing for those 5 seconds to get less work done?
Any possible clue? And can you run the tests on your machine to see if you get different results? One idea is that perhaps the VMWare dual core performance isn't what you would hope.
using System;
using System.Threading;
using NUnit.Framework;
namespace TickZoom.Utilities.TickZoom.Utilities
{
[TestFixture]
public class ActiveMultiQueueTest
{
private volatile bool stopThread = false;
private Exception threadException;
private long addCounter;
private long readCounter;
private long addCounter2;
private long readCounter2;
private long addFailureCounter;
[SetUp]
public void Setup()
{
stopThread = false;
addCounter = 0;
readCounter = 0;
addCounter2 = 0;
readCounter2 = 0;
}
[Test]
public void TestSingleCoreSpeed()
{
var speedThread = new Thread(SpeedTestLoop);
speedThread.Name = "1st Core Speed Test";
speedThread.Start();
Thread.Sleep(5000);
stopThread = true;
speedThread.Join();
if (threadException != null)
{
throw new Exception("Thread failed: ", threadException);
}
Console.Out.WriteLine("readCounter " + readCounter);
Console.Out.WriteLine("readCounter2 " + readCounter2);
Console.Out.WriteLine("total readCounter " + (readCounter + readCounter2));
Console.Out.WriteLine("addCounter " + addCounter);
Console.Out.WriteLine("addCounter2 " + addCounter2);
}
[Test]
public void TestDualCoreSpeed()
{
var speedThread1 = new Thread(SpeedTestLoop);
speedThread1.Name = "Speed Test 1";
var speedThread2 = new Thread(SpeedTestLoop2);
speedThread2.Name = "Speed Test 2";
speedThread1.Start();
speedThread2.Start();
Thread.Sleep(5000);
stopThread = true;
speedThread1.Join();
speedThread2.Join();
if (threadException != null)
{
throw new Exception("Thread failed: ", threadException);
}
Console.Out.WriteLine("readCounter " + readCounter);
Console.Out.WriteLine("readCounter2 " + readCounter2);
Console.Out.WriteLine("totoal readCounter " + (readCounter + readCounter2));
Console.Out.WriteLine("addCounter " + addCounter);
Console.Out.WriteLine("addCounter2 " + addCounter2);
Console.Out.WriteLine("addFailure" + addFailureCounter);
}
private void SpeedTestLoop()
{
try
{
while (!stopThread)
{
for (var i = 0; i < 500; i++)
{
++addCounter;
}
for (var i = 0; i < 500; i++)
{
readCounter++;
}
}
}
catch (Exception ex)
{
threadException = ex;
}
}
private void SpeedTestLoop2()
{
try
{
while (!stopThread)
{
for (var i = 0; i < 500; i++)
{
++addCounter2;
i++;
}
for (var i = 0; i < 500; i++)
{
readCounter2++;
}
}
}
catch (Exception ex)
{
threadException = ex;
}
}
}
}
Edit: I tested the above on a quad core laptop w/o vmware and got similar degraded performance. So I wrote another test similar to the above but which has each thread method in a separate class. My purpose in doing that was to test 4 cores.
Well that test showed excelled results which improved almost linearly with 1, 2, 3, or 4 cores.
With some experimentation now on both machines it appears that the proper performance only happens if main thread methods are on different instances instead of the same instance.
In other words, if multiple threads main entry method on on the same instance of a particular class, then the performance on a multi-core will be worse for each thread you add, instead of better as you might assume.
It almost appears that the CLR is "synchronizing" so only one thread at a time can run on that method. However, my testing says that isn't the case. So it's still unclear what's happening.
But my own problem seems to be solved simply by making separate instances of methods to run threads as their starting point.
Sincerely,
Wayne
EDIT:
Here's an updated unit test that tests 1, 2, 3, & 4 threads with them all on the same instance of a class. Using arrays with variables uses in the thread loop at least 10 elements apart. And performance still degrades significantly for each thread added.
using System;
using System.Threading;
using NUnit.Framework;
namespace TickZoom.Utilities.TickZoom.Utilities
{
[TestFixture]
public class MultiCoreSameClassTest
{
private ThreadTester threadTester;
public class ThreadTester
{
private Thread[] speedThread = new Thread[400];
private long[] addCounter = new long[400];
private long[] readCounter = new long[400];
private bool[] stopThread = new bool[400];
internal Exception threadException;
private int count;
public ThreadTester(int count)
{
for( var i=0; i<speedThread.Length; i+=10)
{
speedThread[i] = new Thread(SpeedTestLoop);
}
this.count = count;
}
public void Run()
{
for (var i = 0; i < count*10; i+=10)
{
speedThread[i].Start(i);
}
}
public void Stop()
{
for (var i = 0; i < stopThread.Length; i+=10 )
{
stopThread[i] = true;
}
for (var i = 0; i < count * 10; i += 10)
{
speedThread[i].Join();
}
if (threadException != null)
{
throw new Exception("Thread failed: ", threadException);
}
}
public void Output()
{
var readSum = 0L;
var addSum = 0L;
for (var i = 0; i < count; i++)
{
readSum += readCounter[i];
addSum += addCounter[i];
}
Console.Out.WriteLine("Thread readCounter " + readSum + ", addCounter " + addSum);
}
private void SpeedTestLoop(object indexarg)
{
var index = (int) indexarg;
try
{
while (!stopThread[index*10])
{
for (var i = 0; i < 500; i++)
{
++addCounter[index*10];
}
for (var i = 0; i < 500; i++)
{
++readCounter[index*10];
}
}
}
catch (Exception ex)
{
threadException = ex;
}
}
}
[SetUp]
public void Setup()
{
}
[Test]
public void SingleCoreTest()
{
TestCores(1);
}
[Test]
public void DualCoreTest()
{
TestCores(2);
}
[Test]
public void TriCoreTest()
{
TestCores(3);
}
[Test]
public void QuadCoreTest()
{
TestCores(4);
}
public void TestCores(int numCores)
{
threadTester = new ThreadTester(numCores);
threadTester.Run();
Thread.Sleep(5000);
threadTester.Stop();
threadTester.Output();
}
}
}
That's only 223 Million read increments. What is god's creation are those 2 CPU's doing for those 5 seconds to get less work done?
You're probably running into cache contention -- when a single CPU is incrementing your integer, it can do so in its own L1 cache, but as soon as two CPUs start "fighting" over the same value, the cache line it's on has to be copied back and forth between their caches each time each one accesses it. The extra time spent copying data between caches adds up fast, especially when the operation you're doing (incrementing an integer) is so trivial.
A few things:
You should probably test each setup at least 10 times and take the average
As far as I know, Thread.sleep is not exact - it depends on how the OS switches your threads
Thread.join is not immediate. Again, it depends on how the OS switches your threads
A better way to test would be to run a computationally intensive operation (say, sum from one to a million) on two configurations and time them. For example:
Time how long it takes to sum from one to a million
Time how long it takes to sum one to 500000 on one thread and one 500001 to 1000000 on another
You were right when you thought that two threads would work faster than one thread. But yours are not the only threads running - the OS has threads, your browser has threads, and so on. Keep in mind that your timings will not be exact and may even fluctuate.
Lastly, there are other reasons(see slide 24) why threads work slower.

Categories