Using Task.WaitAll(threads) is not blocking appropriately - c#

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace Threads
{
class Program
{
static void Main(string[] args)
{
Action<int> TestingDelegate = (x321) => { Console.WriteLine(x321); };
int x123 = Environment.ProcessorCount;
MyParallelFor(0, 8, TestingDelegate);
Console.Read();
}
public static void MyParallelFor(int inclusiveLowerBound, int exclusiveUpperBound, Action<int> body)
{
int size = exclusiveUpperBound - inclusiveLowerBound;
int numProcs = Environment.ProcessorCount;
int range = size / numProcs;
var threads = new List<Task>(numProcs);
for(int p = 0; p < numProcs; p++)
{
int start = p * range + inclusiveLowerBound;
int end = (p == numProcs - 1) ? exclusiveUpperBound : start + range;
Task.Factory.StartNew(() =>
{
for (int i = start; i < end; i++) body(i);
});
}
Task.WaitAll(threads.ToArray());
Console.WriteLine("Done!");
}
}
}
Hi all, I implemented this code from the Patterns of Parallel Programming book and they do it using threads, I decided to rewrite it using the TPL library. The output below is what I get (of course it's random) however... I expect "Done!" to always be printed last. For some reason it is not doing that though. Why is it not blocking?
Done!
1
0
2
6
5
4
3
7

You did not assign any tasks to the threads list on which you are calling WaitAll, your tasks are started independently. you would create tasks and put the tasks in threads collection before you call WaitAll. You can find more how you would add the tasks in tasks list you have created in this MSDN documentation for Task.WaitAll Method (Task[])
You code would be something like
threads.Add(Task.Factory.StartNew(() =>
{
for (int i = 0; i < 10; i++) ;
}));

you are not adding task to your threads collection. So threads collection is empty. So there is no Tasks to wait for. Change code like this
threads.Add(Task.Factory.StartNew(() =>
{
for (int i = start; i < end; i++) body(i);
}));

The reasons is quite simple: You are never adding anything to the threads List. You declare it and allocate space for numProcs entries, but you never call threads.Add.
Therefore the list is still empty and hence the Task.WaitAll doesn't wait on anything.

Related

C# .NET 3 core RunSynchronously vs Run Which way is better for using google big query API InsertRowsAsync

I am working on a C# .NET core 3.1 application that needs to insert 300 - 500 million rows avro file data into a GBQ table. My idea is to batch insert the data using .Net Task to insert data asynchronously that doesn't block the main thread and when all tasks are finished, log the success or fail message. I did a sample code, if I use Task.Run(), it will break the batchId and lose some data. However, if using RunSynchronously works fine, but it will block the main thread and take some time, which is still acceptable. Just wondering if what's wrong with my code and is Task.Run() a good idea for my case. Thanks a lot! Here is my code: https://dotnetfiddle.net/CPKsMv Just in case, it doesn't work well, pasted here again:
using System;
using System.Collections;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
ArrayList forecasts = new ArrayList();
for(var k = 0; k < 100; k++){
forecasts.Add(k);
}
int size = 6;
var taskNum = (int) Math.Ceiling(forecasts.Count / (double) size);
Console.WriteLine("task number:" + taskNum);
Console.WriteLine("item number:" + forecasts.Count);
Task[] tasks = new Task[taskNum];
var i = 0;
for(i = 0; i < taskNum; i++) {
int start = i * size;
if (forecasts.Count - start < size) {
size = forecasts.Count - start;
}
// Method 1: This works well, but need take some time to finish
//tasks[i] = new Task(() => {
//var batchedforecastRows = forecasts.GetRange(start, size);
// GbqTable.InsertRowsAsync(batchedforecastRows);
//Console.WriteLine("batchID:" + (i + 1) + "["+string.Join( ",", batchedforecastRows.ToArray())+"]");
//});
// tasks[i].RunSynchronously();
// Method 2: will lose data: (94, 95) and batchId is messed
// Sample Print below:
// batchID:18 Inserted:[90,91,92,93]
// batchID:18 Inserted:[96,97,98,99]
tasks[i] = Task.Run(() => {
var batchedforecastRows = forecasts.GetRange(start, size);
// GbqTable.InsertRowsAsync(batchedforecastRows);
Console.WriteLine("batchID:" + (i + 1) + " Inserted:["+string.Join( ",", batchedforecastRows.ToArray())+"]");
});
}
}
}

Explain the following results I got from a performance test of OrderBy

I've read other SO posts on the complexity of LINQ's OrderBy function such as this one and so I'm wondering why the following test I made
using System;
using System.Linq;
using System.Collections.Generic;
using System.Diagnostics;
public class Program
{
public static void Main()
{
double[] avgs = new double[100];
int tests_per_size = 1000;
Random rnd = new Random();
Stopwatch stpw = new Stopwatch();
for(int i = 1; i <= avgs.Length; ++i)
{
double sum = 0;
int[] arr = new int[i];
for(int j = 0; j < tests_per_size; ++j)
{
for(int k = 0; k < arr.Length; ++k)
arr[k] = rnd.Next(Int32.MinValue, Int32.MaxValue);
stpw.Start();
var slist = arr.OrderBy(x => x).ToList();
stpw.Stop();
sum += stpw.ElapsedTicks;
}
avgs[i-1] = sum / (double)tests_per_size;
}
foreach(var t in avgs)
Console.WriteLine(t);
}
}
gave me the following results
15076,327
17261,652
19528,579
21993,155
24674,83
26927,163
29332,665
32018,45
35143,727
38955,111
43188,589
47605,542
52243,952
57166,918
63454,059
70261,749
75997,727
82249,885
88953,873
96958,163
104520,145
112432,1
120746,806
129694,464
138588,981
148007,988
157616,249
167493,94
177748,543
188904,677
200761,557
212235,986
225877,753
239173,783
252288,474
265901,092
279629,762
294529,835
309429,827
326944,916
343254,802
361306,427
378797,508
395831,364
413546,694
431166,319
449165,652
467562,618
487180,928
505969,021
525013,641
544555,831
564859,752
585357,237
606849,766
628464,581
651009,432
673865,517
697340,663
720709,903
744837,668
769024,863
793921,415
819441,534
845185,441
873421,004
901587,713
928140,083
955403,824
983023,284
1011295,028
1040868,504
1070366,748
1100416,455
1131158,53
1162260,852
1193641,253
1225165,58
1257410,12
1289450,658
1322668,533
1358718,074
1400162,62
1440996,876
1483102,815
1531781,127
1581157,377
1627831,867
1673969,553
1713026,287
1750012,667
1787497,946
1825893,268
1864184,643
1902912,621
1942420,978
1982395,399
2023052,109
2063803,114
2106027,85
Notice how it approximately doubles every 10 numbers.
Well, for one thing, you're never Restarting your stopwatch, so the timings you're seeing are accumulative. If you change your Start() call to Restart() you'll get some saner values.
Another important point to make is that you're only testing arrays up to a size of 100, which is not nearly enough to clearly see the algorithm's asymptotic behavior.
Finally, note that you're not just testing OrderBy(): you're also testing ToList(). The effect won't be huge, but a good test should isolate the parts that you're really interested in.

Why thread creation is so fast?

I read in books that thread creation is expensive (not so expensive like process creation but nevertheless it is) and we should avoid it. I write the test code and I was shocked how fast thread creation is.
using System;
using System.Diagnostics;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Threading;
namespace ConsoleApplication1
{
class Program
{
static int testVal = 0;
static void Main(string[] args)
{
const int ThreadsCount = 10000;
var watch = Stopwatch.StartNew();
for (int i = 0; i < ThreadsCount; i++)
{
var myThread = new Thread(MainVoid);
myThread.Start();
}
watch.Stop();
Console.WriteLine("Test value ={0}", testVal);
Console.WriteLine("Ended in {0} miliseconds", watch.ElapsedMilliseconds);
Console.WriteLine("{0} miliseconds per thread ", (double)watch.ElapsedMilliseconds / ThreadsCount);
}
static void MainVoid()
{
Interlocked.Increment(ref testVal);
}
}
}
Output:
Test value =10000
Ended in 702 miliseconds
0,0702 miliseconds per thread.
Is my code wrong or thread creation is so fast and advices in books are wrong? (I see only some extra memory consumption per thread but no creation time.)
Thread creation is pretty slow. Consider this bit of code that contrasts the speed of doing things inline, and doing things with multiple threads:
private void DoStuff()
{
const int ThreadsCount = 10000;
var sw = Stopwatch.StartNew();
int testVal = 0;
for (int i = 0; i < ThreadsCount; ++i)
{
Interlocked.Increment(ref testVal);
}
sw.Stop();
Console.WriteLine(sw.ElapsedTicks);
sw = Stopwatch.StartNew();
testVal = 0;
for (int i = 0; i < ThreadsCount; ++i)
{
var myThread = new Thread(() =>
{
Interlocked.Increment(ref testVal);
});
myThread.Start();
}
sw.Stop();
Console.WriteLine(sw.ElapsedTicks);
}
On my system, doing it inline requires 200 ticks. With threads it's almost 2 million ticks. So using threads here takes approximately 10,000 times as long. I used ElapsedTicks here rather than ElapsedMilliseconds because with ElapsedMilliseconds the output for the inline code was 0. The threads version takes around 700 milliseconds. Context switches are expensive.
In addition, your test is fundamentally flawed because you don't explicitly wait for all of the threads to finish before harvesting the result. It's quite possible that you could output the value of testVal before the last thread finishes incrementing it.
When timing code, by the way, you should be sure to run it in release mode without the debugger attached. In Visual Studio, use Ctrl+F5 (start without debugging).

Parallel.For vs for

I have a Parallel.For and a regular for loop doing some simple arithmetic, just to benchmark Parallel.For
My conclusion is that, the regular for is faster on my i5 notebook processor.
This is my code
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int Iterations = int.MaxValue / 1000;
DateTime StartTime = DateTime.MinValue;
DateTime EndTime = DateTime.MinValue;
StartTime = DateTime.Now;
Parallel.For(0, Iterations, i =>
{
OperationDoWork(i);
});
EndTime = DateTime.Now;
Console.WriteLine(EndTime.Subtract(StartTime).ToString());
StartTime = DateTime.Now;
for (int i = 0; i < Iterations; i++)
{
OperationDoWork(i);
}
EndTime = DateTime.Now;
Console.WriteLine(EndTime.Subtract(StartTime).ToString());
StartTime = DateTime.Now;
Parallel.For(0, Iterations, i =>
{
OperationDoWork(i);
});
EndTime = DateTime.Now;
Console.WriteLine(EndTime.Subtract(StartTime).ToString());
StartTime = DateTime.Now;
for (int i = 0; i < Iterations; i++)
{
OperationDoWork(i);
}
EndTime = DateTime.Now;
Console.WriteLine(EndTime.Subtract(StartTime).ToString());
}
private static void OperationDoWork(int i)
{
int a = 0;
a += i;
i = a;
a *= 2;
a = a * a;
a = i;
}
}
}
And these are my results. Which on repetition do not change much:
00:00:03.9062234
00:00:01.7971028
00:00:03.2231844
00:00:01.7781017
So why ever use Parallel.For ?
Parallel processing has organization overhead. Think of it in terms of having 100 tasks and 10 people to do them. It's not easy to have 10 people working for you, just organizing who does what costs time in addition to actually doing the 100 tasks.
So if you want to do something in parallel, make sure it's so much work that the workload of organizing the parallelism is so small compared to the actual workload that it makes sense to do it.
One of the most common mistakes one does, when first delving into multithreading, is the belief, that multithreading is a Free Lunch.
In truth, splitting your operation into multiple smaller operations, which can then run in parallel, is going to take some extra time. If badly synchronized, your tasks may well be spending even more time, waiting for other tasks to release their locks.
As a result; parallelizing is not worth the time/trouble, when each task is going to do little work, which is the case with OperationDoWork.
Edit:
Consider trying this out:
private static void OperationDoWork(int i)
{
double a = 101.1D * i;
for (int k = 0; k < 100; k++)
a = Math.Pow(a, a);
}
According to my benchmark, for will average to 5.7 seconds, while Parallel.For will take 3.05 seconds on my Core2Duo CPU (speedup == ~1.87).
On my Quadcore i7, I get an average of 5.1 seconds with for, and an average of 1.38 seconds with Parallel.For (speedup == ~3.7).
This modified code scales very well to the number of physical cores available. Q.E.D.

Multiple threads slowing down overall dictionary access?

I am profiling a C# application and it looks like two threads each calling Dictionary<>.ContainsKey() 5000 time each on two separate but identical dictionaries (with only two items) is twice as slow as one thread calling Dictionary<>.ContainsKey() on a single dictionary 10000 times.
I am measuring the "thread time" using a tool called JetBrains dotTrace. I am explicitly using copies of the same data, so there are no synhronization primitives that I am using. Is it possible that .NET is doing some synchronization behind the scenes?
I have a dual core machine, and there are three threads running: one is blocked using Semaphore.WaitAll() while the work is done on two new threads whose priority is set to ThreadPriority.Highest.
Obvious culprits like, not actually running the code in parallel, and not using a release build has been ruled out.
EDIT:
People want the code. Alright then:
private int ReduceArrayIteration(VM vm, HeronValue[] input, int begin, int cnt)
{
if (cnt <= 1)
return cnt;
int cur = begin;
for (int i=0; i < cnt - 1; i += 2)
{
// The next two calls are effectively dominated by a call
// to dictionary ContainsKey
vm.SetVar(a, input[begin + i]);
vm.SetVar(b, input[begin + i + 1]);
input[cur++] = vm.Eval(expr);
}
if (cnt % 2 == 1)
{
input[cur++] = input[begin + cnt - 1];
}
int r = cur - begin;
Debug.Assert(r >= 1);
Debug.Assert(r < cnt);
return r;
}
// From VM
public void SetVar(string s, HeronValue o)
{
Debug.Assert(o != null);
frames.Peek().SetVar(s, o);
}
// From Frame
public bool SetVar(string s, HeronValue o)
{
for (int i = scopes.Count; i > 0; --i)
{
// Scope is a derived class of Dictionary
Scope tbl = scopes[i - 1];
if (tbl.HasName(s))
{
tbl[s] = o;
return false;
}
}
return false;
}
Now here is the thread spawning code, which might be retarded:
public static class WorkSplitter
{
static WaitHandle[] signals;
public static void ThreadStarter(Object o)
{
Task task = o as Task;
task.Run();
}
public static void SplitWork(List<Task> tasks)
{
signals = new WaitHandle[tasks.Count];
for (int i = 0; i < tasks.Count; ++i)
signals[i] = tasks[i].done;
for (int i = 0; i < tasks.Count; ++i)
{
Thread t = new Thread(ThreadStarter);
t.Priority = ThreadPriority.Highest;
t.Start(tasks[i]);
}
Semaphore.WaitAll(signals);
}
}
Even if there was any locking in Dictionary (there isn't), it could not affect your measurements since each thread is using a separate one. Running this test 10,000 times is not enough to get reliable timing data, ContainsKey() only takes 20 nanoseconds or so. You'll need at least several million times to avoid scheduling artifacts.

Categories