TPL doesn’t appear to improve execution speed in MSDN example

TPL doesn’t appear to improve execution speed in MSDN example - c#

When I use the example from MSDN:
var queryA = from num in numberList.AsParallel()
select ExpensiveFunction(num); //good for PLINQ
var queryB = from num in numberList.AsParallel()
where num % 2 > 0
select num; //not as good for PLINQ
My example program:
static void Main(string[] args)
{
// ThreadPool.SetMinThreads(100, 100);
var numberList = new List<int>();
for (int i = 0; i <= 1000; i++)
{
numberList.Add(i);
}
Stopwatch sw = new Stopwatch();
sw.Start();
var queryA = from num in numberList
select ExpensiveFunction(num); //good for PLINQ
var c = queryA.ToList<int>();
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
var queryB = from num in numberList.AsParallel()
select ExpensiveFunction(num); //good for PLINQ
c = queryB.ToList<int>();
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.ReadKey();
}
static int ExpensiveFunction(int a)
{
a = a + 100 - 9 + 0 + 98;
// Console.WriteLine(a);
return a;
}
The result is:
7
41
Why is using AsParallel() slower than not using it?

Your ExpensiveFunction really isn't an expensive function for a computer.
Simple maths can be done extremely fast.
Perhaps try Thread.Sleep(500); instead. This will tell the CPU to pause for half a second which will simulate the effect of an actual expensive function.
Edit — I should state, the reason it is slower is because of the overhead parallel processing involves is more work than the actual calculation. See this answer for a better explanation.

Related

How to fill an array in multiple threads?

There is such an array, I know what is needed through Thread, but I don’t understand how to do it. Do you need to split the array into parts, or can you do something right away?
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
int[] a = new int[10000];
Random rand = new Random();
for (int i = 0; i < a.Length; i++)
{
a[i] = rand.Next(-100, 100);
}
foreach (var p in a)
Console.WriteLine(p);
TimeSpan ts = stopWatch.Elapsed;
stopWatch.Stop();
string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
ts.Hours, ts.Minutes, ts.Seconds,
ts.Milliseconds / 10);
Console.WriteLine("RunTime " + elapsedTime);

Another approach, compared to John Wu's, is to use a custom partitioner. I think that it is a little more readable.
using System.Collections.Concurrent;
using System.Threading.Tasks;
int[] a = new int[10000];
int batchSize = 1000;
Random rand = new Random();
Parallel.ForEach(Partitioner.Create(0, a.Length, batchSize), range =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
a[i] = rand.Next(-100, 100);
}
});

In modern c#, you should almost never have to use Thread objects themselves-- they are fraught with peril, and there are other language features that will do the job just as well (see async and TPL). I'll show you a way to do it with TPL.
Note: Due to the problem of false sharing, you need to rig things so that the different threads are working on different memory areas. Otherwise you will see no gain in performance-- indeed, performance could get considerably worse. In this example I divide the array into blocks of 4,000 bytes (1,000 elements) each and work on each block in a separate thread.
using System.Threading.Tasks;
var array = new int[10000];
var offsets = Enumerable.Range(0, 10).Select( x => x * 1000 );
Parallel.ForEach( offsets, offset => {
for ( int i=0; i<1000; i++ )
{
array[offset + i] = random.Next( -100,100 );
}
});
That all being said, I doubt you'll see much of a gain in performance in this example-- the array is much too small to be worth the additional overhead.

Fast comparison of two doubles ignoring everything that comes after 6 decimal digits

I try to optimize the performance of some calculation process.
Decent amount of time is wasted on calculations like the following:
var isBigger = Math.Abs((long) (a * 1e6) / 1e6D) > ((long) ((b + c) * 1e6)) / 1e6D;
where "a","b" and "c" are doubles, "b" and "c" are positive, "a" might be negative.
isBigger should be true only if absolute value of "a" is bigger than "b+c" disregarding anything after the 6th decimal digit.
So I look at this expression, I understand what it does, but it seems hugely inefficient to me, since it multiplies and divides compared numbers by million just to get rig of anything after 6 decimal places.
Below is the program I used to try and create a better solution. So far I failed.
Can someone help me?
class Program
{
static void Main(string[] args)
{
var arrLength = 1000000;
var arr1 = GetArrayOf_A(arrLength);
var arr2 = GetArrayOf_B(arrLength);
var arr3 = GetArrayOf_C(arrLength);
var result1 = new bool[arrLength];
var result2 = new bool[arrLength];
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < arrLength; i++)
{
result1[i] = Math.Abs((long) (arr1[i] * 1e6) / 1e6D)
>
(long) ((arr2[i] + arr3[i]) * 1e6) / 1e6D;
}
sw.Stop();
var t1 = sw.Elapsed.TotalMilliseconds;
sw.Restart();
for (var i = 0; i < arrLength; i++)
{
//result2[i] = Math.Round(Math.Abs(arr1[i]) - (arr2[i] + arr3[i]),6) > 0; // Incorrect, example by index = 0
//result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.000001; // Incorrect, example by index = 1
//result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.0000001; // Incorrect, example by index = 2
result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.00000001; // Incorrect, example by index = 3
}
sw.Stop();
var t2 = sw.Elapsed.TotalMilliseconds;
var areEquivalent = true;
for (var i = 0; i < arrLength; i++)
{
if (result1[i] == result2[i]) continue;
areEquivalent = false;
break;
}
Console.WriteLine($"Functions are equivalent : {areEquivalent}");
if (areEquivalent)
{
Console.WriteLine($"Current function total time: {t1}ms");
Console.WriteLine($"Equivalent function total time: {t2}ms");
}
Console.WriteLine("Press ANY key to quit . . .");
Console.ReadKey();
}
private static readonly Random _rand = new Random(DateTime.Now.Millisecond);
private const int NumberOfRepresentativeExamples = 4;
private static double[] GetArrayOf_A(int arrLength)
{
if(arrLength<=NumberOfRepresentativeExamples)
throw new ArgumentException($"{nameof(arrLength)} should be bigger than {NumberOfRepresentativeExamples}");
var arr = new double[arrLength];
// Representative numbers
arr[0] = 2.4486382579120365;
arr[1] = -1.1716818990000011;
arr[2] = 5.996414627393257;
arr[3] = 6.0740085822069;
// the rest is to build time statistics
FillTheRestOfArray(arr);
return arr;
}
private static double[] GetArrayOf_B(int arrLength)
{
if(arrLength<=NumberOfRepresentativeExamples)
throw new ArgumentException($"{nameof(arrLength)} should be bigger than {NumberOfRepresentativeExamples}");
var arr = new double[arrLength];
// Representative numbers
arr[0] = 2.057823225;
arr[1] = 0;
arr[2] = 2.057823225;
arr[3] = 2.060649901;
// the rest is to build time statistics
FillTheRestOfArray(arr);
return arr;
}
private static double[] GetArrayOf_C(int arrLength)
{
if(arrLength<=NumberOfRepresentativeExamples)
throw new ArgumentException($"{nameof(arrLength)} should be bigger than {NumberOfRepresentativeExamples}");
var arr = new double[arrLength];
// Representative numbers
arr[0] = 0.3908145999796302;
arr[1] = 1.1716809269999997;
arr[2] = 3.9385910820740282;
arr[3] = 4.0133582670728858;
// the rest is to build time statistics
FillTheRestOfArray(arr);
return arr;
}
private static void FillTheRestOfArray(double[] arr)
{
for (var i = NumberOfRepresentativeExamples; i < arr.Length; i++)
{
arr[i] = _rand.Next(0, 10) + _rand.NextDouble();
}
}
}

You don't need the division since if (x/100) < (y/100) that means that x<y.
for(var i = 0; i < arrLength; i++)
{
result2[i] = Math.Abs((long)(arr1[i] * 1e6))
> (long)((arr2[i] + arr3[i]) * 1e6);
}
with the results for me:
Arrays have 1000000 elements.
Functions are equivalent : True
Current function total time: 40.10ms 24.94 kflop
Equivalent function total time: 22.42ms 44.60 kflop
A speedup of 78.83 %
PS. Make sure you compare RELEASE versions of the binary which includes math optimizations.
PS2. The display code is
Console.WriteLine($"Arrays have {arrLength} elements.");
Console.WriteLine($"Functions are equivalent : {areEquivalent}");
Console.WriteLine($" Current function total time: {t1:F2}ms {arrLength/t1/1e3:F2} kflop");
Console.WriteLine($"Equivalent function total time: {t2:F2}ms {arrLength/t2/1e3:F2} kflop");
Console.WriteLine($"An speedup of {t1/t2-1:P2}");

Overall your question goes into the Area of Realtime Programming. Not nessesarily realtime constraint, but it goes into teh same optimisation territory. The kind where every last nanosecond has be shaved off.
.NET is not the ideal scenario for this kind of operation. Usually that thing is done in dedicated lanagauges. The next best thing is doing it in Assembler, C or native C++. .NET has additional features like the Garbage Collector and Just In Time compiler that make even getting reliable benchmark results tricky. Much less reliale runtime performance.
For the datatypes, Float should be about the fastest operation there is. For historical reasons float opeations have been optimized.
One of your comment mentions physics and you do have an array. And I see stuff like array[i] = array2[i] + array3[i]. So maybe this should be a matrix operation you run on the GPU instead? This kind of "huge paralellized array opeartions" is exactly what the GPU is good at. Exactly what drawing on the screen is at it's core.
Unless you tell us what you are actually doing here as sa operation, that is about the best answer I can give.

Is this what you're looking for?
Math.Abs(a) - (b + c) > 0.000001
or if you want to know if the difference is bigger (difference either way):
Math.Abs(Math.Abs(a) - (b + c)) > 0.000001
(I'm assuming you're not limiting to this precision because of speed but because of inherent floating point limited precision.)

In addition to asking this question on this site, I also asked a good friend of mine, and so far he provided the best answer. Here it is:
result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.000001 ||
Math.Abs((long)(arr1[i] * 1e6)) > (long)((arr2[i] + arr3[i])*1e6);
I am happy to have such friends :)

C#: Why is a function call faster than manual inlining?

I have measured the execution time for two ways of calculating the power of 2:
1) Inline
result = b * b;
2) With a simple function call
result = Power(b);
When running in Debug mode, everything is as expected: Calling a function is considerably more expensive than doing the calculation in line (385 ms in line vs. 570 ms function call).
In release mode, I'd expect the compiler to speed up execution time of the function call considerably because the compiler would inline internally the very small Power() function. But I'd NOT expect the function call to be FASTER than the manual inlined calculation.
Most astonishingly this is the case: In the release build, the first run needs 109 ms and the second run with the call to Power() needs only 62 ms.
How can a function call be faster than manual inlining?
Here is the program for your reproduction:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Starting Test");
// 1. Calculating inline without function call
Stopwatch sw = Stopwatch.StartNew();
for (double d = 0; d < 100000000; d++)
{
double res = d * d;
}
sw.Stop();
Console.WriteLine("Checked: " + sw.ElapsedMilliseconds);
// 2. Calulating power with function call
Stopwatch sw2 = Stopwatch.StartNew();
for (int d = 0; d < 100000000; d++)
{
double res = Power(d);
}
sw2.Stop();
Console.WriteLine("Function: " + sw2.ElapsedMilliseconds);
Console.ReadKey();
}
static double Power(double d)
{
return d * d;
}
}

Your test is wrong. In the second part you use a int d instead of a double. Maybe it explains the time difference.

As Xavier correctly spotted, you are using double in one loop and int in the other. Changing both to the same type will make the results the same - I tested it.
Furthermore: What you are really measuring here is the duration of additions and comparisons. You are not measuring the duration of the squaring of d, because it simply is not happening: In a release build, the optimizer completely removes the body of the loop, because the result is not used. You can confirm this by commenting out the body of the loop. The duration will be the same.

Daniel Hilgarth is right, calculation is not happening at all as the result of is not used (which is probably not the case in Debug mode). Try the following example and you'll get correct results:
static void Main(string[] args)
{
Console.WriteLine("Starting Test");
var list = new List<int>();
// 1. Calculating inline without function call
Stopwatch sw = Stopwatch.StartNew();
for (int d = 0; d < 100000000; d++)
{
int res = d * d;
list.Add(res);
}
sw.Stop();
Console.WriteLine("Checked: " + sw.ElapsedMilliseconds);
// 2. Calulating power with function call
list = new List<int>();
Stopwatch sw2 = Stopwatch.StartNew();
for (int d = 0; d < 100000000; d++)
{
int res = Power(d);
list.Add(res);
}
sw2.Stop();
Console.WriteLine("Function: " + sw2.ElapsedMilliseconds);
Console.ReadKey();
}

Exit the loop after specific time in C#

I have a requirement in my project (C#, VS2010, .NET 4.0) that a particular for loop must finish within 200 milliseconds. If it doesn't then it has to terminate after this duration without executing the remaining iterations. The loop generally goes for i = 0 to about 500,000 to 700,000 so the total loop time varies.
I have read following questions which are similar but they didn't help in my case:
What is the best way to exit out of a loop after an elapsed time of 30ms in C++
How to execute the loop for specific time
So far I have tried using a Stopwatch object to track the elapsed time but it's not working for me. Here are 2 different methods I have tried so far:
Method 1. Comparing the elapsed time within for loop:
Stopwatch sw = new Stopwatch();
sw.Start();
for (i = 0; i < nEntries; i++) // nEntries is typically more than 500,000
{
// Do some stuff
...
...
...
if (sw.Elapsed > TimeSpan.FromMilliseconds(200))
break;
}
sw.Stop();
This doesn't work because if (sw.Elapsed > TimeSpan.FromMilliseconds(200)) takes more than 200 milliseconds to complete. Hence useless in my case. I am not sure whether TimeSpan.FromMilliseconds() generally takes this long or it's just in my case for some reason.
Method 2. Creating a separate thread to compare time:
Stopwatch sw = new Stopwatch();
sw.Start();
bool bDoExit = false;
int msLimit = 200;
System.Threading.ThreadPool.QueueUserWorkItem((x) =>
{
while (bDoExit == false)
{
if (sw.Elapsed.Milliseconds > msLimit)
{
bDoExit = true;
sw.Stop();
}
System.Threading.Thread.Sleep(10);
}
});
for (i = 0; i < nEntries; i++) // nEntries is typically more than 500,000
{
// Do some stuff
...
...
...
if (bDoExit == true)
break;
}
sw.Stop();
I have some other code in the for loop that prints some statistics. It tells me that in case of Method 2, the for loop definitely breaks before completing all the iterations but the loop timing is still 280-300 milliseconds.
Any suggestions to break a for loop strictly with-in 200 milliseconds or less?
Thanks.

For a faster comparison try comparing
if(sw.ElapsedMilliseconds > 200)
break;
You should do that check in the beggining of your loop and also during the processing, ("// Do some stuff" part of the code) because it is possible, for example, that processing starts at 190 (beginning of the loop), lasts 20 and ends at 210.
You could also measure average execution time of your processing (this is approximate because it relies on average time), this way loop should last 200 milliseconds or less, here is a demo that you can put in a Main method of a Console application and easily modify it for your application:
Stopwatch sw = new Stopwatch();
sw.Start();
string a = String.Empty;
int i;
decimal sum = 0, avg = 0, beginning = 0, end = 0;
for (i = 0; i < 700000; i++) // nEntries is typically more than 500,000
{
beginning = sw.ElapsedMilliseconds;
if (sw.ElapsedMilliseconds + avg > 200)
break;
// Some processing
a += "x";
int s = a.Length * 100;
Thread.Sleep(19);
/////////////
end = sw.ElapsedMilliseconds;
sum += end - beginning;
avg = sum / (i + 1);
}
sw.Stop();
Console.WriteLine(
"avg:{0}, count:{1}, milliseconds elapsed:{2}", avg, i + 1,
sw.ElapsedMilliseconds);
Console.ReadKey();

Another option would be to use CancellationTokenSource:
CancellationTokenSource source = new CancellationTokenSource(100);
while(!source.IsCancellationRequested)
{
// Do stuff
}

Use the first one - simple and have better chances to be precise than second one.
Both cases have the same kind of termination condition, so both should behave are more-or-less the same. Second is much more complicated due to usage of threads and Sleep, so I'd use first one. Also second one is much less precise due to sleeps.
There are abolutely no reasons for TimeSpan.FromMilliseconds(200) to take any significant amount of time (as well as calling it in every iteration).

Using cancellation token:
var cancellationToken = new CancellationTokenSource(TimeSpan.FromSeconds(15)).Token;
while (!cancellationToken.IsCancellationRequested)
{
//Do stuff...
}

I don't know if this is that exactly, but I think it's worth a try using a System.Timers.Timer:
int msLimit = 200;
int nEntries = 500000;
bool cancel = false;
System.Timers.Timer t = new System.Timers.Timer();
t.Interval = msLimit;
t.Elapsed += (s, e) => cancel = true;
t.Start();
for (int i = 0; i < nEntries; i++)
{
// do sth
if (cancel) {
break;
}
}

Parallel Fibonacci Number Calculator

I'm using Task Parallel Library (TPL ) for calculating Fibonacci number.
Program is given below:
public static int Fib(int n)
{
if (n <= 1)
{
return n;
}
Task<int> task = Task.Factory.StartNew<int>(() => Fib(n - 1));
var p = Fib(n - 2);
return task.Result + p;
}
public static void Main(string[] args)
{
Stopwatch watch = new Stopwatch();
watch.Start();
Console.WriteLine("Answer: " + Fib(44));
watch.Stop();
Console.WriteLine("Time: " + watch.ElapsedMilliseconds);
}
}
Unfortunately this program takes a very long time to complete.
But serial version of this program ( as given below ) takes less than 30 seconds
to calculate 44th Fibonacci number.
public class FibTester
{
public static int Fib(int n)
{
if (n <= 1)
{
return n;
}
var q = Fib(n - 1);
var p = Fib(n - 2);
return p + q;
}
public static void Main(string[] args)
{
Stopwatch watch = new Stopwatch();
watch.Start();
Console.WriteLine("Answer: " + Fib(44));
watch.Stop();
Console.WriteLine("Time: " + watch.ElapsedMilliseconds);
}
}
I think issue in parallel version is, it creates a thread for each Fib(n - 1)
request. Is there any way to control number of thread created in TPL?

This is a perfect example of how not to multithread!
You are creating a new task for each iteration of a recursive function. So each task creates a new task, waits for that task to finish and then adds the numbers from the result.
Each thread has two jobs : 1 - to create a new thread, 2 - to add two numbers.
The overhead cost for creating each thread is going to far outweigh the cost of adding two numbers together.
To answer your question about limiting the number of threads created, the TPL uses the ThreadPool. You can limit the number of threads using ThreadPool.SetMaxThreads.

I think it is pretty clear that fibonacci cannot be parallelized unless you know some pairs of adjacent fibonacci numbers ahead of time
Just go for the iterative code.
Whatever you do, don't spawn a Task/Thread on each iteration/recursion! The overhead will kill the performance. That is a big fat Anti Pattern even if parallellization applies.

Just for fun :)
using System;
using System.Linq;
using System.Threading.Tasks;
public class Program
{
static readonly double sqrt5 = Math.Sqrt(5);
static readonly double p1 = (1 + sqrt5) / 2;
static readonly double p2 = -1 * (p1 - 1);
static ulong Fib1(int n) // surprisingly slightly slower than Fib2
{
double n1 = Math.Pow(p1, n+1);
double n2 = Math.Pow(p2, n+1);
return (ulong)((n1-n2)/sqrt5);
}
static ulong Fib2(int n) // 40x faster than Fib3
{
double n1 = 1.0;
double n2 = 1.0;
for (int i=0; i<n+1; i++)
{
n1*=p1;
n2*=p2;
}
return (ulong)((n1-n2)/sqrt5);
}
static ulong Fib3(int n) // that's fast! Done in 1.32s
{
double n1 = 1.0;
double n2 = 1.0;
Parallel.For(0,n+1,(x)=> {
n1 *= p1;
n2 *= p2;
});
return (ulong)((n1-n2)/sqrt5);
}
public static void Main(string[] args)
{
for (int j=0; j<100000; j++)
for (int i=0; i<90; i++)
Fib1(i);
for (int i=0; i<90; i++)
Console.WriteLine(Fib1(i));
}
}

Your program is very inneficient, because same calculation are repeated (Fib(n-1) actually recalculate the Fib number for all numbers < n -2, which has be done yet).
You should try this :
class Program
{
static void Main(string[] args)
{
var sw = new Stopwatch();
sw.Start();
foreach (var nbr in Fibo().Take(5000))
{
Console.Write(nbr.ToString() + " ");
}
sw.Stop();
Console.WriteLine();
Console.WriteLine("Ellapsed : " + sw.Elapsed.ToString());
Console.ReadLine();
}
static IEnumerable<long> Fibo()
{
long a = 0;
long b = 1;
long t;
while (true)
{
t = a + b;
yield return t;
a = b;
b = t;
}
}
}
44th find in 5ms.
The slowest part of the code is the Console.Write in the loop.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

TPL doesn’t appear to improve execution speed in MSDN example - c#

Related

How to fill an array in multiple threads?

Fast comparison of two doubles ignoring everything that comes after 6 decimal digits

C#: Why is a function call faster than manual inlining?

Exit the loop after specific time in C#

Parallel Fibonacci Number Calculator

Categories

Resources