Relative speed of jagged, multidimensional arrays and object fields - c#

Some background, im an engineer not a professional programmer so apologies for any dumb questions. I have a program that is calculation intensive and i want to speed it up. My understanding (from reading forums) was that C# is optimised for Jagged Arrays, but my test below seems to say not. If i'm doing something wrong please let me know, also the test which uses the tray object (could be any moderately complex object) seems too fast, is there a problem with the code?
The relative speeds im getting (in debug) are; (Note: I just realised the times include array initislisation, but thats ok, the arrays will be cleared or re-initialised many times in the program).
Object/field access 0ms
Jagged Array access 87ms
Normal Array access 12ms
If i dont count initialising the arrays i get 0, 7 and 7 ms respectively.
var watch = Stopwatch.StartNew();
double res=0;
int count = 10000;
Tray tray = column[0][4];
for (int i = 0; i < count; i++)
{
tray = column[0][4];
tray.T = i;
res = tray.T;
}
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
Debug.WriteLine("Column Solution Time1 ms: " + elapsedMs.ToString() + " " + res.ToString());
MessageBox.Show("Object Field" + elapsedMs.ToString());
watch = Stopwatch.StartNew();
double res2 = 0;
int count2 = count;
double[][] tray1 = new double[count2][];
for (int i = 0; i < count2; i++)
tray1 = new double[count2];
for (int i = 0; i < count2; i++)
{
tray1[i][i] = i;
res2 = tray1[i][i];
}
watch.Stop();
elapsedMs = watch.ElapsedMilliseconds;
Debug.WriteLine("Column Solution Time2 ms: " + elapsedMs.ToString() +" "+ res2.ToString());
MessageBox.Show("Jagged Matix" + elapsedMs.ToString());
watch = Stopwatch.StartNew();
double res3 = 0;
int count3 = count;
double[,] tray3 = new double[count3,count3];
for (int i = 0; i < count3; i++)
{
tray3[i,i] = i;
res3 = tray3[i,i];
}
watch.Stop();
elapsedMs = watch.ElapsedMilliseconds;
Debug.WriteLine("Column Solution Time3 ms: " + elapsedMs.ToString() + " " + res3.ToString());
MessageBox.Show("Matix" + elapsedMs.ToString());

Related

Adding integers from 2 arrays using Vector<int> takes longer time than traditional for loop

I am trying to use Vector to add integer values from 2 arrays faster than a traditional for loop.
My Vector count is: 4 which should mean that the addArrays_Vector function should run about 4 times faster than: addArrays_Normally
var vectSize = Vector<int>.Count;
This is true on my computer:
Vector.IsHardwareAccelerated
However strangely enough those are the benchmarks:
addArrays_Normally takes 475 milliseconds
addArrays_Vectortakes 627 milliseconds
How is this possible? Shouldn't addArrays_Vector take only approx 120 milliseconds? I wonder if I do this wrong?
void runVectorBenchmark()
{
var v1 = new int[92564080];
var v2 = new int[92564080];
for (int i = 0; i < v1.Length; i++)
{
v1[i] = 2;
v2[i] = 2;
}
//new Thread(() => addArrays_Normally(v1, v2)).Start();
new Thread(() => addArrays_Vector(v1, v2, Vector<int>.Count)).Start();
}
void addArrays_Normally(int[] v1, int[] v2)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
int sum = 0;
int i = 0;
for (i = 0; i < v1.Length; i++)
{
sum = v1[i] + v2[i];
}
stopWatch.Stop();
MessageBox.Show("stopWatch: " + stopWatch.ElapsedMilliseconds.ToString() + " milliseconds\n\n" );
}
void addArrays_Vector(int[] v1, int[] v2, int vectSize)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
int[] retVal = new int[v1.Length];
int i = 0;
for (i = 0; i < v1.Length - vectSize; i += vectSize)
{
var va = new Vector<int>(v1, i);
var vb = new Vector<int>(v2, i);
var vc = va + vb;
vc.CopyTo(retVal, i);
}
stopWatch.Stop();
MessageBox.Show("stopWatch: " + stopWatch.ElapsedMilliseconds.ToString() + " milliseconds\n\n" );
}
Two functions are different. And looks like RAM memory is a bottleneck here:
in the first example
var v1 = new int[92564080];
var v2 = new int[92564080];
...
int sum = 0;
int i = 0;
for (i = 0; i < v1.Length; i++)
{
sum = v1[i] + v2[i];
}
Code is reading both array once. So memory consumption is: sizeof(int) * 92564080 * 2 == 4 * 92564080 * 2 == 706 MB .
in the second example
var v1 = new int[92564080];
var v2 = new int[92564080];
...
int[] retVal = new int[v1.Length];
int i = 0;
for (i = 0; i < v1.Length - vectSize; i += vectSize)
{
var va = new Vector<int>(v1, i);
var vb = new Vector<int>(v2, i);
var vc = va + vb;
vc.CopyTo(retVal, i);
}
Code is reading 2 input arrays and writing into an output array. Memory consumption is at least sizeof(int) * 92564080 * 3 == 1 059 MB
Update:
RAM is much slower than CPU / CPU cache. From this great article about
Memory Bandwidth Napkin Math roughly:
L1 Bandwidth: 210 GB/s
...
RAM Bandwidth: 45 GB/s
So extra memory consumption would neglect vectorization speed up.
And the Youtube video mentioned is doing comparison on different code, non-vectorized code from the video is as follows, which consumes the same amount of memory as the vectorized code:
int[] AddArrays_Simple(int[] v1, int[] v2)
{
int[] retVal = new int[v1.Length];
for (int i = 0; i < v1.Length; i++)
{
retVal[i] = v1[i] + v2[i];
}
return retVal;
}

Horse Racing Console App. Simulation - C#

I have a project which involves simulating a horse race and reporting who comes first, second and third. I have gone all of the way up to making a random number for each measure of distance and whichever horse has the highest total number wins. I am having trouble putting the first, second and third place down right. I have the total in an array and I'm not quite sure where to go with it now.
Console.Write ("Type 'Begin' to start the race. ");
string startRace = Console.ReadLine ();
if (startRace == "Begin")
{
Console.Clear ();
Console.WriteLine ("You may now begin the race.");
Console.Clear ();
int[] tot = new int[numberOfHorses];
for (int i = 0; i < numberOfHorses; i++)
{
Console.Write (string.Format("{0, 10}: ", horseName[i]));
int total = 0;
for (int n = 1; n <= furlongs; n++)
{
int randomNum = rnd.Next (0, 10);
Console.Write (" " + randomNum + " ");
total = total + randomNum;
}
tot[i] = total;
Console.Write (" | " + total);
Console.WriteLine (" ");
} //This is where I start to get unsure of myself.
int firstPlace = Int32.MinValue
for (int place = 0; place < numberOfHorses; place++)
{
if (tot[place] > firstPlace)
{
firstPlace = tot[place];
}
}
}
'numberOfHorses' is how many horses the user has decided to race and 'horseName' is what the user has named each horse. Thanks. :)
You need a sorting function. Instead of:
int firstPlace = Int32.MinValue
for (int place = 0; place < numberOfHorses; place++)
{
if (tot[place] > firstPlace)
{
firstPlace = tot[place];
}
}
try this:
int[] horseIndexes = new int[numberOfHorses];
for (int place = 0; place < numberOfHorses; place++)
{
horseIndexes[place] = place ;
}
// this is the sorting function here
// (a,b) => tot[b] - tot[a]
// it will sort in descending order
Array.Sort(horseIndexes, (a,b) => tot[b] - tot[a]);
for (int place = 0; place < horseIndexes.Length && place < 3; place++)
{
Console.WriteLine("place: " + (place+1));
Console.WriteLine("horse: " + horseName[horseIndexes[place]);
Console.WriteLine("total: " + tot[horseIndexes[place]);
}
There are better ways of doing this using LINQ expressions, but hopefully this example is the most understandable.
Emerald King should already know how to find the highest number in a list. Take the top number out (set it to 0) then repeat the process to find second, and again to find third.

StopWatch gives random results

That is my first attempt to use StopWatch to meassure code performance and I don't know what is wrong. I want to check if there is difference when casting to double to calculate average with integers.
public static double Avarage(int a, int b)
{
return (a + b + 0.0) / 2;
}
public static double AvarageDouble(int s, int d)
{
return (double)(s + d) / 2;
}
public static double AvarageDouble2(int x, int v)
{
return ((double)x + v) / 2;
}
Code to test these 3 methods, using StopWatch:
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 1000000; i++)
{
var ret = Avarage(2, 3);
}
sw.Stop();
Console.Write("Using 0.0: " + sw.ElapsedTicks + "\n");
sw.Reset();
sw.Start();
for (int i = 0; i < 1000000; i++)
{
var ret2 = AvarageDouble(2, 3);
}
sw.Stop();
Console.Write("Using Double(s+d): " + sw.ElapsedTicks + "\n");
sw.Reset();
sw.Start();
for (int i = 0; i < 1000000; i++)
{
var ret3 = AvarageDouble2(2, 3);
}
sw.Stop();
Console.Write("Using double (x): " + sw.ElapsedTicks + "\n");
It shows random result, once Average is the fastets, other time AverageDouble or AverageDouble2. I use diff variable names, but looks like it does not matter.
What am I missing?
PS. What is the best method to calculate average with two ints as inputs?
Tested your code, yes the results was very random at times. Remember Stopwatch is only the time elapsed from sw.start() to sw.stop(). It does not take into consideration .Net's Just In Time compilation, operating system process scheduling, cpu load etc.
This will be more noteworthy in methods with such small runtimes. Where these noises, can more then double the runtime.
An elaborate and better explanation is written in the following SO question.

Trying to find the average amount of attempt it takes for Tom to roll a six on: 10 repetitions

It takes the total number of rolls and divides them by 10. For example, if it took 56 rolls so my average is 5.6.
Random numGen = new Random ();
int numOfAttempt = 0;
int Attempt = 0;
int avrBefore = 0;
int avrAfter = 0;
for (int i = 1; i <= 10; i++)
do {
Attempt = numGen.Next (1, 7);
Console.WriteLine (Attempt);
numOfAttempt++;
avrBefore = numOfAttempt;
avrAfter = avrBefore / 10;
} while (Attempt != 6);
Console.WriteLine ("He tried " + numOfAttempt + " times to roll a six.");
Console.WriteLine ("The average number of times it took to get a six was " + avrAfter);
Console.ReadKey ();
maybe you need to reset the vars inside the for loop, before the while:
for (int i = 1; i <= 10; i++){
int numOfAttempt = 0;
int Attempt = 0;
int avrBefore = 0;
int avrAfter = 0;
do {//your code
You are diving by ten at the wrong time.
You are dividing by ten even when you haven't done ten iterations.
This code below instead shows the correct average at each iteration.
void Main()
{
Random numGen = new Random ();
int totalAttempts = 0;
for (int i = 1; i <= 10; i++)
{
int attempts = 0;
int attempt = 0;
do {
attempt = numGen.Next (1, 7);
attempts++;
} while (attempt != 6);
totalAttempts+=attempts;
Console.WriteLine ("He tried " + attempts + " times to roll a six.");
Console.WriteLine ("The average number of times it took to get a six was " + (double)totalAttempts / i);
}
}
So you want to know how many times it took to get a six. This is quite simple. You just have to save how many times you rolled the dices and how many times you got a six.
Random numGen = new Random ();
int attempt = 0;
int numOfAttempt = 0;
int numberOfSixes = 0;
int i;
for (i = 0; i < 10; i++) {
do {
attempt = numGen.Next (1, 7);
Console.WriteLine (Attempt);
numOfAttempt++;
} while (Attempt != 6);
numberOfSixes++;
}
Console.WriteLine ("He tried " + numOfAttempt + " times to roll a six.");
Console.WriteLine ("The average number of times it took to get a six was " + (numberOfAttempt/numberOfSixes));
Console.ReadKey ();
I hope you understand my approach.
I would appreciate it if you could elaborate your thought process behind avrBefore,avrAfter and divide by 10. I didn't get that.
You can shorten down the code a lot and get it correct:
Random numGen = new Random();
int numOfAttempt = 0;
for (int i = 0; i < 10; i++)
{
int attempt = 0;
while (attempt != 6)
{
attempt = numGen.Next(1, 7);
Console.WriteLine (attempt );
numOfAttempt++;
}
}
Console.WriteLine("He tried " + numOfAttempt + " times to roll a six.");
Console.WriteLine("The average number of times it took to get a six was " + numOfAttempt / 10.0);
Console.ReadKey();

Performance when Generating CPU Cache Misses

I am trying to learn about CPU cache performance in the world of .NET. Specifically I am working through Igor Ostovsky's article about Processor Cache Effects.
I have gone through the first three examples in his article and have recorded results that widely differ from his. I think I must be doing something wrong because the performance on my machine is showing almost the exact opposite results of what he shows in his article. I am not seeing the large effects from cache misses that I would expect.
What am I doing wrong? (bad code, compiler setting, etc.)
Here are the performance results on my machine:
If it helps, the processor on my machine is an Intel Core i7-2630QM. Here is info on my processor's cache:
I have compiled in x64 Release mode.
Below is my source code:
class Program
{
static Stopwatch watch = new Stopwatch();
static int[] arr = new int[64 * 1024 * 1024];
static void Main(string[] args)
{
Example1();
Example2();
Example3();
Console.ReadLine();
}
static void Example1()
{
Console.WriteLine("Example 1:");
// Loop 1
watch.Restart();
for (int i = 0; i < arr.Length; i++) arr[i] *= 3;
watch.Stop();
Console.WriteLine(" Loop 1: " + watch.ElapsedMilliseconds.ToString() + " ms");
// Loop 2
watch.Restart();
for (int i = 0; i < arr.Length; i += 32) arr[i] *= 3;
watch.Stop();
Console.WriteLine(" Loop 2: " + watch.ElapsedMilliseconds.ToString() + " ms");
Console.WriteLine();
}
static void Example2()
{
Console.WriteLine("Example 2:");
for (int k = 1; k <= 1024; k *= 2)
{
watch.Restart();
for (int i = 0; i < arr.Length; i += k) arr[i] *= 3;
watch.Stop();
Console.WriteLine(" K = "+ k + ": " + watch.ElapsedMilliseconds.ToString() + " ms");
}
Console.WriteLine();
}
static void Example3()
{
Console.WriteLine("Example 3:");
for (int k = 1; k <= 1024*1024; k *= 2)
{
//256* 4bytes per 32 bit int * k = k Kilobytes
arr = new int[256*k];
int steps = 64 * 1024 * 1024; // Arbitrary number of steps
int lengthMod = arr.Length - 1;
watch.Restart();
for (int i = 0; i < steps; i++)
{
arr[(i * 16) & lengthMod]++; // (x & lengthMod) is equal to (x % arr.Length)
}
watch.Stop();
Console.WriteLine(" Array size = " + arr.Length * 4 + " bytes: " + (int)(watch.Elapsed.TotalMilliseconds * 1000000.0 / arr.Length) + " nanoseconds per element");
}
Console.WriteLine();
}
}
Why are you using i += 32 in the second loop. You are stepping over cache lines in this way. 32*4 = 128bytes way bigger then 64bytes needed.

Categories