fill an array with random numbers using threads in c# - c#

so as said in the tile I'm trying to fill up an array of bytes with random numbers using 16 (in my case) threads, now it takes about six and a half seconds filling up an array with 500000000 bytes using one thread so the logic says that using 16 threads will be at least 10 times faster but then I tried to do this, it took 15 seconds to fill it up, what I did is I gave each thread one segment to fill in the same array
here is the code:
static byte[] nums2 = new byte[500000000];
static Random rnd = new Random(123);
static void fill()
{
for (int i = 0; i < nums.Length; i++)
nums[i] = (byte)rnd.Next(10);
}
static void fillPart(object ID)
{
var part = nums2.Length / Environment.ProcessorCount;
int baseN = (int)ID * part;
for (int i = baseN; i < baseN + part; i++)
nums2[i] = (byte)rnd.Next(10);
Console.WriteLine("Done! " + ID);
}
static void Main(string[] args)
{
Stopwatch watch = new Stopwatch();
watch.Start();
fill();
watch.Stop();
Console.WriteLine("it took " + watch.Elapsed);
Console.WriteLine();
watch.Reset();
watch.Start();
Thread[] threads = new Thread[Environment.ProcessorCount];
for (int i = 0; i < Environment.ProcessorCount; i++)
{
threads[i] = new Thread(fillPart);
threads[i].Start(i);
}
for(int i = 0; i < Environment.ProcessorCount; i++)
threads[i].Join();
watch.Stop();
Console.WriteLine("it took " + watch.Elapsed);
}
}```
would like to understand why is it took 15 seconds or maybe what I did wrong

If it were me, I'd just:
byte[] nums = new byte[500000000];
RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
rng.GetBytes(nums);
and be done with it.
If you really want threads:
RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
var dop = 8;
var batchSize = 500000000 / dop;
var bigBytes = Enumerable.Range(0, dop).AsParallel().SelectMany(t => {
var bytes = new byte[batchSize];
rng.GetBytes(bytes); //This *IS* thread-safe
return bytes;
}).ToArray();
but I suspect the time spent collating into a new array by SelectMany followed by ToArray might make this more expensive that the single-thread approach.

The problem is you declare the static Random rnd = new Random(int.MaxValue);
If you create the Random instance in fillPart method then the program will finish in one second so the problem in not Thread is rnd.Next(10)
static void fillPart(object ID)
{
Random rnd = new Random(123);
var part = nums2.Length / Environment.ProcessorCount;
int baseN = (int)ID * part;
int co = 0;
for (int i = baseN; i < baseN + part; i++)
{
nums2[i] = (byte)rnd.Next(10);
}
//Console.WriteLine("Done! " + ID + co);
}

Related

Ryzen vs. i7 Multi-Threaded Performance

I made the following C# Console App:
class Program
{
static RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
public static ConcurrentDictionary<int, int> StateCount { get; set; }
static int length = 1000000000;
static void Main(string[] args)
{
StateCount = new ConcurrentDictionary<int, int>();
for (int i = 0; i < 3; i++)
{
StateCount.AddOrUpdate(i, 0, (k, v) => 0);
}
Console.WriteLine("Processors: " + Environment.ProcessorCount);
Console.WriteLine("Starting...");
Console.WriteLine();
Timer t = new Timer(1000);
t.Elapsed += T_Elapsed;
t.Start();
Stopwatch sw = new Stopwatch();
sw.Start();
Parallel.For(0, length, (i) =>
{
var rand = GetRandomNumber();
int newState = 0;
if(rand < 0.3)
{
newState = 0;
}
else if (rand < 0.6)
{
newState = 1;
}
else
{
newState = 2;
}
StateCount.AddOrUpdate(newState, 0, (k, v) => v + 1);
});
sw.Stop();
t.Stop();
Console.WriteLine();
Console.WriteLine("Total time: " + sw.Elapsed.TotalSeconds);
Console.ReadKey();
}
private static void T_Elapsed(object sender, ElapsedEventArgs e)
{
int total = 0;
for (int i = 0; i < 3; i++)
{
if(StateCount.TryGetValue(i, out int value))
{
total += value;
}
}
int percent = (int)Math.Round((total / (double)length) * 100);
Console.Write("\r" + percent + "%");
}
public static double GetRandomNumber()
{
var bytes = new Byte[8];
rng.GetBytes(bytes);
var ul = BitConverter.ToUInt64(bytes, 0) / (1 << 11);
Double randomDouble = ul / (Double)(1UL << 53);
return randomDouble;
}
}
Before running this, the Task Manager reported <2% CPU usage (across all runs and machines).
I ran it on a machine with a Ryzen 3800X. The output was:
Processors: 16
Total time: 209.22
The speed reported in the Task Manager while it ran was ~4.12 GHz.
I ran it on a machine with an i7-7820HK The output was:
Processors: 8
Total time: 213.09
The speed reported in the Task Manager while it ran was ~3.45 GHz.
I modified Parallel.For to include the processor count (Parallel.For(0, length, new ParallelOptions() { MaxDegreeOfParallelism = Environment.ProcessorCount }, (i) => {code});). The outputs were:
3800X: 16 - 158.58 # ~4.13
7820HK: 8 - 210.49 # ~3.40
There's something to be said about Parallel.For not natively identifying the Ryzen processors vs cores, but setting that aside, even here the Ryzen performance is still significantly poorer than would be expected (only ~25% faster with double the cores/processors, a faster speed, and larger L1-3 caches). Can anyone explain why?
Edit: Following a couple of comments, I made some changes to my code. See below:
static int length = 1000;
static void Main(string[] args)
{
StateCount = new ConcurrentDictionary<int, int>();
for (int i = 0; i < 3; i++)
{
StateCount.AddOrUpdate(i, 0, (k, v) => 0);
}
var procCount = Environment.ProcessorCount;
Console.WriteLine("Processors: " + procCount);
Console.WriteLine("Starting...");
Console.WriteLine();
List<double> times = new List<double>();
Stopwatch sw = new Stopwatch();
for (int m = 0; m < 10; m++)
{
sw.Restart();
Parallel.For(0, length, new ParallelOptions() { MaxDegreeOfParallelism = procCount }, (i) =>
{
for (int j = 0; j < 1000000; j++)
{
var rand = GetRandomNumber();
int newState = 0;
if (rand < 0.3)
{
newState = 0;
}
else if (rand < 0.6)
{
newState = 1;
}
else
{
newState = 2;
}
StateCount.AddOrUpdate(newState, 0, (k, v) => v + 1);
}
});
sw.Stop();
Console.WriteLine("Total time: " + sw.Elapsed.TotalSeconds);
times.Add(sw.Elapsed.TotalSeconds);
}
Console.WriteLine();
var avg = times.Average();
var variance = times.Select(x => (x - avg) * (x - avg)).Sum() / times.Count;
var stdev = Math.Sqrt(variance);
Console.WriteLine("Average time: " + avg + " +/- " + stdev);
Console.ReadKey();
Console.ReadKey();
}
The outside loop is 1,000 instead of 1,000,000,000, so there are "only" 1,000 parallel "tasks." Within each parallel "task" however there's now a loop of 1,000,000 actions, so the act of "getting the task" or whatever should have a much smaller affect on the total. I also loop the whole thing 10 times and get the average + standard devation. Output:
Ryzen 3800X: 158.531 +/- 0.429 # ~4.13
i7-7820HK: 202.159 +/- 2.538 # ~3.48
Even here, the Ryzen's twice as many threads and 0.60 GHz higher clock only result in a ~75% faster time for the total operation.

Adding integers from 2 arrays using Vector<int> takes longer time than traditional for loop

I am trying to use Vector to add integer values from 2 arrays faster than a traditional for loop.
My Vector count is: 4 which should mean that the addArrays_Vector function should run about 4 times faster than: addArrays_Normally
var vectSize = Vector<int>.Count;
This is true on my computer:
Vector.IsHardwareAccelerated
However strangely enough those are the benchmarks:
addArrays_Normally takes 475 milliseconds
addArrays_Vectortakes 627 milliseconds
How is this possible? Shouldn't addArrays_Vector take only approx 120 milliseconds? I wonder if I do this wrong?
void runVectorBenchmark()
{
var v1 = new int[92564080];
var v2 = new int[92564080];
for (int i = 0; i < v1.Length; i++)
{
v1[i] = 2;
v2[i] = 2;
}
//new Thread(() => addArrays_Normally(v1, v2)).Start();
new Thread(() => addArrays_Vector(v1, v2, Vector<int>.Count)).Start();
}
void addArrays_Normally(int[] v1, int[] v2)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
int sum = 0;
int i = 0;
for (i = 0; i < v1.Length; i++)
{
sum = v1[i] + v2[i];
}
stopWatch.Stop();
MessageBox.Show("stopWatch: " + stopWatch.ElapsedMilliseconds.ToString() + " milliseconds\n\n" );
}
void addArrays_Vector(int[] v1, int[] v2, int vectSize)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
int[] retVal = new int[v1.Length];
int i = 0;
for (i = 0; i < v1.Length - vectSize; i += vectSize)
{
var va = new Vector<int>(v1, i);
var vb = new Vector<int>(v2, i);
var vc = va + vb;
vc.CopyTo(retVal, i);
}
stopWatch.Stop();
MessageBox.Show("stopWatch: " + stopWatch.ElapsedMilliseconds.ToString() + " milliseconds\n\n" );
}
Two functions are different. And looks like RAM memory is a bottleneck here:
in the first example
var v1 = new int[92564080];
var v2 = new int[92564080];
...
int sum = 0;
int i = 0;
for (i = 0; i < v1.Length; i++)
{
sum = v1[i] + v2[i];
}
Code is reading both array once. So memory consumption is: sizeof(int) * 92564080 * 2 == 4 * 92564080 * 2 == 706 MB .
in the second example
var v1 = new int[92564080];
var v2 = new int[92564080];
...
int[] retVal = new int[v1.Length];
int i = 0;
for (i = 0; i < v1.Length - vectSize; i += vectSize)
{
var va = new Vector<int>(v1, i);
var vb = new Vector<int>(v2, i);
var vc = va + vb;
vc.CopyTo(retVal, i);
}
Code is reading 2 input arrays and writing into an output array. Memory consumption is at least sizeof(int) * 92564080 * 3 == 1 059 MB
Update:
RAM is much slower than CPU / CPU cache. From this great article about
Memory Bandwidth Napkin Math roughly:
L1 Bandwidth: 210 GB/s
...
RAM Bandwidth: 45 GB/s
So extra memory consumption would neglect vectorization speed up.
And the Youtube video mentioned is doing comparison on different code, non-vectorized code from the video is as follows, which consumes the same amount of memory as the vectorized code:
int[] AddArrays_Simple(int[] v1, int[] v2)
{
int[] retVal = new int[v1.Length];
for (int i = 0; i < v1.Length; i++)
{
retVal[i] = v1[i] + v2[i];
}
return retVal;
}

Fastest way to accomplish a lookup in C# and how to deal with imprecision of double values?

I have an array of doubles (from a calculation) with 327,680 values. I need to translate those values into a color 8 bit per color image. I need to have a lookup table that holds the double values as an index with a byte[3] array for the RGB values to be used as the visual representation of that temperature value. I have up to 15 ms to do this, no more.
The best idea I have come up with is using a dictionary for the colors. Here is minimal, complete and verifiable code that I have used for testing:
//Create lookup table
Dictionary<int, byte[]> Lookup = new Dictionary<int, byte[]>();
for (int i = 0; i < 1200; i++)
{
byte bValue = (byte)i;
byte[] b = new byte[3] { bValue, bValue, bValue };
Lookup.Add(i, b);
}
//Make proto temp readings
int[] temps = new int[640 * 512];
Random r = new Random();
for (int i = 0; i < 640 * 512; i++)
{
temps[i] = r.Next(0, 255);
}
int size = 640 * 512 * 3;
byte[] imageValues = new byte[size];
for (int i = 0; i < 50; i++)
{
Stopwatch sw = new Stopwatch();
sw.Start();
int index = 0;
foreach (int item in temps)
{
byte[] pixel = new byte[3];
if (Lookup.TryGetValue(item, out pixel))
{
imageValues[index] = pixel[0];
imageValues[index + 1] = pixel[1];
imageValues[index + 2] = pixel[2];
index += 3;
}
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
First question: When I run this I get times in the 10ms - 14ms range depending on if the lookup table has 1200 items or 256 items. Is there a way to speed this up?
Second question: My actual key values will be temperatures (double) that are the result of a calculation. For some reason doubles seem to have a little imprecision in the least significant digits. I have noticed that a result that should have turned out as 25 ends up being 25.00000000012 or something like that. If I am using doubles as the search value then I have the real risk of looking for 25 when the actual key value is 25.00000000012 or vice versa.
I can truncate or something when I am creating the doubles but I am worried about the time to do that.
What are some good strategies for dealing with the double imprecision issue when using the double as a key?
First question: Is there a way to speed this up?
You have unneccesary memory allocation
byte[] pixel = new byte[3];
You can either leave the empty variable declaration
byte[] pixel;
or use inline variable declaration
if (Lookup.TryGetValue(item, out byte[] pixel))
This change improves performance in my tests.
Like Ivan said move the memory allocation which will save you ~20%
You can save another 50% if you create a lookup array with all possible temperature values (just use the resolution of your sensor).
//Create array lookup table
List<byte[]> array = new List<byte[]>(25500);
for (int i = 0; i < 25500; i++)
{
byte bValue = (byte)i;
byte[] b = new byte[3] { bValue, bValue, bValue };
array.Add(b);
}
This will give you temperatures from 0 to 255.00
Then you can access the desired value like so
int size = 640 * 512 * 3;
byte[] imageValues = new byte[size];
var sw = new Stopwatch();
byte[] pixel = new byte[3];
for (int i = 0; i < 50; i++)
{
sw.Start();
int index = 0;
foreach (var item in temps)
{
pixel = array[item * 100];
imageValues[index] = pixel[0];
imageValues[index + 1] = pixel[1];
imageValues[index + 2] = pixel[2];
index += 3;
}
}
sw.Stop();
Console.WriteLine($"{sw.ElapsedMilliseconds}/{sw.ElapsedMilliseconds / 50.0}");
This will bring you below 5ms for a single lookup
You can solve both problems by replacing the Dictionary<T,byte[]> with a byte[][], and mapping each temperature double to an int index into the color array.
So take the range of temperatures, divide it into N equal partitions, where N is the number of elements in your color array. Take each measured temperature and map it to a partition number, which is also an array index into the colors.
The function to map a temperature to an array index would be something like:
temp => (int)(pixelValues * (temp - minTemp) / (maxTemp - minTemp));
EG
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
namespace ConsoleApp21
{
class Program
{
static void Main(string[] args)
{
double maxTemp = 255;
double minTemp = -35;
int pixelValues = 1200;
byte[][] Lookup = new byte[pixelValues][];
for (int i = 0; i < Lookup.Length; i++)
{
byte bValue = (byte)i;
byte[] b = new byte[3] { bValue, bValue, bValue };
Lookup[i] = b;
}
//Make proto temp readings
double[] temps = new double[640 * 512];
Random r = new Random();
for (int i = 0; i < 640 * 512; i++)
{
temps[i] = r.NextDouble() * maxTemp;
}
int size = 640 * 512 * 3;
byte[] imageValues = new byte[size];
var timings = new List<long>(50);
for (int i = 0; i < 50; i++)
{
Stopwatch sw = new Stopwatch();
sw.Start();
int index = 0;
for (int j = 0; j < temps.Length; j++)
{
var lookupVal = (int)(pixelValues * (temps[j] - minTemp) / (maxTemp - minTemp));
byte[] pixel = Lookup[lookupVal];
imageValues[index] = pixel[0];
imageValues[index + 1] = pixel[1];
imageValues[index + 2] = pixel[2];
index += 3;
}
sw.Stop();
var ms = sw.ElapsedMilliseconds;
timings.Add(ms);
//Console.WriteLine(sw.ElapsedMilliseconds);
}
Console.WriteLine($"Max {timings.Max()} Avg {timings.Average()}");
Console.ReadKey();
}
}
}
outputs
Max 7 Avg 3.2
This may be a bit unfair, as it optimizes your example and not your real problem, but perhaps can be applied. You know you are looking up ints, so just use a jagged array: byte[][]. This averages 0.66ms on my PC versus 5.4ms for your original.
Note: Using a Dictionary<int,(byte,byte,byte)> is about a 4ms with a ValueTuple holding the 3 bytes.
var repeats = 50;
Console.WriteLine();
Console.WriteLine("byte[][3]");
//Create lookup table
var lookups = 1200;
var Lookup = new byte[lookups][];
for (int i = 0; i < lookups; i++) {
byte bValue = (byte)i;
var b = new byte[3] { bValue, bValue, bValue };
Lookup[i] = b;
}
//Make proto temp readings
int[] temps = new int[640 * 512];
Random r = new Random();
for (int i = 0; i < 640 * 512; i++) {
temps[i] = r.Next(0, 255);
}
int size = 640 * 512 * 3;
byte[] imageValues = new byte[size];
long totalMS = 0;
Stopwatch sw = new Stopwatch();
for (int i = 0; i < repeats; i++) {
sw.Restart();
int index = 0;
foreach (int item in temps) {
if (item < lookups) {
var pixel = Lookup[item];
imageValues[index] = pixel[0];
imageValues[index + 1] = pixel[1];
imageValues[index + 2] = pixel[2];
index += 3;
}
}
sw.Stop();
totalMS += sw.ElapsedMilliseconds;
//Console.WriteLine(sw.ElapsedMilliseconds);
}
Console.WriteLine($"Average: {totalMS / (double)repeats} ms");
You can utilize more cores of your machine, assuming that it has more than one. Probably it is a good idea to not utilize all of them, and leave one free for the OS and other apps. The code below uses Parallel.ForEach with a range partitioner, and speeds up the execution from 21 msec to 8 msec in my machine.
ParallelOptions options = new ParallelOptions()
{
MaxDegreeOfParallelism = Math.Max(1, Environment.ProcessorCount - 1)
};
Parallel.ForEach(Partitioner.Create(0, temps.Length), options, range =>
{
for (int item = range.Item1; item < range.Item2; item++)
{
byte[] pixel = new byte[3];
if (Lookup.TryGetValue(item, out pixel))
{
int updatedIndex = Interlocked.Add(ref index, 3);
int localIndex = updatedIndex - 3;
imageValues[localIndex] = pixel[0];
imageValues[localIndex + 1] = pixel[1];
imageValues[localIndex + 2] = pixel[2];
//index += 3;
}
}
});
I made no other changes to your code. I didn't optimize the unnecessary array allocation for example.
Btw multithreading creates issues with thread safety. For this reason I edited my answer to increment index using Interlocked.Add instead of +=. Shared access to the imageValues array is probably safe.

Logging the time it takes to fill a LinkedList and Array

This Console-application is a bit strange but kinda funny, if it works. First, I'm clocking the time it takes to fill a LinkedList with 4.000.000 elements, with random numbers. Then I'm searching for 100 random elements in that LinkedList. And between this I'm writing out the time it took to fill and find the elements.
After that I'm trying to do the same thing again, but with an Array. First filling it, then looking for 100 random elements. And then I'm sorting the array, to see the difference between looking for 100 random elements in a unsorted vs sorted array. And then typing the time again.
The problem is, after I've filled the LinkedList, and found the elements in the LinkedList, I'm starting to fill the Array with a loop. And I get a infinite loop. I really don't know what's wrong ATM.
I suggest, if you want to help, that you copy the code I'm pasting into this question, so you understand how it should look for all the parts of the program.
Code:
public static bool sokning(int[] a, int b)
{
bool sant = false;
Random rand = new Random();
Stopwatch watchFindArray = new Stopwatch();
Console.Write("Letar efter tal: ");
watchFindArray.Start();
int myint = 0;
for (int iii = 0; iii < a.Length; iii++)
{
b = rand.Next();
Console.Write("#");
myint = Array.BinarySearch(a, b);
if (myint < 0)
{
sant = false;
}
else
{
sant = true;
}
}
watchFindArray.Stop();
if (sant == true)
{
Console.WriteLine("\nFann alla element efter " + watchFindArray.Elapsed.TotalSeconds + " sekunder.");
return true;
}
else
{
return false;
}
}
public static void körMetod()
{
const int MAX = 40000000;
int[] array = new int[MAX];
int hittamig2 = 0;
Random rand2 = new Random();
Stopwatch watchArray = new Stopwatch();
Console.WriteLine("\nStartar Array...");
watchArray.Start();
Console.Write("Position: ");
for (int ii = 0; ii < MAX; ii++)
{
array[ii] = rand2.Next();
if (array.Length % 1000000 == 0)
{
Console.Write("#");
}
}
watchArray.Stop();
Console.WriteLine("\nTid: " + watchArray.Elapsed.TotalSeconds + " sekunder att fylla en array.");
Console.WriteLine("Letar efter tal: ");
bool sant = sokning(array, hittamig2);
Console.WriteLine("Sorterar arrayen.");
Array.Sort(array);
sant = sokning(array, hittamig2);
if (sant == false)
{
Console.WriteLine("\nHittade inte alla element i arrayen.");
Console.ReadLine();
}
else
{
Console.WriteLine("Klar!");
Console.ReadLine();
}
}
static void Main(string[] args)
{
Random rnd = new Random();
const int MAX = 40000000;
LinkedList<int> lankadLista = new LinkedList<int>();
Stopwatch watchLinkedList = new Stopwatch();
Console.WriteLine("Startar LinkedList...");
watchLinkedList.Start();
Console.Write("Position: ");
for (int i = 0; i < MAX; i++)
{
lankadLista.AddLast(rnd.Next());
if (lankadLista.Count() % 1000000 == 0)
{
Console.Write("#");
}
}
watchLinkedList.Stop();
Console.WriteLine("\nTid: " + watchLinkedList.Elapsed.TotalSeconds + " sekunder att fylla en LinkedList.");
Stopwatch watchFindLinkedList = new Stopwatch();
int hittaMig;
Console.Write("Letar efter tal: ");
watchFindLinkedList.Start();
for (int j = 0; j < 100; j++)
{
hittaMig = rnd.Next();
Console.Write("#");
lankadLista.Find(hittaMig);
}
watchFindLinkedList.Stop();
Console.WriteLine("\nFann alla element efter " +
watchFindLinkedList.Elapsed.TotalSeconds + " sekunder.");
Console.ReadLine();
körMetod();
}
Best Regards.
You are not in an infinite loop, the problem is that it the following code:
for (int ii = 0; ii < MAX; ii++)
{
array[ii] = rand2.Next();
if (array.Length % 1000000 == 0)
{
Console.Write("#");
}
}
The inner condition is array.Length % 1000000 == 0 which is always true because the size of array is always 40000000 as you initialized it:
const int MAX = 40000000;
int[] array = new int[MAX];
When you are doing array[ii] = rand2.Next(); you are not changing the length of the array you are just setting one of its cells with a value equals to rand2.Next();.
This causes the Console.Write("#"); to work in every iteration and also slowing your loop dramatically.
To fix this, just change:
if (array.Length % 1000000 == 0)
to:
if (ii % 1000000 == 0)
You don't want to add new item at the end of the array every time because, resizing the array reallocates the array every time which is super slow but you can do it using the Array.Resize method (no reason for you to do it)
I think you have a big problem in the routine that searches the Array. (sokning)
Where is the code that searches for only 100 elements?
It seems that your are searching a random generated number for 40 millions times. Just fixing the Console.Write("#") to write correctly at every million point is not enough. I think that the big delay that let you think to have an infinite loop is in your code that search 40 millions of random generated numbers in an array of 40 millions of numbers
Of course this is not very "responsive" (considering also that you call this method two times)
public static bool sokning(int[] a, int b)
{
bool sant = false;
Random rand = new Random();
Stopwatch watchFindArray = new Stopwatch();
Console.Write("Letar efter tal: ");
watchFindArray.Start();
int myint = 0;
// Search only 100 numbers like you do in the linked list
for (int iii = 0; iii < 100; iii++)
{
b = rand.Next();
Console.Write("#");
myint = Array.BinarySearch(a, b);
if (myint < 0)
{
sant = false;
}
else
{
sant = true;
}
}
watchFindArray.Stop();
if (sant == true)
{
Console.WriteLine("\nFann alla element efter " + watchFindArray.Elapsed.TotalSeconds + " sekunder.");
return true;
}
else
{
return false;
}
}
There are also two minor problems.
Why passing the variable b inside the sokning method? The original value is never used and when you start the loop to search a random generated number the b variable os overwritten. So I think you could remove it
The second problem is the result of this sokning method. You set the sant variable to true or false at every loop. So the latest loop wins. In other words, if the latest loop finds a match you return true or false if not. If some previous loop has a different result, it is totally lost for the callers of sokning.

Do C# collections care about cache friendlyness?

I've been running a lot of tests comparing an array of structs with an array of classes and a list of classes. Here's the test I've been running:
struct AStruct {
public int val;
}
class AClass {
public int val;
}
static void TestCacheCoherence()
{
int num = 10000;
int iterations = 1000;
int padding = 64;
List<Object> paddingL = new List<Object>();
AStruct[] structArray = new AStruct[num];
AClass[] classArray = new AClass[num];
List<AClass> classList = new List<AClass>();
for(int i=0;i<num;i++){
classArray[i] = new AClass();
if(padding >0) paddingL.Add(new byte[padding]);
}
for (int i = 0; i < num; i++)
{
classList.Add(new AClass());
if (padding > 0) paddingL.Add(new byte[padding]);
}
Console.WriteLine("\n");
stopwatch("StructArray", iterations, () =>
{
for (int i = 0; i < num; i++)
{
structArray[i].val *= 3;
}
});
stopwatch("ClassArray ", iterations, () =>
{
for (int i = 0; i < num; i++)
{
classArray[i].val *= 3;
}
});
stopwatch("ClassList ", iterations, () =>
{
for (int i = 0; i < num; i++)
{
classList[i].val *= 3;
}
});
}
static Stopwatch watch = new Stopwatch();
public static long stopwatch(string msg, int iterations, Action c)
{
watch.Restart();
for (int i = 0; i < iterations; i++)
{
c();
}
watch.Stop();
Console.WriteLine(msg +": " + watch.ElapsedTicks);
return watch.ElapsedTicks;
}
I'm running this in release mode with the following:
Process.GetCurrentProcess().ProcessorAffinity = new IntPtr(2); // Use only the second core
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;
Thread.CurrentThread.Priority = ThreadPriority.Highest;
RESULTS:
With padding=0 I get:
StructArray: 21517
ClassArray: 42637
ClassList: 80679
With padding=64 I get:
StructArray: 21871
ClassArray: 82139
ClassList: 105309
With padding=128 I get:
StructArray: 21694
ClassArray: 76455
ClassList: 107330
I am a bit confused with these results, since I was expecting the difference to be bigger.
After all the structures are tiny and are laid one after the other in memory, while the classes are separated by up to 128 bytes of garbage.
Does this mean that I shouldn't even worry about cache friendlyness? Or is my test flawed?
There are a number of things going on here. The first is that your tests don't take GC's into account- it is distinctly possible that the arrays are being GC'd during the loop over the list (because the arrays are no longer used while you are iterating the list, they are eligible for collection).
The second is that you need to keep in mind that List<T> is backed by an array anyway. The only reading overhead is the additional function calls to go through List.

Categories