Multiple threads slowing down overall dictionary access? - c#

I am profiling a C# application and it looks like two threads each calling Dictionary<>.ContainsKey() 5000 time each on two separate but identical dictionaries (with only two items) is twice as slow as one thread calling Dictionary<>.ContainsKey() on a single dictionary 10000 times.
I am measuring the "thread time" using a tool called JetBrains dotTrace. I am explicitly using copies of the same data, so there are no synhronization primitives that I am using. Is it possible that .NET is doing some synchronization behind the scenes?
I have a dual core machine, and there are three threads running: one is blocked using Semaphore.WaitAll() while the work is done on two new threads whose priority is set to ThreadPriority.Highest.
Obvious culprits like, not actually running the code in parallel, and not using a release build has been ruled out.
EDIT:
People want the code. Alright then:
private int ReduceArrayIteration(VM vm, HeronValue[] input, int begin, int cnt)
{
if (cnt <= 1)
return cnt;
int cur = begin;
for (int i=0; i < cnt - 1; i += 2)
{
// The next two calls are effectively dominated by a call
// to dictionary ContainsKey
vm.SetVar(a, input[begin + i]);
vm.SetVar(b, input[begin + i + 1]);
input[cur++] = vm.Eval(expr);
}
if (cnt % 2 == 1)
{
input[cur++] = input[begin + cnt - 1];
}
int r = cur - begin;
Debug.Assert(r >= 1);
Debug.Assert(r < cnt);
return r;
}
// From VM
public void SetVar(string s, HeronValue o)
{
Debug.Assert(o != null);
frames.Peek().SetVar(s, o);
}
// From Frame
public bool SetVar(string s, HeronValue o)
{
for (int i = scopes.Count; i > 0; --i)
{
// Scope is a derived class of Dictionary
Scope tbl = scopes[i - 1];
if (tbl.HasName(s))
{
tbl[s] = o;
return false;
}
}
return false;
}
Now here is the thread spawning code, which might be retarded:
public static class WorkSplitter
{
static WaitHandle[] signals;
public static void ThreadStarter(Object o)
{
Task task = o as Task;
task.Run();
}
public static void SplitWork(List<Task> tasks)
{
signals = new WaitHandle[tasks.Count];
for (int i = 0; i < tasks.Count; ++i)
signals[i] = tasks[i].done;
for (int i = 0; i < tasks.Count; ++i)
{
Thread t = new Thread(ThreadStarter);
t.Priority = ThreadPriority.Highest;
t.Start(tasks[i]);
}
Semaphore.WaitAll(signals);
}
}

Even if there was any locking in Dictionary (there isn't), it could not affect your measurements since each thread is using a separate one. Running this test 10,000 times is not enough to get reliable timing data, ContainsKey() only takes 20 nanoseconds or so. You'll need at least several million times to avoid scheduling artifacts.

Related

c# performance changes rapidly by looping just 0,001% more often

I'm working on an UI project which has to work with huge datasets (every second 35 new values) which will then be displayed in a graph. The user shall be able to change the view from 10 Minutes up to Month view. To archive this I wrote myself a helper function which truncate a lot of data to a 600 byte array which then should be displayed on a LiveView Chart.
I found out that at the beginning the software works very well and fast, but as longer the software runs (e.g. for a month) and the memory usage raises (to ca. 600 mb) the function get's a lot of slower (up to 8x).
So I made some tests to to find the source of this. Quite surprised I found out that there is something like a magic number where the function get's 2x slower , just by changing 71494 loops to 71495 from 19ms to 39ms runtime
I'm really confused. Even when you comment out the second for loop (where the arrays are geting truncated) it is a lot of slower.
Maybe this has something to do with the Garbage Collector? Or does C# compress memory automatically?
Using Visual Studio 2017 with newest updates.
The Code
using System;
using System.Collections.Generic;
using System.Diagnostics;
namespace TempoaryTest
{
class ProductNameStream
{
public struct FileValue
{
public DateTime Time;
public ushort[] Value;
public ushort[] Avg1;
public ushort[] Avg2;
public ushort[] DAvg;
public ushort AlarmDelta;
public ushort AlarmAverage;
public ushort AlarmSum;
}
}
public static class Program
{
private const int MAX_MEASURE_MODEL = 600;
private const int TEST = 71494;
//private const int TEST = 71495;//this one doubles the consuming time!
public static void Main(string[] bleg)
{
List<ProductNameStream.FileValue> fileValues = new List<ProductNameStream.FileValue>();
ProductNameStream.FileValue fil = new ProductNameStream.FileValue();
DateTime testTime = DateTime.Now;
Console.WriteLine("TEST: {0} {1:X}", TEST, TEST);
//Creating example List
for (int n = 0; n < TEST; n++)
{
fil = new ProductNameStream.FileValue
{
Time = testTime = testTime.AddSeconds(1),
Value = new ushort[8],
Avg1 = new ushort[8],
Avg2 = new ushort[8],
DAvg = new ushort[8]
};
for (int i = 0; i < 8; i++)
{
fil.Value[i] = (ushort)(n + i);
fil.Avg1[i] = (ushort)(TEST - n - i);
fil.Avg2[i] = (ushort)(n / (i + 1));
fil.DAvg[i] = (ushort)(n * (i + 1));
}
fil.AlarmDelta = (ushort)DateTime.Now.Ticks;
fil.AlarmAverage = (ushort)(fil.AlarmDelta / 2);
fil.AlarmSum = (ushort)(n);
fileValues.Add(fil);
}
var sw = Stopwatch.StartNew();
/* May look like the same as MAX_MEASURE_MODEL but since we use int
* as counter we must be aware of the int round down.*/
int cnt = (fileValues.Count / (fileValues.Count / MAX_MEASURE_MODEL)) + 1;
ProductNameStream.FileValue[] newFileValues = new ProductNameStream.FileValue[cnt];
ProductNameStream.FileValue[] fileValuesArray = fileValues.ToArray();
//Truncate the big list to a 600 Array
for (int n = 0; n < fileValues.Count; n++)
{
if ((n % (fileValues.Count / MAX_MEASURE_MODEL)) == 0)
{
cnt = n / (fileValues.Count / MAX_MEASURE_MODEL);
newFileValues[cnt] = fileValuesArray[n];
newFileValues[cnt].Value = new ushort[8];
newFileValues[cnt].Avg1 = new ushort[8];
newFileValues[cnt].Avg2 = new ushort[8];
newFileValues[cnt].DAvg = new ushort[8];
}
else
{
for (int i = 0; i < 8; i++)
{
if (newFileValues[cnt].Value[i] < fileValuesArray[n].Value[i])
newFileValues[cnt].Value[i] = fileValuesArray[n].Value[i];
if (newFileValues[cnt].Avg1[i] < fileValuesArray[n].Avg1[i])
newFileValues[cnt].Avg1[i] = fileValuesArray[n].Avg1[i];
if (newFileValues[cnt].Avg2[i] < fileValuesArray[n].Avg2[i])
newFileValues[cnt].Avg2[i] = fileValuesArray[n].Avg2[i];
if (newFileValues[cnt].DAvg[i] < fileValuesArray[n].DAvg[i])
newFileValues[cnt].DAvg[i] = fileValuesArray[n].DAvg[i];
}
if (newFileValues[cnt].AlarmSum < fileValuesArray[n].AlarmSum)
newFileValues[cnt].AlarmSum = fileValuesArray[n].AlarmSum;
if (newFileValues[cnt].AlarmDelta < fileValuesArray[n].AlarmDelta)
newFileValues[cnt].AlarmDelta = fileValuesArray[n].AlarmDelta;
if (newFileValues[cnt].AlarmAverage < fileValuesArray[n].AlarmAverage)
newFileValues[cnt].AlarmAverage = fileValuesArray[n].AlarmAverage;
}
}
Console.WriteLine(sw.ElapsedMilliseconds);
}
}
}
This is most likely being caused by the garbage collector, as you suggested.
I can offer two pieces of evidence to indicate that this is so:
If you put GC.Collect() just before you start the stopwatch, the difference in times goes away.
If you instead change the initialisation of the list to new List<ProductNameStream.FileValue>(TEST);, the difference in time also goes away.
(Initialising the list's capacity to the final size in its constructor prevents multiple reallocations of its internal array while items are being added to it, which will reduce pressure on the garbage collector.)
Therefore, I assert based on this evidence that it is indeed the garbage collector that is impacting your timings.
Incidentally, the threshold value was slightly different for me, and for at least one other person too (which isn't surprising if the timing differences are being caused by the garbage collector).
Your data structure is inefficient and is forcing you to do a lot of allocations during computation. Have a look of thisfixed size array inside a struct
. Also preallocate the list. Don't rely on the list to constantly adjust its size which also creates garbage.

ThreadPool behaves different for debug mode and runtime

I want to use ThreadPool to complete long running jobs in less time. My methods
does more jobs of course but I prepared a simple example for you to understand
my situation. If I run this application it throws ArgumentOutOfRangeException on the commented line. Also it shows that i is equal to 10. How can it enter the for loop if it is 10?
If I don't run the application and debug this code it does not throw exception and works fine.
public void Test()
{
List<int> list1 = new List<int>();
List<int> list2 = new List<int>();
for (int i = 0; i < 10; i++) list1.Add(i);
for (int i = 0; i < 10; i++) list2.Add(i);
int toProcess = list1.Count;
using (ManualResetEvent resetEvent = new ManualResetEvent(false))
{
for (int i = 0; i < list1.Count; i++)
{
ThreadPool.QueueUserWorkItem(
new WaitCallback(delegate(object state)
{
// ArgumentOutOfRangeException with i=10
Sum(list1[i], list2[i]);
if (Interlocked.Decrement(ref toProcess) == 0)
resetEvent.Set();
}), null);
}
resetEvent.WaitOne();
}
MessageBox.Show("Done");
}
private void Sum(int p, int p2)
{
int sum = p + p2;
}
What is the problem here?
The problem is that i==10, but your lists have 10 items (i.e. a maximum index of 9).
This is because you have a race condition over a captured variable that is being changed before your delegate runs. Will the next iteration of the loop increment the value before the delegate runs, or will your delegate run before the loop increments the value? It's all down to the timing of that specific run.
Your instinct is that i will have a value of 0-9. However, when the loop reaches its termination, i will have a value of 10. Because the delegate captures i, the value of i may well be used after the loop has terminated.
Change your loop as follows:
for (int i = 0; i < list1.Count; i++)
{
var idx=i;
ThreadPool.QueueUserWorkItem(
new WaitCallback(delegate(object state)
{
// ArgumentOutOfRangeException with i=10
Sum(list1[idx], list2[idx]);
if (Interlocked.Decrement(ref toProcess) == 0)
resetEvent.Set();
}), null);
}
Now your delegate is getting a "private", independent copy of i instead of referring to a single, changing value that is shared between all invocations of the delegate.
I wouldn't worry too much about the difference in behaviour between debug and non-debug modes. That's the nature of race conditions.
What is the problem here?
Closure. You're capturing the i variable which isn't doing what you expect it to do.
You'll need to create a copy inside your for loop:
var currentIndex = i:
Sum(list1[currentIndex], list2[currentIndex]);

Dynamically run more than one thread in c#

I have a program where I need to run a number of threads at the same time
int defaultMaxworkerThreads = 0;
int defaultmaxIOThreads = 0;
ThreadPool.GetMaxThreads(out defaultMaxworkerThreads, out defaultmaxIOThreads);
ThreadPool.SetMaxThreads(defaultMaxworkerThreads, defaultmaxIOThreads);
List<Data1> Data1 = PasswordFileHandler.ReadPasswordFile("Data1.txt");
List<Data1> Data2 = PasswordFileHandler.ReadPasswordFile("Data2.txt");
while (Data1.Count >= 0)
{
List<String> Data1Subset = (from sub in Data1 select sub).Take(NumberOfWordPrThead).ToList();
Data1 = _Data1.Except(Data1Subset ).ToList();
_NumberOfTheadsRunning++;
ThreadPool.QueueUserWorkItem(new WaitCallback(ThreadCompleted), new TaskInfo(Data1Subset , Data2 ));
//Start theads based on how many we like to start
}
How can I run more than 1 thread at a time? I would like to decide the number of threads at run-time, based on the number of cores and a config setting, but my code only seems to always run one one thread.
How should I change it to run on more than one thread?
As #TomTom pointed out, your code will work properly if you set both SetMinThreads and SetMaxThreads. In accordance with MSDN you also have to watch out not to quit the main thread too early, before the execution of the ThreadPool:
// used to simulate different work time
static Random random = new Random();
// worker
static private void callback(Object data)
{
Console.WriteLine(String.Format("Called from {0}", data));
System.Threading.Thread.Sleep(random.Next(100, 1000));
}
//
int minWorker, minIOC;
ThreadPool.GetMinThreads(out minWorker, out minIOC);
ThreadPool.SetMaxThreads(5, minIOC);
ThreadPool.SetMinThreads(3, minIOC);
for(int i = 0; i < 3; i++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(callback), i.ToString());
}
// give the ThreadPool a change to run
Thread.Sleep(1000);
A good alternative to the standard ThreadPool is the Task Parallel Library which introduces the concept of Tasks. Using the Task object you could for example easily start multiple tasks like this:
// global variable
Random random = new Random(); // used to simulate different work time
// unit of work
private void callback(int i)
{
Console.WriteLine(String.Format("Nr. {0}", i));
System.Threading.Thread.Sleep(random.Next(100, 1000));
}
const int max = 5;
var tasks = new System.Threading.Tasks.Task[max];
for (int i = 0; i < max; i++)
{
var copy = i;
// create the tasks and init the work units
tasks[i] = new System.Threading.Tasks.Task(() => callback(copy));
}
// start the parallel execution
foreach (var task in tasks)
{
task.Start();
}
// optionally wait for all tasks to finish
System.Threading.Tasks.Task.WaitAll(tasks);
You could also start the code execution immediately using Task.Factory like this:
const int max = 5;
var tasks = new System.Threading.Tasks.Task[max];
for (int i = 0; i < max; i++)
{
var copy = i;
// start execution immediately
tasks[i] = System.Threading.Tasks.Task.Factory.StartNew(() => callback(copy));
}
System.Threading.Tasks.Task.WaitAll(tasks);
Have a look at this SO post to see the difference between ThreadPool.QueueUserWorkItem vs. Task.Factory.StartNew.

How to consume an array of BlockingCollection<T>, one element at a time, in C#?

I've seen lots of examples of how to consume a BlockingCollection<T> in the producer-consumer scenario, even how to consume one element at a time here. I'm quite new to parallel programming though, so I'm facing the following problem:
The problem, indeed, is how to write the method ConsumerProducerExample.ConsumerMethod in the example below, such that it consumes the first 2 double of the BlockingCollection<Double> for every element in the array, then proceeds to consume the next 2 double for every element in the array and so on.
I have written the method in the example below, but it is not the way I want it to work. It is what I know how to do based in the exampled I linked above though. It is exactly the opposite way: as it is here, it will consume every double before jumping to the next BlockingCollection<Double> of the array. And, again, I want it to consume only two double, then jump to the next BlockingCollection<Double>, consume two double, etc. After completing the array loop, it shall proceed to consume the next two double for each BlockingCollection<Double> element again, and so on.
public class ConsumerProducerExample
{
public static BlockingCollection<Double>[] data;
public static Int32 nEl = 100;
public static Int32 tTotal = 1000;
public static Int32 peakCounter = 0;
public static void InitData()
{
data = new BlockingCollection<Double>[nEl];
for (int i = 0; i < nEl; i++) data[i] = new BlockingCollection<Double>();
}
public static void ProducerMethod()
{
Int32 t = 0, j;
while (t < ConsumerProducerExample.tTotal)
{
j = 0;
while (j < ConsumerProducerExample.nEl)
{
data[j].Add(Math.Sin((Double)t / 10.0D));
j++;
}
t++;
}
j = 0;
while (j < ConsumerProducerExample.nEl)
{
data[j].CompleteAdding();
j++;
}
}
public static void ConsumerMethod()
{
// THE PROBLEM IS WITH THIS METHOD
Double x1, x2;
Int32 j = 0;
while (j < Program.nEl)
{
while (!data[j].IsCompleted)
{
try
{
x1 = data[j].Take();
x2 = data[j].Take();
}
catch (InvalidOperationException)
{
break;
}
if (x1 * x2 < 0.0)
{
Program.peakCounter++;
}
}
j++;
}
}
}
They should be used like this:
ConsumerProducerExample.InitData();
Task task1 = Task.Factory.StartNew(ConsumerProducerExample.ProducerMethod);
Task task2 = Task.Factory.StartNew(ConsumerProducerExample.ConsumerMethod);
Any suggestions?
In short, this is a try to count peaks in the solutions of nEl Differential Equations concurrently (the solutions are represented by sin(x) in this example).
Basically, you need to get rid of your inner loop and add another outer loop, that iterates over all of the collections, until they're complete. Something like:
while (!data.All(d => d.IsCompleted))
{
foreach (var d in data)
{
// only takes elements from the collection if it has more than 2 elements
if (d.Count >= 2)
{
double x1, x2;
if (d.TryTake(out x1) && d.TryTake(out x2))
{
if (x1 * x2 < 0.0)
peakCounter++;
}
else
throw new InvalidOperationException();
}
}
}

C# FindAll VS Where Speed

Anyone know any speed differences between Where and FindAll on List. I know Where is part of IEnumerable and FindAll is part of List, I'm just curious what's faster.
The FindAll method of the List<T> class actually constructs a new list object, and adds results to it. The Where extension method for IEnumerable<T> will simply iterate over an existing list and yield an enumeration of the matching results without creating or adding anything (other than the enumerator itself.)
Given a small set, the two would likely perform comparably. However, given a larger set, Where should outperform FindAll, as the new List created to contain the results will have to dynamically grow to contain additional results. Memory usage of FindAll will also start to grow exponentially as the number of matching results increases, where as Where should have constant minimal memory usage (in and of itself...excluding whatever you do with the results.)
FindAll is obviously slower than Where, because it needs to create a new list.
Anyway, I think you really should consider Jon Hanna comment - you'll probably need to do some operations on your results and list would be more useful than IEnumerable in many cases.
I wrote small test, just paste it in Console App project. It measures time/ticks of: function execution, operations on results collection(to get perf. of 'real' usage, and to be sure that compiler won't optimize unused data etc. - I'm new to C# and don't know how it works yet,sorry).
Notice: every measured function except WhereIENumerable() creates new List of elements. I might be doing something wrong, but clearly iterating IEnumerable takes much more time than iterating list.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
namespace Tests
{
public class Dummy
{
public int Val;
public Dummy(int val)
{
Val = val;
}
}
public class WhereOrFindAll
{
const int ElCount = 20000000;
const int FilterVal =1000;
const int MaxVal = 2000;
const bool CheckSum = true; // Checks sum of elements in list of resutls
static List<Dummy> list = new List<Dummy>();
public delegate void FuncToTest();
public static long TestTicks(FuncToTest function, string msg)
{
Stopwatch watch = new Stopwatch();
watch.Start();
function();
watch.Stop();
Console.Write("\r\n"+msg + "\t ticks: " + (watch.ElapsedTicks));
return watch.ElapsedTicks;
}
static void Check(List<Dummy> list)
{
if (!CheckSum) return;
Stopwatch watch = new Stopwatch();
watch.Start();
long res=0;
int count = list.Count;
for (int i = 0; i < count; i++) res += list[i].Val;
for (int i = 0; i < count; i++) res -= (long)(list[i].Val * 0.3);
watch.Stop();
Console.Write("\r\n\nCheck sum: " + res.ToString() + "\t iteration ticks: " + watch.ElapsedTicks);
}
static void Check(IEnumerable<Dummy> ieNumerable)
{
if (!CheckSum) return;
Stopwatch watch = new Stopwatch();
watch.Start();
IEnumerator<Dummy> ieNumerator = ieNumerable.GetEnumerator();
long res = 0;
while (ieNumerator.MoveNext()) res += ieNumerator.Current.Val;
ieNumerator=ieNumerable.GetEnumerator();
while (ieNumerator.MoveNext()) res -= (long)(ieNumerator.Current.Val * 0.3);
watch.Stop();
Console.Write("\r\n\nCheck sum: " + res.ToString() + "\t iteration ticks :" + watch.ElapsedTicks);
}
static void Generate()
{
if (list.Count > 0)
return;
var rand = new Random();
for (int i = 0; i < ElCount; i++)
list.Add(new Dummy(rand.Next(MaxVal)));
}
static void For()
{
List<Dummy> resList = new List<Dummy>();
int count = list.Count;
for (int i = 0; i < count; i++)
{
if (list[i].Val < FilterVal)
resList.Add(list[i]);
}
Check(resList);
}
static void Foreach()
{
List<Dummy> resList = new List<Dummy>();
int count = list.Count;
foreach (Dummy dummy in list)
{
if (dummy.Val < FilterVal)
resList.Add(dummy);
}
Check(resList);
}
static void WhereToList()
{
List<Dummy> resList = list.Where(x => x.Val < FilterVal).ToList<Dummy>();
Check(resList);
}
static void WhereIEnumerable()
{
Stopwatch watch = new Stopwatch();
IEnumerable<Dummy> iEnumerable = list.Where(x => x.Val < FilterVal);
Check(iEnumerable);
}
static void FindAll()
{
List<Dummy> resList = list.FindAll(x => x.Val < FilterVal);
Check(resList);
}
public static void Run()
{
Generate();
long[] ticks = { 0, 0, 0, 0, 0 };
for (int i = 0; i < 10; i++)
{
ticks[0] += TestTicks(For, "For \t\t");
ticks[1] += TestTicks(Foreach, "Foreach \t");
ticks[2] += TestTicks(WhereToList, "Where to list \t");
ticks[3] += TestTicks(WhereIEnumerable, "Where Ienum \t");
ticks[4] += TestTicks(FindAll, "FindAll \t");
Console.Write("\r\n---------------");
}
for (int i = 0; i < 5; i++)
Console.Write("\r\n"+ticks[i].ToString());
}
}
class Program
{
static void Main(string[] args)
{
WhereOrFindAll.Run();
Console.Read();
}
}
}
Results(ticks) - CheckSum enabled(some operations on results), mode: release without debugging(CTRL+F5):
- 16,222,276 (for ->list)
- 17,151,121 (foreach -> list)
- 4,741,494 (where ->list)
- 27,122,285 (where ->ienum)
- 18,821,571 (findall ->list)
CheckSum disabled (not using returned list at all):
- 10,885,004 (for ->list)
- 11,221,888 (foreach ->list)
- 18,688,433 (where ->list)
- 1,075 (where ->ienum)
- 13,720,243 (findall ->list)
Your results can be slightly different, to get real results you need more iterations.
UPDATE(from comment): Looking through that code I agree, .Where should have, at worst, equal performance but almost always better.
Original answer:
.FindAll() should be faster, it takes advantage of already knowing the List's size and looping through the internal array with a simple for loop. .Where() has to fire up an enumerator (a sealed framework class called WhereIterator in this case) and do the same job in a less specific way.
Keep in mind though, that .Where() is enumerable, not actively creating a List in memory and filling it. It's more like a stream, so the memory use on something very large can have a significant difference. Also, you could start using the results in a parallel fashion much faster using there .Where() approach in 4.0.
Where is much, much faster than FindAll. No matter how big the list is, Where takes exactly the same amount of time.
Of course Where just creates a query. It doesn't actually do anything, unlike FindAll which does create a list.
The answer from jrista makes senses. However, the new list adds the same objects, thus just growing with reference to existing objects, which should not be that slow.
As long as 3.5 / Linq extension are possible, Where stays better anyway.
FindAll makes much more sense when limited with 2.0

Categories