Good day to everyone.
Today I made up a school project with one thing bothering me.
My problem is, that I am passing argument to Thread function and when I print it to console via Console.WriteLine, it shows bad numbers.
for (i = 0; i < 10; i++) autari[i] = new Thread(() => autar(i));
for (i = 0; i < 10; i++) motorkari[i] = new Thread(() => motorkar(i + 10));
When I start them in same cycles, their functions do this:
static void motorkar(int id)
{
Console.WriteLine("motorkar {0}", id);
...
It is not the order problem, but when I pass for example 0. Visual studio in Debug writes to console number 2 and without Debug it writes 1.
What can be the problem? I know that I can solve this by setting string name, but I am confused with this.
This is due to the compiler creating you a closure under the hood. If you change the code around to the below you should get your expected output
for (i = 0; i < 10; i++)
{
var local = i;
autari[i] = new Thread(() => autar(local))
}
for (i = 0; i < 10; i++)
{
var local = i + 10;
motorkari[i] = new Thread(() => motorkar(local))
}
For kicks, I wanted to see how the speed of a C# for-loop compares with that of a C++ for-loop. My test is to simply iterate over a for-loop 100000 times, 100000 times, and average the result.
Here is my C# implementation:
static void Main(string[] args) {
var numberOfMeasurements = 100000;
var numberOfLoops = 100000;
var measurements = new List < long > ();
var stopwatch = new Stopwatch();
for (var i = 0; i < numberOfMeasurements; i++) {
stopwatch.Start();
for (int j = 0; j < numberOfLoops; j++) {}
measurements.Add(stopwatch.ElapsedMilliseconds);
}
Console.WriteLine("Average runtime = " + measurements.Average() + " ms.");
Console.Read();
}
Result: Average runtime = 10301.92929 ms.
Here is my C++ implementation:
void TestA()
{
auto numberOfMeasurements = 100000;
auto numberOfLoops = 100000;
std::vector<long> measurements;
for (size_t i = 0; i < numberOfMeasurements; i++)
{
auto start = clock();
for (size_t j = 0; j < numberOfLoops; j++){}
auto duration = start - clock();
measurements.push_back(duration);
}
long avg = std::accumulate(measurements.begin(), measurements.end(), 0.0) / measurements.size();
std::cout << "TestB: Time taken in milliseconds: " << avg << std::endl;
}
int main()
{
TestA();
return 0;
}
Result: TestA: Time taken in milliseconds: 0
When I had a look at what was in measurements, I noticed that it was filled with zeros... So, what is it, what is the problem here? Is it clock? Is there a better/correct way to measure the for-loop?
There is no "problem". Being able to optimize away useless code is one of the key features of C++. As the inner loop does nothing, it should be removed by every sane compiler.
Tip of the day: Only profile meaningful code that does something.
If you want to learn something about micro-benchmarks, you might be interested in this.
As "Baum mit Augen" already said the compiler will remove code that doesn't do anything. That is a common mistake when "benchmarking" C++ code. The same thing will happen if you create some kind of benchmark function which just calculates some things that will never used (won't be returned or used otherwise in code) - the compiler will just remove it.
You can avoid this behavior by not using optimize flags like O2, Ofast and so on. Since nobody would do that with real code it won't display the real performance of C++.
TL;DR Just benchmark real production code.
I have implemented B-Tree and now I am trying to find the best size per node. I am using time benchmarking to measure the speed.
The problem is that it crashes on the second tested number in benchmarking method.
For the example down the output in console is
Benchmarking 10
Benchmarking 11
The crash is in insert method of Node class, but it does not matter because when I tested using any number as SIZE it works well. Maybe I do not understand how the class objects are created or something like that.
I also tried calling benchmarking method with variuos numbers from main, but same result, during second call it crashed.
Can somebody please look at it and explain me what am I doing wrong? Thanks in advance!
I put part of the code here and here is the whole thing http://pastebin.com/AcihW1Qk
public static void benchmark(String filename, int d, int h)
{
using (System.IO.StreamWriter file = new System.IO.StreamWriter(filename))
{
Stopwatch sw = new Stopwatch();
for(int i = d; i <= h; i++)
{
Console.WriteLine("Benchmarking SIZE = " + i.ToString());
file.WriteLine("SIZE = " + i.ToString());
sw.Start();
// code here
Node.setSize(i);
Tree tree = new Tree();
for (int k = 0; k < 10000000; k++)
tree.insert(k);
Random r = new Random(10);
for (int k = 0; k < 10000; k++)
{
int x = r.Next(10000000);
tree.contains(x);
}
file.WriteLine("Depth of tree is " + tree.depth.ToString());
// end of code
sw.Stop();
file.WriteLine("TIME = " + sw.ElapsedMilliseconds.ToString());
file.WriteLine();
sw.Reset();
}
}
static void Main(string[] args)
{
benchmark("benchmark 10-11.txt", 10,11);
}
So I noticed that a treeview took unusually long to sort, first I figured that most of the time was spent repainting the control after adding each sorted item. But eitherway I had a gut feeling that List<T>.Sort() was taking longer than reasonable so I used a custom sort method to benchmark it against. The results were interesting, List<T>.Sort() took ~20 times longer, that's the biggest disappointment in performance I've ever encountered in .NET for such a simple task.
My question is, what could be the reason for this? My guess is the overhead of invoking the comparison delegate, which further has to call String.Compare() (in case of string sorting). Increasing the size of the list appears to increase the performance gap. Any ideas? I'm trying to use .NET classes as much as possible but in cases like this I just can't.
Edit:
static List<string> Sort(List<string> list)
{
if (list.Count == 0)
{
return new List<string>();
}
List<string> _list = new List<string>(list.Count);
_list.Add(list[0]);
int length = list.Count;
for (int i = 1; i < length; i++)
{
string item = list[i];
int j;
for (j = _list.Count - 1; j >= 0; j--)
{
if (String.Compare(item, _list[j]) > 0)
{
_list.Insert(j + 1, item);
break;
}
}
if (j == -1)
{
_list.Insert(0, item);
}
}
return _list;
}
Answer: It's not.
I ran the following benchmark in a simple console app and your code was slower:
static void Main(string[] args)
{
long totalListSortTime = 0;
long totalCustomSortTime = 0;
for (int c = 0; c < 100; c++)
{
List<string> list1 = new List<string>();
List<string> list2 = new List<string>();
for (int i = 0; i < 5000; i++)
{
var rando = RandomString(15);
list1.Add(rando);
list2.Add(rando);
}
Stopwatch watch1 = new Stopwatch();
Stopwatch watch2 = new Stopwatch();
watch2.Start();
list2 = Sort(list2);
watch2.Stop();
totalCustomSortTime += watch2.ElapsedMilliseconds;
watch1.Start();
list1.Sort();
watch1.Stop();
totalListSortTime += watch1.ElapsedMilliseconds;
}
Console.WriteLine("totalListSortTime = " + totalListSortTime);
Console.WriteLine("totalCustomSortTime = " + totalCustomSortTime);
Console.ReadLine();
}
Result:
I haven't had the time to fully test it because I had a blackout (writing from phone now), but it would seem your code (from Pastebin) is sorting several times an already ordered list, so it would seem that your algorithm could be faster to...sort an already sorted list. In case the standard .NET implementation is a Quick Sort, this would be natural since QS has its worst case scenario on already sorted lists.
Anyone know any speed differences between Where and FindAll on List. I know Where is part of IEnumerable and FindAll is part of List, I'm just curious what's faster.
The FindAll method of the List<T> class actually constructs a new list object, and adds results to it. The Where extension method for IEnumerable<T> will simply iterate over an existing list and yield an enumeration of the matching results without creating or adding anything (other than the enumerator itself.)
Given a small set, the two would likely perform comparably. However, given a larger set, Where should outperform FindAll, as the new List created to contain the results will have to dynamically grow to contain additional results. Memory usage of FindAll will also start to grow exponentially as the number of matching results increases, where as Where should have constant minimal memory usage (in and of itself...excluding whatever you do with the results.)
FindAll is obviously slower than Where, because it needs to create a new list.
Anyway, I think you really should consider Jon Hanna comment - you'll probably need to do some operations on your results and list would be more useful than IEnumerable in many cases.
I wrote small test, just paste it in Console App project. It measures time/ticks of: function execution, operations on results collection(to get perf. of 'real' usage, and to be sure that compiler won't optimize unused data etc. - I'm new to C# and don't know how it works yet,sorry).
Notice: every measured function except WhereIENumerable() creates new List of elements. I might be doing something wrong, but clearly iterating IEnumerable takes much more time than iterating list.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
namespace Tests
{
public class Dummy
{
public int Val;
public Dummy(int val)
{
Val = val;
}
}
public class WhereOrFindAll
{
const int ElCount = 20000000;
const int FilterVal =1000;
const int MaxVal = 2000;
const bool CheckSum = true; // Checks sum of elements in list of resutls
static List<Dummy> list = new List<Dummy>();
public delegate void FuncToTest();
public static long TestTicks(FuncToTest function, string msg)
{
Stopwatch watch = new Stopwatch();
watch.Start();
function();
watch.Stop();
Console.Write("\r\n"+msg + "\t ticks: " + (watch.ElapsedTicks));
return watch.ElapsedTicks;
}
static void Check(List<Dummy> list)
{
if (!CheckSum) return;
Stopwatch watch = new Stopwatch();
watch.Start();
long res=0;
int count = list.Count;
for (int i = 0; i < count; i++) res += list[i].Val;
for (int i = 0; i < count; i++) res -= (long)(list[i].Val * 0.3);
watch.Stop();
Console.Write("\r\n\nCheck sum: " + res.ToString() + "\t iteration ticks: " + watch.ElapsedTicks);
}
static void Check(IEnumerable<Dummy> ieNumerable)
{
if (!CheckSum) return;
Stopwatch watch = new Stopwatch();
watch.Start();
IEnumerator<Dummy> ieNumerator = ieNumerable.GetEnumerator();
long res = 0;
while (ieNumerator.MoveNext()) res += ieNumerator.Current.Val;
ieNumerator=ieNumerable.GetEnumerator();
while (ieNumerator.MoveNext()) res -= (long)(ieNumerator.Current.Val * 0.3);
watch.Stop();
Console.Write("\r\n\nCheck sum: " + res.ToString() + "\t iteration ticks :" + watch.ElapsedTicks);
}
static void Generate()
{
if (list.Count > 0)
return;
var rand = new Random();
for (int i = 0; i < ElCount; i++)
list.Add(new Dummy(rand.Next(MaxVal)));
}
static void For()
{
List<Dummy> resList = new List<Dummy>();
int count = list.Count;
for (int i = 0; i < count; i++)
{
if (list[i].Val < FilterVal)
resList.Add(list[i]);
}
Check(resList);
}
static void Foreach()
{
List<Dummy> resList = new List<Dummy>();
int count = list.Count;
foreach (Dummy dummy in list)
{
if (dummy.Val < FilterVal)
resList.Add(dummy);
}
Check(resList);
}
static void WhereToList()
{
List<Dummy> resList = list.Where(x => x.Val < FilterVal).ToList<Dummy>();
Check(resList);
}
static void WhereIEnumerable()
{
Stopwatch watch = new Stopwatch();
IEnumerable<Dummy> iEnumerable = list.Where(x => x.Val < FilterVal);
Check(iEnumerable);
}
static void FindAll()
{
List<Dummy> resList = list.FindAll(x => x.Val < FilterVal);
Check(resList);
}
public static void Run()
{
Generate();
long[] ticks = { 0, 0, 0, 0, 0 };
for (int i = 0; i < 10; i++)
{
ticks[0] += TestTicks(For, "For \t\t");
ticks[1] += TestTicks(Foreach, "Foreach \t");
ticks[2] += TestTicks(WhereToList, "Where to list \t");
ticks[3] += TestTicks(WhereIEnumerable, "Where Ienum \t");
ticks[4] += TestTicks(FindAll, "FindAll \t");
Console.Write("\r\n---------------");
}
for (int i = 0; i < 5; i++)
Console.Write("\r\n"+ticks[i].ToString());
}
}
class Program
{
static void Main(string[] args)
{
WhereOrFindAll.Run();
Console.Read();
}
}
}
Results(ticks) - CheckSum enabled(some operations on results), mode: release without debugging(CTRL+F5):
- 16,222,276 (for ->list)
- 17,151,121 (foreach -> list)
- 4,741,494 (where ->list)
- 27,122,285 (where ->ienum)
- 18,821,571 (findall ->list)
CheckSum disabled (not using returned list at all):
- 10,885,004 (for ->list)
- 11,221,888 (foreach ->list)
- 18,688,433 (where ->list)
- 1,075 (where ->ienum)
- 13,720,243 (findall ->list)
Your results can be slightly different, to get real results you need more iterations.
UPDATE(from comment): Looking through that code I agree, .Where should have, at worst, equal performance but almost always better.
Original answer:
.FindAll() should be faster, it takes advantage of already knowing the List's size and looping through the internal array with a simple for loop. .Where() has to fire up an enumerator (a sealed framework class called WhereIterator in this case) and do the same job in a less specific way.
Keep in mind though, that .Where() is enumerable, not actively creating a List in memory and filling it. It's more like a stream, so the memory use on something very large can have a significant difference. Also, you could start using the results in a parallel fashion much faster using there .Where() approach in 4.0.
Where is much, much faster than FindAll. No matter how big the list is, Where takes exactly the same amount of time.
Of course Where just creates a query. It doesn't actually do anything, unlike FindAll which does create a list.
The answer from jrista makes senses. However, the new list adds the same objects, thus just growing with reference to existing objects, which should not be that slow.
As long as 3.5 / Linq extension are possible, Where stays better anyway.
FindAll makes much more sense when limited with 2.0