How to improve performance of a method

How to improve performance of a method - c#

I have a pretty big method.
where i have some c# calculation and also i am calling 3/4 stored procedures.
constructing 3/4 objects and finally adding in a list and returning the list.
My target is to improve the performance of this method so that it takes less time to execute.
My question is, is there any way so that I can check each part of the method and find out which part is taking time to execute??
may be some looging or something !!
I am using LINQ to EF.

Invest in a performance profiler, like Ants from Redgate. Some of the better versions of Visual Studio also come with one.
At the least, you could try using System.Diagnostics.Stopwatch
From msdn:
static void Main(string[] args)
{
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
Thread.Sleep(10000);
stopWatch.Stop();
TimeSpan ts = stopWatch.Elapsed;
string elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
ts.Hours, ts.Minutes, ts.Seconds,
ts.Milliseconds / 10);
Console.WriteLine("RunTime " + elapsedTime);
}

If possible, you can try executing your stored procedures in parallel. I've seen this improve performance quite a bit, especially if your stored procedures just do reads and no writes.
It might look something like this:
ConcurrentBag<Result> results = new ConcurrentBag<Result>();
Parallel.Invoke(
() => {
var db = new DatabaseEntities();
Result result1 = db.StoredProcudure1();
results.Add(result1);
}
() => {
var db = new DatabaseEntities();
Result result2 = db.StoredProcudure2();
results.Add(result2);
}
() => {
var db = new DatabaseEntities();
Result result3 = db.StoredProcudure3();
results.Add(result3);
}
);
return results;
I'm using a ConcurrentBag here instead of a List because it is thread safe.

What you're looking for is a profiler - a profiler runs your program and tells you how much time each line of code took to execute, as well as how long it took to execute as a percentage of the total execution time.
A great C# profiler is the ANTS .Net Profiler, it's rather expensive, but it has a 14 day free trial - I think this would be perfect for your needs.

You have several options. I find myself using stop watches to test this kind of thing. Howerver before you do anything are you sure the code isn't already performing well enough. If it ain't broke don't fix it is often the best advice. If you're still interested you can do this kind of thing:
Stopwatch sw = Stopwatch.StartNew();
// do some code stuff here
sw.Stop();
Console.WriteLine(sw.ElapsedTicks);
You also have seconds, milliseconds and other measurements in the sw variable.

My advise would be for you to use JetBrains dottrace it have some very helpfull functionality that points hotspot and tells you which piece of code have taken how long
PS: it has saved my neck few times

Database accesses are generally orders of magnitude slower than any calculations that you might make (unless you are trying the predict tomorrows weather). So the LINQ-to-EF part is most probably where time gets lost.
You can use profilers to analyse a program. SQL-Server has a profiler that allows you to monitor queries. If you want to analyse the code, google for .NET profilers and you will find quite a few that have a free licence. Or buy one, if you find it useful. The EQATEC profiler was quite useful for me.
If you have a big method your code is badly structured. Making a big method does not make it faster than splitting it into smaller logical parts. Smaller parts will be easier to maintain and the code profilers will yield more useful informations, since they often only return method call totals and don't show the times for single lines of code.

Related

How to reliably measure code efficiency/complexity/performance/expensiveness in C#?

My question consists of 2 parts:
Is there any good way in C# to measure computation effort other than using timers such as Stopwatch? Below is what I have been doing, but the granularity is not great, and the result returned varies every time. I am wondering if there is more precise measure such as CPU operation count so that the result returned can be consistent.
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
//do work
stopWatch.Stop();
TimeSpan ts = stopWatch.Elapsed;
Console.WriteLine(ts);
If the alternative approach in 1 is not possible, how can I make the performance test result less variate? What are some factors that can make the result change? Would closing all other applications running help? (I did try it but there seems to be no significant effect.) How about running the test on a VM, sandbox, etc.?
(After typing the proceeding text I realized that I also have tried the Performance Analysis feature which comes with Visual Studio. The test result seems more coarse because of the sampling method it uses. So I also want to rule out that option)

You need to get a profiling tool. But you can use StopWatch more reliably if you run your tests in a loop multiple times but only take the results of the test if the garbage collection generation stays the same.
Like this:
var timespans = new List<TimeSpan>();
while (true)
{
var count = GC.CollectionCount(0);
var sw = Stopwatch.StartNew();
/* run test here */
sw.Stop();
if (count == GC.CollectionCount(0))
{
timespans.Add(sw.Elapsed);
}
if (timespans.Count == 100)
{
break;
}
}
That'll give you 100 tests where garbage collection didn't occur. The average is then pretty good to work from.
If you find that your tests never run without invoking a garbage collection then try working out the minimum number of GC's that get triggered and collect your time spans only when that number occurs.

You could query a system performance counter. The msdn doc for the System.Diagnostics.PerformanceCounter class has some examples. With this class you could query "\Process(your_process_name)\% Processor Time" for example. It's an alternative to Stopwatch but tbh I think just using stopwatch and averaging many runs over time is a perfectly good way to go.
If what you need is a higher resolution stopwatch because you are trying to measure a very small slice of cpu time, then you may be interested in the High-Performance Counter.

Is there a good way to find the bottleneck in your app?

I am trying to figure out what the best way would be for me to find out what portions of my application are taking the longest time to run (Largest run cost). The application is not overly complex, but I wanted to make ensure that I have all of the pieces properly tuned so that I could potentially handle a greater load.
Application: Loads / shreds xml documents and dumps the contents into a DB. The application is using Linq to XML to parse the xml, and SQL Server TVPs to pass the data down to the DB. Because I am using TVPs I have one round trip to the DB even when there are collections of data the data is not big (XML files at most 1MB).
Any suggestions on how to isolate the bottlenecks would be greatly appreciated.
As always greatly appreciate the feedback.

You may want to check out the StopWatch class. You can sprinkle it into your code like this:
// load XML Method
var stopWatch = new Stopwatch();
stopWatch.Start();
// run XML parsing code
stopWatch.Stop();
var xmlTime = stopWatch.Elapsed;
// SQL Server dump Method
var stopWatch = new Stopwatch();
stopWatch.Start();
// dump to SQL Server
stopWatch.Stop();
var sqlTime = stopWatch.Elapsed;
This is a low-tech way to take general measurments. For a simple application this is probably more efficient than a profiler, since your application only has two real points for a bottle neck. That said, learning how to use a profiler may be worth your while.

Based on the answer of Nate you could make things easy and use a small helper method for this purpose.
public static Int64 MeasureTime(Action myAction)
{
var stopWatch = new Stopwatch();
stopWatch.Start();
myAction();
stopWatch.Stop();
return stopWatch.ElapsedMilliseconds;
}
Sample usage:
StringBuilder result;
Console.WriteLine("Parse-Xml: {0}", MeasureTime(() => result = MyAction("Test.xml")));

The common way to do this is to use a profiling tool. I've used RedGate ANTS for profiling C# applications, and it works well, but there are plenty of alternatives.

Why is PLINQ slower? [duplicate]

Amazingly, using PLINQ did not yield benefits on a small test case I created; in fact, it was even worse than usual LINQ.
Here's the test code:
int repeatedCount = 10000000;
private void button1_Click(object sender, EventArgs e)
{
var currTime = DateTime.Now;
var strList = Enumerable.Repeat(10, repeatedCount);
var result = strList.AsParallel().Sum();
var currTime2 = DateTime.Now;
textBox1.Text = (currTime2.Ticks-currTime.Ticks).ToString();
}
private void button2_Click(object sender, EventArgs e)
{
var currTime = DateTime.Now;
var strList = Enumerable.Repeat(10, repeatedCount);
var result = strList.Sum();
var currTime2 = DateTime.Now;
textBox2.Text = (currTime2.Ticks - currTime.Ticks).ToString();
}
The result?
textbox1: 3437500
textbox2: 781250
So, LINQ is taking less time than PLINQ to complete a similar operation!
What am I doing wrong? Or is there a twist that I don't know about?
Edit: I've updated my code to use stopwatch, and yet, the same behavior persisted. To discount the effect of JIT, I actually tried a few times with clicking both button1 and button2 and in no particular order. Although the time I got might be different, but the qualitative behavior remained: PLINQ was indeed slower in this case.

First: Stop using DateTime to measure run time. Use a Stopwatch instead. The test code would look like:
var watch = new Stopwatch();
var strList = Enumerable.Repeat(10, 10000000);
watch.Start();
var result = strList.Sum();
watch.Stop();
Console.WriteLine("Linear: {0}", watch.ElapsedMilliseconds);
watch.Reset();
watch.Start();
var parallelResult = strList.AsParallel().Sum();
watch.Stop();
Console.WriteLine("Parallel: {0}", watch.ElapsedMilliseconds);
Console.ReadKey();
Second: Running things in Parallel adds overhead. In this case, PLINQ has to figure out the best way to divide your collection so that it can Sum the elements safely in parallel. After that, you need to join the results from the various threads created and Sum those as well. This isn't a trivial task.
Using the code above I can see that using Sum() nets a ~95ms call. Calling .AsParallel().Sum() nets around ~185ms.
Doing a task in Parallel is only a good idea if you gain something by doing it. In this case, Sum is a simple enough task that you don't gain by using PLINQ.

This is a classic mistake -- thinking, "I'll run a simple test to compare the performance of this single-threaded code with this multi-threaded code."
A simple test is the worst kind of test you can run to measure multi-threaded performance.
Typically, parallelizing some operation yields a performance benefit when the steps you're parallelizing require substantial work. When the steps are simple -- as in, quick* -- the overhead of parallelizing your work ends up dwarfing the miniscule performance gain you would have otherwise gotten.
Consider this analogy.
You're constructing a building. If you have one worker, he has to lay bricks one by one until he's made one wall, then do the same for the next wall, and so on until all walls are built and connected. This is a slow and laborious task that could benefit from parallelization.
The right way to do this would be to parallelize the wall building -- hire, say, 3 more workers, and have each worker construct his own wall so that 4 walls can be built simultaneously. The time it takes to find the 3 extra workers and assign them their tasks is insignificant in comparison to the savings you get by getting 4 walls up in the amount of time it would have previously taken to build 1.
The wrong way to do it would be to parallelize the brick laying -- hire about a thousand more workers and have each worker responsible for laying a single brick at a time. You may think, "If one worker can lay 2 bricks per minute, then a thousand workers should be able to lay 2000 bricks per minute, so I'll finish this job in no time!" But the reality is that by parallelizing your workload at such a microscopic level, you're wasting a tremendous amount of energy gathering and coordinating all of your workers, assigning tasks to them ("lay this brick right there"), making sure no one's work is interfering with anyone else's, etc.
So the moral of this analogy is: in general, use parallelization to split up the substantial units of work (like walls), but leave the insubstantial units (like bricks) to be handled in the usual sequential manner.
*For this reason, you can actually make a pretty good approximation of the performance gain of parallelization in a more work-intensive context by taking any fast-executing code and adding Thread.Sleep(100) (or some other random number) to the end of it. Suddenly sequential executions of this code will be slowed down by 100 ms per iteration, while parallel executions will be slowed significantly less.

Others have pointed out some flaws in your benchmarks. Here's a short console app to make it simpler:
using System;
using System.Diagnostics;
using System.Linq;
public class Test
{
const int Iterations = 1000000000;
static void Main()
{
// Make sure everything's JITted
Time(Sequential, 1);
Time(Parallel, 1);
Time(Parallel2, 1);
// Now run the real tests
Time(Sequential, Iterations);
Time(Parallel, Iterations);
Time(Parallel2, Iterations);
}
static void Time(Func<int, int> action, int count)
{
GC.Collect();
Stopwatch sw = Stopwatch.StartNew();
int check = action(count);
if (count != check)
{
Console.WriteLine("Check for {0} failed!", action.Method.Name);
}
sw.Stop();
Console.WriteLine("Time for {0} with count={1}: {2}ms",
action.Method.Name, count,
(long) sw.ElapsedMilliseconds);
}
static int Sequential(int count)
{
var strList = Enumerable.Repeat(1, count);
return strList.Sum();
}
static int Parallel(int count)
{
var strList = Enumerable.Repeat(1, count);
return strList.AsParallel().Sum();
}
static int Parallel2(int count)
{
var strList = ParallelEnumerable.Repeat(1, count);
return strList.Sum();
}
}
Compilation:
csc /o+ /debug- Test.cs
Results on my quad core i7 laptop; runs up to 2 cores fast, or 4 cores more slowly. Basically ParallelEnumerable.Repeat wins, followed by the sequence version, followed by parallelising the normal Enumerable.Repeat.
Time for Sequential with count=1: 117ms
Time for Parallel with count=1: 181ms
Time for Parallel2 with count=1: 12ms
Time for Sequential with count=1000000000: 9152ms
Time for Parallel with count=1000000000: 44144ms
Time for Parallel2 with count=1000000000: 3154ms
Note that earlier versions of this answer were embarrassingly flawed by having the wrong number of elements - I'm much more confident in the results above.

Is it possible you are not taking into account JIT time? You should run your test twice and discard the first set of results.
Also, you shouldn't use DateTime to get performance timing, use the Stopwatch class instead:
var swatch = new Stopwatch();
swatch.StartNew();
var strList = Enumerable.Repeat(10, repeatedCount);
var result = strList.AsParallel().Sum();
swatch.Stop();
textBox1.Text = swatch.Elapsed;
PLINQ does add some overhead to the processing of a sequence. But the magnitute difference in your case seems excessive. PLINQ makes sense when the overhead cost is outweighed by the benefit of running the logic on multiple cores/CPUs. If you don't have multiple core, running processing in parallel offers no real advantage - and PLINQ should detect such a case and perform the processing sequentially.
EDIT: When creating embedded performance tests of this kind, you should make sure that you are not running them under the debugger, or with Intellitrace enabled, as those can significantly skew performance timings.

Something more important that I didn't see mentioned is that .AsParallel will have different performance depending on the collection used.
In my tests PLINQ is faster than LINQ when NOT used on IEnumerable (Enumerable.Repeat) :
29ms PLINQ ParralelQuery
30ms LINQ ParralelQuery
30ms PLINQ Array
38ms PLINQ List
163ms LINQ IEnumerable
211ms LINQ Array
213ms LINQ List
273ms PLINQ IEnumerable
4 processors
Code is in VB, but provided to show that using .ToArray made the PLINQ version few times faster
Dim test = Function(LINQ As Action, PLINQ As Action, type As String)
Dim sw1 = Stopwatch.StartNew : LINQ() : Dim ts1 = sw1.ElapsedMilliseconds
Dim sw2 = Stopwatch.StartNew : PLINQ() : Dim ts2 = sw2.ElapsedMilliseconds
Return {String.Format("{0,4}ms LINQ {1}", ts1, type), String.Format("{0,4}ms PLINQ {1}", ts2, type)}
End Function
Dim results = New List(Of String) From {Environment.ProcessorCount & " processors"}
Dim count = 12345678, iList = Enumerable.Repeat(1, count)
With iList : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "IEnumerable")) : End With
With iList.ToArray : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "Array")) : End With
With iList.ToList : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "List")) : End With
With ParallelEnumerable.Repeat(1, count) : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "ParralelQuery")) : End With
MessageBox.Show(String.join(Environment.NewLine, From l In results Order By l))
Running the tests in different order will have a bit different results, so having them in one line makes moving them up and down a bit easier for me.

That indeed may be the case because you are increasing the number of context switches and you are not performing any action that would benefit of having threads waiting for something like i/o completion. This is going to be even worse if you are running in a single cpu box.

I'd recommend using the Stopwatch class for timing metrics. In your case it's a better measure of the interval.

Please read the Side Effects section of this article.
http://msdn.microsoft.com/en-us/magazine/cc163329.aspx
I think you can run into many conditions where PLINQ has additional data processing patterns you must understand before you opt to think that is will always purely have faster response times.

Justin's comment about overhead is exactly right.
Just something to consider when writing concurrent software in general, beyond the use of PLINQ:
You always need to be thinking about the "granularity" of your work items. Some problems are very well suited to parallelization because they can be "chunked" at a very high level, like raytracing entire frames concurrently (these sorts of problems are called embarrassingly parallel). When there are very large "chunks" of work, then the overhead of creating and managing multiple threads becomes negligible compared to the actual work that you want to get done.
PLINQ makes concurrent programming easier, but it doesn't mean that you can ignore thinking about the granularity of your work.

PLINQ Performs Worse Than Usual LINQ

Amazingly, using PLINQ did not yield benefits on a small test case I created; in fact, it was even worse than usual LINQ.
Here's the test code:
int repeatedCount = 10000000;
private void button1_Click(object sender, EventArgs e)
{
var currTime = DateTime.Now;
var strList = Enumerable.Repeat(10, repeatedCount);
var result = strList.AsParallel().Sum();
var currTime2 = DateTime.Now;
textBox1.Text = (currTime2.Ticks-currTime.Ticks).ToString();
}
private void button2_Click(object sender, EventArgs e)
{
var currTime = DateTime.Now;
var strList = Enumerable.Repeat(10, repeatedCount);
var result = strList.Sum();
var currTime2 = DateTime.Now;
textBox2.Text = (currTime2.Ticks - currTime.Ticks).ToString();
}
The result?
textbox1: 3437500
textbox2: 781250
So, LINQ is taking less time than PLINQ to complete a similar operation!
What am I doing wrong? Or is there a twist that I don't know about?
Edit: I've updated my code to use stopwatch, and yet, the same behavior persisted. To discount the effect of JIT, I actually tried a few times with clicking both button1 and button2 and in no particular order. Although the time I got might be different, but the qualitative behavior remained: PLINQ was indeed slower in this case.

First: Stop using DateTime to measure run time. Use a Stopwatch instead. The test code would look like:
var watch = new Stopwatch();
var strList = Enumerable.Repeat(10, 10000000);
watch.Start();
var result = strList.Sum();
watch.Stop();
Console.WriteLine("Linear: {0}", watch.ElapsedMilliseconds);
watch.Reset();
watch.Start();
var parallelResult = strList.AsParallel().Sum();
watch.Stop();
Console.WriteLine("Parallel: {0}", watch.ElapsedMilliseconds);
Console.ReadKey();
Second: Running things in Parallel adds overhead. In this case, PLINQ has to figure out the best way to divide your collection so that it can Sum the elements safely in parallel. After that, you need to join the results from the various threads created and Sum those as well. This isn't a trivial task.
Using the code above I can see that using Sum() nets a ~95ms call. Calling .AsParallel().Sum() nets around ~185ms.
Doing a task in Parallel is only a good idea if you gain something by doing it. In this case, Sum is a simple enough task that you don't gain by using PLINQ.

Others have pointed out some flaws in your benchmarks. Here's a short console app to make it simpler:
using System;
using System.Diagnostics;
using System.Linq;
public class Test
{
const int Iterations = 1000000000;
static void Main()
{
// Make sure everything's JITted
Time(Sequential, 1);
Time(Parallel, 1);
Time(Parallel2, 1);
// Now run the real tests
Time(Sequential, Iterations);
Time(Parallel, Iterations);
Time(Parallel2, Iterations);
}
static void Time(Func<int, int> action, int count)
{
GC.Collect();
Stopwatch sw = Stopwatch.StartNew();
int check = action(count);
if (count != check)
{
Console.WriteLine("Check for {0} failed!", action.Method.Name);
}
sw.Stop();
Console.WriteLine("Time for {0} with count={1}: {2}ms",
action.Method.Name, count,
(long) sw.ElapsedMilliseconds);
}
static int Sequential(int count)
{
var strList = Enumerable.Repeat(1, count);
return strList.Sum();
}
static int Parallel(int count)
{
var strList = Enumerable.Repeat(1, count);
return strList.AsParallel().Sum();
}
static int Parallel2(int count)
{
var strList = ParallelEnumerable.Repeat(1, count);
return strList.Sum();
}
}
Compilation:
csc /o+ /debug- Test.cs
Results on my quad core i7 laptop; runs up to 2 cores fast, or 4 cores more slowly. Basically ParallelEnumerable.Repeat wins, followed by the sequence version, followed by parallelising the normal Enumerable.Repeat.
Time for Sequential with count=1: 117ms
Time for Parallel with count=1: 181ms
Time for Parallel2 with count=1: 12ms
Time for Sequential with count=1000000000: 9152ms
Time for Parallel with count=1000000000: 44144ms
Time for Parallel2 with count=1000000000: 3154ms
Note that earlier versions of this answer were embarrassingly flawed by having the wrong number of elements - I'm much more confident in the results above.

Is it possible you are not taking into account JIT time? You should run your test twice and discard the first set of results.
Also, you shouldn't use DateTime to get performance timing, use the Stopwatch class instead:
var swatch = new Stopwatch();
swatch.StartNew();
var strList = Enumerable.Repeat(10, repeatedCount);
var result = strList.AsParallel().Sum();
swatch.Stop();
textBox1.Text = swatch.Elapsed;
PLINQ does add some overhead to the processing of a sequence. But the magnitute difference in your case seems excessive. PLINQ makes sense when the overhead cost is outweighed by the benefit of running the logic on multiple cores/CPUs. If you don't have multiple core, running processing in parallel offers no real advantage - and PLINQ should detect such a case and perform the processing sequentially.
EDIT: When creating embedded performance tests of this kind, you should make sure that you are not running them under the debugger, or with Intellitrace enabled, as those can significantly skew performance timings.

Something more important that I didn't see mentioned is that .AsParallel will have different performance depending on the collection used.
In my tests PLINQ is faster than LINQ when NOT used on IEnumerable (Enumerable.Repeat) :
29ms PLINQ ParralelQuery
30ms LINQ ParralelQuery
30ms PLINQ Array
38ms PLINQ List
163ms LINQ IEnumerable
211ms LINQ Array
213ms LINQ List
273ms PLINQ IEnumerable
4 processors
Code is in VB, but provided to show that using .ToArray made the PLINQ version few times faster
Dim test = Function(LINQ As Action, PLINQ As Action, type As String)
Dim sw1 = Stopwatch.StartNew : LINQ() : Dim ts1 = sw1.ElapsedMilliseconds
Dim sw2 = Stopwatch.StartNew : PLINQ() : Dim ts2 = sw2.ElapsedMilliseconds
Return {String.Format("{0,4}ms LINQ {1}", ts1, type), String.Format("{0,4}ms PLINQ {1}", ts2, type)}
End Function
Dim results = New List(Of String) From {Environment.ProcessorCount & " processors"}
Dim count = 12345678, iList = Enumerable.Repeat(1, count)
With iList : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "IEnumerable")) : End With
With iList.ToArray : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "Array")) : End With
With iList.ToList : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "List")) : End With
With ParallelEnumerable.Repeat(1, count) : results.AddRange(test(Sub() .Sum(), Sub() .AsParallel.Sum(), "ParralelQuery")) : End With
MessageBox.Show(String.join(Environment.NewLine, From l In results Order By l))
Running the tests in different order will have a bit different results, so having them in one line makes moving them up and down a bit easier for me.

That indeed may be the case because you are increasing the number of context switches and you are not performing any action that would benefit of having threads waiting for something like i/o completion. This is going to be even worse if you are running in a single cpu box.

I'd recommend using the Stopwatch class for timing metrics. In your case it's a better measure of the interval.

Please read the Side Effects section of this article.
http://msdn.microsoft.com/en-us/magazine/cc163329.aspx
I think you can run into many conditions where PLINQ has additional data processing patterns you must understand before you opt to think that is will always purely have faster response times.

Justin's comment about overhead is exactly right.
Just something to consider when writing concurrent software in general, beyond the use of PLINQ:
You always need to be thinking about the "granularity" of your work items. Some problems are very well suited to parallelization because they can be "chunked" at a very high level, like raytracing entire frames concurrently (these sorts of problems are called embarrassingly parallel). When there are very large "chunks" of work, then the overhead of creating and managing multiple threads becomes negligible compared to the actual work that you want to get done.
PLINQ makes concurrent programming easier, but it doesn't mean that you can ignore thinking about the granularity of your work.

Testing your code for speed?

I'm a total newbie, but I was writing a little program that worked on strings in C# and I noticed that if I did a few things differently, the code executed significantly faster.
So it had me wondering, how do you go about clocking your code's execution speed? Are there any (free)utilities? Do you go about it the old-fashioned way with a System.Timer and do it yourself?

What you are describing is known as performance profiling. There are many programs you can get to do this such as Jetbrains profiler or Ants profiler, although most will slow down your application whilst in the process of measuring its performance.
To hand-roll your own performance profiling, you can use System.Diagnostics.Stopwatch and a simple Console.WriteLine, like you described.
Also keep in mind that the C# JIT compiler optimizes code depending on the type and frequency it is called, so play around with loops of differing sizes and methods such as recursive calls to get a feel of what works best.

ANTS Profiler from RedGate is a really nice performance profiler. dotTrace Profiler from JetBrains is also great. These tools will allow you to see performance metrics that can be drilled down the each individual line.
Scree shot of ANTS Profiler:
ANTS http://www.red-gate.com/products/ants_profiler/images/app/timeline_calltree3.gif
If you want to ensure that a specific method stays within a specific performance threshold during unit testing, I would use the Stopwatch class to monitor the execution time of a method one ore many times in a loop and calculate the average and then Assert against the result.

Just a reminder - make sure to compile in Relase, not Debug! (I've seen this mistake made by seasoned developers - it's easy to forget).

What are you describing is 'Performance Tuning'. When we talk about performance tuning there are two angle to it. (a) Response time - how long it take to execute a particular request/program. (b) Throughput - How many requests it can execute in a second. When we typically 'optimize' - when we eliminate unnecessary processing both response time as well as throughput improves. However if you have wait events in you code (like Thread.sleep(), I/O wait etc) your response time is affected however throughput is not affected. By adopting parallel processing (spawning multiple threads) we can improve response time but throughput will not be improved. Typically for server side application both response time and throughput are important. For desktop applications (like IDE) throughput is not important only response time is important.
You can measure response time by 'Performance Testing' - you just note down the response time for all key transactions. You can measure the throughput by 'Load Testing' - You need to pump requests continuously from sufficiently large number of threads/clients such that the CPU usage of server machine is 80-90%. When we pump request we need to maintain the ratio between different transactions (called transaction mix) - for eg: in a reservation system there will be 10 booking for every 100 search. there will be one cancellation for every 10 booking etc.
After identifying the transactions require tuning for response time (performance testing) you can identify the hot spots by using a profiler.
You can identify the hot spots for throughput by comparing the response time * fraction of that transaction. Assume in search, booking, cancellation scenario, ratio is 89:10:1.
Response time are 0.1 sec, 10 sec and 15 sec.
load for search - 0.1 * .89 = 0.089
load for booking- 10 * .1 = 1
load for cancell= 15 * .01= 0.15
Here tuning booking will yield maximum impact on throughput.
You can also identify hot spots for throughput by taking thread dumps (in the case of java based applications) repeatedly.

Use a profiler.
Ants (http://www.red-gate.com/Products/ants_profiler/index.htm)
dotTrace (http://www.jetbrains.com/profiler/)
If you need to time one specific method only, the Stopwatch class might be a good choice.

I do the following things:
1) I use ticks (e.g. in VB.Net Now.ticks) for measuring the current time. I subtract the starting ticks from the finished ticks value and divide by TimeSpan.TicksPerSecond to get how many seconds it took.
2) I avoid UI operations (like console.writeline).
3) I run the code over a substantial loop (like 100,000 iterations) to factor out usage / OS variables as best as I can.

You can use the StopWatch class to time methods. Remember the first time is often slow due to code having to be jitted.

There is a native .NET option (Team Edition for Software Developers) that might address some performance analysis needs. From the 2005 .NET IDE menu, select Tools->Performance Tools->Performance Wizard...
[GSS is probably correct that you must have Team Edition]

This is simple example for testing code speed. I hope I helped you
class Program {
static void Main(string[] args) {
const int steps = 10000;
Stopwatch sw = new Stopwatch();
ArrayList list1 = new ArrayList();
sw.Start();
for(int i = 0; i < steps; i++) {
list1.Add(i);
}
sw.Stop();
Console.WriteLine("ArrayList:\tMilliseconds = {0},\tTicks = {1}", sw.ElapsedMilliseconds, sw.ElapsedTicks);
MyList list2 = new MyList();
sw.Start();
for(int i = 0; i < steps; i++) {
list2.Add(i);
}
sw.Stop();
Console.WriteLine("MyList: \tMilliseconds = {0},\tTicks = {1}", sw.ElapsedMilliseconds, sw.ElapsedTicks);

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.