Is a LINQ statement faster than a 'foreach' loop?

Is a LINQ statement faster than a 'foreach' loop? - c#

I am writing a Mesh Rendering manager and thought it would be a good idea to group all of the meshes which use the same shader and then render these while I'm in that shader pass.
I am currently using a foreach loop, but wondered if utilising LINQ might give me a performance increase?

Why should LINQ be faster? It also uses loops internally.
Most of the times, LINQ will be a bit slower because it introduces overhead. Do not use LINQ if you care much about performance. Use LINQ because you want shorter better readable and maintainable code.

LINQ-to-Objects generally is going to add some marginal overheads (multiple iterators, etc). It still has to do the loops, and has delegate invokes, and will generally have to do some extra dereferencing to get at captured variables etc. In most code this will be virtually undetectable, and more than afforded by the simpler to understand code.
With other LINQ providers like LINQ-to-SQL, then since the query can filter at the server it should be much better than a flat foreach, but most likely you wouldn't have done a blanket "select * from foo" anyway, so that isn't necessarily a fair comparison.
Re PLINQ; parallelism may reduce the elapsed time, but the total CPU time will usually increase a little due to the overheads of thread management etc.

LINQ is slower now, but it might get faster at some point. The good thing about LINQ is that you don't have to care about how it works. If a new method is thought up that's incredibly fast, the people at Microsoft can implement it without even telling you and your code would be a lot faster.
More importantly though, LINQ is just much easier to read. That should be enough reason.

It should probably be noted that the for loop is faster than the foreach. So for the original post, if you are worried about performance on a critical component like a renderer, use a for loop.
Reference:
In .NET, which loop runs faster, 'for' or 'foreach'?

You might get a performance boost if you use parallel LINQ for multi cores. See Parallel LINQ (PLINQ) (MSDN).

I was interested in this question, so I did a test just now. Using .NET Framework 4.5.2 on an Intel(R) Core(TM) i3-2328M CPU # 2.20GHz, 2200 Mhz, 2 Core(s) with 8GB ram running Microsoft Windows 7 Ultimate.
It looks like LINQ might be faster than for each loop. Here are the results I got:
Exists = True
Time = 174
Exists = True
Time = 149
It would be interesting if some of you could copy & paste this code in a console app and test as well.
Before testing with an object (Employee) I tried the same test with integers. LINQ was faster there as well.
public class Program
{
public class Employee
{
public int id;
public string name;
public string lastname;
public DateTime dateOfBirth;
public Employee(int id,string name,string lastname,DateTime dateOfBirth)
{
this.id = id;
this.name = name;
this.lastname = lastname;
this.dateOfBirth = dateOfBirth;
}
}
public static void Main() => StartObjTest();
#region object test
public static void StartObjTest()
{
List<Employee> items = new List<Employee>();
for (int i = 0; i < 10000000; i++)
{
items.Add(new Employee(i,"name" + i,"lastname" + i,DateTime.Today));
}
Test3(items, items.Count-100);
Test4(items, items.Count - 100);
Console.Read();
}
public static void Test3(List<Employee> items, int idToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = false;
foreach (var item in items)
{
if (item.id == idToCheck)
{
exists = true;
break;
}
}
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
public static void Test4(List<Employee> items, int idToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = items.Exists(e => e.id == idToCheck);
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
#endregion
#region int test
public static void StartIntTest()
{
List<int> items = new List<int>();
for (int i = 0; i < 10000000; i++)
{
items.Add(i);
}
Test1(items, -100);
Test2(items, -100);
Console.Read();
}
public static void Test1(List<int> items,int itemToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = false;
foreach (var item in items)
{
if (item == itemToCheck)
{
exists = true;
break;
}
}
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
public static void Test2(List<int> items, int itemToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = items.Contains(itemToCheck);
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
#endregion
}

This is actually quite a complex question. Linq makes certain things very easy to do, that if you implement them yourself, you might stumble over (e.g. linq .Except()). This particularly applies to PLinq, and especially to parallel aggregation as implemented by PLinq.
In general, for identical code, linq will be slower, because of the overhead of delegate invocation.
If, however, you are processing a large array of data, and applying relatively simple calculations to the elements, you will get a huge performance increase if:
You use an array to store the data.
You use a for loop to access each element (as opposed to foreach or linq).
Note: When benchmarking, please everyone remember - if you use the same array/list for two consecutive tests, the CPU cache will make the second one faster. *

Coming in .NET core 7 are some significant updates to LINQ performance of .Min .Max, .Average and .Sum
Reference: https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/#linq
Here is a benchmark from the post.
If you compare to a ForEach loop, than it becomes apparent that in .NET 6 the ForEach loop was faster and in .NET 7 the LINQ methods:
this was the code of the benchmark using BenchmarkDotNet
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
public class Program
{
public static void Main()
{
BenchmarkRunner.Run<ForEachVsLinq>();
}
}
[SimpleJob(RuntimeMoniker.Net60)]
[SimpleJob(RuntimeMoniker.Net70)]
[MemoryDiagnoser(false)]
public class ForEachVsLinq
{
private int[] _intArray;
[GlobalSetup]
public void Setup()
{
var random = new Random();
var randomItems = Enumerable.Range(0, 500).Select(_ => random.Next(999));
this._intArray = randomItems.ToArray();
}
[Benchmark]
public void ForEachMin()
{
var min = int.MaxValue;
foreach (var i in this._intArray)
{
if ( i < min)
min = i;
}
Console.WriteLine(min);
}
[Benchmark]
public void Min()
{
var min = this._intArray.Min();
Console.WriteLine(min);
}
[Benchmark]
public void ForEachMax()
{
var max = 0;
foreach (var i in this._intArray)
{
if (i > max)
max = i;
}
Console.WriteLine(max);
}
[Benchmark]
public void Max()
{
var max = this._intArray.Max();
Console.WriteLine(max);
}
[Benchmark]
public void ForEachSum()
{
var sum = 0;
foreach (var i in this._intArray)
{
sum += i;
}
Console.WriteLine(sum);
}
[Benchmark]
public void Sum()
{
var sum = this._intArray.Sum();
Console.WriteLine(sum);
}
}
In .NET Core 6 and earlier versions the mentioned methods are slower than doing your own foreach loop and finding the min, max value, average or summarizing the objects in the array.
But in .NET Core 7, the performance increase makes these buildin LINQ methods actually a lot faster.
Nick Chapsas shows this in a benchmark video on YouTupe
So if you want to calculate the sum, min, max or average value, you should use the LINQ methods instead of a foreach loop from .NET Core 7 onwards (at least, from a performance point of view)

Related

Why is in C# a nested foreach loop more performant than a SelectMany combined with a single foreach loop or a GroupBy? [duplicate]

I am writing a Mesh Rendering manager and thought it would be a good idea to group all of the meshes which use the same shader and then render these while I'm in that shader pass.
I am currently using a foreach loop, but wondered if utilising LINQ might give me a performance increase?

Why should LINQ be faster? It also uses loops internally.
Most of the times, LINQ will be a bit slower because it introduces overhead. Do not use LINQ if you care much about performance. Use LINQ because you want shorter better readable and maintainable code.

LINQ-to-Objects generally is going to add some marginal overheads (multiple iterators, etc). It still has to do the loops, and has delegate invokes, and will generally have to do some extra dereferencing to get at captured variables etc. In most code this will be virtually undetectable, and more than afforded by the simpler to understand code.
With other LINQ providers like LINQ-to-SQL, then since the query can filter at the server it should be much better than a flat foreach, but most likely you wouldn't have done a blanket "select * from foo" anyway, so that isn't necessarily a fair comparison.
Re PLINQ; parallelism may reduce the elapsed time, but the total CPU time will usually increase a little due to the overheads of thread management etc.

LINQ is slower now, but it might get faster at some point. The good thing about LINQ is that you don't have to care about how it works. If a new method is thought up that's incredibly fast, the people at Microsoft can implement it without even telling you and your code would be a lot faster.
More importantly though, LINQ is just much easier to read. That should be enough reason.

It should probably be noted that the for loop is faster than the foreach. So for the original post, if you are worried about performance on a critical component like a renderer, use a for loop.
Reference:
In .NET, which loop runs faster, 'for' or 'foreach'?

You might get a performance boost if you use parallel LINQ for multi cores. See Parallel LINQ (PLINQ) (MSDN).

I was interested in this question, so I did a test just now. Using .NET Framework 4.5.2 on an Intel(R) Core(TM) i3-2328M CPU # 2.20GHz, 2200 Mhz, 2 Core(s) with 8GB ram running Microsoft Windows 7 Ultimate.
It looks like LINQ might be faster than for each loop. Here are the results I got:
Exists = True
Time = 174
Exists = True
Time = 149
It would be interesting if some of you could copy & paste this code in a console app and test as well.
Before testing with an object (Employee) I tried the same test with integers. LINQ was faster there as well.
public class Program
{
public class Employee
{
public int id;
public string name;
public string lastname;
public DateTime dateOfBirth;
public Employee(int id,string name,string lastname,DateTime dateOfBirth)
{
this.id = id;
this.name = name;
this.lastname = lastname;
this.dateOfBirth = dateOfBirth;
}
}
public static void Main() => StartObjTest();
#region object test
public static void StartObjTest()
{
List<Employee> items = new List<Employee>();
for (int i = 0; i < 10000000; i++)
{
items.Add(new Employee(i,"name" + i,"lastname" + i,DateTime.Today));
}
Test3(items, items.Count-100);
Test4(items, items.Count - 100);
Console.Read();
}
public static void Test3(List<Employee> items, int idToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = false;
foreach (var item in items)
{
if (item.id == idToCheck)
{
exists = true;
break;
}
}
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
public static void Test4(List<Employee> items, int idToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = items.Exists(e => e.id == idToCheck);
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
#endregion
#region int test
public static void StartIntTest()
{
List<int> items = new List<int>();
for (int i = 0; i < 10000000; i++)
{
items.Add(i);
}
Test1(items, -100);
Test2(items, -100);
Console.Read();
}
public static void Test1(List<int> items,int itemToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = false;
foreach (var item in items)
{
if (item == itemToCheck)
{
exists = true;
break;
}
}
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
public static void Test2(List<int> items, int itemToCheck)
{
Stopwatch s = new Stopwatch();
s.Start();
bool exists = items.Contains(itemToCheck);
Console.WriteLine("Exists=" + exists);
Console.WriteLine("Time=" + s.ElapsedMilliseconds);
}
#endregion
}

This is actually quite a complex question. Linq makes certain things very easy to do, that if you implement them yourself, you might stumble over (e.g. linq .Except()). This particularly applies to PLinq, and especially to parallel aggregation as implemented by PLinq.
In general, for identical code, linq will be slower, because of the overhead of delegate invocation.
If, however, you are processing a large array of data, and applying relatively simple calculations to the elements, you will get a huge performance increase if:
You use an array to store the data.
You use a for loop to access each element (as opposed to foreach or linq).
Note: When benchmarking, please everyone remember - if you use the same array/list for two consecutive tests, the CPU cache will make the second one faster. *

Coming in .NET core 7 are some significant updates to LINQ performance of .Min .Max, .Average and .Sum
Reference: https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/#linq
Here is a benchmark from the post.
If you compare to a ForEach loop, than it becomes apparent that in .NET 6 the ForEach loop was faster and in .NET 7 the LINQ methods:
this was the code of the benchmark using BenchmarkDotNet
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
public class Program
{
public static void Main()
{
BenchmarkRunner.Run<ForEachVsLinq>();
}
}
[SimpleJob(RuntimeMoniker.Net60)]
[SimpleJob(RuntimeMoniker.Net70)]
[MemoryDiagnoser(false)]
public class ForEachVsLinq
{
private int[] _intArray;
[GlobalSetup]
public void Setup()
{
var random = new Random();
var randomItems = Enumerable.Range(0, 500).Select(_ => random.Next(999));
this._intArray = randomItems.ToArray();
}
[Benchmark]
public void ForEachMin()
{
var min = int.MaxValue;
foreach (var i in this._intArray)
{
if ( i < min)
min = i;
}
Console.WriteLine(min);
}
[Benchmark]
public void Min()
{
var min = this._intArray.Min();
Console.WriteLine(min);
}
[Benchmark]
public void ForEachMax()
{
var max = 0;
foreach (var i in this._intArray)
{
if (i > max)
max = i;
}
Console.WriteLine(max);
}
[Benchmark]
public void Max()
{
var max = this._intArray.Max();
Console.WriteLine(max);
}
[Benchmark]
public void ForEachSum()
{
var sum = 0;
foreach (var i in this._intArray)
{
sum += i;
}
Console.WriteLine(sum);
}
[Benchmark]
public void Sum()
{
var sum = this._intArray.Sum();
Console.WriteLine(sum);
}
}
In .NET Core 6 and earlier versions the mentioned methods are slower than doing your own foreach loop and finding the min, max value, average or summarizing the objects in the array.
But in .NET Core 7, the performance increase makes these buildin LINQ methods actually a lot faster.
Nick Chapsas shows this in a benchmark video on YouTupe
So if you want to calculate the sum, min, max or average value, you should use the LINQ methods instead of a foreach loop from .NET Core 7 onwards (at least, from a performance point of view)

C# LINQ performance when extension method called inside where clause

I have a LINQ query like this
public static bool CheckIdExists(int searchId)
{
return itemCollection.Any(item => item.Id.Equals(searchId.ConvertToString()));
}
item.Id is a string while searchId is an int. .ConvertToString() is an extension which which converts int to string
Code for ConvertToString:
public static string ConvertToString(this object input)
{
return Convert.ToString(input, CultureInfo.InvariantCulture);
}
Now my query is, does searchId.ConvertToString() gets executed for each item in itemCollection?
Is computing searchId.ConvertToString() beforehand and calling the method like below improves performance?
public static bool CheckIdExists(int searchId)
{
string sId=searchId.ConvertToString();
return itemCollection.Any(item => item.Id.Equals(sId));
}
How to debug these two scenarios and observe their performances?

I re-generated the scenarios you talked about in your question. I tried following code and got this output.
But this is how you can debug this.
static List<string> itemCollection = new List<string>();
static void Main(string[] args)
{
for (int i = 0; i < 10000000; i++)
{
itemCollection.Add(i.ToString());
}
var watch = new Stopwatch();
watch.Start();
Console.WriteLine(CheckIdExists(580748));
watch.Stop();
Console.WriteLine($"Took {watch.ElapsedMilliseconds}");
var watch1 = new Stopwatch();
watch1.Start();
Console.WriteLine(CheckIdExists1(580748));
watch1.Stop();
Console.WriteLine($"Took {watch1.ElapsedMilliseconds}");
Console.ReadLine();
}
public static bool CheckIdExists(int searchId)
{
return itemCollection.Any(item => item.Equals(ConvertToString(searchId)));
}
public static bool CheckIdExists1(int searchId)
{
string sId =ConvertToString(searchId);
return itemCollection.Any(item => item.Equals(sId));
}
public static string ConvertToString(int input)
{
return Convert.ToString(input, CultureInfo.InvariantCulture);
}
OUTPUT:
True
Took 170
True
Took 11

How long it takes is the ultimate guide. You can create a stopwatch to log the performance of any code. Just use the ElapsedMilliseconds to see how long has been taken. For very short operations I suggest using very long loops to get a more accurate length of time.
var watch = new Stopwatch();
watch.Start();
/// CODE HERE (IDEALLY IN A LONG LOOP)
Debub.WriteLine($"Took {watch.ElapsedMilliseconds}");

Yes, it should be faster to get the string once. But I guess that compiler does optimize that thing for you (I just suspect this, don't ave anything to back it up. I just remeber that compilers are very good at detecting things that are not changing).
And no, it's not computed for every item, since LINQ method Any does not necessarily check all items. It return true for the first matching item. The only scenario when it checks all items, is where for none the lambda returns true.
If you want to test the speed difference,make sure to have more data - otherwise the difference may be too small.
Just do:
itemCollection = Enumerable.Range(0, 1000).SelectMany(x => itemCollection).ToList() // or array or whatever the type of collection you have
Than measure the times with StopWatch, just like #RobSedgwick said

I think you have two solution:
1- make log and meke inside this log datetime.now
2- you can use the diagnostic Tools tab
hopefully this help you

ThreadLocal performance vs using parameters

I am currently implementing a runtime (i.e. a collection of functions) for a formulas language. Some formulas need a context to be passed to them and I created a class called EvaluationContext which contains all properties I need access to at runtime.
Using ThreadLocal<EvaluationContext> seems like a good option to make this context available to the runtime functions. The other option is to pass the context as a parameter to the functions that need it.
I prefer using ThreadLocal but I was wondering if there is any performance penalty as opposed to passing the evaluation context via method parameters.

I created the program below and it is faster to use parameters rather than the ThreadLocal field.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace TestThreadLocal
{
internal class Program
{
public class EvaluationContext
{
public int A { get; set; }
public int B { get; set; }
}
public static class FormulasRunTime
{
public static ThreadLocal<EvaluationContext> Context = new ThreadLocal<EvaluationContext>();
public static int SomeFunction()
{
EvaluationContext ctx = Context.Value;
return ctx.A + ctx.B;
}
public static int SomeFunction(EvaluationContext context)
{
return context.A + context.B;
}
}
private static void Main(string[] args)
{
Stopwatch stopwatch = Stopwatch.StartNew();
int N = 10000;
Task<int>[] tasks = new Task<int>[N];
int sum = 0;
for (int i = 0; i < N; i++)
{
int x = i;
tasks[i] = Task.Factory.StartNew(() =>
{
//Console.WriteLine("Starting {0}, thread {1}", x, Thread.CurrentThread.ManagedThreadId);
FormulasRunTime.Context.Value = new EvaluationContext {A = 0, B = x};
return FormulasRunTime.SomeFunction();
});
sum += i;
}
Task.WaitAll(tasks);
Console.WriteLine("Using ThreadLocal: It took {0} millisecs and the sum is {1}", stopwatch.ElapsedMilliseconds, tasks.Sum(t => t.Result));
Console.WriteLine(sum);
stopwatch = Stopwatch.StartNew();
for (int i = 0; i < N; i++)
{
int x = i;
tasks[i] = Task.Factory.StartNew(() =>
{
return FormulasRunTime.SomeFunction(new EvaluationContext { A = 0, B = x });
});
}
Task.WaitAll(tasks);
Console.WriteLine("Using parameter: It took {0} millisecs and the sum is {1}", stopwatch.ElapsedMilliseconds, tasks.Sum(t => t.Result));
Console.ReadKey();
}
}
}

Going on costa's answer;
If you try N as 10000000,
int N = 10000000;
you will see there is not much of a difference (around 107.4 to 103.4 seconds).
If the value gets bigger the difference becomes smaller.
So, if you do not mind a three seconds slowness, i think it is the difference between the usability and the taste.
PS: In the code, int return types must be converted to long.

I consider the ThreadLocal design to be dirty, yet creative. It is definitely going to be faster to use parameters but performance should not be your only concern. Parameters will be much clearer to understand. I recommend you go with parameters.

There will not be any performance impact, but you will not be able to do any parallel computations in this case (which can be quite useful especially in formulas domain).
If you definitely don't want to do it you can go for ThreadLocal.
Otherwise I would suggest you look at the "state monad" "pattern" that will allow you to seamlessly pass your state (context) through your computations (formulas) without having any explicit parameters.

I think you'll find that in a head-to-head comparison, accessing a ThreadLocal<> takes substantially longer than accessing a parameter, but in the end it might not be a significant difference - it all depends what else you're doing.

Is IEnumerable.Any faster than a for loop with a break?

We experienced some slowness in our code opening a form and it was possibly due to a for loop with a break that was taking a long time to execute. I switched this to an IEnumerable.Any() and saw the form open very quickly. I am now trying to figure out if making this change alone increased performance or if it was accessing the ProductIDs property more efficiently. Should this implementation be faster, and if so, why?
Original Implementation:
public bool ContainsProduct(int productID) {
bool containsProduct = false;
for (int i = 0; i < this.ProductIDs.Length; i++) {
if (productID == this.ProductIDs[i]) {
containsProduct = true;
break;
}
}
return containsProduct;
}
New Implementation:
public bool ContainsProduct(int productID) {
return this.ProductIDs.Any(t => productID == t);
}

Call this an educated guess:
this.ProductIDs.Length
This probably is where the slowness lies. If the list of ProductIDs gets retrieved from database (for example) on every iteration in order to get the Length it would indeed be very slow. You can confirm this by profiling your application.
If this is not the case (say ProductIDs is in memory and Length is cached), then both should have an almost identical running time.

First implementation is slightly faster (enumeration is slightly slower than for loop). Second one is a lot more readable.
UPDATE
Oded's answer is possibly correct and well done for spotting it. The first one is slower here since it involves database roundtrip. Otherwise, it is slightly faster as I said.
UPDATE 2 - Proof
Here is a simple code showing why first one is faster:
public static void Main()
{
int[] values = Enumerable.Range(0, 1000000).ToArray();
int dummy = 0;
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i < values.Length; i++)
{
dummy *= i;
}
stopwatch.Stop();
Console.WriteLine("Loop took {0}", stopwatch.ElapsedTicks);
dummy = 0;
stopwatch.Reset();
stopwatch.Start();
foreach (var value in values)
{
dummy *= value;
}
stopwatch.Stop();
Console.WriteLine("Iteration took {0}", stopwatch.ElapsedTicks);
Console.Read();
}
Here is output:
Loop took 12198
Iteration took 20922
So loop is twice is fast as iteration/enumeration.

I think they would be more or less identical. I usually refer to Jon Skeet's Reimplementing LINQ to Objects blog series to get an idea of how the extension methods work. Here's the post for Any() and All()
Here's the core part of Any() implementation from that post
public static bool Any<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
...
foreach (TSource item in source)
{
if (predicate(item))
{
return true;
}
}
return false;
}

This post assumes that ProductIDs is a List<T> or an array. So I'm talking about Linq-to-objects.
Linq is usually slower but shorter/more readable than conventional loop based code. A factor of 2-3 depending on what you're doing is typical.
Can you refactor your code to make this.ProductIDs a HashSet<T>? Or at least sort the array so you can use a binary search. Your problem is that you're performing a linear search, which is slow if there are many products.

I think the below implementation would be a little faster than the corresponding linq implementation, but very minor though
public bool ContainsProduct(int productID) {
var length = this.ProductIDs.Length;
for (int i = 0; i < length; i++) {
if (productID == this.ProductIDs[i]) {
return true;
}
}
return false;
}

The difference will be generally in memory usage then speed.
But generally you should use for loop when you know that you will be using all elements of array in other cases you should try to use while or do while.
I think that this solution use minimum resources
int i = this.ProductIDs.Length - 1;
while(i >= 0) {
if(this.ProductIDs[i--] == productId) {
return true;
}
}
return false;

Which is faster in a loop: calling a property twice, or storing the property once?

This is more of an academic question about performance than a realistic 'what should I use' but I'm curious as I don't dabble much in IL at all to see what's constructed and I don't have a large dataset on hand to profile against.
So which is faster:
List<myObject> objs = SomeHowGetList();
List<string> strings = new List<string>();
foreach (MyObject o in objs)
{
if (o.Field == "something")
strings.Add(o.Field);
}
or:
List<myObject> objs = SomeHowGetList();
List<string> strings = new List<string>();
string s;
foreach (MyObject o in objs)
{
s = o.Field;
if (s == "something")
strings.Add(s);
}
Keep in mind that I don't really want to know the performance impact of the string.Add(s) (as whatever operation needs to be done can't really be changed), just the performance difference between setting s each iteration (let's say that s can be any primitive type or string) verses calling the getter on the object each iteration.

Your first option is noticeably faster in my tests. I'm such flip flopper! Seriously though, some comments were made about the code in my original test. Here's the updated code that shows option 2 being faster.
class Foo
{
public string Bar { get; set; }
public static List<Foo> FooMeUp()
{
var foos = new List<Foo>();
for (int i = 0; i < 10000000; i++)
{
foos.Add(new Foo() { Bar = (i % 2 == 0) ? "something" : i.ToString() });
}
return foos;
}
}
static void Main(string[] args)
{
var foos = Foo.FooMeUp();
var strings = new List<string>();
Stopwatch sw = Stopwatch.StartNew();
foreach (Foo o in foos)
{
if (o.Bar == "something")
{
strings.Add(o.Bar);
}
}
sw.Stop();
Console.WriteLine("It took {0}", sw.ElapsedMilliseconds);
strings.Clear();
sw = Stopwatch.StartNew();
foreach (Foo o in foos)
{
var s = o.Bar;
if (s == "something")
{
strings.Add(s);
}
}
sw.Stop();
Console.WriteLine("It took {0}", sw.ElapsedMilliseconds);
Console.ReadLine();
}

Most of the time, your second code snippet should be at least as fast as the first snippet.
These two code snippets are not functionally equivalent. Properties are not guaranteed to return the same result across individual accesses. As a consequence, the JIT optimizer is not able to cache the result (except for trivial cases) and it will be faster if you cache the result of a long running property. Look at this example: why foreach is faster than for loop while reading richtextbox lines.
However, for some specific cases like:
for (int i = 0; i < myArray.Length; ++i)
where myArray is an array object, the compiler is able to detect the pattern and optimize the code and omit the bound checks. It might be slower if you cache the result of Length property like:
int len = myArray.Length;
for (int i = 0; i < myArray.Length; ++i)

It really depends on the implementation. In most cases, it is assumed (as a matter of common practice / courtesy) that a property is inexpensive. However, it could that each "get" does a non-cached search over some remote resource. For standard, simple properties, you'll never notice a real difference between the two. For the worst-case, fetch-once, store and re-use will be much faster.
I'd be tempted to use get twice until I know there is a problem... "premature optimisation", etc... But; if I was using it in a tight loop, then I might store it in a variable. Except for Length on an array, which has special JIT treatment ;-p

Generally the second one is faster, as the first one recalculates the property on each iteration.
Here is an example of something that could take significant amount of time:
var d = new DriveInfo("C:");
d.VolumeLabel; // will fetch drive label on each call

Storing the value in a field is the faster option.
Although a method call doesn't impose a huge overhead, it far outweighs storing the value once to a local variable on the stack and then retrieving it.
I for one do it consistently.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Is a LINQ statement faster than a 'foreach' loop? - c#

I am writing a Mesh Rendering manager and thought it would be a good idea to group all of the meshes which use the same shader and then render these while I'm in that shader pass. I am currently using a foreach loop, but wondered if utilising LINQ might give me a performance increase?

Why should LINQ be faster? It also uses loops internally. Most of the times, LINQ will be a bit slower because it introduces overhead. Do not use LINQ if you care much about performance. Use LINQ because you want shorter better readable and maintainable code.

It should probably be noted that the for loop is faster than the foreach. So for the original post, if you are worried about performance on a critical component like a renderer, use a for loop. Reference: In .NET, which loop runs faster, 'for' or 'foreach'?

You might get a performance boost if you use parallel LINQ for multi cores. See Parallel LINQ (PLINQ) (MSDN).

Related

Why is in C# a nested foreach loop more performant than a SelectMany combined with a single foreach loop or a GroupBy? [duplicate]

C# LINQ performance when extension method called inside where clause

ThreadLocal performance vs using parameters

Is IEnumerable.Any faster than a for loop with a break?

Which is faster in a loop: calling a property twice, or storing the property once?

Categories

Resources