I want to know the most efficient way of replacing empty strings in an array with null values.
I have the following array:
string[] _array = new string [10];
_array[0] = "A";
_array[1] = "B";
_array[2] = "";
_array[3] = "D";
_array[4] = "E";
_array[5] = "F";
_array[6] = "G";
_array[7] = "";
_array[8] = "";
_array[9] = "J";
and I am currently replacing empty strings by the following:
for (int i = 0; i < _array.Length; i++)
{
if (_array[i].Trim() == "")
{
_array[i] = null;
}
}
which works fine on small arrays but I'm chasing some code that is the most efficient at doing the task because the arrays I am working with could be much larger and I would be repeating this process over and over again.
Is there a linq query or something that is more efficient?
You might consider switching _array[i].Trim() == "" with string.IsNullOrWhitespace(_array[i]) to avoid new string allocation. But that's pretty much all you can do to make it faster and still keep sequential. LINQ will not be faster than a for loop.
You could try making your processing parallel, but that seems like a bigger change, so you should evaluate if that's ok in your scenario.
Parallel.For(0, _array.Length, i => {
if (string.IsNullOrWhitespace(_array[i]))
{
_array[i] = null;
}
});
As far as efficiency it is fine but it also depends on how large the array is and the frequency that you would be iterating over such arrays. The main problem I see is that you could get a NullReferenceException with your trim method. A better approach is to use string.IsNullOrEmpty or string.IsNullOrWhiteSpace, the later is more along the lines of what you want but is not available in all versions of .net.
for (int i = 0; i < _array.Length; i++)
{
if (string.IsNullOrWhiteSpace(_array[i]))
{
_array[i] = null;
}
}
LINQ is mainly used for querying not for assignment. To do certain action on Collection, you could try to use List. If you use List instead of Array, you could do it with one line instead:
_list.ForEach(x => string.IsNullOrWhiteSpace(x) ? x = null; x = x);
A linq query will do essentially the same thing behind the scenes so you aren't going to gain any real efficiency simply by using linq.
When determining something more efficient, look at a few things:
How big will your array grow?
How often will the data in your array change?
Does the order of your array matter?
You've already answered that your array might grow to large sizes and performance is a concern.
So looking at options 2 and 3 together, if the order of your data doesn't matter then you could keep your array sorted and break the loop after you detect non-empty strings.
Ideally, you would be able to check the data on the way in so you don't have to constantly loop over your entire array. Is that not a possibility?
Hope this at least gets some thoughts going.
It's ugly, but you can eliminate the CALL instruction to the RTL, as I mentioned earlier, with this code:
if (_array[i] != null) {
Boolean blank = true;
for(int j = 0; j < value.Length; j++) {
if(!Char.IsWhiteSpace(_array[i][j])) {
blank = false;
break;
}
}
if (blank) {
_array[i] = null;
}
}
But it does add an extra assignment and includes an extra condition and it is just too ugly for me. But if you want to shave off nanoseconds off a massive list then perhaps this could be used. I like the idea of parallel processing and you could wrap this with Parallel.
Use the below code
_array = _array.Select(str => { if (str.Length == 0) str = null; return str; }).ToArray();
I'm writing a chess engine in C#, and I'm trying to debug move generation. I've been using breakpoints so far to check variables, but I need a better way to debug. Despite the fact that a breakpoint shows that this list has a length of 86, the loop only runs 5 times. The only thing I can think of is that the arrays in the list have a length of 5, but I have no idea what would be causing this.
foreach (int[] debugBoard in futures)
{
for (int i = 0; i < 5; i++)
{
Debug.Write(debugBoard[i].ToString().PadLeft(3, ' '));
}
Debug.Write("\n");
int[,] debugOutBoard = new int[8, 8];
Array.Copy(chessBoard.Pieces(), debugOutBoard, 64);
if (debugBoard[0] < 0 || debugBoard[1] < 0 || debugBoard[2] < 0 || debugBoard[3] < 0 || debugBoard[0] > 7 || debugBoard[1] > 7 || debugBoard[2] > 7 || debugBoard[3] > 7)
break;
debugOutBoard[debugBoard[2], debugBoard[3]] = debugOutBoard[debugBoard[0], debugBoard[1]];
debugOutBoard[debugBoard[0], debugBoard[1]] = 0;
int rowLength = debugOutBoard.GetLength(0);
int colLength = debugOutBoard.GetLength(1);
for (int i = 0; i < rowLength; i++)
{
for (int j = 0; j < colLength; j++)
{
Debug.Write(debugOutBoard[i, j].ToString().PadLeft(3, ' '));
}
Debug.Write(Environment.NewLine + Environment.NewLine);
}
}
Also, I tried using a concurrentbag to store moves in (I was going to multithread the move processing later on) but as soon as the foreach loop touched the concurrent bag, all the memory values of the bag changed to one value. I've been stuck on this roadblock for days, and I really need help.
Your if statement breaks out of the for loop when its condition is met, I presume this happens for the first time on the 5th/6th iteration.
What I meant to do is iterate to the next element. What command would I use to do that? –
You need to use continue instead of break
If 'futures' contains 86 items the only way it could stop iterating is if an exception occurs. Visual studio (In default settings) should break when this occurs unless you handle the exception somewhere.
Wrap the whole thing in a try{} catch{} and set a breakpoint in catch and see if it hits.
If I have a for loop which is nested within another, how can I efficiently come out of both loops (inner and outer) in the quickest possible way?
I don't want to have to use a boolean and then have to say go to another method, but rather just to execute the first line of code after the outer loop.
What is a quick and nice way of going about this?
I was thinking that exceptions aren't cheap/should only be thrown in a truly exceptional condition etc. Hence I don't think this solution would be good from a performance perspective.
I don't feel it it is right to take advantage of the newer features in .NET (anon methods) to do something which is pretty fundamental.
Well, goto, but that is ugly, and not always possible. You can also place the loops into a method (or an anon-method) and use return to exit back to the main code.
// goto
for (int i = 0; i < 100; i++)
{
for (int j = 0; j < 100; j++)
{
goto Foo; // yeuck!
}
}
Foo:
Console.WriteLine("Hi");
vs:
// anon-method
Action work = delegate
{
for (int x = 0; x < 100; x++)
{
for (int y = 0; y < 100; y++)
{
return; // exits anon-method
}
}
};
work(); // execute anon-method
Console.WriteLine("Hi");
Note that in C# 7 we should get "local functions", which (syntax tbd etc) means it should work something like:
// local function (declared **inside** another method)
void Work()
{
for (int x = 0; x < 100; x++)
{
for (int y = 0; y < 100; y++)
{
return; // exits local function
}
}
};
Work(); // execute local function
Console.WriteLine("Hi");
C# adaptation of approach often used in C - set value of outer loop's variable outside of loop conditions (i.e. for loop using int variable INT_MAX -1 is often good choice):
for (int i = 0; i < 100; i++)
{
for (int j = 0; j < 100; j++)
{
if (exit_condition)
{
// cause the outer loop to break:
// use i = INT_MAX - 1; otherwise i++ == INT_MIN < 100 and loop will continue
i = int.MaxValue - 1;
Console.WriteLine("Hi");
// break the inner loop
break;
}
}
// if you have code in outer loop it will execute after break from inner loop
}
As note in code says break will not magically jump to next iteration of the outer loop - so if you have code outside of inner loop this approach requires more checks. Consider other solutions in such case.
This approach works with for and while loops but does not work for foreach. In case of foreach you won't have code access to the hidden enumerator so you can't change it (and even if you could IEnumerator doesn't have some "MoveToEnd" method).
Acknowledgments to inlined comments' authors:
i = INT_MAX - 1 suggestion by Meta
for/foreach comment by ygoe.
Proper IntMax by jmbpiano
remark about code after inner loop by blizpasta
This solution does not apply to C#
For people who found this question via other languages, Javascript, Java, and D allows labeled breaks and continues:
outer: while(fn1())
{
while(fn2())
{
if(fn3()) continue outer;
if(fn4()) break outer;
}
}
Use a suitable guard in the outer loop. Set the guard in the inner loop before you break.
bool exitedInner = false;
for (int i = 0; i < N && !exitedInner; ++i) {
.... some outer loop stuff
for (int j = 0; j < M; ++j) {
if (sometest) {
exitedInner = true;
break;
}
}
if (!exitedInner) {
... more outer loop stuff
}
}
Or better yet, abstract the inner loop into a method and exit the outer loop when it returns false.
for (int i = 0; i < N; ++i) {
.... some outer loop stuff
if (!doInner(i, N, M)) {
break;
}
... more outer loop stuff
}
Don't quote me on this, but you could use goto as suggested in the MSDN. There are other solutions, as including a flag that is checked in each iteration of both loops. Finally you could use an exception as a really heavyweight solution to your problem.
GOTO:
for ( int i = 0; i < 10; ++i ) {
for ( int j = 0; j < 10; ++j ) {
// code
if ( break_condition ) goto End;
// more code
}
}
End: ;
Condition:
bool exit = false;
for ( int i = 0; i < 10 && !exit; ++i ) {
for ( int j = 0; j < 10 && !exit; ++j ) {
// code
if ( break_condition ) {
exit = true;
break; // or continue
}
// more code
}
}
Exception:
try {
for ( int i = 0; i < 10 && !exit; ++i ) {
for ( int j = 0; j < 10 && !exit; ++j ) {
// code
if ( break_condition ) {
throw new Exception()
}
// more code
}
}
catch ( Exception e ) {}
Is it possible to refactor the nested for loop into a private method? That way you could simply 'return' out of the method to exit the loop.
It seems to me like people dislike a goto statement a lot, so I felt the need to straighten this out a bit.
I believe the 'emotions' people have about goto eventually boil down to understanding of code and (misconceptions) about possible performance implications. Before answering the question, I will therefore first go into some of the details on how it's compiled.
As we all know, C# is compiled to IL, which is then compiled to assembler using an SSA compiler. I'll give a bit of insights into how this all works, and then try to answer the question itself.
From C# to IL
First we need a piece of C# code. Let's start simple:
foreach (var item in array)
{
// ...
break;
// ...
}
I'll do this step by step to give you a good idea of what happens under the hood.
First translation: from foreach to the equivalent for loop (Note: I'm using an array here, because I don't want to get into details of IDisposable -- in which case I'd also have to use an IEnumerable):
for (int i=0; i<array.Length; ++i)
{
var item = array[i];
// ...
break;
// ...
}
Second translation: the for and break is translated into an easier equivalent:
int i=0;
while (i < array.Length)
{
var item = array[i];
// ...
break;
// ...
++i;
}
And third translation (this is the equivalent of the IL code): we change break and while into a branch:
int i=0; // for initialization
startLoop:
if (i >= array.Length) // for condition
{
goto exitLoop;
}
var item = array[i];
// ...
goto exitLoop; // break
// ...
++i; // for post-expression
goto startLoop;
While the compiler does these things in a single step, it gives you insight into the process. The IL code that evolves from the C# program is the literal translation of the last C# code. You can see for yourself here: https://dotnetfiddle.net/QaiLRz (click 'view IL')
Now, one thing you have observed here is that during the process, the code becomes more complex. The easiest way to observe this is by the fact that we needed more and more code to ackomplish the same thing. You might also argue that foreach, for, while and break are actually short-hands for goto, which is partly true.
From IL to Assembler
The .NET JIT compiler is an SSA compiler. I won't go into all the details of SSA form here and how to create an optimizing compiler, it's just too much, but can give a basic understanding about what will happen. For a deeper understanding, it's best to start reading up on optimizing compilers (I do like this book for a brief introduction: http://ssabook.gforge.inria.fr/latest/book.pdf ) and LLVM (llvm.org).
Every optimizing compiler relies on the fact that code is easy and follows predictable patterns. In the case of FOR loops, we use graph theory to analyze branches, and then optimize things like cycli in our branches (e.g. branches backwards).
However, we now have forward branches to implement our loops. As you might have guessed, this is actually one of the first steps the JIT is going to fix, like this:
int i=0; // for initialization
if (i >= array.Length) // for condition
{
goto endOfLoop;
}
startLoop:
var item = array[i];
// ...
goto endOfLoop; // break
// ...
++i; // for post-expression
if (i >= array.Length) // for condition
{
goto startLoop;
}
endOfLoop:
// ...
As you can see, we now have a backward branch, which is our little loop. The only thing that's still nasty here is the branch that we ended up with due to our break statement. In some cases, we can move this in the same way, but in others it's there to stay.
So why does the compiler do this? Well, if we can unroll the loop, we might be able to vectorize it. We might even be able to proof that there's just constants being added, which means our whole loop could vanish into thin air. To summarize: by making the patterns predictable (by making the branches predictable), we can proof that certain conditions hold in our loop, which means we can do magic during the JIT optimization.
However, branches tend to break those nice predictable patterns, which is something optimizers therefore kind-a dislike. Break, continue, goto - they all intend to break these predictable patterns- and are therefore not really 'nice'.
You should also realize at this point that a simple foreach is more predictable then a bunch of goto statements that go all over the place. In terms of (1) readability and (2) from an optimizer perspective, it's both the better solution.
Another thing worth mentioning is that it's very relevant for optimizing compilers to assign registers to variables (a process called register allocation). As you might know, there's only a finite number of registers in your CPU and they are by far the fastest pieces of memory in your hardware. Variables used in code that's in the inner-most loop, are more likely to get a register assigned, while variables outside of your loop are less important (because this code is probably hit less).
Help, too much complexity... what should I do?
The bottom line is that you should always use the language constructs you have at your disposal, which will usually (implictly) build predictable patterns for your compiler. Try to avoid strange branches if possible (specifically: break, continue, goto or a return in the middle of nothing).
The good news here is that these predictable patterns are both easy to read (for humans) and easy to spot (for compilers).
One of those patterns is called SESE, which stands for Single Entry Single Exit.
And now we get to the real question.
Imagine that you have something like this:
// a is a variable.
for (int i=0; i<100; ++i)
{
for (int j=0; j<100; ++j)
{
// ...
if (i*j > a)
{
// break everything
}
}
}
The easiest way to make this a predictable pattern is to simply eliminate the if completely:
int i, j;
for (i=0; i<100 && i*j <= a; ++i)
{
for (j=0; j<100 && i*j <= a; ++j)
{
// ...
}
}
In other cases you can also split the method into 2 methods:
// Outer loop in method 1:
for (i=0; i<100 && processInner(i); ++i)
{
}
private bool processInner(int i)
{
int j;
for (j=0; j<100 && i*j <= a; ++j)
{
// ...
}
return i*j<=a;
}
Temporary variables? Good, bad or ugly?
You might even decide to return a boolean from within the loop (but I personally prefer the SESE form because that's how the compiler will see it and I think it's cleaner to read).
Some people think it's cleaner to use a temporary variable, and propose a solution like this:
bool more = true;
for (int i=0; i<100; ++i)
{
for (int j=0; j<100; ++j)
{
// ...
if (i*j > a) { more = false; break; } // yuck.
// ...
}
if (!more) { break; } // yuck.
// ...
}
// ...
I personally am opposed to this approach. Look again on how the code is compiled. Now think about what this will do with these nice, predictable patterns. Get the picture?
Right, let me spell it out. What will happen is that:
The compiler will write out everything as branches.
As an optimization step, the compiler will do data flow analysis in an attempt to remove the strange more variable that only happens to be used in control flow.
If succesful, the variable more will be eliminated from the program, and only branches remain. These branches will be optimized, so you will get only a single branch out of the inner loop.
If unsuccesful, the variable more is definitely used in the inner-most loop, so if the compiler won't optimize it away, it has a high chance to be allocated to a register (which eats up valuable register memory).
So, to summarize: the optimizer in your compiler will go into a hell of a lot of trouble to figure out that more is only used for the control flow, and in the best case scenario will translate it to a single branch outside of the outer for loop.
In other words, the best case scenario is that it will end up with the equivalent of this:
for (int i=0; i<100; ++i)
{
for (int j=0; j<100; ++j)
{
// ...
if (i*j > a) { goto exitLoop; } // perhaps add a comment
// ...
}
// ...
}
exitLoop:
// ...
My personal opinion on this is quite simple: if this is what we intended all along, let's make the world easier for both the compiler and readability, and write that right away.
tl;dr:
Bottom line:
Use a simple condition in your for loop if possible. Stick to the high-level language constructs you have at your disposal as much as possible.
If everything fails and you're left with either goto or bool more, prefer the former.
You asked for a combination of quick, nice, no use of a boolean, no use of goto, and C#. You've ruled out all possible ways of doing what you want.
The most quick and least ugly way is to use a goto.
factor into a function/method and use early return, or rearrange your loops into a while-clause. goto/exceptions/whatever are certainly not appropriate here.
def do_until_equal():
foreach a:
foreach b:
if a==b: return
The cleanest, shortest, and most reusable way is a self invoked anonymous function:
no goto
no label
no temporary variable
no named function
One line shorter than the top answer with anonymous method.
new Action(() =>
{
for (int x = 0; x < 100; x++)
{
for (int y = 0; y < 100; y++)
{
return; // exits self invoked lambda expression
}
}
})();
Console.WriteLine("Hi");
Sometimes nice to abstract the code into it's own function and than use an early return - early returns are evil though : )
public void GetIndexOf(Transform transform, out int outX, out int outY)
{
outX = -1;
outY = -1;
for (int x = 0; x < Columns.Length; x++)
{
var column = Columns[x];
for (int y = 0; y < column.Transforms.Length; y++)
{
if(column.Transforms[y] == transform)
{
outX = x;
outY = y;
return;
}
}
}
}
Since I first saw break in C a couple of decades back, this problem has vexed me. I was hoping some language enhancement would have an extension to break which would work thus:
break; // our trusty friend, breaks out of current looping construct.
break 2; // breaks out of the current and it's parent looping construct.
break 3; // breaks out of 3 looping constructs.
break all; // totally decimates any looping constructs in force.
I've seen a lot of examples that use "break" but none that use "continue".
It still would require a flag of some sort in the inner loop:
while( some_condition )
{
// outer loop stuff
...
bool get_out = false;
for(...)
{
// inner loop stuff
...
get_out = true;
break;
}
if( get_out )
{
some_condition=false;
continue;
}
// more out loop stuff
...
}
The easiest way to end a double loop would be directly ending the first loop
string TestStr = "The frog jumped over the hill";
char[] KillChar = {'w', 'l'};
for(int i = 0; i < TestStr.Length; i++)
{
for(int E = 0; E < KillChar.Length; E++)
{
if(KillChar[E] == TestStr[i])
{
i = TestStr.Length; //Ends First Loop
break; //Ends Second Loop
}
}
}
Loops can be broken using custom conditions in the loop, allowing as to have clean code.
static void Main(string[] args)
{
bool isBreak = false;
for (int i = 0; ConditionLoop(isBreak, i, 500); i++)
{
Console.WriteLine($"External loop iteration {i}");
for (int j = 0; ConditionLoop(isBreak, j, 500); j++)
{
Console.WriteLine($"Inner loop iteration {j}");
// This code is only to produce the break.
if (j > 3)
{
isBreak = true;
}
}
Console.WriteLine("The code after the inner loop will be executed when breaks");
}
Console.ReadKey();
}
private static bool ConditionLoop(bool isBreak, int i, int maxIterations) => i < maxIterations && !isBreak;
With this code we ontain the following output:
External loop iteration 0
Inner loop iteration 0
Inner loop iteration 1
Inner loop iteration 2
Inner loop iteration 3
Inner loop iteration 4
The code after the inner loop will be executed when breaks
I remember from my student days that it was said it's mathematically provable that you can do anything in code without a goto (i.e. there is no situation where goto is the only answer). So, I never use goto's (just my personal preference, not suggesting that i'm right or wrong)
Anyways, to break out of nested loops I do something like this:
var isDone = false;
for (var x in collectionX) {
for (var y in collectionY) {
for (var z in collectionZ) {
if (conditionMet) {
// some code
isDone = true;
}
if (isDone)
break;
}
if (isDone)
break;
}
if (isDone)
break;
}
... i hope that helps for those who like me are anti-goto "fanboys" :)
That's how I did it. Still a workaround.
foreach (var substring in substrings) {
//To be used to break from 1st loop.
int breaker=1;
foreach (char c in substring) {
if (char.IsLetter(c)) {
Console.WriteLine(line.IndexOf(c));
\\setting condition to break from 1st loop.
breaker=9;
break;
}
}
if (breaker==9) {
break;
}
}
Another option which is not mentioned here which is both clean and does not rely on newer .NET features is to consolidate the double loop into a single loop over the product. Then inside the loop the values of the counters can be calculated using simple math:
int n; //set to max of first loop
int m; //set to max of second loop
for (int k = 0; k < n * m; k++)
{
//calculate the values of i and j as if there was a double loop
int i = k / m;
int j = k % m;
if(exitCondition)
{
break;
}
}
People often forget that the 2nd statement of the for loops themselves are the break conditions, so there is no need to have additional ifs within the code.
Something like this works:
bool run = true;
int finalx = 0;
int finaly = 0;
for (int x = 0; x < 100 && run; x++)
{
finalx = x;
for (int y = 0; y < 100 && run; y++)
{
finaly = y;
if (x == 10 && y == 50) { run = false; }
}
}
Console.WriteLine("x: " + finalx + " y: " + finaly); // outputs 'x: 10 y: 50'
just use return inside the inner loop and the two loops will be exited...
I would just set a flag.
var breakOuterLoop = false;
for (int i = 0; i < 30; i++)
{
for (int j = 0; j < 30; j++)
{
if (condition)
{
breakOuterLoop = true;
break;
}
}
if (breakOuterLoop){
break;
}
}
Throw a custom exception which goes out outter loop.
It works for for,foreach or while or any kind of loop and any language that uses try catch exception block
try
{
foreach (object o in list)
{
foreach (object another in otherList)
{
// ... some stuff here
if (condition)
{
throw new CustomExcpetion();
}
}
}
}
catch (CustomException)
{
// log
}
bool breakInnerLoop=false
for(int i=0;i<=10;i++)
{
for(int J=0;i<=10;i++)
{
if(i<=j)
{
breakInnerLoop=true;
break;
}
}
if(breakInnerLoop)
{
continue
}
}
As i see you accepted the answer in which the person refers you goto statement, where in modern programming and in expert opinion goto is a killer, we called it a killer in programming which have some certain reasons, which i will not discuss it over here at this point, but the solution of your question is very simple, you can use a Boolean flag in this kind of scenario like i will demonstrate it in my example:
for (; j < 10; j++)
{
//solution
bool breakme = false;
for (int k = 1; k < 10; k++)
{
//place the condition where you want to stop it
if ()
{
breakme = true;
break;
}
}
if(breakme)
break;
}
simple and plain. :)
Did you even look at the break keyword? O.o
This is just pseudo-code, but you should be able to see what I mean:
<?php
for(...) {
while(...) {
foreach(...) {
break 3;
}
}
}
If you think about break being a function like break(), then it's parameter would be the number of loops to break out of. As we are in the third loop in the code here, we can break out of all three.
Manual: http://php.net/break
I think unless you want to do the "boolean thing" the only solution is actually to throw. Which you obviously shouldn't do..!
I was adapting a simple prime-number generation one-liner from Scala to C# (mentioned in a comment on this blog by its author). I came up with the following:
int NextPrime(int from)
{
while(true)
{
n++;
if (!Enumerable.Range(2, (int)Math.Sqrt(n) - 1).Any((i) => n % i == 0))
return n;
}
}
It works, returning the same results I'd get from running the code referenced in the blog. In fact, it works fairly quickly. In LinqPad, it generated the 100,000th prime in about 1 second. Out of curiosity, I rewrote it without Enumerable.Range() and Any():
int NextPrimeB(int from)
{
while(true)
{
n++;
bool hasFactor = false;
for (int i = 2; i <= (int)Math.Sqrt(n); i++)
{
if (n % i == 0) hasFactor = true;
}
if (!hasFactor) return n;
}
}
Intuitively, I'd expect them to either run at the same speed, or even for the latter to run a little faster. In actuality, computing the same value (100,000th prime) with the second method, takes 12 seconds - It's a staggering difference.
So what's going on here? There must be fundamentally something extra happening in the second approach that's eating up CPU cycles, or some optimization going on the background of the Linq examples. Anybody know why?
For every iteration of the for loop, you are finding the square root of n. Cache it instead.
int root = (int)Math.Sqrt(n);
for (int i = 2; i <= root; i++)
And as other have mentioned, break the for loop as soon as you find a factor.
The LINQ version short circuits, your loop does not. By this I mean that when you have determined that a particular integer is in fact a factor the LINQ code stops, returns it, and then moves on. Your code keeps looping until it's done.
If you change the for to include that short circuit, you should see similar performance:
int NextPrimeB(int from)
{
while(true)
{
n++;
for (int i = 2; i <= (int)Math.Sqrt(n); i++)
{
if (n % i == 0) return n;;
}
}
}
It looks like this is the culprit:
for (int i = 2; i <= (int)Math.Sqrt(n); i++)
{
if (n % i == 0) hasFactor = true;
}
You should exit the loop once you find a factor:
if (n % i == 0){
hasFactor = true;
break;
}
And as other have pointed out, move the Math.Sqrt call outside the loop to avoid calling it each cycle.
Enumerable.Any takes an early out if the condition is successful while your loop does not.
The enumeration of source is stopped as soon as the result can be determined.
This is an example of a bad benchmark. Try modifying your loop and see the difference:
if (n % i == 0) { hasFactor = true; break; }
}
throw new InvalidOperationException("Cannot satisfy criteria.");
In the name of optimization, you can be a little more clever about this by avoiding even numbers after 2:
if (n % 2 != 0)
{
int quux = (int)Math.Sqrt(n);
for (int i = 3; i <= quux; i += 2)
{
if (n % i == 0) return n;
}
}
There are some other ways to optimize prime searches, but this is one of the easier to do and has a large payoff.
Edit: you may want to consider using (int)Math.Sqrt(n) + 1. FP functions + round-down could potentially cause you to miss a square of a large prime number.
At least part of the problem is the number of times Math.Sqrt is executed. In the LINQ query this is executed once but in the loop example it's executed N times. Try pulling that out into a local and reprofiling the application. That will give you a more representative break down
int limit = (int)Math.Sqrt(n);
for (int i = 2; i <= limit; i++)