So i had an interview question: Write a function that takes a number and returns all numbers less than or divisible by 7
private List<int> GetLessThanOrDivisbleBySeven(int num)
{
List<int> ReturnList = new List<int>();
for(int i = 0; i <= num; i++)
{
if(i <7 || i % 7 == 0)
{
ReturnList.Add(i);
}
}
return ReturnList;
}
So far so good. The follow up question was: Let's say that call was being made 10s of thousands of times an hour. How could you speed it up?
I said if you knew what your queue was you could break up your queue and thread it. That got me some points i feel. However, he wanted to know if there was anything in the function i could do.
I came up with the idea to test if the num was greater than 7. if so initialize the list with 1 - 7 and start the loop int i = 8 which i think was ok but is there another way i am missing?
If you want to speed it up without caching, you can just increment i by 7 to get all numbers divisible by 7, it will be something like this:
static private List<int> GetLessThanOrDivisbleBySeven(int num) {
List<int> ReturnList;
int i;
if (num <= 7) {
ReturnList = new List<int>();
for (i = 0; i <= num; i++) {
ReturnList.Add(i);
}
return ReturnList;
}
ReturnList = new List<int> { 0, 1, 2, 3, 4, 5, 6 };
i = 7;
while (i <= num) {
ReturnList.Add(i);
i += 7;
}
return ReturnList;
}
You can cache the results. Each time your function is being called, check what numbers are in the cache, and calculate the rest.
If the current number is smaller, return the appropriate cached results.
use the previous results when calculating new list
int oldMax = 0;
List<int> ReturnList = new List<int>();
private List<int> GetLessThanOrDivisbleBySeven(int num)
{
if (num > oldMax )
{
oldMax = num;
for(int i = oldMax ; i <= num; i++)
{
if(i <7 || i % 7 == 0)
{
ReturnList.Add(i);
}
}
return ReturnList;
}
else
{
// create a copy of ReturnList and Remove from the copy numbers bigger than num
}
}
Interview questions are usually more about how you approach problems in general and not so much the technical implementation. In your case you could do a lot of small things, like caching the list outside. Caching different versions of the list in a dictionary, if space was not a problem. Maybe somebody can come up with some smarter math, to save on calculations, but usually it's more about asking the right questions and considering the right options. Say, if you ask "does this program run on a web server? maybe I can store all data in a table and use it as a quick lookup instead of recalculating every time." There might not even be a correct or best answer, they probably just want to hear, that you can think of special situations.
You can find all the numbers that are divisible by 7 and smaller than num by calculating res = num/7 and then create a loop from 1 to res and multiply each number by 7.
private List<int> GetLessThanOrDivisbleBySeven(int num)
{
List<int> ReturnList = new List<int>();
// Add all the numbers that are less than 7 first
int i = 0;
for(i = 0; i < 7; i++)
ReturnList.Add(i);
int res = num / 7;// num = res*7+rem
for(i = 1; i <= res; i++)
{
ReturnList.Add(i*7);
}
return ReturnList;
}
Think about memory management and how the List class works.
Unless you tell it the capacity it will need, it allocates a new array whenever it runs out of space, however it is easy to work out the size it will need to be.
Returning an array would save one object allocation compared to using a List, so discuss the tradeoff between the two.
What about using "yeild return" to advoid allocating memory, or does it have other costs to consider?
Is the same number requested often, if so consider cacheing.
Would LINQ, maybe using Enumerable.Range help?
An experienced C# programmer would be expected to know at least a little about all the above and that memory management is often an hidden issue.
In a course a problem was to list the first n primes. Apparently we should implement trial division while saving primes in an array to reduce the number of divisions required. Initially I misunderstood, but got a working if slower solution using a separate function to test for primality but I would like to implement it the way I should have done.
Below is my attempt, with irrelevant code removed, such as the input test.
using System;
namespace PrimeNumbers
{
class MainClass
{
public static void Main (string[] args)
{
Console.Write("How many primes?\n");
string s = Console.ReadLine();
uint N;
UInt32.TryParse(s, out N)
uint[] PrimeTable = new uint[N];
PrimeTable[0] = 2;
for (uint i=1; i < N; i++)//loop n spaces in array, [0] set already so i starts from 1
{
uint j = PrimeTable[i -1] + 1;//sets j bigger than biggest prime so far
bool isPrime = false;// Just a condition to allow the loop to break???(Is that right?)
while (!isPrime)//so loop continues until a break is hit
{
isPrime = true;//to ensure that the loop executes
for(uint k=0; k < i; k++)//want to divide by first i primes
{
if (PrimeTable[k] == 0) break;//try to avoid divide by zero - unnecessary
if (j % PrimeTable[k] == 0)//zero remainder means not prime so break and increment j
{
isPrime = false;
break;
}
}
j++;//j increment mentioned above
}
PrimeTable[i] = j; //not different if this is enclosed in brace above
}
for (uint i = 0; i < N; i++)
Console.Write(PrimeTable[i] + " ");
Console.ReadLine();
}
}
}
My comments are my attempt to describe what I think the code is doing, I have tried very many small changes, often they would lead to divide by zero errors when running so I added in a test, but I don't think it should be necessary. (I also got several out of range errors when trying to change the loop conditions.)
I have looked at several questions on stack exchange, in particular:
Program to find prime numbers
The first answer uses a different method, the second is close to what I want, but the exact thing is in this comment from Nick Larsson:
You could make this faster by keeping track of the primes and only
trying to divide by those.
C# is not shown on here: http://rosettacode.org/wiki/Sequence_of_primes_by_Trial_Division#Python
I have seen plenty of other methods and algorithms, such as Eratosthenes sieve and GNF, but really only want to implement it this way, as I think my problem is with the program logic and I don't understand why it doesn't work. Thanks
The following should solve your problem:
for (uint i = 1; i < numberOfPrimes; i++)//loop n spaces in array, [0] set already so i starts from 1
{
uint j = PrimeTable[i - 1] + 1;//sets j bigger than biggest prime so far
bool isPrime = false;// Just a condition to allow the loop to break???(Is that right?)
while (!isPrime)//so loop continues until a break is hit
{
isPrime = true;//to ensure that the loop executes
for (uint k = 0; k < i; k++)//want to divide by first i primes
{
if (PrimeTable[k] == 0) break;//try to avoid divide by zero - unnecessary
if (j % PrimeTable[k] == 0)//zero remainder means not prime so break and increment j
{
isPrime = false;
j++;
break;
}
}
}
PrimeTable[i] = j;
}
The major change that I did was move the incrementation of the variable j to inside the conditional prime check. This is because, the current value is not prime, so we want to check the next prime number and must move to the next candidate before breaking in the loop.
Your code was incrementing after the check was made. Which means that when you found a prime candidate, you would increment to the next candidate and assign that as your prime. For example, when j = 3, it would pass the condition, isPrime would still = true, but then j++ would increment it to 4 and that would add it to the PrimeTable.
Make sense?
This might not be a very good answer to your question, but you might want to look at this implementation and see if you can spot where yours differs.
int primesCount = 10;
List<uint> primes = new List<uint>() { 2u };
for (uint n = 3u;; n += 2u)
{
if (primes.TakeWhile(u => u * u <= n).All(u => n % u != 0))
{
primes.Add(n);
}
if (primes.Count() >= primesCount)
{
break;
}
}
This correctly and efficiently computes the first primesCount primes.
If I have a for loop which is nested within another, how can I efficiently come out of both loops (inner and outer) in the quickest possible way?
I don't want to have to use a boolean and then have to say go to another method, but rather just to execute the first line of code after the outer loop.
What is a quick and nice way of going about this?
I was thinking that exceptions aren't cheap/should only be thrown in a truly exceptional condition etc. Hence I don't think this solution would be good from a performance perspective.
I don't feel it it is right to take advantage of the newer features in .NET (anon methods) to do something which is pretty fundamental.
Well, goto, but that is ugly, and not always possible. You can also place the loops into a method (or an anon-method) and use return to exit back to the main code.
// goto
for (int i = 0; i < 100; i++)
{
for (int j = 0; j < 100; j++)
{
goto Foo; // yeuck!
}
}
Foo:
Console.WriteLine("Hi");
vs:
// anon-method
Action work = delegate
{
for (int x = 0; x < 100; x++)
{
for (int y = 0; y < 100; y++)
{
return; // exits anon-method
}
}
};
work(); // execute anon-method
Console.WriteLine("Hi");
Note that in C# 7 we should get "local functions", which (syntax tbd etc) means it should work something like:
// local function (declared **inside** another method)
void Work()
{
for (int x = 0; x < 100; x++)
{
for (int y = 0; y < 100; y++)
{
return; // exits local function
}
}
};
Work(); // execute local function
Console.WriteLine("Hi");
C# adaptation of approach often used in C - set value of outer loop's variable outside of loop conditions (i.e. for loop using int variable INT_MAX -1 is often good choice):
for (int i = 0; i < 100; i++)
{
for (int j = 0; j < 100; j++)
{
if (exit_condition)
{
// cause the outer loop to break:
// use i = INT_MAX - 1; otherwise i++ == INT_MIN < 100 and loop will continue
i = int.MaxValue - 1;
Console.WriteLine("Hi");
// break the inner loop
break;
}
}
// if you have code in outer loop it will execute after break from inner loop
}
As note in code says break will not magically jump to next iteration of the outer loop - so if you have code outside of inner loop this approach requires more checks. Consider other solutions in such case.
This approach works with for and while loops but does not work for foreach. In case of foreach you won't have code access to the hidden enumerator so you can't change it (and even if you could IEnumerator doesn't have some "MoveToEnd" method).
Acknowledgments to inlined comments' authors:
i = INT_MAX - 1 suggestion by Meta
for/foreach comment by ygoe.
Proper IntMax by jmbpiano
remark about code after inner loop by blizpasta
This solution does not apply to C#
For people who found this question via other languages, Javascript, Java, and D allows labeled breaks and continues:
outer: while(fn1())
{
while(fn2())
{
if(fn3()) continue outer;
if(fn4()) break outer;
}
}
Use a suitable guard in the outer loop. Set the guard in the inner loop before you break.
bool exitedInner = false;
for (int i = 0; i < N && !exitedInner; ++i) {
.... some outer loop stuff
for (int j = 0; j < M; ++j) {
if (sometest) {
exitedInner = true;
break;
}
}
if (!exitedInner) {
... more outer loop stuff
}
}
Or better yet, abstract the inner loop into a method and exit the outer loop when it returns false.
for (int i = 0; i < N; ++i) {
.... some outer loop stuff
if (!doInner(i, N, M)) {
break;
}
... more outer loop stuff
}
Don't quote me on this, but you could use goto as suggested in the MSDN. There are other solutions, as including a flag that is checked in each iteration of both loops. Finally you could use an exception as a really heavyweight solution to your problem.
GOTO:
for ( int i = 0; i < 10; ++i ) {
for ( int j = 0; j < 10; ++j ) {
// code
if ( break_condition ) goto End;
// more code
}
}
End: ;
Condition:
bool exit = false;
for ( int i = 0; i < 10 && !exit; ++i ) {
for ( int j = 0; j < 10 && !exit; ++j ) {
// code
if ( break_condition ) {
exit = true;
break; // or continue
}
// more code
}
}
Exception:
try {
for ( int i = 0; i < 10 && !exit; ++i ) {
for ( int j = 0; j < 10 && !exit; ++j ) {
// code
if ( break_condition ) {
throw new Exception()
}
// more code
}
}
catch ( Exception e ) {}
Is it possible to refactor the nested for loop into a private method? That way you could simply 'return' out of the method to exit the loop.
It seems to me like people dislike a goto statement a lot, so I felt the need to straighten this out a bit.
I believe the 'emotions' people have about goto eventually boil down to understanding of code and (misconceptions) about possible performance implications. Before answering the question, I will therefore first go into some of the details on how it's compiled.
As we all know, C# is compiled to IL, which is then compiled to assembler using an SSA compiler. I'll give a bit of insights into how this all works, and then try to answer the question itself.
From C# to IL
First we need a piece of C# code. Let's start simple:
foreach (var item in array)
{
// ...
break;
// ...
}
I'll do this step by step to give you a good idea of what happens under the hood.
First translation: from foreach to the equivalent for loop (Note: I'm using an array here, because I don't want to get into details of IDisposable -- in which case I'd also have to use an IEnumerable):
for (int i=0; i<array.Length; ++i)
{
var item = array[i];
// ...
break;
// ...
}
Second translation: the for and break is translated into an easier equivalent:
int i=0;
while (i < array.Length)
{
var item = array[i];
// ...
break;
// ...
++i;
}
And third translation (this is the equivalent of the IL code): we change break and while into a branch:
int i=0; // for initialization
startLoop:
if (i >= array.Length) // for condition
{
goto exitLoop;
}
var item = array[i];
// ...
goto exitLoop; // break
// ...
++i; // for post-expression
goto startLoop;
While the compiler does these things in a single step, it gives you insight into the process. The IL code that evolves from the C# program is the literal translation of the last C# code. You can see for yourself here: https://dotnetfiddle.net/QaiLRz (click 'view IL')
Now, one thing you have observed here is that during the process, the code becomes more complex. The easiest way to observe this is by the fact that we needed more and more code to ackomplish the same thing. You might also argue that foreach, for, while and break are actually short-hands for goto, which is partly true.
From IL to Assembler
The .NET JIT compiler is an SSA compiler. I won't go into all the details of SSA form here and how to create an optimizing compiler, it's just too much, but can give a basic understanding about what will happen. For a deeper understanding, it's best to start reading up on optimizing compilers (I do like this book for a brief introduction: http://ssabook.gforge.inria.fr/latest/book.pdf ) and LLVM (llvm.org).
Every optimizing compiler relies on the fact that code is easy and follows predictable patterns. In the case of FOR loops, we use graph theory to analyze branches, and then optimize things like cycli in our branches (e.g. branches backwards).
However, we now have forward branches to implement our loops. As you might have guessed, this is actually one of the first steps the JIT is going to fix, like this:
int i=0; // for initialization
if (i >= array.Length) // for condition
{
goto endOfLoop;
}
startLoop:
var item = array[i];
// ...
goto endOfLoop; // break
// ...
++i; // for post-expression
if (i >= array.Length) // for condition
{
goto startLoop;
}
endOfLoop:
// ...
As you can see, we now have a backward branch, which is our little loop. The only thing that's still nasty here is the branch that we ended up with due to our break statement. In some cases, we can move this in the same way, but in others it's there to stay.
So why does the compiler do this? Well, if we can unroll the loop, we might be able to vectorize it. We might even be able to proof that there's just constants being added, which means our whole loop could vanish into thin air. To summarize: by making the patterns predictable (by making the branches predictable), we can proof that certain conditions hold in our loop, which means we can do magic during the JIT optimization.
However, branches tend to break those nice predictable patterns, which is something optimizers therefore kind-a dislike. Break, continue, goto - they all intend to break these predictable patterns- and are therefore not really 'nice'.
You should also realize at this point that a simple foreach is more predictable then a bunch of goto statements that go all over the place. In terms of (1) readability and (2) from an optimizer perspective, it's both the better solution.
Another thing worth mentioning is that it's very relevant for optimizing compilers to assign registers to variables (a process called register allocation). As you might know, there's only a finite number of registers in your CPU and they are by far the fastest pieces of memory in your hardware. Variables used in code that's in the inner-most loop, are more likely to get a register assigned, while variables outside of your loop are less important (because this code is probably hit less).
Help, too much complexity... what should I do?
The bottom line is that you should always use the language constructs you have at your disposal, which will usually (implictly) build predictable patterns for your compiler. Try to avoid strange branches if possible (specifically: break, continue, goto or a return in the middle of nothing).
The good news here is that these predictable patterns are both easy to read (for humans) and easy to spot (for compilers).
One of those patterns is called SESE, which stands for Single Entry Single Exit.
And now we get to the real question.
Imagine that you have something like this:
// a is a variable.
for (int i=0; i<100; ++i)
{
for (int j=0; j<100; ++j)
{
// ...
if (i*j > a)
{
// break everything
}
}
}
The easiest way to make this a predictable pattern is to simply eliminate the if completely:
int i, j;
for (i=0; i<100 && i*j <= a; ++i)
{
for (j=0; j<100 && i*j <= a; ++j)
{
// ...
}
}
In other cases you can also split the method into 2 methods:
// Outer loop in method 1:
for (i=0; i<100 && processInner(i); ++i)
{
}
private bool processInner(int i)
{
int j;
for (j=0; j<100 && i*j <= a; ++j)
{
// ...
}
return i*j<=a;
}
Temporary variables? Good, bad or ugly?
You might even decide to return a boolean from within the loop (but I personally prefer the SESE form because that's how the compiler will see it and I think it's cleaner to read).
Some people think it's cleaner to use a temporary variable, and propose a solution like this:
bool more = true;
for (int i=0; i<100; ++i)
{
for (int j=0; j<100; ++j)
{
// ...
if (i*j > a) { more = false; break; } // yuck.
// ...
}
if (!more) { break; } // yuck.
// ...
}
// ...
I personally am opposed to this approach. Look again on how the code is compiled. Now think about what this will do with these nice, predictable patterns. Get the picture?
Right, let me spell it out. What will happen is that:
The compiler will write out everything as branches.
As an optimization step, the compiler will do data flow analysis in an attempt to remove the strange more variable that only happens to be used in control flow.
If succesful, the variable more will be eliminated from the program, and only branches remain. These branches will be optimized, so you will get only a single branch out of the inner loop.
If unsuccesful, the variable more is definitely used in the inner-most loop, so if the compiler won't optimize it away, it has a high chance to be allocated to a register (which eats up valuable register memory).
So, to summarize: the optimizer in your compiler will go into a hell of a lot of trouble to figure out that more is only used for the control flow, and in the best case scenario will translate it to a single branch outside of the outer for loop.
In other words, the best case scenario is that it will end up with the equivalent of this:
for (int i=0; i<100; ++i)
{
for (int j=0; j<100; ++j)
{
// ...
if (i*j > a) { goto exitLoop; } // perhaps add a comment
// ...
}
// ...
}
exitLoop:
// ...
My personal opinion on this is quite simple: if this is what we intended all along, let's make the world easier for both the compiler and readability, and write that right away.
tl;dr:
Bottom line:
Use a simple condition in your for loop if possible. Stick to the high-level language constructs you have at your disposal as much as possible.
If everything fails and you're left with either goto or bool more, prefer the former.
You asked for a combination of quick, nice, no use of a boolean, no use of goto, and C#. You've ruled out all possible ways of doing what you want.
The most quick and least ugly way is to use a goto.
factor into a function/method and use early return, or rearrange your loops into a while-clause. goto/exceptions/whatever are certainly not appropriate here.
def do_until_equal():
foreach a:
foreach b:
if a==b: return
The cleanest, shortest, and most reusable way is a self invoked anonymous function:
no goto
no label
no temporary variable
no named function
One line shorter than the top answer with anonymous method.
new Action(() =>
{
for (int x = 0; x < 100; x++)
{
for (int y = 0; y < 100; y++)
{
return; // exits self invoked lambda expression
}
}
})();
Console.WriteLine("Hi");
Sometimes nice to abstract the code into it's own function and than use an early return - early returns are evil though : )
public void GetIndexOf(Transform transform, out int outX, out int outY)
{
outX = -1;
outY = -1;
for (int x = 0; x < Columns.Length; x++)
{
var column = Columns[x];
for (int y = 0; y < column.Transforms.Length; y++)
{
if(column.Transforms[y] == transform)
{
outX = x;
outY = y;
return;
}
}
}
}
Since I first saw break in C a couple of decades back, this problem has vexed me. I was hoping some language enhancement would have an extension to break which would work thus:
break; // our trusty friend, breaks out of current looping construct.
break 2; // breaks out of the current and it's parent looping construct.
break 3; // breaks out of 3 looping constructs.
break all; // totally decimates any looping constructs in force.
I've seen a lot of examples that use "break" but none that use "continue".
It still would require a flag of some sort in the inner loop:
while( some_condition )
{
// outer loop stuff
...
bool get_out = false;
for(...)
{
// inner loop stuff
...
get_out = true;
break;
}
if( get_out )
{
some_condition=false;
continue;
}
// more out loop stuff
...
}
The easiest way to end a double loop would be directly ending the first loop
string TestStr = "The frog jumped over the hill";
char[] KillChar = {'w', 'l'};
for(int i = 0; i < TestStr.Length; i++)
{
for(int E = 0; E < KillChar.Length; E++)
{
if(KillChar[E] == TestStr[i])
{
i = TestStr.Length; //Ends First Loop
break; //Ends Second Loop
}
}
}
Loops can be broken using custom conditions in the loop, allowing as to have clean code.
static void Main(string[] args)
{
bool isBreak = false;
for (int i = 0; ConditionLoop(isBreak, i, 500); i++)
{
Console.WriteLine($"External loop iteration {i}");
for (int j = 0; ConditionLoop(isBreak, j, 500); j++)
{
Console.WriteLine($"Inner loop iteration {j}");
// This code is only to produce the break.
if (j > 3)
{
isBreak = true;
}
}
Console.WriteLine("The code after the inner loop will be executed when breaks");
}
Console.ReadKey();
}
private static bool ConditionLoop(bool isBreak, int i, int maxIterations) => i < maxIterations && !isBreak;
With this code we ontain the following output:
External loop iteration 0
Inner loop iteration 0
Inner loop iteration 1
Inner loop iteration 2
Inner loop iteration 3
Inner loop iteration 4
The code after the inner loop will be executed when breaks
I remember from my student days that it was said it's mathematically provable that you can do anything in code without a goto (i.e. there is no situation where goto is the only answer). So, I never use goto's (just my personal preference, not suggesting that i'm right or wrong)
Anyways, to break out of nested loops I do something like this:
var isDone = false;
for (var x in collectionX) {
for (var y in collectionY) {
for (var z in collectionZ) {
if (conditionMet) {
// some code
isDone = true;
}
if (isDone)
break;
}
if (isDone)
break;
}
if (isDone)
break;
}
... i hope that helps for those who like me are anti-goto "fanboys" :)
That's how I did it. Still a workaround.
foreach (var substring in substrings) {
//To be used to break from 1st loop.
int breaker=1;
foreach (char c in substring) {
if (char.IsLetter(c)) {
Console.WriteLine(line.IndexOf(c));
\\setting condition to break from 1st loop.
breaker=9;
break;
}
}
if (breaker==9) {
break;
}
}
Another option which is not mentioned here which is both clean and does not rely on newer .NET features is to consolidate the double loop into a single loop over the product. Then inside the loop the values of the counters can be calculated using simple math:
int n; //set to max of first loop
int m; //set to max of second loop
for (int k = 0; k < n * m; k++)
{
//calculate the values of i and j as if there was a double loop
int i = k / m;
int j = k % m;
if(exitCondition)
{
break;
}
}
People often forget that the 2nd statement of the for loops themselves are the break conditions, so there is no need to have additional ifs within the code.
Something like this works:
bool run = true;
int finalx = 0;
int finaly = 0;
for (int x = 0; x < 100 && run; x++)
{
finalx = x;
for (int y = 0; y < 100 && run; y++)
{
finaly = y;
if (x == 10 && y == 50) { run = false; }
}
}
Console.WriteLine("x: " + finalx + " y: " + finaly); // outputs 'x: 10 y: 50'
just use return inside the inner loop and the two loops will be exited...
I would just set a flag.
var breakOuterLoop = false;
for (int i = 0; i < 30; i++)
{
for (int j = 0; j < 30; j++)
{
if (condition)
{
breakOuterLoop = true;
break;
}
}
if (breakOuterLoop){
break;
}
}
Throw a custom exception which goes out outter loop.
It works for for,foreach or while or any kind of loop and any language that uses try catch exception block
try
{
foreach (object o in list)
{
foreach (object another in otherList)
{
// ... some stuff here
if (condition)
{
throw new CustomExcpetion();
}
}
}
}
catch (CustomException)
{
// log
}
bool breakInnerLoop=false
for(int i=0;i<=10;i++)
{
for(int J=0;i<=10;i++)
{
if(i<=j)
{
breakInnerLoop=true;
break;
}
}
if(breakInnerLoop)
{
continue
}
}
As i see you accepted the answer in which the person refers you goto statement, where in modern programming and in expert opinion goto is a killer, we called it a killer in programming which have some certain reasons, which i will not discuss it over here at this point, but the solution of your question is very simple, you can use a Boolean flag in this kind of scenario like i will demonstrate it in my example:
for (; j < 10; j++)
{
//solution
bool breakme = false;
for (int k = 1; k < 10; k++)
{
//place the condition where you want to stop it
if ()
{
breakme = true;
break;
}
}
if(breakme)
break;
}
simple and plain. :)
Did you even look at the break keyword? O.o
This is just pseudo-code, but you should be able to see what I mean:
<?php
for(...) {
while(...) {
foreach(...) {
break 3;
}
}
}
If you think about break being a function like break(), then it's parameter would be the number of loops to break out of. As we are in the third loop in the code here, we can break out of all three.
Manual: http://php.net/break
I think unless you want to do the "boolean thing" the only solution is actually to throw. Which you obviously shouldn't do..!
I was adapting a simple prime-number generation one-liner from Scala to C# (mentioned in a comment on this blog by its author). I came up with the following:
int NextPrime(int from)
{
while(true)
{
n++;
if (!Enumerable.Range(2, (int)Math.Sqrt(n) - 1).Any((i) => n % i == 0))
return n;
}
}
It works, returning the same results I'd get from running the code referenced in the blog. In fact, it works fairly quickly. In LinqPad, it generated the 100,000th prime in about 1 second. Out of curiosity, I rewrote it without Enumerable.Range() and Any():
int NextPrimeB(int from)
{
while(true)
{
n++;
bool hasFactor = false;
for (int i = 2; i <= (int)Math.Sqrt(n); i++)
{
if (n % i == 0) hasFactor = true;
}
if (!hasFactor) return n;
}
}
Intuitively, I'd expect them to either run at the same speed, or even for the latter to run a little faster. In actuality, computing the same value (100,000th prime) with the second method, takes 12 seconds - It's a staggering difference.
So what's going on here? There must be fundamentally something extra happening in the second approach that's eating up CPU cycles, or some optimization going on the background of the Linq examples. Anybody know why?
For every iteration of the for loop, you are finding the square root of n. Cache it instead.
int root = (int)Math.Sqrt(n);
for (int i = 2; i <= root; i++)
And as other have mentioned, break the for loop as soon as you find a factor.
The LINQ version short circuits, your loop does not. By this I mean that when you have determined that a particular integer is in fact a factor the LINQ code stops, returns it, and then moves on. Your code keeps looping until it's done.
If you change the for to include that short circuit, you should see similar performance:
int NextPrimeB(int from)
{
while(true)
{
n++;
for (int i = 2; i <= (int)Math.Sqrt(n); i++)
{
if (n % i == 0) return n;;
}
}
}
It looks like this is the culprit:
for (int i = 2; i <= (int)Math.Sqrt(n); i++)
{
if (n % i == 0) hasFactor = true;
}
You should exit the loop once you find a factor:
if (n % i == 0){
hasFactor = true;
break;
}
And as other have pointed out, move the Math.Sqrt call outside the loop to avoid calling it each cycle.
Enumerable.Any takes an early out if the condition is successful while your loop does not.
The enumeration of source is stopped as soon as the result can be determined.
This is an example of a bad benchmark. Try modifying your loop and see the difference:
if (n % i == 0) { hasFactor = true; break; }
}
throw new InvalidOperationException("Cannot satisfy criteria.");
In the name of optimization, you can be a little more clever about this by avoiding even numbers after 2:
if (n % 2 != 0)
{
int quux = (int)Math.Sqrt(n);
for (int i = 3; i <= quux; i += 2)
{
if (n % i == 0) return n;
}
}
There are some other ways to optimize prime searches, but this is one of the easier to do and has a large payoff.
Edit: you may want to consider using (int)Math.Sqrt(n) + 1. FP functions + round-down could potentially cause you to miss a square of a large prime number.
At least part of the problem is the number of times Math.Sqrt is executed. In the LINQ query this is executed once but in the loop example it's executed N times. Try pulling that out into a local and reprofiling the application. That will give you a more representative break down
int limit = (int)Math.Sqrt(n);
for (int i = 2; i <= limit; i++)
Good morning, afternoon or night,
Up until today, I thought comparison was one of the basic processor instructions, and so that it was one of the fastest operations one can do in a computer... On the other hand I know multiplication in sometimes trickier and involves lots of bits operations. However, I was a little shocked to look at the results of the following code:
Stopwatch Test = new Stopwatch();
int a = 0;
int i = 0, j = 0, l = 0;
double c = 0, d = 0;
for (i = 0; i < 32; i++)
{
Test.Start();
for (j = Int32.MaxValue, l = 1; j != 0; j = -j + ((j < 0) ? -1 : 1), l = -l)
{
a = l * j;
}
Test.Stop();
Console.WriteLine("Product: {0}", Test.Elapsed.TotalMilliseconds);
c += Test.Elapsed.TotalMilliseconds;
Test.Reset();
Test.Start();
for (j = Int32.MaxValue, l = 1; j != 0; j = -j + ((j < 0) ? -1 : 1), l = -l)
{
a = (j < 0) ? -j : j;
}
Test.Stop();
Console.WriteLine("Comparison: {0}", Test.Elapsed.TotalMilliseconds);
d += Test.Elapsed.TotalMilliseconds;
Test.Reset();
}
Console.WriteLine("Product: {0}", c / 32);
Console.WriteLine("Comparison: {0}", d / 32);
Console.ReadKey();
}
Result:
Product: 8558.6
Comparison: 9799.7
Quick explanation: j is an ancillary alternate variable which goes like (...), 11, -10, 9, -8, 7, (...) until it reaches zero, l is a variable which stores j's sign, and a is the test variable, which I want always to be equal to the modulus of j. The goal of the test was to check whether it is faster to set a to this value using multiplication or the conditional operator.
Can anyone please comment on these results?
Thank you very much.
Your second test it's not a mere comparison, but an if statement.
That's probably translated in a JUMP/BRANCH instruction in CPU, involving branch prediction (with possible blocks of the pipeline) and then is likely slower than a simple multiplication (even if not so much).
It can often be very difficult to make such assertions about an optimising compiler. They do lots of tricks that make simple cases different from real code. That said, you aren't just doing a comparison, you're doing a compare/assign in a very tight loop. The thread you're working on may have to pause many times at the branch; the multiplication can assign as many times as it likes, as long as the last assignment is still last, so many multiplies can be going on at once.
As is the general rule, make your code clear and ignore minor timing issues unless they become a problem.
If you do have a speed problem a good tracing/timing tool will guide you much better than knowing if one operation is faster than other in a specific case.
I guess one comment I would make is that you are doing a lot more in the second operation:
a = (j < 0) ? -j : j;
Not only are you doing a comparison, but also effectivly a "if..else.." with the ? operator and a negation of j.
You should try to run this test a 1000 or so times and use avrage to compare you never now what CLR is doing in background