As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Is it a bad practice to use break statement inside a for loop?
Say, I am searching for an value in an array. Compare inside a for loop and when value is found, break; to exit the for loop.
Is this a bad practice? I have seen the alternative used: define a variable vFound and set it to true when the value is found and check vFound in the for statement condition. But is it necessary to create a new variable just for this purpose?
I am asking in the context of a normal C or C++ for loop.
P.S: The MISRA coding guidelines advise against using break.
No, break is the correct solution.
Adding a boolean variable makes the code harder to read and adds a potential source of errors.
Lots of answers here, but I haven't seen this mentioned yet:
Most of the "dangers" associated with using break or continue in a for loop are negated if you write tidy, easily-readable loops. If the body of your loop spans several screen lengths and has multiple nested sub-blocks, yes, you could easily forget that some code won't be executed after the break. If, however, the loop is short and to the point, the purpose of the break statement should be obvious.
If a loop is getting too big, use one or more well-named function calls within the loop instead. The only real reason to avoid doing so is for processing bottlenecks.
You can find all sorts of professional code with 'break' statements in them. It perfectly make sense to use this whenever necessary. In your case this option is better than creating a separate variable just for the purpose of coming out of the loop.
Using break as well as continue in a for loop is perfectly fine.
It simplifies the code and improves its readability.
Far from bad practice, Python (and other languages?) extended the for loop structure so part of it will only be executed if the loop doesn't break.
for n in range(5):
for m in range(3):
if m >= n:
print('stop!')
break
print(m, end=' ')
else:
print('finished.')
Output:
stop!
0 stop!
0 1 stop!
0 1 2 finished.
0 1 2 finished.
Equivalent code without break and that handy else:
for n in range(5):
aborted = False
for m in range(3):
if not aborted:
if m >= n:
print('stop!')
aborted = True
else:
print(m, end=' ')
if not aborted:
print('finished.')
General rule: If following a rule requires you to do something more awkward and difficult to read then breaking the rule, then break the rule.
In the case of looping until you find something, you run into the problem of distinguishing found versus not found when you get out. That is:
for (int x=0;x<fooCount;++x)
{
Foo foo=getFooSomehow(x);
if (foo.bar==42)
break;
}
// So when we get here, did we find one, or did we fall out the bottom?
So okay, you can set a flag, or initialize a "found" value to null. But
That's why in general I prefer to push my searches into functions:
Foo findFoo(int wantBar)
{
for (int x=0;x<fooCount;++x)
{
Foo foo=getFooSomehow(x);
if (foo.bar==wantBar)
return foo;
}
// Not found
return null;
}
This also helps to unclutter the code. In the main line, "find" becomes a single statement, and when the conditions are complex, they're only written once.
There is nothing inherently wrong with using a break statement but nested loops can get confusing. To improve readability many languages (at least Java does) support breaking to labels which will greatly improve readability.
int[] iArray = new int[]{0,1,2,3,4,5,6,7,8,9};
int[] jArray = new int[]{0,1,2,3,4,5,6,7,8,9};
// label for i loop
iLoop: for (int i = 0; i < iArray.length; i++) {
// label for j loop
jLoop: for (int j = 0; j < jArray.length; j++) {
if(iArray[i] < jArray[j]){
// break i and j loops
break iLoop;
} else if (iArray[i] > jArray[j]){
// breaks only j loop
break jLoop;
} else {
// unclear which loop is ending
// (breaks only the j loop)
break;
}
}
}
I will say that break (and return) statements often increase cyclomatic complexity which makes it harder to prove code is doing the correct thing in all cases.
If you're considering using a break while iterating over a sequence for some particular item, you might want to reconsider the data structure used to hold your data. Using something like a Set or Map may provide better results.
break is a completely acceptable statement to use (so is continue, btw). It's all about code readability -- as long as you don't have overcomplicated loops and such, it's fine.
It's not like they were the same league as goto. :)
It depends on the language. While you can possibly check a boolean variable here:
for (int i = 0; i < 100 && stayInLoop; i++) { ... }
it is not possible to do it when itering over an array:
for element in bigList: ...
Anyway, break would make both codes more readable.
I agree with others who recommend using break. The obvious consequential question is why would anyone recommend otherwise? Well... when you use break, you skip the rest of the code in the block, and the remaining iterations. Sometimes this causes bugs, for example:
a resource acquired at the top of the block may be released at the bottom (this is true even for blocks inside for loops), but that release step may be accidentally skipped when a "premature" exit is caused by a break statement (in "modern" C++, "RAII" is used to handle this in a reliable and exception-safe way: basically, object destructors free resources reliably no matter how a scope is exited)
someone may change the conditional test in the for statement without noticing that there are other delocalised exit conditions
ndim's answer observes that some people may avoid breaks to maintain a relatively consistent loop run-time, but you were comparing break against use of a boolean early-exit control variable where that doesn't hold
Every now and then people observing such bugs realise they can be prevented/mitigated by this "no breaks" rule... indeed, there's a whole related strategy for "safer" programming called "structured programming", where each function is supposed to have a single entry and exit point too (i.e. no goto, no early return). It may eliminate some bugs, but it doubtless introduces others. Why do they do it?
they have a development framework that encourages a particular style of programming / code, and they've statistical evidence that this produces a net benefit in that limited framework, or
they've been influenced by programming guidelines or experience within such a framework, or
they're just dictatorial idiots, or
any of the above + historical inertia (relevant in that the justifications are more applicable to C than modern C++).
In your example you do not know the number of iterations for the for loop. Why not use while loop instead, which allows the number of iterations to be indeterminate at the beginning?
It is hence not necessary to use break statemement in general, as the loop can be better stated as a while loop.
I did some analysis on the codebase I'm currently working on (40,000 lines of JavaScript).
I found only 22 break statements, of those:
19 were used inside switch statements (we only have 3 switch statements in total!).
2 were used inside for loops - a code that I immediately classified as to be refactored into separate functions and replaced with return statement.
As for the final break inside while loop... I ran git blame to see who wrote this crap!
So according to my statistics: If break is used outside of switch, it is a code smell.
I also searched for continue statements. Found none.
It's perfectly valid to use break - as others have pointed out, it's nowhere in the same league as goto.
Although you might want to use the vFound variable when you want to check outside the loop whether the value was found in the array. Also from a maintainability point of view, having a common flag signalling the exit criteria might be useful.
I don't see any reason why it would be a bad practice PROVIDED that you want to complete STOP processing at that point.
In the embedded world, there is a lot of code out there that uses the following construct:
while(1)
{
if (RCIF)
gx();
if (command_received == command_we_are_waiting_on)
break;
else if ((num_attempts > MAX_ATTEMPTS) || (TickGet() - BaseTick > MAX_TIMEOUT))
return ERROR;
num_attempts++;
}
if (call_some_bool_returning_function())
return TRUE;
else
return FALSE;
This is a very generic example, lots of things are happening behind the curtain, interrupts in particular. Don't use this as boilerplate code, I'm just trying to illustrate an example.
My personal opinion is that there is nothing wrong with writing a loop in this manner as long as appropriate care is taken to prevent remaining in the loop indefinitely.
Depends on your use case. There are applications where the runtime of a for loop needs to be constant (e.g. to satisfy some timing constraints, or to hide your data internals from timing based attacks).
In those cases it will even make sense to set a flag and only check the flag value AFTER all the for loop iterations have actually run. Of course, all the for loop iterations need to run code that still takes about the same time.
If you do not care about the run time... use break; and continue; to make the code easier to read.
On MISRA 98 rules, that is used on my company in C dev, break statement shall not be used...
Edit : Break is allowed in MISRA '04
Ofcourse, break; is the solution to stop the for loop or foreach loop. I used it in php in foreach and for loop and found working.
I think it can make sense to have your checks at the top of your for loop like so
for(int i = 0; i < myCollection.Length && myCollection[i].SomeValue != "Break Condition"; i++)
{
//loop body
}
or if you need to process the row first
for(int i = 0; i < myCollection.Length && (i == 0 ? true : myCollection[i-1].SomeValue != "Break Condition"); i++)
{
//loop body
}
This way you can have a singular body function without breaks.
for(int i = 0; i < myCollection.Length && (i == 0 ? true : myCollection[i-1].SomeValue != "Break Condition"); i++)
{
PerformLogic(myCollection[i]);
}
It can also be modified to move Break into its own function as well.
for(int i = 0; ShouldContinueLooping(i, myCollection); i++)
{
PerformLogic(myCollection[i]);
}
I have seen something like the following a couple times... and I hate it. Is this basically 'cheating' the language? Or.. would you consider this to be 'ok' because the IsNullOrEmpty is evaluated first, all the time?
(We could argue whether or not a string should be NULL when it comes out of a function, but that isn't really the question.)
string someString;
someString = MagicFunction();
if (!string.IsNullOrEmpty(someString) && someString.Length > 3)
{
// normal string, do whatever
}
else
{
// On a NULL string, it drops to here, because first evaluation of IsNullOrEmpty fails
// However, the Length function, if used by itself, would throw an exception.
}
EDIT:
Thanks again to everyone for reminding me of this language fundamental. While I knew "why" it worked, I can't believe I didn't know/remember the name of the concept.
(In case anyone wants any background.. I came upon this while troubleshooting exceptions generated by NULL strings and .Length > x exceptions... in different places of the code. So when I saw the above code, in addition to everything else, my frustration took over from there.)
You're taking advantage of a language feature known as short circuiting. This is not cheating the language but in fact using a feature exactly how it was designed to be used.
If you are asking if its ok to depend on the "short circuit" relational operators && and ||, then yes thats totally fine.
There is nothing wrong with this, as you just want to make certain you won't get a nullpointer exception.
I think it is reasonable to do.
With Extensions you can make it cleaner, but the basic concept would still be valid.
This code is totally valid, but I like to use the Null Coalesce Operator for avoid null type checks.
string someString = MagicFunction() ?? string.Empty;
if (someString.Length > 3)
{
// normal string, do whatever
}
else
{
// NULL strings will be converted to Length = 0 and will end up here.
}
Theres nothing wrong with this.
if(conditions are evaluated from left to right so it's perfectly fine to stack them like this.
This is valid code, in my opinion (although declaring a variable and assigning it on the next line is pretty annoying), but you should probably realize that you can enter the else-block also in the condition where the length of the string is < 3.
That looks to me like a perfectly reasonable use of logical short-circuitting--if anything, it's cheating with the language. I've only recently come from VB6 which didn't ever short-circuit, and that really annoyed me.
One problem to watch out for is that you might need to test for Null again in that else clause, since--as written--you're winding up there with both Null strings and length-less-than-three strings.
This is perfectly valid and there is nothing wrong with using it that way. If you are following documented behaviour for the language than all is well. In C# the syntax you are using are the conditional logic operators and thier docemented bahviour can be found on MSDN
For me it's the same as when you do not use parenthesis for when doing multiplication and addition in the same statement because the language documents that the multiplication operations will get carried out first.
Relying on short-circuiting is the "right thing" to do in most cases. It leads to terser code with fewer moving parts. Which generally means easier to maintain. This is especially true in C and C++.
I would seriously reconsider hiring someone who is not familiar with (and does not know how to use) short-circuiting operations.
I find it OK :) You're just making sure that you don't access a NULL variable.
Actually, I always do such checking before doing any operation on my variable (also, when indexing collections and so) - it's safer, a best practice, that's all ..
It makes sense because C# by default short circuits the conditions, so I think it's fine to use that to your advantage. In VB there may be some issues if the developer uses AND instead of ANDALSO.
I don't think it's any different than something like this:
INT* pNumber = GetAddressOfNumber();
if ((pNUmber != NULL) && (*pNumber > 0))
{
// valid number, do whatever
}
else
{
// On a null pointer, it drops to here, because (pNumber != NULL) fails
// However, (*pNumber > 0), if used by itself, would throw and exception when dereferencing NULL
}
It's just taking advantage of a feature in the language. This kind of idiom has been in common use, I think, since C started executing Boolean expressions in this manner (or whatever language did it first).)
If it were code in c that you compiled into assembly, not only is short-circuiting the right behavior, it's faster. In machine langauge the parts of the if statement are evaluated one after another. Not short-circuiting is slower.
Writing code cost a lot of $ to a company. But maintaining it cost more !
So, I'm OK with your point : chance are that this line of code will not be understood immediatly by the guy who will have to read it and correct it in 2 years.
Of course, he will be asked to correct a critical production bug. He will search here and there and may not notice this.
We should always code for the next guy and he may be less clever that we are. To me, this is the only thing to remember.
And this implies that we use evident language features and avoid the others.
All the best, Sylvain.
A bit off topic but if you rand the same example in vb.net like this
dim someString as string
someString = MagicFunction()
if not string.IsNullOrEmpty(someString) and someString.Length > 3 then
' normal string, do whatever
else
' do someting else
end if
this would go bang on a null (nothing) string but in VB.Net you code it as follows do do the same in C#
dim someString as string
someString = MagicFunction()
if not string.IsNullOrEmpty(someString) andalso someString.Length > 3 then
' normal string, do whatever
else
' do someting else
end if
adding the andalso make it behave the same way, also it reads better. as someone who does both vb and c' development the second vb one show that the login is slighty different and therefor easyer to explain to someone that there is a differeance etc.
Drux
I found this statement is some old code and it took me a second to figure out...
IsTestActive = (TestStateID == 1 ? true : false);
Please correct me if I'm wrong but isn't this the same as this one?:
IsTestActive = (TestStateID == 1);
If it is, why would you ever want to use the first? Which one is more readable? (I think the latter, but I'd like to see what others think.)
Yes, it is exactly the same.
Yes, the latter is more readable.
IsTestActive = (TestStateID == 1);
is definitely more readable.
You could make a case for defining a constant
ACTIVE = 1
then replacing the boolean variable IsTestActive with
(TestStateID == ACTIVE)
The way the code is now, the state of the boolean IsTestActive will be erroneous if the state of TestStateID changes without updating the boolean. Bypassing the boolean and testing the real source of the information you're after will remove the possibility of this error.
No, there's no practical reason for using the first version, world isn't perfect, and neither are programmers.
Readability depends on where you use this construct. I often find something like
(TestStateID == 1 ? true : false)
more readable.
Well, I don't know about other languages, but in PHP it's even more easy, using type-casting:
$IsTestActive = (boolean)$TestStateId;