A doubt related to Time Complexity ! - c#

Consider the following snippet:
for(i = n-1; i >= 0; i--)
{
if(str[i] == ' ')
{
i += 2; // I am just incrementing it by 2, so that i can retrieve i+1
continue;
}
// rest of the code with many similar increments of i
}
Say suppose the loop never turns up into infinity and If I traverse through the loop with many such increments and decrements, I am sure the complexity would not be of order N or N Square. But is there any generalised complexity for such kind of solutions ??
P.S: I know it`s the worst code, but still wanted to give it a try :-)

This is an infinite loop (infinite complexity) if you have a space in your string. Since you are using continue, it goes back to for and it starts from i+2.

Lets assume that str does not change over the course of this traversal.
You are traversing str backwards and when you hit a space you move the index forward by one i.e. it would hit the space again in the next decrement and then move it forward again i.e. your claim that the loop is not infinite, does not seem valid.

If no mutable state affects the path taken by i, then either you go into an infinite loop, or you exit the loop in n or less steps. In the latter case, the worst case performance will be O(N).
If the loop mutates some other state, and that state affects the path, then it is impossible to predict the complexity without understanding the state and the mutation process.
(As written, the code will go into an infinite loop ... unless the section at the end of the loop does something to prevent it.)

Related

Need help understanding the syntax in an array

I am having trouble understanding what the for loop is doing.
To me, I see it as:
int i = 0; //Declaring i to become 0. i is the value in myArray?
i < myArray.Length; //When i is less than any value in myArray keep looping?
i++; //Every time this loop goes through increase i by 1?
//Making an array called myArray that contains 20,5,7,2,55
int[] myArray = { 20, 5, 7, 2, 55 };
//Using the built in feature, Array.Sort(); to sort out myArray
Array.Sort(myArray);
for (int i = 0; i < myArray.Length; i++)
{
Console.WriteLine(myArray[i]);
}
I'm going to make some assumptions about your knowledge of programming, so forgive me if this explanation covers topics you're already familiar with, but they are all important for understanding what a for loop does, what it's use is and what the semantics are going to be when someone comes behind you and reads your code. Your question demonstrates that you're super close to understanding it, so hopefully it'll hit you like a ton of bricks once you have a good explanation.
Consider an array of strings of length 5. You would initialize it in C# like so:
string[] arr = new string[5];
What this means is that you have an array that has allocated 5 slots for strings. The names of these slots are the indexes of the array. Unfortunately for those who are new to programming, like yourself, indexes start at 0 (this is called zero-indexing) instead of 1. What that means is that the first slot in our new string[] has the name or index of 0, the second of 1, the third of 3 and so on. That means that they length of the array will always be a number equal to the index of the final slot plus one; to put it another way, because arrays are 0 indexed and the first (1st) slot's index is 0, we know what the index of any given slot is n - 1 where n is what folks who are not programmers (or budding programmers!) would typically consider to be the position of that slot in the array as a whole.
We can use the index to pick out the value from an array in the slot that corresponds to the index. Using your example:
int[] myArray = { 20, 5, 7, 2, 55 };
bool first = myArray[0] == 20: //=> true
bool second = myArray[1] == 5; //=> true
bool third = myArray[2] == 7; //=> true
// and so on...
So you see that the number we are passing into the indexer (MSDN) (the square brackets []) corresponds to the location in the array that we are trying to access.
for loops in C syntax languages (C# being one of them along with C, C++, Java, JavaScript, and several others) generally follow the same convention for the "parameters":
for (index_initializer; condition; index_incrementer)
To understand the intended use of these fields it's important to understand what indexes are. Indexes can be thought of as the names or locations for each of the slots in the array (or list or anything that is list-like).
So, to explain each of the parts of the for loop, lets go through them one by one:
Index Initializer
Because we're going to use the index to access the slots in the array, we need to initialize it to a starting value for our for loop. Everything before the first semicolon in the for loop statement is going to run exactly once before anything else in the for loop is run. We call the variable initialized here the index as it keeps track of the current index we're on in the scope of the for loop's life. It is typical (and therefore good practice) to name this variable i for index with nested loops using the subsequent letters of the Latin alphabet. Like I said, this initializing statement happens exactly once so we assign 0 to i to represent that we want to start looping on the first element of the array.
Condition
The next thing that happens when you declare a for loop is that the condition is checked. This check will be the first thing that is run each time the loop runs and the loop will immediately stop if the check returns false. This condition can be anything as long as it results in a bool. If you have a particularly complicated for loop, you might delegate the condition to a method call:
for (int i = 0; ShouldContinueLooping(i); i++)
In the case of your example, we're checking against the length of the array. What we are saying here from an idiomatic standpoint (and what most folks will expect when they see that as the condition) is that you're going to do something with each of the elements of the array. We only want to continue the loop so long as our i is within the "bounds" of the array, which is always defined as 0 through length - 1. Remember how the last index of an array is equal to its length minus 1? That's important here because the first time this condition is going to be false (that is, i will not be less than the length) is when it is equal to the length of the array and therefore 1 greater than the final slot's index. We need to stop looping because the next part of the for statement increases i by one and would cause us to try to access an index outside the bounds of our array.
Index incrementer
The final part of the for loop is executed once as the last thing that happens each time the loop runs. Your comment for this part is spot on.
To recap the order in which things happen:
Index initializer
Conditional check ("break out" or stop lopping if the check returns false)
Body of loop
Index incrementer
Repeat from step 2
To make this clearer, here's your example with a small addition to make things a little more explicit:
// Making an array called myArray that contains 20,5,7,2,55
int[] myArray = { 20, 5, 7, 2, 55 };
// Using the built in feature, Array.Sort(); to sort out myArray
Array.Sort(myArray);
// Array is now [2, 5, 7, 20, 55]
for (int i = 0; i < myArray.Length; i++)
{
int currentNumber = myArray[i];
Console.WriteLine($"Index {i}; Current number {currentNumber}");
}
The output of running this will be:
Index 0; Current number 2
Index 1; Current number 5
Index 2; Current number 7
Index 3; Current number 20
Index 4; Current number 55
I am having trouble understanding what the for loop is doing.
Then let's take a big step back.
When you see
for (int i = 0; i < myArray.Length; i++)
{
Console.WriteLine(myArray[i]);
}
what you should mentally think is:
int i = 0;
while (i < myArray.Length)
{
Console.WriteLine(myArray[i]);
i++;
}
Now we have rewritten the for in terms of while, which is simpler.
Of course, this requires that you understand "while". We can understand while by again, breaking it down into something simpler. When you see while, think:
int i = 0;
START:
if (i < myArray.Length)
goto BODY;
else
goto END;
BODY:
Console.WriteLine(myArray[i]);
i++;
goto START;
END:
// the rest of your program here.
Now we have broken down your loop into its fundamental parts and the control flow is laid bare to your understanding. Walk through it.
We start with i equal to 0. Suppose the length of the array is 3.
Is 0 less than 3? Yes. So we go to BODY next. We write the 0th element of the array and increment i to 1. Now we go back to START.
Is 1 less than 3? Yes. So we go to BODY next. We write the 1th element of the array and increment i to 2. Now we go back to START.
Is 2 less than 3? Yes. So we go to BODY next. We write the 2th element of the array and increment i to 3. Now we go back to START.
Is 3 less than 3? No. So we go to END, and the rest of your program executes.
Now, you probably have noticed that the "goto" form is incredibly ugly and hard to read and reason about. That's why we invented while and for loops, so that you don't have to write awful code that uses gotos. But you can always reason about simple control flow by going back to the goto form mentally.
i < myArray.Length;
This is not testing against the values inside myArray but against the length (how many items the array contains). Therefore it means: When i is less than the length of the array.
So the loop will keep going, adding 1 to i (as you correctly said) each time it loops, when i is equal to the length of the array, meaning it has gone through all the values, it will exit the loop.
As Nicolás Straub pointed out, i is the index of the array, meaning the location of an item in an array, you have initialised it with the value of 0, this is correct because the first value in an array would have an index of 0.
To directly answer your question about for loops:
A for loop is executing lines of code iteratively (multiple times), the amount depends on its control statement:
for (int i = 0; i < myArray.Length; i++)
For loops are generally pre-condition (the condition to loop is before the code) and have loop counters, being i (i is actually a counter but can be seen as the index because you are going through every element, if you wanted to skip some then i would only be a counter). For is great for when you know how many times you want to loop before you start looping.
You are correct in your thinking except as what the others have stated. Think of an array as a sequence of data. You can even use the Reverse() method to apply that to your array. I would research more about arrays so you will understand different things you can do with an array and most importantly if you need to read or write them on the console, in a listbox, or a gridview from the text or csv file.
I suggest you add:
Console.ReadLine();
When you do this the application then will read like this:
2
5
7
20
55

It is wise to create a variable to avoid using several times a count?

Many times I'm in front of a code like that:
var maybe = 'some random linq query'
int maybeCount = maybe.Count();
List<KeyValuePair<int, Customer>> lst2scan = maybe.Take(8).ToList();
for (int k = 0; k < 8; k++)
{
if (k + 1 <= maybeCount) custlst[k].Customer = lst2scan[k].Value;
else custlst[k].Customer = new Customer();
}
Each time I have a code like this. I ask me, must I create a variable to avoid the for-each calculate the Count() ?. Maybe for only 1 Count() in the loop it's useless.
Somebody have an advice with the "right way" to code that in a loop. Do you do in case per case or ?
In this case, is my maybeCount variable useless ?
Do you know if the Count() count each time or he just return the content of a count variable.
Thanks for any advice to improve my knowledge.
If the count is guarranteed to not change then yes this will stop the need to calculate the count multiple times, this can become more important when the Count is resolved from method that can take some time to execute.
If the count does change then this can result in erroneous results being generated.
In answer to your questions.
The way you have done it looks pretty reasonable
As mentioned, your cnt variable means that you won't have to repeat your linq query on each iteration.
The count will be determined each time you call it, it has no memory for what the count is
Note: It is important to note that I am talking about the Enumerable.Count method. The List<T>.Count is a property that can just return a value
Well, it depends ;)
I wouldn't use a loop. I would do something like this:
-
var maybe = 'some random linq query'
// if necessary clear list
custlst.Clear();
custlst.AddRange(maybe.Take(8).Select(p => p.Value));
custlst.AddRange(Enumerable.Range(0, 8 - custlst.Count).Select(i => new Customer()));
-
In this case, yes.
It depends if the underlying type is a List (a Collection, in fact) or a simple enumerable. The implementation of Count will look for a precomputed Count value and use it. So, if you're cycling a list, the variable is almost useless (still a little faster, but I don't think it's relevant by any means), if you're cycling a Linq query, a variable is recommended, because Count is going to cycle the entire enumeration.
Just my 2c.
Edit: for reference, you can find the source for .Count() at https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/Enumerable.cs (around line 1489); as you see, it checks for ICollection, and uses the precomputed Count property.

Reasons to not use 'break' statement in loops [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Is it a bad practice to use break statement inside a for loop?
Say, I am searching for an value in an array. Compare inside a for loop and when value is found, break; to exit the for loop.
Is this a bad practice? I have seen the alternative used: define a variable vFound and set it to true when the value is found and check vFound in the for statement condition. But is it necessary to create a new variable just for this purpose?
I am asking in the context of a normal C or C++ for loop.
P.S: The MISRA coding guidelines advise against using break.
No, break is the correct solution.
Adding a boolean variable makes the code harder to read and adds a potential source of errors.
Lots of answers here, but I haven't seen this mentioned yet:
Most of the "dangers" associated with using break or continue in a for loop are negated if you write tidy, easily-readable loops. If the body of your loop spans several screen lengths and has multiple nested sub-blocks, yes, you could easily forget that some code won't be executed after the break. If, however, the loop is short and to the point, the purpose of the break statement should be obvious.
If a loop is getting too big, use one or more well-named function calls within the loop instead. The only real reason to avoid doing so is for processing bottlenecks.
You can find all sorts of professional code with 'break' statements in them. It perfectly make sense to use this whenever necessary. In your case this option is better than creating a separate variable just for the purpose of coming out of the loop.
Using break as well as continue in a for loop is perfectly fine.
It simplifies the code and improves its readability.
Far from bad practice, Python (and other languages?) extended the for loop structure so part of it will only be executed if the loop doesn't break.
for n in range(5):
for m in range(3):
if m >= n:
print('stop!')
break
print(m, end=' ')
else:
print('finished.')
Output:
stop!
0 stop!
0 1 stop!
0 1 2 finished.
0 1 2 finished.
Equivalent code without break and that handy else:
for n in range(5):
aborted = False
for m in range(3):
if not aborted:
if m >= n:
print('stop!')
aborted = True
else:
print(m, end=' ')
if not aborted:
print('finished.')
General rule: If following a rule requires you to do something more awkward and difficult to read then breaking the rule, then break the rule.
In the case of looping until you find something, you run into the problem of distinguishing found versus not found when you get out. That is:
for (int x=0;x<fooCount;++x)
{
Foo foo=getFooSomehow(x);
if (foo.bar==42)
break;
}
// So when we get here, did we find one, or did we fall out the bottom?
So okay, you can set a flag, or initialize a "found" value to null. But
That's why in general I prefer to push my searches into functions:
Foo findFoo(int wantBar)
{
for (int x=0;x<fooCount;++x)
{
Foo foo=getFooSomehow(x);
if (foo.bar==wantBar)
return foo;
}
// Not found
return null;
}
This also helps to unclutter the code. In the main line, "find" becomes a single statement, and when the conditions are complex, they're only written once.
There is nothing inherently wrong with using a break statement but nested loops can get confusing. To improve readability many languages (at least Java does) support breaking to labels which will greatly improve readability.
int[] iArray = new int[]{0,1,2,3,4,5,6,7,8,9};
int[] jArray = new int[]{0,1,2,3,4,5,6,7,8,9};
// label for i loop
iLoop: for (int i = 0; i < iArray.length; i++) {
// label for j loop
jLoop: for (int j = 0; j < jArray.length; j++) {
if(iArray[i] < jArray[j]){
// break i and j loops
break iLoop;
} else if (iArray[i] > jArray[j]){
// breaks only j loop
break jLoop;
} else {
// unclear which loop is ending
// (breaks only the j loop)
break;
}
}
}
I will say that break (and return) statements often increase cyclomatic complexity which makes it harder to prove code is doing the correct thing in all cases.
If you're considering using a break while iterating over a sequence for some particular item, you might want to reconsider the data structure used to hold your data. Using something like a Set or Map may provide better results.
break is a completely acceptable statement to use (so is continue, btw). It's all about code readability -- as long as you don't have overcomplicated loops and such, it's fine.
It's not like they were the same league as goto. :)
It depends on the language. While you can possibly check a boolean variable here:
for (int i = 0; i < 100 && stayInLoop; i++) { ... }
it is not possible to do it when itering over an array:
for element in bigList: ...
Anyway, break would make both codes more readable.
I agree with others who recommend using break. The obvious consequential question is why would anyone recommend otherwise? Well... when you use break, you skip the rest of the code in the block, and the remaining iterations. Sometimes this causes bugs, for example:
a resource acquired at the top of the block may be released at the bottom (this is true even for blocks inside for loops), but that release step may be accidentally skipped when a "premature" exit is caused by a break statement (in "modern" C++, "RAII" is used to handle this in a reliable and exception-safe way: basically, object destructors free resources reliably no matter how a scope is exited)
someone may change the conditional test in the for statement without noticing that there are other delocalised exit conditions
ndim's answer observes that some people may avoid breaks to maintain a relatively consistent loop run-time, but you were comparing break against use of a boolean early-exit control variable where that doesn't hold
Every now and then people observing such bugs realise they can be prevented/mitigated by this "no breaks" rule... indeed, there's a whole related strategy for "safer" programming called "structured programming", where each function is supposed to have a single entry and exit point too (i.e. no goto, no early return). It may eliminate some bugs, but it doubtless introduces others. Why do they do it?
they have a development framework that encourages a particular style of programming / code, and they've statistical evidence that this produces a net benefit in that limited framework, or
they've been influenced by programming guidelines or experience within such a framework, or
they're just dictatorial idiots, or
any of the above + historical inertia (relevant in that the justifications are more applicable to C than modern C++).
In your example you do not know the number of iterations for the for loop. Why not use while loop instead, which allows the number of iterations to be indeterminate at the beginning?
It is hence not necessary to use break statemement in general, as the loop can be better stated as a while loop.
I did some analysis on the codebase I'm currently working on (40,000 lines of JavaScript).
I found only 22 break statements, of those:
19 were used inside switch statements (we only have 3 switch statements in total!).
2 were used inside for loops - a code that I immediately classified as to be refactored into separate functions and replaced with return statement.
As for the final break inside while loop... I ran git blame to see who wrote this crap!
So according to my statistics: If break is used outside of switch, it is a code smell.
I also searched for continue statements. Found none.
It's perfectly valid to use break - as others have pointed out, it's nowhere in the same league as goto.
Although you might want to use the vFound variable when you want to check outside the loop whether the value was found in the array. Also from a maintainability point of view, having a common flag signalling the exit criteria might be useful.
I don't see any reason why it would be a bad practice PROVIDED that you want to complete STOP processing at that point.
In the embedded world, there is a lot of code out there that uses the following construct:
while(1)
{
if (RCIF)
gx();
if (command_received == command_we_are_waiting_on)
break;
else if ((num_attempts > MAX_ATTEMPTS) || (TickGet() - BaseTick > MAX_TIMEOUT))
return ERROR;
num_attempts++;
}
if (call_some_bool_returning_function())
return TRUE;
else
return FALSE;
This is a very generic example, lots of things are happening behind the curtain, interrupts in particular. Don't use this as boilerplate code, I'm just trying to illustrate an example.
My personal opinion is that there is nothing wrong with writing a loop in this manner as long as appropriate care is taken to prevent remaining in the loop indefinitely.
Depends on your use case. There are applications where the runtime of a for loop needs to be constant (e.g. to satisfy some timing constraints, or to hide your data internals from timing based attacks).
In those cases it will even make sense to set a flag and only check the flag value AFTER all the for loop iterations have actually run. Of course, all the for loop iterations need to run code that still takes about the same time.
If you do not care about the run time... use break; and continue; to make the code easier to read.
On MISRA 98 rules, that is used on my company in C dev, break statement shall not be used...
Edit : Break is allowed in MISRA '04
Ofcourse, break; is the solution to stop the for loop or foreach loop. I used it in php in foreach and for loop and found working.
I think it can make sense to have your checks at the top of your for loop like so
for(int i = 0; i < myCollection.Length && myCollection[i].SomeValue != "Break Condition"; i++)
{
//loop body
}
or if you need to process the row first
for(int i = 0; i < myCollection.Length && (i == 0 ? true : myCollection[i-1].SomeValue != "Break Condition"); i++)
{
//loop body
}
This way you can have a singular body function without breaks.
for(int i = 0; i < myCollection.Length && (i == 0 ? true : myCollection[i-1].SomeValue != "Break Condition"); i++)
{
PerformLogic(myCollection[i]);
}
It can also be modified to move Break into its own function as well.
for(int i = 0; ShouldContinueLooping(i, myCollection); i++)
{
PerformLogic(myCollection[i]);
}

While loop execution time

We were having a performance issue in a C# while loop. The loop was super slow doing only one simple math calc. Turns out that parmIn can be a huge number anywhere from 999999999 to MaxInt. We hadn't anticipated the giant value of parmIn. We have fixed our code using a different methodology.
The loop, coded for simplicity below, did one math calc. I am just curious as to what the actual execution time for a single iteration of a while loop containing one simple math calc is?
int v1=0;
while(v1 < parmIn) {
v1+=parmIn2;
}
There is something else going on here. The following will complete in ~100ms for me. You say that the parmIn can approach MaxInt. If this is true, and the ParmIn2 is > 1, you're not checking to see if your int + the new int will overflow. If ParmIn >= MaxInt - parmIn2, your loop might never complete as it will roll back over to MinInt and continue.
static void Main(string[] args)
{
int i = 0;
int x = int.MaxValue - 50;
int z = 42;
System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
st.Start();
while (i < x)
{
i += z;
}
st.Stop();
Console.WriteLine(st.Elapsed.Milliseconds.ToString());
Console.ReadLine();
}
Assuming an optimal compiler, it should be one operation to check the while condition, and one operation to do the addition.
The time, small as it is, to execute just one iteration of the loop shown in your question is ... surprise ... small.
However, it depends on the actual CPU speed and whatnot exactly how small it is.
It should be just a few machine instructions, so not many cycles to pass once through the iteration, but there could be a few cycles to loop back up, especially if branch prediction fails.
In any case, the code as shown either suffers from:
Premature optimization (in that you're asking about timing for it)
Incorrect assumptions. You can probably get a much faster code if parmIn is big by just calculating how many loop iterations you would have to perform, and do a multiplication. (note again that this might be an incorrect assumption, which is why there is only one sure way to find performance issues, measure measure measure)
What is your real question?
It depends on the processor you are using and the calculation it is performing. (For example, even on some modern architectures, an add may take only one clock cycle, but a divide may take many clock cycles. There is a comparison to determine if the loop should continue, which is likely to be around one clock cycle, and then a branch back to the start of the loop, which may take any number of cycles depending on pipeline size and branch prediction)
IMHO the best way to find out more is to put the code you are interested into a very large loop (millions of iterations), time the loop, and divide by the number of iterations - this will give you an idea of how long it takes per iteration of the loop. (on your PC). You can try different operations and learn a bit about how your PC works. I prefer this "hands on" approach (at least to start with) because you can learn so much more from physically trying it than just asking someone else to tell you the answer.
The while loop is couple of instructions and one instruction for the math operation. You're really looking at a minimal execution time for one iteration. it's the sheer number of iterations you're doing that is killing you.
Note that a tight loop like this has implications on other things as well, as it bogs down one CPU and it blocks the UI thread (if it's running on it). Thus, not only it is slow due to the number of operations, it also adds a perceived perf impact due to making the whole machine look unresponsive.
If you're interested in the actual execution time, why not time it for yourself and find out?
int parmIn = 10 * 1000 * 1000; // 10 million
int v1=0;
Stopwatch sw = Stopwatch.StartNew();
while(v1 < parmIn) {
v1+=parmIn2;
}
sw.Stop();
double opsPerSec = (double)parmIn / sw.Elapsed.TotalSeconds;
And, of course, the time for one iteration is 1/opsPerSec.
Whenever someone asks about how fast control structures in any language you know they are trying to optimize the wrong thing. If you find yourself changing all your i++ to ++i or changing all your switch to if...else for speed you are micro-optimizing. And micro optimizations almost never give you the speed you want. Instead, think a bit more about what you are really trying to do and devise a better way to do it.
I'm not sure if the code you posted is really what you intend to do or if it is simply the loop stripped down to what you think is causing the problem. If it is the former then what you are trying to do is find the largest value of a number that is smaller than another number. If this is really what you want then you don't really need a loop:
// assuming v1, parmIn and parmIn2 are integers,
// and you want the largest number (v1) that is
// smaller than parmIn but is a multiple of parmIn2.
// AGAIN, assuming INTEGER MATH:
v1 = (parmIn/parmIn2)*parmIn2;
EDIT: I just realized that the code as originally written gives the smallest number that is a multiple of parmIn2 that is larger than parmIn. So the correct code is:
v1 = ((parmIn/parmIn2)*parmIn2)+parmIn2;
If this is not what you really want then my advise remains the same: think a bit on what you are really trying to do (or ask on Stackoverflow) instead of trying to find out weather while or for is faster. Of course, you won't always find a mathematical solution to the problem. In which case there are other strategies to lower the number of loops taken. Here's one based on your current problem: keep doubling the incrementer until it is too large and then back off until it is just right:
int v1=0;
int incrementer=parmIn2;
// keep doubling the incrementer to
// speed up the loop:
while(v1 < parmIn) {
v1+=incrementer;
incrementer=incrementer*2;
}
// now v1 is too big, back off
// and resume normal loop:
v1-=incrementer;
while(v1 < parmIn) {
v1+=parmIn2;
}
Here's yet another alternative that speeds up the loop:
// First count at 100x speed
while(v1 < parmIn) {
v1+=parmIn2*100;
}
// back off and count at 50x speed
v1-=parmIn2*100;
while(v1 < parmIn) {
v1+=parmIn2*50;
}
// back off and count at 10x speed
v1-=parmIn2*50;
while(v1 < parmIn) {
v1+=parmIn2*10;
}
// back off and count at normal speed
v1-=parmIn2*10;
while(v1 < parmIn) {
v1+=parmIn2;
}
In my experience, especially with graphics programming where you have millions of pixels or polygons to process, speeding up code usually involve adding even more code which translates to more processor instructions instead of trying to find the fewest instructions possible for the task at hand. The trick is to avoid processing what you don't have to.

Why is it bad to "monkey with the loop index"?

One of Steve McConnell's checklist items is that you should not monkey with the loop index (Chapter 16, page 25, Loop Indexes, PDF format).
This makes intuitive sense and is a practice I've always followed except maybe as I learned how to program back in the day.
In a recent code review I found this awkward loop and immediately flagged it as suspect.
for ( int i=0 ; i < this.MyControl.TabPages.Count ; i++ )
{
this.MyControl.TabPages.Remove ( this.MyControl.TabPages[i] );
i--;
}
It's almost amusing since it manages to work by keeping the index at zero until all TabPages are removed.
This loop could have been written as
while(MyControl.TabPages.Count > 0)
MyControl.TabPages.RemoveAt(0);
And since the control was in fact written at about the same time as the loop it could even have been written as
MyControl.TabPages.Clear();
I've since been challenged about the code-review issue and found that my articulation of why it is bad practice was not as strong as I'd have liked. I said it was harder to understand the flow of the loop and therefore harder to maintain and debug and ultimately more expensive over the lifetime of the code.
Is there a better articulation of why this is bad practice?
I think your articulation is great. Maybe it can be worded like so:
Since the logic can be expressed much
clearer, it should.
Well, this adds confusion for little purpose - you could just as easily write:
while(MyControl.TabPages.Count > 0)
{
MyControl.TabPages.Remove(MyControl.TabPages[0]);
}
or (simpler)
while(MyControl.TabPages.Count > 0)
{
MyControl.TabPages.RemoveAt(0);
}
or (simplest)
MyControl.TabPages.Clear();
In all of the above, I don't have to squint and think about any edge-cases; it is pretty clear what happens when. If you are modifying the loop index, you can quickly make it quite hard to understand at a glance.
It's all about expectation.
When one uses a loopcounter, you expect that it is incremented (decremented) each iteration of the loop with the same amount.
If you mess (or monkey if you like) with the loop counter, your loop does not behave like expected. This means it is harder to understand and it increases the chance that your code is misinterpreted, and this introduces bugs.
Or to (mis) quote a wise but fictional character:
complexity leads to misunderstanding
misunderstanding leads to bugs
bugs leads to the dark side.
I agree with your challenge. If they want to keep a for loop, the code:
for ( int i=0 ; i < this.MyControl.TabPages.Count ; i++ ) {
this.MyControl.TabPages.Remove ( this.MyControl.TabPages[i] );
i--;
}
reduces as follows:
for ( int i=0 ; i < this.MyControl.TabPages.Count ; ) {
this.MyControl.TabPages.Remove ( this.MyControl.TabPages[i] );
}
and then to:
for ( ; 0 < this.MyControl.TabPages.Count ; ) {
this.MyControl.TabPages.Remove ( this.MyControl.TabPages[0] );
}
But a while loop or a Clear() method, if that exists, are clearly preferable.
I think you could build a stronger argument by invoking Knuth's concepts of literate programming, that programs should not be written for computers, but to communicate concepts to other programmers, thus the simpler loop:
while (this.MyControl.TabPages.Count>0)
{
this.MyControl.TabPages.Remove ( this.MyControl.TabPages[0] );
}
more clearly illustrates the intent - remove the first tab page until there are none left. I think most people would grok that much quicker than the original example.
This might be clearer:
while (this.MyControl.TabPages.Count > 0)
{
this.MyControl.TabPages.Remove ( this.MyControl.TabPages[0] );
}
One argument that could be used is that it is much more difficult to debug such code, where the index is being changed twice.
The original code is highly redundant to bend the action of the for-loop to what is necessary. The increment is unnecessary, and balanced by the decrement. Those should be PRE-increments, not POST-increments as well, because conceptually the post-increment is wrong. The comparison with the tabpages count is semi-redundant since that's a hackish way of checking that the container is empty.
In short, it's unnecessary cleverness, it adds rather than removes redundancy. Since it can be both obviously simpler and obviously shorter, it's wrong.
The only reason to bother with an index at all would be if one were selectively erasing things. Even in that case, I would think it preferable to say: i=0;
while(i < MyControl.Tabpages.Count)
if (wantToDelete(MyControl.Tabpages(i))
MyControl.Tabpages.RemoveAt(i);
else
i++;rather than jinxing the loop index after each removal. Or, better yet, have the index count downward so that when an item is removed it won't affect the index of future items needing removal. If many items are deleted, this may also help minimize the amount of time spent moving items around after each deletion.
I think pointing out the fact that the loop iteratations is beeing controled not by the "i++" as anyone would expect but by the crazy "i--" setup should have been enough.
I also think that altering the the state of "i" by evaluating the count and then altering the count in the loop may also lead to potential problems. I would expect a for loop to generally have a "fixed" number of iterations and the only part of the for loop condition that changes to be the loop variable "i".

Categories