Expanding pre increment operator in programming languages [duplicate]

Expanding pre increment operator in programming languages [duplicate] - c#

This question already has answers here:
What is the difference between prefix and postfix operators?
(13 answers)
Closed 9 years ago.
For example, i++ can be written as i=i+1.
Similarly, how can ++i be written in the form as shown above for i++?
Is it the same way as of i++ or if not please specify the right way to write ++i in the form as shown above for i++?

You actually have it the wrong way around:
i=i+1 is closer to ++i than to i++
The reason for this is that it anyway only makes a difference with surrounding expressions, like:
j=++i gives j the same value as j=(i+=1) and j=(i=i+1)
There's a whole lot more to the story though:
1) To be on the same ground, let's look at the difference of pre-increment and post-increment:
j=i++ vs j=++i
The first one is post-increment (value is returned and "post"=afterwards incremented), the later one is pre-increment (value is "pre"=before incremented, and then returned)
So, (multi statement) equivalents would be:
j=i; i=i+1; and i=i+1; j=i; respectively
This makes the whole thing very interesting with (theoretical) expressions like j=++i++ (which are illegal in standard C though, as you may not change one variable in a single statement multiple times)
It's also interesting with regards to memory fences, as modern processors can do so called out of order execution, which means, you might code in a specific order, and it might be executed in a totally different order (though, there are certain rules to this of course). Compilers may also reorder your code, so the 2 statements actually end up being the same at the end.
-
2) Depending on your compiler, the expressions will most likely be really the same after compilation/optimization etc.
If you have a standalone expression of i++; and ++i; the compiler will most likely transform it to the intermediate 'ADD' asm command. You can try it yourself with most compilers, e.g. with gcc it's -s to get the intermediate asm output.
Also, for many years now, compilers tend to optimize the very common construct of
for (int i = 0; i < whatever; i++)
as directly translating i++ in this case would spoil an additional register (and possible lead to register spilling) and lead to unneeded instructions (we're talking about stuff here, which is so minor, that it really won't matter unless the loop runs trillion of times with a single asm instruction or the like)
-
3) Back to your original question, the whole thing is a precedence question, as the 2 statements can be expressed as (this time a bit more exact than the upper explanation):
something = i++ => temp = i; i = i + 1; something = temp
and
something = ++i => i = i + 1; something = i
( here you can also see why the first variant would theoretically lead to more register spilled and more instructions )
As you can see here, the first expression can not easily be altered in a way I think would satisfy your question, as it's not possible to express it using precedence symbols, e.g. parentheses, or, simpler to understand, a block). For the second one though, that's easy:
++i => (++i) => { i = i + 1; return i } (pseudo code)
for the first one that would be
i++ => { return i; i = i + 1 } (pseudo code again)
As you can see, this won't work.
Hope I helped you clear up your question, if anything may need clarification or I made an error, feel free to point it out.

Technically j = ++i is the same as
j = (i = i + 1);
and j = i++; is the same as
j = i;
i = i + 1;
You can avoid writing this as two lines with this trick, though ++ doesn't do this.
j = (i = i + 1) - 1;

++i and i++ both have identical results if written as a stand alone statement (as opposed to chaining it with other operations.
That result is also identical to i += 1 and to i = i + 1.
The difference only comes in if you start using the ++i and i++ inside a larger expression.

The real meaning of this pre and post increment, you 'll know only about where you are using that.?
Main difference is
pre condition will increment first before execute any statement.
Post condition will increment after the statement executed.
Here I've mention a simple example to you.
void main()
{
int i=0, value;
value=i++; // Here I've used post increment. Here value of i will be assigned first then It'll be incremented.
// After this statement, now Value will hold the value 0 and i will hold the value 1
// Now I'm going to use pre increment
value=++i; // Here i've used pre increment. So i will be incremented first then value will be assigned to value.
// After this statement, Now value will hold the value 2 and i will hold the value 2
}

This may be done using anonymous methods:
int i = 5;
int j = i++;
is equivalent to:
int i = 5;
int j = new Func<int>(() => { int temp = i; i = i + 1; return temp; })();
However, it would make more sense if it were expanded as a named method with a ref parameter (which mimics the underlying implementation of the i++ operator):
static void Main(string[] args)
{
int i = 5;
int j = PostIncrement(ref i);
}
static int PostIncrement(ref int x)
{
int temp = x;
x = x + 1;
return temp;
}

There is no 'equivalent' for i++.
In i++ the value for i is incremented after the surrounding expression has been evaluated.
In ++i and i+1 the value for i is incremented before the surrounding expression is evaluated.

++i will increment the value of 'i', and then return the incremented value.
Example:
int i=1;
System.out.print(++i);
//2
System.out.print(i);
//2
i++ will increment the value of 'i', but return the original value that 'i' held before being incremented.
int i=1;
System.out.print(i++);
//1
System.out.print(i);
//2

Related

Sorting array using BubbleSort fails

Following algorithm works pretty fine in C#
public int[] Sortieren(int[] array, int decide)
{
bool sorted;
int temp;
for (int i = 0; i < array.Length; i++)
{
do
{
sorted= true;
for (int j = 0; j < array.Length - 1; j++)
{
if (decide == 1)
{
if (array[j] < array[j + 1])
{
temp = array[j];
array[j] = array[j + 1];
array[j + 1] = temp;
sorted= false;
}
}else if (decide == 0)
{
if (array[j] > array[j + 1])
{
temp = array[j];
array[j] = array[j + 1];
array[j + 1] = temp;
sorted= false;
}
}
else
{
Console.WriteLine("Incorrect sorting parameter!");
break;
}
}
} while (!sorted);
}
return array;
}
Same thing in C fails. I only get the first two numbers of the array being sorted. The rest of the numbers are same. So, this code also seems to change the array instead of only sorting it. Any ideas, where are the bugs?
#include <stdio.h>
#include<stdbool.h>
#define MAX 10
void main(void)
{
int random_numbers[MAX],temp,Array_length;
bool sorted;
srand(time(NULL));
for(int i=0;i<=MAX;i++){
random_numbers[i]=rand()%1000;
}
Array_length=sizeof(random_numbers) / sizeof(int);
printf("List of (unsorted) numbers:\n");
for(int i=0;i<MAX;i++){
if(i==MAX-1)
printf("%i",random_numbers[i]);
else
printf("%i,",random_numbers[i]);
}
//Searching algorithm
for(int i=0;i<Array_length;i++){
do{
sorted=true;
for(int j=0;j<Array_length-1;j++){
if(random_numbers[j]>random_numbers[j+1]){
temp=random_numbers[j];
random_numbers[j]==random_numbers[j+1];
random_numbers[j+1]=temp;
sorted=false;
}
}
}while(!sorted);
}
printf("\n");
for(int i=0;i<Array_length;i++){
if(i==Array_length-1)
printf("%i",random_numbers[i]);
else
printf("%i,",random_numbers[i]);
}
}

You have an error in your swap algorithm:
if (zufallszahlen[j] > zufallszahlen[j+1]) {
temp = zufallszahlen[j];
zufallszahlen[j] == zufallszahlen[j+1]; // here
zufallszahlen[j+1] = temp;
sortiert = false;
}
In the line after you assign to temp, your double equal sign results in a check for equality rather than an assignment. This is still legal code (== is an operator and and expressions that use them evaluate to something), and the expression will evaluate to either 1 or 0 depending on the truth value of the statement. Note that this is legal even though you're not using the expression, where normally a boolean value would presumably be used for control flow.
Note that this is true for other operators as well. For example, the = operator assigns the value on the right to the variable on the left, so hypothetically a mistake like if (x = 0) will mean this branch will never be called, since the x = 0 will evaluate to false every time, when you may have meant to branch when x == 0.
Also, why are you using a boolean value to check if the array is sorted? Bubble sort is a simple algorithm, so it should be trivial to implement, and by the definition of an algorithm, it's guaranteed to both finish and be correct. If you were trying to optimize for performance purposes, for example choosing between merge sort and insertion sort based on whether the data was already sorted then I would understand, but you're checking whether the data is sorted as you're sorting it, which doesn't really make sense, since the algorithm will tell you when it's sorted because it will finish. Adding the boolean checking only adds overhead and nets you nothing.
Also, note how in your C# implementation, you repeated the sort process. This is a good sign your design is wrong. You take in an integer as well as the actual int[] array in your C# code, and you use that integer to branch. Then, from what I can gather, you sort using either < or >, depending on the value passed in. I'm pretty confused by this, since either would work. You gain nothing from adding this functionality, so I'm confused as to why you added it in.
Also, why do you repeat the printf statements? Even doing if/else if I might understand. But you're doing if/else. This is logically equivalent to P V ~P and will always evaluate to true, so you might as well get rid of the if and the else and just have one printf statement.
Below is implementation of your Bubble Sort program, and I want to point out a few things. First, it's generally frowned upon to declare main as void (What should main() return in C and C++?).
I quickly want to also point out that even though we are declaring the maximum length of the array as a macro, all of the array functions I defined explicitly take a size_t size argument for referrential transparency.
Last but not least, I would recommend not declaring all your variables at the start of your program/functions. This is a more contested topic among developers, especially because it used to be required, since compilers needed to know exactly what variables needed to be allocated. As compilers got better and better, they could accept variable declarations within code (and could even optimize some variables away altogether), so some developers recommend declaring your variables when you need them, so that their declaration makes sense (i.e... you know you need them), and also to reduce code noise.
That being said, some developers do prefer declaring all their variables at the beginning of the program/function. You'll especially see this:
int i, j, k;
or some variation of that, because the developer pre-declared all of their loop counters. Again, I think it's just code noise, and when you work with C++ some of the language syntax itself is code noise in my opinion, but just be aware of this.
So for example, rather than declaring everything like this:
int zufallszahlen[MAX], temp, Array_length;
You would declare the variables like this:
int zufallszahlen[MAX];
int Array_length = sizeof (zufallszahlen) / sizeof (int);
The temp variable is then put off for as long as possible so that it's obvious when and were it's useful. In my implementation, you'll notice I declared it in the swap function.
For pedagogical purposes I would also like to add that you don't have to use a swap variable when sorting integers because you can do the following:
a = a + b;
b = a - b;
a = a - b;
I will say, however, that I believe the temporary swap variable makes the swap much more instantly familiar, so I would say leave it in, but this is my own personal preference.
I do recommend using size_t for Array_length, however, because that's the type that the sizeof operator returns. It makes sense, too, because the size of an array will not be negative.
Here are the include statements and functions. Remember that I do not include <stdbool.h> because the bool checking you were doing was doing nothing for the algorithm.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 10
void PrintArray(int arr[], size_t n) {
for (int i = 0; i < n; ++i) {
printf("%d ", arr[i]);
}
printf("\n");
}
void PopulateArray(int arr[], size_t n) {
for (int i = 0; i < n; ++i) {
arr[i] = rand() % 1000 + 1;
}
}
void BubbleSortArray(int arr[], size_t n) {
for (int i = 0; i < n; ++i) {
for (int j = 0; j < n - 1; ++j) {
if (arr[j] > arr[j+1]) {
int temp = arr[j+1];
arr[j+1] = arr[j];
arr[j] = temp;
}
}
}
}
To implement the bubble sort algorithm, the only thing you have to do now is initialize the random number generator like you did, create your array and populate it, and finally sort the array.
int main()
{
srand(time(NULL));
int arr[MAX];
size_t array_length = sizeof (arr) / sizeof (int);
PopulateArray(arr, array_length);
PrintArray(arr, array_length);
BubbleSortArray(arr, array_length);
PrintArray(arr, array_length);
}
I hope this helps, let me know if you have any questions.

Showing different output from my acceptation

Hay I didn't know even If this question has asked before but my problem is as following.
In my c# console application I had declared a variable i with assigning a value as
int i = 0 and now I want increment i by 2, obviously I can use following cede.
int i = o;
i += 2;
Console.WriteLine(i);
Console.ReadLine();
//OUTPUT WILL BE 2
but this one is my alternate solution. As my lazy behavior I refuse to use this code and I had used following code.
int i = 0;
i += i++;
Console.WriteLine(i);
Console.ReadLine();
In above code I had accepted FIRST i++ will increment by one and than after it will again increment by i+=i but this thing is not happen.!!!
I doesn't know why this thing is happening may be I had done something wrong or some compilation problem.?????
Can any one suggest me why this happening????
I just want to know why code no 2 is not working? what is happening in there?

The i++ returns the value of i (0) and then adds 1. i++ is called post-increment.
What you are after is ++i, which will first increase by one and then return the increased number.
(see http://msdn.microsoft.com/en-us/library/aa691363(v=vs.71).aspx for details about increment operators)

i needs to start off with 1 to make this work.
int i = 1;
EDIT::
int i = 0;
i += i++;
Your code above expresses the following:
i + 0 then add one
if you use i += ++i; then you'll get i + 1 as it processed the increment beforehand.

What your code is doing:
"i++" is evaluated. The value of "i++" is the value of i before the increment happens.
As part of the evaluation of "i++", i is incremented by one. Now i has the value of 1;
The assignment is executed. i is assigned the value of "i++", which is the value of i before the increment - that is, 0.
That is, "i = i++" roughly translates to
int oldValue = i;
i = i + 1
//is the same thing as
i = oldValue;
The post-increment operator increments the value of your integer "i" after the execution of your i++
For your information:
i++ will increment the value of i, but return the pre-incremented value.
++i will increment the value of i, and then return the incremented value.
so the best way of doing the 2 step increment is like that:
i +=2

How and where do i use the i++ operator in C#?

I was wondering if anyone could tell me about the i++ operator in C#.
I know it adds one to the int value, but I wouldn't have a clue where to use it, and if its only for loop statements, or can be used in general projects.
If I could get some practical examples of where you might use the i++ operator, thank you.

This is post-increment,which means that the increment will be done after the execution of the statement
int i=0;
Console.Write(i++); // still 0
Console.Write(i); // prints 1, it is incremented

In general, given this declaration:
var myNum = 0;
anywhere you would normally do this:
myNum += 1;
You could just do this:
myNum++;

What you are referring to is the C# post increment operator, common use case are :
Incrementing the counter variable in a standard for loop
Incrementing the counter variable in a while loop
Accessing an array in sequential order (3)
Example (3) :
int[] table = { 0, 1, 2, 3, 4, 5, 6 };
int i = 0;
int j = table[i++]; //Access the table array element 0
int k = table[i++]; //Access the table array element 1
int l = table[i++]; //Access the table array element 2
So, whats the post increment operator really does ?
It return the value of the variable and then increment it's value by 1 unit .

The expression i++ is just like x == y > 0 ? x : z which both are syntactic sugar.
x = i++ saves you the space of writing x=i; i=i+1;
Is it good or bad? There is really no exact answer to this.
I personally avoid these expressions as they make my code look complex, sometimes hard to debug. Always think code readability instead of write-ability, always think how your code looks readable instead of how easy is it to write it

Off By One errors and Mutation Testing

In the process of writing an "Off By One" mutation tester for my favourite mutation testing framework (NinjaTurtles), I wrote the following code to provide an opportunity to check the correctness of my implementation:
public int SumTo(int max)
{
int sum = 0;
for (var i = 1; i <= max; i++)
{
sum += i;
}
return sum;
}
now this seems simple enough, and it didn't strike me that there would be a problem trying to mutate all the literal integer constants in the IL. After all, there are only 3 (the 0, the 1, and the ++).
WRONG!
It became very obvious on the first run that it was never going to work in this particular instance. Why? Because changing the code to
public int SumTo(int max)
{
int sum = 0;
for (var i = 0; i <= max; i++)
{
sum += i;
}
return sum;
}
only adds 0 (zero) to the sum, and this obviously has no effect. Different story if it was the multiple set, but in this instance it was not.
Now there's a fairly easy algorithm for working out the sum of integers
sum = max * (max + 1) / 2;
which I could have fail the mutations easily, since adding or subtracting 1 from either of the constants there will result in an error. (given that max >= 0)
So, problem solved for this particular case. Although it did not do what I wanted for the test of the mutation, which was to check what would happen when I lost the ++ - effectively an infinite loop. But that's another problem.
So - My Question: Are there any trivial or non-trivial cases where a loop starting from 0 or 1 may result in a "mutation off by one" test failure that cannot be refactored (code under test or test) in a similar way? (examples please)
Note: Mutation tests fail when the test suite passes after a mutation has been applied.
Update: an example of something less trivial, but something that could still have the test refactored so that it failed would be the following
public int SumArray(int[] array)
{
int sum = 0;
for (var i = 0; i < array.Length; i++)
{
sum += array[i];
}
return sum;
}
Mutation testing against this code would fail when changing the var i=0 to var i=1 if the test input you gave it was new[] {0,1,2,3,4,5,6,7,8,9}. However change the test input to new[] {9,8,7,6,5,4,3,2,1,0}, and the mutation testing will fail. So a successful refactor proves the testing.

I think with this particular method, there are two choices. You either admit that it's not suitable for mutation testing because of this mathematical anomaly, or you try to write it in a way that makes it safe for mutation testing, either by refactoring to the form you give, or some other way (possibly recursive?).
Your question really boils down to this: is there a real life situation where we care about whether the element 0 is included in or excluded from the operation of a loop, and for which we cannot write a test around that specific aspect? My instinct is to say no.
Your trivial example may be an example of lack of what I referred to as test-drivenness in my blog, writing about NinjaTurtles. Meaning in the case that you have not refactored this method as far as you should.

One natural case of "mutation test failure" is an algorithm for matrix transposition. To make it more suitable for a single for-loop, add some constraints to this task: let the matrix be non-square and require transposition to be in-place. These constraints make one-dimensional array most suitable place to store the matrix and a for-loop (starting, usually, from index '1') may be used to process it. If you start it from index '0', nothing changes, because top-left element of the matrix always transposes to itself.
For an example of such code, see answer to other question (not in C#, sorry).
Here "mutation off by one" test fails, refactoring the test does not change it. I don't know if the code itself may be refactored to avoid this. In theory it may be possible, but should be too difficult.
The code snippet I referenced earlier is not a perfect example. It still may be refactored if the for loop is substituted by two nested loops (as if for rows and columns) and then these rows and columns are recalculated back to one-dimensional index. Still it gives an idea how to make some algorithm, which cannot be refactored (though not very meaningful).
Iterate through an array of positive integers in the order of increasing indexes, for each index compute its pair as i + i % a[i], and if it's not outside the bounds, swap these elements:
for (var i = 1; i < a.Length; i++)
{
var j = i + i % a[i];
if (j < a.Length)
Swap(a[i], a[j]);
}
Here again a[0] is "unmovable", refactoring the test does not change this, and refactoring the code itself is practically impossible.
One more "meaningful" example. Let's implement an implicit Binary Heap. It is usually placed to some array, starting from index '1' (this simplifies many Binary Heap computations, compared to starting from index '0'). Now implement a copy method for this heap. "Off-by-one" problem in this copy method is undetectable because index zero is unused and C# zero-initializes all arrays. This is similar to OP's array summation, but cannot be refactored.
Strictly speaking, you can refactor the whole class and start everything from '0'. But changing only 'copy' method or the test does not prevent "mutation off by one" test failure. Binary Heap class may be treated just as a motivation to copy an array with unused first element.
int[] dst = new int[src.Length];
for (var i = 1; i < src.Length; i++)
{
dst[i] = src[i];
}

Yes, there are many, assuming I have understood your question.
One similar to your case is:
public int MultiplyTo(int max)
{
int product = 1;
for (var i = 1; i <= max; i++)
{
product *= i;
}
return product;
}
Here, if it starts from 0, the result will be 0, but if it starts from 1 the result should be correct. (Although it won't tell the difference between 1 and 2!).

Not quite sure what you are looking for exactly, but it seems to me that if you change/mutate the initial value of sum from 0 to 1, you should fail the test:
public int SumTo(int max)
{
int sum = 1; // Now we are off-by-one from the beginning!
for (var i = 0; i <= max; i++)
{
sum += i;
}
return sum;
}
Update based on comments:
The loop will only not fail after mutation when the loop invariant is violated in the processing of index 0 (or in the absence of it). Most such special cases can be refactored out of the loop, but consider a summation of 1/x:
for (var i = 1; i <= max; i++) {
sum += 1/i;
}
This works fine, but if you mutate the initial bundary from 1 to 0, the test will fail as 1/0 is invalid operation.

which approach is good while using strings [duplicate]

I have always wondered if, in general, declaring a throw-away variable before a loop, as opposed to repeatedly inside the loop, makes any (performance) difference?
A (quite pointless) example in Java:
a) declaration before loop:
double intermediateResult;
for(int i=0; i < 1000; i++){
intermediateResult = i;
System.out.println(intermediateResult);
}
b) declaration (repeatedly) inside loop:
for(int i=0; i < 1000; i++){
double intermediateResult = i;
System.out.println(intermediateResult);
}
Which one is better, a or b?
I suspect that repeated variable declaration (example b) creates more overhead in theory, but that compilers are smart enough so that it doesn't matter. Example b has the advantage of being more compact and limiting the scope of the variable to where it is used. Still, I tend to code according example a.
Edit: I am especially interested in the Java case.

Which is better, a or b?
From a performance perspective, you'd have to measure it. (And in my opinion, if you can measure a difference, the compiler isn't very good).
From a maintenance perspective, b is better. Declare and initialize variables in the same place, in the narrowest scope possible. Don't leave a gaping hole between the declaration and the initialization, and don't pollute namespaces you don't need to.

Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)
A: average execution time: .074 sec
B: average execution time : .067 sec
To my surprise B was slightly faster.
As fast as computers are now its hard to say if you could accurately measure this.
I would code it the A way as well but I would say it doesn't really matter.

It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.
Example:
using System;
using System.Collections.Generic;
class Test
{
static void Main()
{
List<Action> actions = new List<Action>();
int outer;
for (int i=0; i < 10; i++)
{
outer = i;
int inner = i;
actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
}
foreach (Action action in actions)
{
action();
}
}
}
Output:
Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9
The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.

The following is what I wrote and compiled in .NET.
double r0;
for (int i = 0; i < 1000; i++) {
r0 = i*i;
Console.WriteLine(r0);
}
for (int j = 0; j < 1000; j++) {
double r1 = j*j;
Console.WriteLine(r1);
}
This is what I get from .NET Reflector when CIL is rendered back into code.
for (int i = 0; i < 0x3e8; i++)
{
double r0 = i * i;
Console.WriteLine(r0);
}
for (int j = 0; j < 0x3e8; j++)
{
double r1 = j * j;
Console.WriteLine(r1);
}
So both look exactly same after compilation. In managed languages code is converted into CL/byte code and at time of execution it's converted into machine language. So in machine language a double may not even be created on the stack. It may just be a register as code reflect that it is a temporary variable for WriteLine function. There are a whole set optimization rules just for loops. So the average guy shouldn't be worried about it, especially in managed languages. There are cases when you can optimize manage code, for example, if you have to concatenate a large number of strings using just string a; a+=anotherstring[i] vs using StringBuilder. There is very big difference in performance between both. There are a lot of such cases where the compiler cannot optimize your code, because it cannot figure out what is intended in a bigger scope. But it can pretty much optimize basic things for you.

This is a gotcha in VB.NET. The Visual Basic result won't reinitialize the variable in this example:
For i as Integer = 1 to 100
Dim j as Integer
Console.WriteLine(j)
j = i
Next
' Output: 0 1 2 3 4...
This will print 0 the first time (Visual Basic variables have default values when declared!) but i each time after that.
If you add a = 0, though, you get what you might expect:
For i as Integer = 1 to 100
Dim j as Integer = 0
Console.WriteLine(j)
j = i
Next
'Output: 0 0 0 0 0...

I made a simple test:
int b;
for (int i = 0; i < 10; i++) {
b = i;
}
vs
for (int i = 0; i < 10; i++) {
int b = i;
}
I compiled these codes with gcc - 5.2.0. And then I disassembled the main ()
of these two codes and that's the result:
1º:
0x00000000004004b6 <+0>: push rbp
0x00000000004004b7 <+1>: mov rbp,rsp
0x00000000004004ba <+4>: mov DWORD PTR [rbp-0x4],0x0
0x00000000004004c1 <+11>: jmp 0x4004cd <main+23>
0x00000000004004c3 <+13>: mov eax,DWORD PTR [rbp-0x4]
0x00000000004004c6 <+16>: mov DWORD PTR [rbp-0x8],eax
0x00000000004004c9 <+19>: add DWORD PTR [rbp-0x4],0x1
0x00000000004004cd <+23>: cmp DWORD PTR [rbp-0x4],0x9
0x00000000004004d1 <+27>: jle 0x4004c3 <main+13>
0x00000000004004d3 <+29>: mov eax,0x0
0x00000000004004d8 <+34>: pop rbp
0x00000000004004d9 <+35>: ret
vs
2º
0x00000000004004b6 <+0>: push rbp
0x00000000004004b7 <+1>: mov rbp,rsp
0x00000000004004ba <+4>: mov DWORD PTR [rbp-0x4],0x0
0x00000000004004c1 <+11>: jmp 0x4004cd <main+23>
0x00000000004004c3 <+13>: mov eax,DWORD PTR [rbp-0x4]
0x00000000004004c6 <+16>: mov DWORD PTR [rbp-0x8],eax
0x00000000004004c9 <+19>: add DWORD PTR [rbp-0x4],0x1
0x00000000004004cd <+23>: cmp DWORD PTR [rbp-0x4],0x9
0x00000000004004d1 <+27>: jle 0x4004c3 <main+13>
0x00000000004004d3 <+29>: mov eax,0x0
0x00000000004004d8 <+34>: pop rbp
0x00000000004004d9 <+35>: ret
Which are exaclty the same asm result. isn't a proof that the two codes produce the same thing?

It is language dependent - IIRC C# optimises this, so there isn't any difference, but JavaScript (for example) will do the whole memory allocation shebang each time.

I would always use A (rather than relying on the compiler) and might also rewrite to:
for(int i=0, double intermediateResult=0; i<1000; i++){
intermediateResult = i;
System.out.println(intermediateResult);
}
This still restricts intermediateResult to the loop's scope, but doesn't redeclare during each iteration.

In my opinion, b is the better structure. In a, the last value of intermediateResult sticks around after your loop is finished.
Edit:
This doesn't make a lot of difference with value types, but reference types can be somewhat weighty. Personally, I like variables to be dereferenced as soon as possible for cleanup, and b does that for you,

I suspect a few compilers could optimize both to be the same code, but certainly not all. So I'd say you're better off with the former. The only reason for the latter is if you want to ensure that the declared variable is used only within your loop.

As a general rule, I declare my variables in the inner-most possible scope. So, if you're not using intermediateResult outside of the loop, then I'd go with B.

A co-worker prefers the first form, telling it is an optimization, preferring to re-use a declaration.
I prefer the second one (and try to persuade my co-worker! ;-)), having read that:
It reduces scope of variables to where they are needed, which is a good thing.
Java optimizes enough to make no significant difference in performance. IIRC, perhaps the second form is even faster.
Anyway, it falls in the category of premature optimization that rely in quality of compiler and/or JVM.

There is a difference in C# if you are using the variable in a lambda, etc. But in general the compiler will basically do the same thing, assuming the variable is only used within the loop.
Given that they are basically the same: Note that version b makes it much more obvious to readers that the variable isn't, and can't, be used after the loop. Additionally, version b is much more easily refactored. It is more difficult to extract the loop body into its own method in version a. Moreover, version b assures you that there is no side effect to such a refactoring.
Hence, version a annoys me to no end, because there's no benefit to it and it makes it much more difficult to reason about the code...

Well, you could always make a scope for that:
{ //Or if(true) if the language doesn't support making scopes like this
double intermediateResult;
for (int i=0; i<1000; i++) {
intermediateResult = i;
System.out.println(intermediateResult);
}
}
This way you only declare the variable once, and it'll die when you leave the loop.

I think it depends on the compiler and is hard to give a general answer.

I've always thought that if you declare your variables inside of your loop then you're wasting memory. If you have something like this:
for(;;) {
Object o = new Object();
}
Then not only does the object need to be created for each iteration, but there needs to be a new reference allocated for each object. It seems that if the garbage collector is slow then you'll have a bunch of dangling references that need to be cleaned up.
However, if you have this:
Object o;
for(;;) {
o = new Object();
}
Then you're only creating a single reference and assigning a new object to it each time. Sure, it might take a bit longer for it to go out of scope, but then there's only one dangling reference to deal with.

My practice is following:
if type of variable is simple (int, double, ...) I prefer variant b (inside).
Reason: reducing scope of variable.
if type of variable is not simple (some kind of class or struct) I prefer variant a (outside).
Reason: reducing number of ctor-dtor calls.

I had this very same question for a long time. So I tested an even simpler piece of code.
Conclusion: For such cases there is NO performance difference.
Outside loop case
int intermediateResult;
for(int i=0; i < 1000; i++){
intermediateResult = i+2;
System.out.println(intermediateResult);
}
Inside loop case
for(int i=0; i < 1000; i++){
int intermediateResult = i+2;
System.out.println(intermediateResult);
}
I checked the compiled file on IntelliJ's decompiler and for both cases, I got the same Test.class
for(int i = 0; i < 1000; ++i) {
int intermediateResult = i + 2;
System.out.println(intermediateResult);
}
I also disassembled code for both the case using the method given in this answer. I'll show only the parts relevant to the answer
Outside loop case
Code:
stack=2, locals=3, args_size=1
0: iconst_0
1: istore_2
2: iload_2
3: sipush 1000
6: if_icmpge 26
9: iload_2
10: iconst_2
11: iadd
12: istore_1
13: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
16: iload_1
17: invokevirtual #3 // Method java/io/PrintStream.println:(I)V
20: iinc 2, 1
23: goto 2
26: return
LocalVariableTable:
Start Length Slot Name Signature
13 13 1 intermediateResult I
2 24 2 i I
0 27 0 args [Ljava/lang/String;
Inside loop case
Code:
stack=2, locals=3, args_size=1
0: iconst_0
1: istore_1
2: iload_1
3: sipush 1000
6: if_icmpge 26
9: iload_1
10: iconst_2
11: iadd
12: istore_2
13: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
16: iload_2
17: invokevirtual #3 // Method java/io/PrintStream.println:(I)V
20: iinc 1, 1
23: goto 2
26: return
LocalVariableTable:
Start Length Slot Name Signature
13 7 2 intermediateResult I
2 24 1 i I
0 27 0 args [Ljava/lang/String;
If you pay close attention, only the Slot assigned to i and intermediateResult in LocalVariableTable is swapped as a product of their order of appearance. The same difference in slot is reflected in other lines of code.
No extra operation is being performed
intermediateResult is still a local variable in both cases, so there is no difference access time.
BONUS
Compilers do a ton of optimization, take a look at what happens in this case.
Zero work case
for(int i=0; i < 1000; i++){
int intermediateResult = i;
System.out.println(intermediateResult);
}
Zero work decompiled
for(int i = 0; i < 1000; ++i) {
System.out.println(i);
}

From a performance perspective, outside is (much) better.
public static void outside() {
double intermediateResult;
for(int i=0; i < Integer.MAX_VALUE; i++){
intermediateResult = i;
}
}
public static void inside() {
for(int i=0; i < Integer.MAX_VALUE; i++){
double intermediateResult = i;
}
}
I executed both functions 1 billion times each.
outside() took 65 milliseconds. inside() took 1.5 seconds.

I tested for JS with Node 4.0.0 if anyone is interested. Declaring outside the loop resulted in a ~.5 ms performance improvement on average over 1000 trials with 100 million loop iterations per trial. So I'm gonna say go ahead and write it in the most readable / maintainable way which is B, imo. I would put my code in a fiddle, but I used the performance-now Node module. Here's the code:
var now = require("../node_modules/performance-now")
// declare vars inside loop
function varInside(){
for(var i = 0; i < 100000000; i++){
var temp = i;
var temp2 = i + 1;
var temp3 = i + 2;
}
}
// declare vars outside loop
function varOutside(){
var temp;
var temp2;
var temp3;
for(var i = 0; i < 100000000; i++){
temp = i
temp2 = i + 1
temp3 = i + 2
}
}
// for computing average execution times
var insideAvg = 0;
var outsideAvg = 0;
// run varInside a million times and average execution times
for(var i = 0; i < 1000; i++){
var start = now()
varInside()
var end = now()
insideAvg = (insideAvg + (end-start)) / 2
}
// run varOutside a million times and average execution times
for(var i = 0; i < 1000; i++){
var start = now()
varOutside()
var end = now()
outsideAvg = (outsideAvg + (end-start)) / 2
}
console.log('declared inside loop', insideAvg)
console.log('declared outside loop', outsideAvg)

A) is a safe bet than B).........Imagine if you are initializing structure in loop rather than 'int' or 'float' then what?
like
typedef struct loop_example{
JXTZ hi; // where JXTZ could be another type...say closed source lib
// you include in Makefile
}loop_example_struct;
//then....
int j = 0; // declare here or face c99 error if in loop - depends on compiler setting
for ( ;j++; )
{
loop_example loop_object; // guess the result in memory heap?
}
You are certainly bound to face problems with memory leaks!. Hence I believe 'A' is safer bet while 'B' is vulnerable to memory accumulation esp working close source libraries.You can check usinng 'Valgrind' Tool on Linux specifically sub tool 'Helgrind'.

It's an interesting question. From my experience there is an ultimate question to consider when you debate this matter for a code:
Is there any reason why the variable would need to be global?
It makes sense to only declare the variable once, globally, as opposed to many times locally, because it is better for organizing the code and requires less lines of code. However, if it only needs to be declared locally within one method, I would initialize it in that method so it is clear that the variable is exclusively relevant to that method. Be careful not to call this variable outside the method in which it is initialized if you choose the latter option--your code won't know what you're talking about and will report an error.
Also, as a side note, don't duplicate local variable names between different methods even if their purposes are near-identical; it just gets confusing.

this is the better form
double intermediateResult;
int i = byte.MinValue;
for(; i < 1000; i++)
{
intermediateResult = i;
System.out.println(intermediateResult);
}
1) in this way declared once time both variable, and not each for cycle.
2) the assignment it's fatser thean all other option.
3) So the bestpractice rule is any declaration outside the iteration for.

Tried the same thing in Go, and compared the compiler output using go tool compile -S with go 1.9.4
Zero difference, as per the assembler output.

I use (A) when I want to see the contents of the variable after exiting the loop. It only matters for debugging. I use (B) when I want the code more compact, since it saves one line of code.

Even if I know my compiler is smart enough, I won't like to rely on it, and will use the a) variant.
The b) variant makes sense to me only if you desperately need to make the intermediateResult unavailable after the loop body. But I can't imagine such desperate situation, anyway....
EDIT: Jon Skeet made a very good point, showing that variable declaration inside a loop can make an actual semantic difference.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Expanding pre increment operator in programming languages [duplicate] - c#

Technically j = ++i is the same as j = (i = i + 1); and j = i++; is the same as j = i; i = i + 1; You can avoid writing this as two lines with this trick, though ++ doesn't do this. j = (i = i + 1) - 1;

++i and i++ both have identical results if written as a stand alone statement (as opposed to chaining it with other operations. That result is also identical to i += 1 and to i = i + 1. The difference only comes in if you start using the ++i and i++ inside a larger expression.

There is no 'equivalent' for i++. In i++ the value for i is incremented after the surrounding expression has been evaluated. In ++i and i+1 the value for i is incremented before the surrounding expression is evaluated.

Related

Sorting array using BubbleSort fails

Showing different output from my acceptation

How and where do i use the i++ operator in C#?

Off By One errors and Mutation Testing

which approach is good while using strings [duplicate]

Categories

Resources