For i = 0, why is (i += i++) equal to 0? - c#

Take the following code (usable as a Console Application):
static void Main(string[] args)
{
int i = 0;
i += i++;
Console.WriteLine(i);
Console.ReadLine();
}
The result of i is 0. I expected 2 (as some of my colleagues did). Probably the compiler creates some sort of structure that results in i being zero.
The reason I expected 2 is that, in my line of thought, the right hand statement would be evaluated first, incrementing i with 1. Than it is added to i. Since i is already 1, it is adding 1 to 1. So 1 + 1 = 2. Obviously this is not what's happening.
Can you explain what the compiler does or what happens at runtime? Why is the result zero?
Some-sort-of-disclaimer: I'm absolutely aware you won't (and probably shouldn't) use this code. I know I never will. Nevertheless, I find it is interesting to know why it acts in such a way and what is happening exactly.

This:
int i = 0;
i += i++
Can be seen as you doing (the following is a gross oversimplification):
int i = 0;
i = i + i; // i=0 because the ++ is a postfix operator and hasn't been executed
i + 1; // Note that you are discarding the calculation result
What actually happens is more involved than that - take a look at MSDN, 7.5.9 Postfix increment and decrement operators:
The run-time processing of a postfix increment or decrement operation of the form x++ or x-- consists of the following steps:
If x is classified as a variable:
x is evaluated to produce the variable.
The value of x is saved.
The selected operator is invoked with the saved value of x as its argument.
The value returned by the operator is stored in the location given by the evaluation of x.
The saved value of x becomes the result of the operation.
Note that due to order of precedence, the postfix ++ occurs before +=, but the result ends up being unused (as the previous value of i is used).
A more thorough decomposition of i += i++ to the parts it is made of requires one to know that both += and ++ are not atomic (that is, neither one is a single operation), even if they look like they are. The way these are implemented involve temporary variables, copies of i before the operations take place - one for each operation. (I will use the names iAdd and iAssign for the temporary variables used for ++ and += respectively).
So, a closer approximation to what is happening would be:
int i = 0;
int iAdd = i; // Copy of the current value of i, for ++
int iAssign = i; // Copy of the current value of i, for +=
i = i + 1; // i++ - Happens before += due to order of precedence
i = iAdd + iAssign;

Disassembly of the running code:
int i = 0;
xor edx, edx
mov dword ptr i, edx // set i = 0
i += i++;
mov eax, dword ptr i // set eax = i (=0)
mov dword ptr tempVar1, eax // set tempVar1 = eax (=0)
mov eax, dword ptr i // set eax = 0 ( again... why??? =\ )
mov dword ptr tempVar2, eax // set tempVar2 = eax (=0)
inc dword ptr i // set i = i+1 (=1)
mov eax, dword ptr tempVar1 // set eax = tempVar1 (=0)
add eax, dword ptr tempVar2 // set eax = eax+tempVar2 (=0)
mov dword ptr i, eax // set i = eax (=0)
Equivalent code
It compiles to the same code as the following code:
int i, tempVar1, tempVar2;
i = 0;
tempVar1 = i; // created due to postfix ++ operator
tempVar2 = i; // created due to += operator
++i;
i = tempVar1 + tempVar2;
Disassembly of the second code (just to prove they are the same)
int i, tempVar1, tempVar2;
i = 0;
xor edx, edx
mov dword ptr i, edx
tempVar1 = i; // created due to postfix ++ operator
mov eax, dword ptr i
mov dword ptr tempVar1, eax
tempVar2 = i; // created due to += operator
mov eax, dword ptr i
mov dword ptr tempVar2, eax
++i;
inc dword ptr i
i = tempVar1 + tempVar2;
mov eax, dword ptr tempVar1
add eax, dword ptr tempVar2
mov dword ptr i, eax
Opening disassembly window
Most people don't know, or even don't remember, that they can see the final in-memory assembly code, using Visual Studio Disassembly window. It shows the machine code that is being executed, it is not CIL.
Use this while debuging:
Debug (menu) -> Windows (submenu) -> Disassembly
So what is happening with postfix++?
The postfix++ tells that we'd like to increment the value of the operand after the evaluation... that everybody knows... what confuses a bit is the meaning of "after the evaluation".
So what does "after the evaluation" means:
other usages of the operand, on the same line of code must be affected:
a = i++ + i the second i is affected by the increment
Func(i++, i) the second i is affected
other usages on the same line respect short-circuit operator like || and &&:
(false && i++ != i) || i == 0 the third i is not affected by i++ because it is not evaluated
So what is the meaning of: i += i++;?
It is the same as i = i + i++;
The order of evaluation is:
Store i + i (that is 0 + 0)
Increment i (i becomes 1)
Assign the value of step 1 to i (i becomes 0)
Not that the increment is being discarded.
What is the meaning of: i = i++ + i;?
This is not the same as the previous example. The 3rd i is affected by the increment.
The order of evaluation is:
Store i (that is 0)
Increment i (i becomes 1)
Store value of step 1 + i (that is 0 + 1)
Assign the value of step 3 to i (i becomes 1)

int i = 0;
i += i++;
is evaluated as follows:
Stack<int> stack = new Stack<int>();
int i;
// int i = 0;
stack.Push(0); // push 0
i = stack.Pop(); // pop 0 --> i == 0
// i += i++;
stack.Push(i); // push 0
stack.Push(i); // push 0
stack.Push(i); // push 0
stack.Push(1); // push 1
i = stack.Pop() + stack.Pop(); // pop 0 and 1 --> i == 1
i = stack.Pop() + stack.Pop(); // pop 0 and 0 --> i == 0
i.e. i is changed twice: once by the i++ expression and once by the += statement.
But the operands of the += statement are
the value i before the evaluation of i++ (left-hand side of +=) and
the value i before the evaluation of i++ (right-hand side of +=).

First, i++ returns 0. Then i is incremented by 1. Lastly i is set to the initial value of i which is 0 plus the value i++ returned, which is zero too. 0 + 0 = 0.

This is simply left to right, bottom-up evaluation of the abstract syntax tree. Conceptually, the expression's tree is walked from top down, but the evaluation unfolds as the recursion pops back up the tree from the bottom.
// source code
i += i++;
// abstract syntax tree
+=
/ \
i ++ (post)
\
i
Evaluation begins by considering the root node +=. That is the major constituent of the expression. The left operand of += must be evaluated to determine the place where we store the variable, and to obtain the prior value which is zero. Next, the right side must be evaluated.
The right side is a post-incrementing ++ operator. It has one operand, i which is evaluated both as a source of a value, and as a place where a value is to be stored. The operator evaluates i, finding 0, and consequently stores a 1 into that location. It returns the prior value, 0, in accordance with its semantics of returning the prior value.
Now control is back to the += operator. It now has all the info to complete its operation. It knows the place where to store the result (the storage location of i) as well as the prior value, and it has the value to added to the prior value, namely 0. So, i ends up with zero.
Like Java, C# has sanitized a very asinine aspect of the C language by fixing the order of evaluation. Left-to-right, bottom-up: the most obvious order that is likely to be expected by coders.

Because i++ first returns the value, then increments it. But after i is set to 1, you set it back to 0.

The post-increment method looks something like this
int ++(ref int i)
{
int c = i;
i = i + 1;
return c;
}
So basically when you call i++, i is increment but the original value is returned in your case it's 0 being returned.

Simple answer
int i = 0;
i += i++;
// Translates to:
i = i + 0; // because post increment returns the current value 0 of i
// Before the above operation is set, i will be incremented to 1
// Now i gets set after the increment,
// so the original returned value of i will be taken.
i = 0;

i++ means: return the value of i THEN increment it.
i += i++ means:
Take the current value of i.
Add the result of i++.
Now, let's add in i = 0 as a starting condition.
i += i++ is now evaluated like this:
What's the current value of i? It is 0. Store it so we can add the result of i++ to it.
Evaluate i++ (evaluates to 0 because that's the current value of i)
Load the stored value and add the result of step 2 to it. (add 0 to 0)
Note: At the end of step 2, the value of i is actually 1. However, in step 3, you discard it by loading the value of i before it was incremented.
As opposed to i++, ++i returns the incremented value.
Therefore, i+= ++i would give you 1.

The post fix increment operator, ++, gives the variable a value in the expression and then do the increment you assigned returned zero (0) value to i again that overwrites the incremented one (1), so you are getting zero. You can read more about increment operator in ++ Operator (MSDN).

i += i++; will equal zero, because it does the ++ afterwards.
i += ++i; will do it before

The ++ postfix evaluates i before incrementing it, and += only evaluates i once.
Therefore, 0 + 0 = 0, as i is evaluated and used before it is incremented, as the postfix format of ++ is used. To get i incremented first, use the prefix form (++i).
(Also, just a note: you should only get 1, as 0 + (0 + 1) = 1)
References: http://msdn.microsoft.com/en-us/library/sa7629ew.aspx (+=)
http://msdn.microsoft.com/en-us/library/36x43w8w.aspx (++)

What C# is doing, and the "why" of the confusion
I also expected the value to be 1... but some exploration on that matter did clarify some points.
Cosider the following methods:
static int SetSum(ref int a, int b) { return a += b; }
static int Inc(ref int a) { return a++; }
I expected that i += i++ to be the same as SetSum(ref i, Inc(ref i)). The value of i after this statement is 1:
int i = 0;
SetSum(ref i, Inc(ref i));
Console.WriteLine(i); // i is 1
But then I came to another conclusion... i += i++ is actually the same as i = i + i++... so I have created another similar example, using these functions:
static int Sum(int a, int b) { return a + b; }
static int Set(ref int a, int b) { return a = b; }
After calling this Set(ref i, Sum(i, Inc(ref i))) the value of i is 0:
int i = 0;
Set(ref i, Sum(i, Inc(ref i)));
Console.WriteLine(i); // i is 0
This not only explains what C# is doing... but also why a lot of people got confused with it... including me.

A good mnemonic I always remember about this is the following:
If ++ stands after the expression, it returns the value it was before. So the following code
int a = 1;
int b = a++;
is 1, because a was 1 before it got increased by the ++ standing after a. People call this postfix notation. There is also a prefix notation, where things are exactly the opposite: if ++ stands before, the expression returns the value that it is after the operation:
int a = 1;
int b = ++a;
b is two in here.
So for your code, this means
int i = 0;
i += (i++);
i++ returns 0 (as described above), so 0 + 0 = 0.
i += (++i); // Here 'i' would become two
Scott Meyers describes the difference between those two notations in "Effective C++ programming". Internally, i++ (postfix) remembers the value i was, and calls the prefix-notation (++i) and returns the old value, i. This is why you should allways use ++i in for loops (although I think all modern compilers are translating i++ to ++i in for loops).

The only answer to your question which is correct is: Because it is undefined.
i+=i++; result in 0 is undefined.
a bug in the language evaluation mechanism if you will.. or even worse! a bug in design.
want a proof? of course you want!
int t=0; int i=0; t+=i++; //t=0; i=1
Now this... is intuitive result! because we first evaluated t assigned it with a value and only after evaluation and assignment we had the post operation happening - rational isn't it?
is it rational that: i=i++ and i=i yield the same result for i?
while t=i++ and t=i have different results for i.
The post operation is something that should happen after the statement evaluation.
Therefore:
int i=0;
i+=i++;
Should be the same if we wrote:
int i=0;
i = i + i ++;
and therefore the same as:
int i=0;
i= i + i;
i ++;
and therefore the same as:
int i=0;
i = i + i;
i = i + 1;
Any result which is not 1 indicate a bug in the complier or a bug in the language design if we go with rational thinking - however MSDN and many other sources tells us "hey - this is undefined!"
Now, before I continue, even this set of examples I gave is not supported or acknowledged by anyone.. However this is what according to intuitive and rational way should have been the result.
The coder should have no knowledge of how the assembly is being written or translated!
If it is written in a manner that will not respect the language definitions - it is a bug!
And to finish I copied this from Wikipedia, Increment and decrement operators :
Since the increment/decrement operator modifies its operand, use of such an operand more than once within the same expression can produce undefined results. For example, in expressions such as x − ++x, it is not clear in what sequence the subtraction and increment operators should be performed. Situations like this are made even worse when optimizations are applied by the compiler, which could result in the order of execution of the operations to be different than what the programmer intended.
And therefore.
The correct answer is that this SHOULD NOT BE USED! (as it is UNDEFINED!)
Yes.. - It has unpredictable results even if C# complier is trying to normalize it somehow.
I did not find any documentation of C# describing the behavior all of you documented as a normal or well defined behavior of the language. What I did find is the exact opposite!
[copied from MSDN documentation for Postfix Increment and Decrement Operators: ++ and --]
When a postfix operator is applied to a function argument, the value of the argument is not guaranteed to be incremented or decremented before it is passed to the function. See section 1.9.17 in the C++ standard for more information.
Notice those words not guaranteed...

The ++ operator after the variable makes it a postfix increment. The incrementing happens after everything else in the statement, the adding and assignment. If instead, you put the ++ before the variable, it would happen before i's value was evaluated, and give you the expected answer.

The steps in calculation are:
int i=0 //Initialized to 0
i+=i++ //Equation
i=i+i++ //after simplifying the equation by compiler
i=0+i++ //i value substitution
i=0+0 //i++ is 0 as explained below
i=0 //Final result i=0
Here, initially the value of i is 0.
WKT, i++ is nothing but: first use the i value and then increment the i value by 1. So
it uses the i value, 0, while calculating i++ and then increments it by 1.
So it results in a value of 0.

Be very careful: read the C FAQ: what you're trying to do (mixing assignement and ++ of the same variable) is not only unspecified, but it is also undefined (meaning that the compiler may do anything when evaluating!, not only giving "reasonnable" results).
Please read, section 3. The whole section is well worth a read! Especially 3.9, which explains the implication of unspecified. Section 3.3 gives you a quick summary of what you can, and cannot do, with "i++" and the like.
Depending on the compilers internals, you may get 0, or 2, or 1, or even anything else! And as it is undefined, it's OK for them to do so.

There are two options:
The first option: if the compiler read the statement as follows,
i++;
i+=i;
then the result is 2.
For
else if
i+=0;
i++;
the result is 1.

There's lot of excellent reasoning in above answers, I just did a small test and want to share with you
int i = 0;
i+ = i++;
Here result i is showing 0 result.
Now consider below cases :
Case 1:
i = i++ + i; //Answer 1
earlier I thought above code resemble this so at first look answer is 1, and really answer of i for this one is 1.
Case 2:
i = i + i++; //Answer 0 this resembles the question code.
here increment operator doesn't come in execution path, unlike previous case where i++ has the chance to execute before addition.
I hope this helps a bit. Thanks

Hoping to answer this from a C programming 101 type of perspective.
Looks to me like it's happening in this order:
i is evaluated as 0, resulting in i = 0 + 0 with the increment operation i++ "queued", but the assignment of 0 to i hasn't happened yet either.
The increment i++ occurs
The assignment i = 0 from above happens, effectively overwriting anything that #2 (the post-increment) would've done.
Now, #2 may never actually happen (probably doesn't?) because the compiler likely realizes it will serve no purpose, but this could be compiler dependent. Either way, other, more knowledgeable answers have shown that the result is correct and conforms to the C# standard, but it's not defined what happens here for C/C++.
How and why is beyond my expertise, but the fact that the previously evaluated right-hand-side assignment happens after the post-increment is probably what's confusing here.
Further, you would not expect the result to be 2 regardless unless you did ++i instead of i++ I believe.

Simply put,
i++, will add 1 to "i" after the "+=" operator has completed.
What you want is ++i, so that it will add 1 to "i" before the "+=" operator is executed.

i=0
i+=i
i=i+1
i=0;
Then the 1 is added to i.
i+=i++
So before adding 1 to i, i took the value of 0. Only if we add 1 before, i get the value 0.
i+=++i
i=2

The answer is i will be 1.
Let's have a look how:
Initially i=0;.
Then while calculating i +=i++; according to value of we will have something like 0 +=0++;, so according to operator precedence 0+=0 will perform first and the result will be 0.
Then the increment operator will applied as 0++, as 0+1 and the value of i will be 1.

Related

strange behavior of reverse loop in c# and c++

I just programmed a simple reverse loop like this:
for (unsigned int i = 50; i >= 0; i--)
printf("i = %d\n", i);
but it doesn't stop at 0 as expected but goes down far to the negative values, why?
See this ideone sample: http://ideone.com/kkixx8
(I tested it in c# and c++)
You declared the int as unsigned. It will always be >= 0. The only reason you see negative values is that your printf call interprets it as signed (%d) instead of unsigned (%ud).
Although you did not ask for a solution, here are two common ways of fixing the problem:
// 1. The goes-to operator
for (unsigned int i = 51; i --> 0; )
printf("i = %d\n", i);
// 2. Waiting for overflow
for (unsigned int i = 50; i <= 50; i--)
printf("i = %d\n", i);
An unsigned int can never become negative.
In C# this code
for (uint i = 50; i >= 0; i--)
Console.WriteLine(i);
Produces following output:
50
...
7
6
5
4
3
2
1
0
4294967295
4294967294
4294967293
...
You are using an unsigned int. It can never be < 0. It just wraps around. You are seeing negative values because of the way you are formatting your output (interpreting it as a signed int).
Loop breaks when i would be less than zero. But i is unsigned, and it never be less than zero.
in your for loop
for (unsigned int i = 50; i >= 0; i--)
printf("i = %d\n", i);
the value of i decresed by 1 and when your value of i==0 then loop decrement try to assign
i-- means i=-1
The -1 to the right of your equals sign is set up as a signed integer (probably 32 bits in size) and will have the hexadecimal value 0xFFFFFFF4. The compiler generates code to move this signed integer into your unsigned integer i which is also a 32 bit entity. The compiler assumes you only have a positive value to the right of the equals sign so it simply moves all 32 bits into i. i now has the value 0xFFFFFFF4 which is 4294967284 if interpreted as a positive number. But the printf format of %d says the 32 bits are to be interpreted as a signed integer so you get -1. If you had used %u it would have printed as 4294967284.

Typedef for indexes in C# with static type checking without runtime overhead

It's pretty common case to use multidimensional arrays with complicated indexing. It's really confusing and error-prone when all indexes are ints because you can easily mix up columns and rows (or whatever you have) and there's no way for compiler to identify the problem. In fact there should be two types of indexes: rows and columns but it's not expressed on type level.
Here's a small illustration of what I want:
var table = new int[RowsCount,ColumnsCount];
Row row = 5;
Column col = 10;
int value = table[row, col];
public void CalcSum(int[,] table, Column col)
{
int sum = 0;
for (Row r = 0; r < table.GetLength(0); r++)
{
sum += table[row, col];
}
return sum;
}
CalcSum(table, col); // OK
CalcSum(table, row); // Compile time error
Summing up:
indexes should be statically checked for mixing up (kind of type check)
important! they should be run time efficient since it's not OK for performance to wrap ints to custom objects containing the index and then unwrapping them back
they should be implicitly convertible to ints in order to serve as indexes in native multidimensional arrays
Is there any way to achieve this? The perfect solution would be something like typedef which serves as compile-time check only compiling into plane ints.
You'll only get a 2x slowdown with the x64 jitter. It generates interesting optimized code. The loop that uses the struct looks like this:
00000040 mov ecx,1
00000045 nop word ptr [rax+rax+00000000h]
00000050 lea eax,[rcx-1]
s.Idx = j;
00000053 mov dword ptr [rsp+30h],eax
00000057 mov dword ptr [rsp+30h],ecx
0000005b add ecx,2
for (int j = 0; j < 100000000; j++) {
0000005e cmp ecx,5F5E101h
00000064 jl 0000000000000050
This requires some annotation since the code is unusual. First off, the weird NOP at offset 45 is there to align the instruction at the start of the loop. That makes the branch at offset 64 faster. The instruction at 53 looks completely unnecessary. What you see happen here is loop unrolling, note how the instruction at 5b increments the loop counter by 2. The optimizer is however not smart enough to then also see that the store is unnecessary.
And most of all note that there's no ADD instruction to be seen. In other words, the code doesn't actually calculate the value of "sum". Which is because you are not using it anywhere after the loop, the optimizer can see that the calculation is useless and removed it entirely.
It does a much better job at the second loop:
000000af xor eax,eax
000000b1 add eax,4
for (int j = 0; j < 100000000; j++) {
000000b4 cmp eax,5F5E100h
000000b9 jl 00000000000000B1
It now entirely removed the "sum" calculation and the "i" variable assignment. It could have also removed the entire for() loop but that's never done by the jitter optimizer, it assumes that the delay is intentional.
Hopefully the message is clear by now: avoid making assumptions from artificial benchmarks and only ever profile real code. You can make it more real by actually displaying the value of "sum" so the optimizer doesn't throw away the calculation. Add this line of code after the loops:
Console.Write("Sum = {0} ", sum);
And you'll now see that there's no difference anymore.

Why is my string.indexof(char) faster?

Don't ask how I got there, but I was playing around with some masking, loop unrolling etc. In any case, out of interest I was thinking about how I would implement an indexof method, and long story short, all that masking etc aside, this naive implementation:
public static unsafe int IndexOf16(string s, int startIndex, char c) {
if (startIndex < 0 || startIndex >= s.Length) throw new ArgumentOutOfRangeException("startIndex");
fixed (char* cs = s) {
for (int i = startIndex; i < s.Length; i++) {
if ((cs[i]) == c) return i;
}
return -1;
}
}
is faster than string.IndexOf(char). I wrote some simple tests, and it seems to match output exactly.
Some sample output numbers from my machine (it varies to some degree of course, but the trend is clear):
short haystack 500k runs
1741 ms for IndexOf16
2737 ms for IndexOf32
2963 ms for IndexOf64
2337 ms for string.IndexOf <-- buildin
longer haystack:
2888 ms for IndexOf16
3028 ms for IndexOf32
2816 ms for IndexOf64
3353 ms for string.IndexOf <-- buildin
IndexOfChar is marked extern, so you cant reflector it. However I think this should be the (native) implementation:
http://www.koders.com/cpp/fidAB4768BA4DF45482A7A2AA6F39DE9C272B25B8FE.aspx?s=IndexOfChar#L1000
They seem to use the same naive implementation.
Questions come to my mind:
1) Am I missing something in my implementation that explains why its faster? I can only think of extended chars support, but their implementation suggests they don't do anything special for that either.
2) I assumed much of the low level methods would ultimately be implemented in hand assembler, that seems not the case. If so, why implement it natively at all, instead of just in C# like my sample implementation?
(Complete test here (I think its too long to paste here): http://paste2.org/p/1606018 )
(No this is not premature optimization, it's not for a project I am just messing about) :-)
Update: Thnx to Oliver for the hint about nullcheck and the Count param. I have added these to my IndexOf16Implementation like so:
public static unsafe int IndexOf16(string s, int startIndex, char c, int count = -1) {
if (s == null) throw new ArgumentNullException("s");
if (startIndex < 0 || startIndex >= s.Length) throw new ArgumentOutOfRangeException("startIndex");
if (count == -1) count = s.Length - startIndex;
if (count < 0 || count > s.Length - startIndex) throw new ArgumentOutOfRangeException("count");
int endIndex = startIndex + count;
fixed (char* cs = s) {
for (int i = startIndex; i < endIndex; i++) {
if ((cs[i]) == c) return i;
}
return -1;
}
}
The numbers changed slightly, however it is still quite significantly faster (32/64 results omitted):
short haystack 500k runs
1908 ms for IndexOf16
2361 ms for string.IndexOf
longer haystack:
3061 ms for IndexOf16
3391 ms for string.IndexOf
Update2: This version is faster yet (especially for the long haystack case):
public static unsafe int IndexOf16(string s, int startIndex, char c, int count = -1) {
if (s == null) throw new ArgumentNullException("s");
if (startIndex < 0 || startIndex >= s.Length) throw new ArgumentOutOfRangeException("startIndex");
if (count == -1) count = s.Length - startIndex;
if (count < 0 || count > s.Length - startIndex) throw new ArgumentOutOfRangeException("count");
int endIndex = startIndex + count;
fixed (char* cs = s) {
char* cp = cs + startIndex;
for (int i = startIndex; i <= endIndex; i++, cp++) {
if (*cp == c) return i;
}
return -1;
}
}
Update 4:
Based on the discussion with LastCoder I believe this to be architecture depended. My Xeon W3550 at works seems to prefer this version, while his i7 seems to like the buildin version. My home machine (Athlon II) appears to be in between. I am surprised about the large difference though.
Possibility 1)
This may not hold (as true) in C# but when I did optimization work for x86-64 assembler I quickly found out while benchmarking that calling code from a DLL (marked external) was slower than implementing the same exact function within my executable. The most obvious reason is paging and memory, the DLL (external) method is loaded far away in memory from the rest of the running code and if it wasn't accessed previously it'll need to be paged in. Your benchmarking code should do some warm up loops of the functions you are benchmarking to make sure they are paged in memory first before you time them.
Possibility 2)
Microsoft tends not to optimize string functions to the fullest, so out optimizing a native string length, substring, indexof etc. isn't really unheard of. Anecdote; in x86-64 assembler I was able to create a version of WinXP64's RtlInitUnicodeString function that ran 2x faster in almost all practical use cases.
Possibility 3) Your benchmarking code shows that you're using the 2 parameter overload for IndexOf, this function likely calls the 3 parameter overload IndexOf(Char, Int32, Int32) which adds an extra overhead to each iteration.
This may be even faster because your removing the i variable increment per iteration.
char* cp = cs + startIndex;
char* cpEnd = cp + endIndex;
while (cp <= cpEnd) {
if (*cp == c) return cp - cs;
cp++;
}
edit In reply regarding (2) for your curiosity, coded back in 2005 and used to patch the ntdll.dll of my WinXP64 machine. http://board.flatassembler.net/topic.php?t=4467
RtlInitUnicodeString_Opt: ;;rcx=buff rdx=ucharstr 77bytes
xor r9d,r9d
test rdx,rdx
mov dword[rcx],r9d
mov [rcx+8],rdx
jz .end
mov r8,rdx
.scan:
mov eax,dword[rdx]
test ax,ax
jz .one
add rdx,4
shr eax,16
test ax,ax
jz .two
jmp .scan
.two:
add rdx,2
.one:
mov eax,0fffch
sub rdx,r8
cmp rdx,0fffeh
cmovnb rdx,rax
mov [ecx],dx
add dx,2
mov [ecx+2],dx
ret
.end:
retn
edit 2 Running your example code (updated with your fastest version) the string.IndexOf runs faster on my Intel i7, 4GB RAM, Win7 64bit.
short haystack 500k runs
2590 ms for IndexOf16
2287 ms for string.IndexOf
longer haystack:
3549 ms for IndexOf16
2757 ms for string.IndexOf
Optimizations are sometimes very architecture reliant.
If you really make such a micro measurement check every single bit counts. Within the MS implementation (as seen in the link you provided) they also check if s is null and throw a NullArgumentException. Also this is the implementation including the count parameter. So they additionally check if count as a correct value and throw a ArgumentOutOfRangeException.
I think these little checks to make the code more robust are enough to make them a little bit slower if you call them so often in such a short time.
This might have somthing to do with the "fixed" statement as "It pins the location of the src and dst objects in memory so that they will not be moved by garbage collection." perhaps speeding up the methods?
Also "Unsafe code increases the performance by getting rid of array bounds checks." this could also be why.
Above comments taken from MSDN

In C#, Is it slower to reference an array variable?

I've got an array of integers, and I'm looping through them:
for (int i = 0; i < data.Length; i++)
{
// do a lot of stuff here using data[i]
}
If I do:
for (int i = 0; i < data.Length; i++)
{
int value = data[i];
// do a lot of stuff with value instead of data[i]
}
Is there any performance gain/loss?
From my understanding, C/C++ array elements are accessed directly, i.e. an n-element array of integers has a contiguous memory block of length n * sizeof(int), and the program access element i by doing something like *data[i] = *data[0] + (i * sizeof(int)). (Please excuse my abuse of notation, but you get what I mean.)
So this means C/C++ should have no performance gain/loss for referencing array variables.
What about C#?
C# has a bunch of extra overhead like data.Length, data.IsSynchronized, data.GetLowerBound(), data.GetEnumerator().
Clearly, a C# array is not the same as a C/C++ array.
So what's the verdict? Should I store int value = data[i] and work with value, or is there no performance impact?
You can have the cake and eat it too. There are many cases where the jitter optimizer can easily determine that an array indexing access is safe and doesn't need to be checked. Any for-loop like you got in your question is one such case, the jitter knows the range of the index variable. And knows that checking it again is pointless.
The only way you can see that is from the generated machine code. I'll give an annotated example:
static void Main(string[] args) {
int[] array = new int[] { 0, 1, 2, 3 };
for (int ix = 0; ix < array.Length; ++ix) {
int value = array[ix];
Console.WriteLine(value);
}
}
Starting at the for loop, ebx has the pointer to the array:
for (int ix = 0; ix < array.Length; ++ix) {
00000037 xor esi,esi ; ix = 0
00000039 cmp dword ptr [ebx+4],0 ; array.Length < 0 ?
0000003d jle 0000005A ; skip everything
int value = array[ix];
0000003f mov edi,dword ptr [ebx+esi*4+8] ; NO BOUNDS CHECK !!!
Console.WriteLine(value);
00000043 call 6DD5BE38 ; Console.Out
00000048 mov ecx,eax ; arg = Out
0000004a mov edx,edi ; arg = value
0000004c mov eax,dword ptr [ecx] ; call WriteLine()
0000004e call dword ptr [eax+000000BCh]
for (int ix = 0; ix < array.Length; ++ix) {
00000054 inc esi ; ++ix
00000055 cmp dword ptr [ebx+4],esi ; array.Length > ix ?
00000058 jg 0000003F ; loop
The array indexing happens at address 00003f, ebx has the array pointer, esi is the index, 8 is the offset of the array elements in the object. Note how the esi value is not checked again against the array bounds. This runs just as fast as the code generated by a C compiler.
Yes, there is a performance loss due to the bounds check for every access to the array.
No, you most likely don't need to worry about it.
Yes, you can should store the value and work with the value. No, this isn't because of the performance issue, but rather because it makes the code more readable (IMHO).
By the way, the JIT compiler might optimize out redundant checks, so it doesn't mean you'll actually get a check on every call. Either way, it's probably not worth your time to worry about it; just use it, and if it turns out to be a bottleneck you can always go back and use unsafe blocks.
You have written it both ways. Run it both ways, measure it. Then you'll know.
But I think you would prefer working with the copy rather than always working with the array element directly, simply because it's easier to write the code that way, particularly if you have lots of operations involving that particular value.
The compiler can only perform common subexpression optimization here if it can prove that the array isn't accessed by other threads or any methods (including delegates) called inside the loop, it might be better to create the local copy yourself.
But readability should be your main concern, unless this loop executes a huge number of times.
All of this is also true in C and C++ -- indexing into an array will be slower than accessing a local variable.
As a side note, your suggested optimization is no good: value is a keyword, choose a different variable name.
Not really sure, but it probably wouldn't hurt to store the value if you are going to use it multiple times. You could also use a foreach statement :)

how to loop through the digits of a binary number?

I have a binary number 1011011, how can I loop through all these binary digits one after the other ?
I know how to do this for decimal integers by using modulo and division.
int n = 0x5b; // 1011011
Really you should just do this, hexadecimal in general is much better representation:
printf("%x", n); // this prints "5b"
To get it in binary, (with emphasis on easy understanding) try something like this:
printf("%s", "0b"); // common prefix to denote that binary follows
bool leading = true; // we're processing leading zeroes
// starting with the most significant bit to the least
for (int i = sizeof(n) * CHAR_BIT - 1; i >= 0; --i) {
int bit = (n >> i) & 1;
leading |= bit; // if the bit is 1, we are no longer reading leading zeroes
if (!leading)
printf("%d", bit);
}
if (leading) // all zero, so just print 0
printf("0");
// at this point, for n = 0x5b, we'll have printed 0b1011011
You can use modulo and division by 2 exactly like you would in base 10. You can also use binary operators, but if you already know how to do that in base 10, it would be easier if you just used division and modulo
Expanding on Frédéric and Gabi's answers, all you need to do is realise that the rules in base 2 are no different to in base 10 - you just need to do your division and modulus with a divisor 2 instead of 10.
The next step is simply to use number >> 1 instead of number / 2 and number & 0x1 instead of number % 2 to improve performance. Mind you, with modern optimising compilers there's probably no difference...
Use an AND with increasing powers of two...
In C, at least, you can do something like:
while (val != 0)
{
printf("%d", val&0x1);
val = val>>1;
}
To expand on #Marco's answer with an example:
uint value = 0x82fa9281;
for (int i = 0; i < 32; i++)
{
bool set = (value & 0x1) != 0;
value >>= 1;
Console.WriteLine("Bit set: {0}", set);
}
What this does is test the last bit, and then shift everything one bit.
If you're already starting with a string, you could just iterate through each of the characters in the string:
var values = "1011011".Reverse().ToCharArray();
for(var index = 0; index < values.Length; index++) {
var isSet = (Boolean)Int32.Parse(values[index]); // Boolean.Parse only works on "true"/"false", not 0/1
// do whatever
}
byte input = Convert.ToByte("1011011", 2);
BitArray arr = new BitArray(new[] { input });
foreach (bool value in arr)
{
// ...
}
You can simply loop through every bit. The following C like pseudocode allows you to set the bit number you want to check. (You might also want to google endianness)
for()
{
bitnumber = <your bit>
printf("%d",(val & 1<<bitnumber)?1:0);
}
The code basically writes 1 if the bit it set or 0 if not. We shift the value 1 (which in binary is 1 ;) ) the number of bits set in bitnumber and then we AND it with the value in val to see if it matches up. Simple as that!
So if bitnumber is 3 we simply do this
00000100 ( The value 1 is shifted 3 left for example)
AND
10110110 (We check it with whatever you're value is)
=
00000100 = True! - Both values have bit 3 set!

Categories