C# immutable int - c#

In Java strings are immutable. If we have a string and make changes to it, we get new string referenced by the same variable:
String str = "abc";
str += "def"; // now str refers to another piece in the heap containing "abcdef"
// while "abc" is still somewhere in the heap until taken by GC
It's been said that int and double are immutable in C#. Does it mean that when we have int and later change it, we would get new int "pointed" by the same variable? Same thing but with stack.
int i = 1;
i += 1; // same thing: in the stack there is value 2 to which variable
// i is attached, and somewhere in the stack there is value 1
Is that correct? If not, in what way is int immutable?

To follow up on Marc's (perfectly acceptable) answer: Integer values are immutable but integer variables may vary. That's why they're called "variables".
Integer values are immutable: if you have the value that is the number 12, there's no way to make it odd, no way to paint it blue, and so on. If you try to make it odd by, say, adding one, then you end up with a different value, 13. Maybe you store that value in the variable that used to contain 12, but that doesn't change any property of 12. 12 stays exactly the same as it was before.

You haven't changed (and cannot change) something about the int; you have assigned a new int value (and discarded the old value). Thus it is immutable.
Consider a more complex struct:
var x = new FooStruct(123);
x.Value = 456; // mutate
x.SomeMethodThatChangedInternalState(); // mutate
x = new FooStruct(456); // **not** a mutate; this is a *reassignment*
However, there is no "pointing" here. The struct is directly on the stack (in this case): no references involved.

Although this may sound obvious I'll add a couple of lines that helped me to understand in case someone has the same confusion.
There are various kinds of mutability but generally when people say "immutable" they mean that class has members that cannot be changed.
String is probably storing characters in an array, which is not accessible and cannot be changed via methods, that's why string is immutable. Operation + on strings always returns a new string.
int is probably storing it's value in member "Value" which anyway is inaccessible, and that's why cannot be changed. All operations on int return new int, and this value is copied to variable, because it's a value type.

Related

How to handle null when overloading operator + for a class value object?

I want to have to have a value object that represents length. I would prefer to use a struct given that it is a value type, but since zero length does not make sense I am forced to use a class. Adding two lengths together seems like a reasonable operation, so I want to overload the + operator. I am curious though, how should I handle adding null?
Adding null to an existing string returns a string with the same content as the existing string. Adding null to a int? that has a value returns null.
I can see a case where adding nullto an existing length simply returns a new length with the same value as the existing length. At the same time, I can see a case where adding null would be considered a bug. I have been trying to find some guidance but have not been able to find any. Is there a common guideline for this or is it different for each application?
I would highly recommend using struct for your length, and treating the default representation as zero length.
since zero length does not make sense I am forced to use a class
It is up to your code to treat the default representation of length struct as a representation of some specific length. In addition to treating it as zero length, you have at least two options:
You can treat default length as an unknown, in which case any operation with it would produce an unknown, or
You can treat it as a "trap representation" of length, in which case any operation with it would produce an exception.
It is probably a design mistake to not treat zeros in a uniform way with all other numbers. Specifically, zero length may become handy when you subtract length values, because subtracting two values of equal length would have nothing to produce.
As far as "unknown" length is concerned, using struct gives you a convenient standard representation of Nullable<length> immediately familiar to users of your length structure.
Simple Answer:
if your allowed to add nulls in your system then you should probably keep the existing value and treat it like a 0 like so:
public static NullNumber operator+ (NullNumber b, NullNumber c) {
return (b ?? 0) + (c ?? 0);
}
Advanced Answer:
You are probably correct about length not making sense at 0 and you are right about adding nulls seems like a bug
I can't see where the field is populated but I suspect either:
you don't have a constructor that requires you to pass in a length if it's required.
Or you have a faulty class that sometimes has a length and sometimes meaning it sounds closer to 2 classes
Strictly speaking, a null length doesn't exist in reality, everything has length. Getting a null return or a NullReferenceException when working with your struct would lead me to think I messed up the constructor or instantiation. In other words, the null reference would be employed in the scope of the application and not exposed to the client.
struct length = new MyStruct(); //no!
struct length = new MyStruct(double feet, double inches) //better...
struct length = 34.5; //ok...

Why does incrementation take place after setter in this example? [duplicate]

I've seen them both being used in numerous pieces of C# code, and I'd like to know when to use i++ and when to use ++i?
(i being a number variable like int, float, double, etc).
The typical answer to this question, unfortunately posted here already, is that one does the increment "before" remaining operations and the other does the increment "after" remaining operations. Though that intuitively gets the idea across, that statement is on the face of it completely wrong. The sequence of events in time is extremely well-defined in C#, and it is emphatically not the case that the prefix (++var) and postfix (var++) versions of ++ do things in a different order with respect to other operations.
It is unsurprising that you'll see a lot of wrong answers to this question. A great many "teach yourself C#" books also get it wrong. Also, the way C# does it is different than how C does it. Many people reason as though C# and C are the same language; they are not. The design of the increment and decrement operators in C# in my opinion avoids the design flaws of these operators in C.
There are two questions that must be answered to determine what exactly the operation of prefix and postfix ++ are in C#. The first question is what is the result? and the second question is when does the side effect of the increment take place?
It is not obvious what the answer to either question is, but it is actually quite simple once you see it. Let me spell out for you precisely what x++ and ++x do for a variable x.
For the prefix form (++x):
x is evaluated to produce the variable
The value of the variable is copied to a temporary location
The temporary value is incremented to produce a new value (not overwriting the temporary!)
The new value is stored in the variable
The result of the operation is the new value (i.e. the incremented value of the temporary)
For the postfix form (x++):
x is evaluated to produce the variable
The value of the variable is copied to a temporary location
The temporary value is incremented to produce a new value (not overwriting the temporary!)
The new value is stored in the variable
The result of the operation is the value of the temporary
Some things to notice:
First, the order of events in time is exactly the same in both cases. Again, it is absolutely not the case that the order of events in time changes between prefix and postfix. It is entirely false to say that the evaluation happens before other evaluations or after other evaluations. The evaluations happen in exactly the same order in both cases as you can see by steps 1 through 4 being identical. The only difference is the last step - whether the result is the value of the temporary, or the new, incremented value.
You can easily demonstrate this with a simple C# console app:
public class Application
{
public static int currentValue = 0;
public static void Main()
{
Console.WriteLine("Test 1: ++x");
(++currentValue).TestMethod();
Console.WriteLine("\nTest 2: x++");
(currentValue++).TestMethod();
Console.WriteLine("\nTest 3: ++x");
(++currentValue).TestMethod();
Console.ReadKey();
}
}
public static class ExtensionMethods
{
public static void TestMethod(this int passedInValue)
{
Console.WriteLine($"Current:{Application.currentValue} Passed-in:{passedInValue}");
}
}
Here are the results...
Test 1: ++x
Current:1 Passed-in:1
Test 2: x++
Current:2 Passed-in:1
Test 3: ++x
Current:3 Passed-in:3
In the first test, you can see that both currentValue and what was passed into the TestMethod() extension show the same value, as expected.
However, in the second case, people will try to tell you that the increment of currentValue happens after the call to TestMethod(), but as you can see from the results, it happens before the call as indicated by the 'Current:2' result.
In this case, first the value of currentValue is stored in a temporary. Next, an incremented version of that value is stored back in currentValue but without touching the temporary which still stores the original value. Finally that temporary is passed to TestMethod(). If the increment happened after the call to TestMethod() then it would write out the same, non-incremented value twice, but it does not.
It's important to note that the value returned from both the currentValue++ and ++currentValue operations are based on the temporary and not the actual value stored in the variable at the time either operation exits.
Recall in the order of operations above, the first two steps copy the then-current value of the variable into the temporary. That is what's used to calculate the return value; in the case of the prefix version, it's that temporary value incremented while in the case of the suffix version, it's that value directly/non-incremented. The variable itself is not read again after the initial storage into the temporary.
Put more simply, the postfix version returns the value that was read from the variable (i.e. the value of the temporary) while the prefix version returns the value that was written back to the variable (i.e. the incremented value of the temporary). Neither return the variable's value.
This is important to understand because the variable itself could be volatile and have changed on another thread which means the return value of those operations could differ from the current value stored in the variable.
It is surprisingly common for people to get very confused about precedence, associativity, and the order in which side effects are executed, I suspect mostly because it is so confusing in C. C# has been carefully designed to be less confusing in all these regards. For some additional analysis of these issues, including me further demonstrating the falsity of the idea that prefix and postfix operations "move stuff around in time" see:
https://ericlippert.com/2009/08/10/precedence-vs-order-redux/
which led to this SO question:
int[] arr={0}; int value = arr[arr[0]++]; Value = 1?
You might also be interested in my previous articles on the subject:
https://ericlippert.com/2008/05/23/precedence-vs-associativity-vs-order/
and
https://ericlippert.com/2007/08/14/c-and-the-pit-of-despair/
and an interesting case where C makes it hard to reason about correctness:
https://learn.microsoft.com/archive/blogs/ericlippert/bad-recursion-revisited
Also, we run into similar subtle issues when considering other operations that have side effects, such as chained simple assignments:
https://learn.microsoft.com/archive/blogs/ericlippert/chaining-simple-assignments-is-not-so-simple
And here's an interesting post on why the increment operators result in values in C# rather than in variables:
Why can't I do ++i++ in C-like languages?
Oddly it looks like the other two answers don't spell it out, and it's definitely worth saying:
i++ means 'tell me the value of i, then increment'
++i means 'increment i, then tell me the value'
They are Pre-increment, post-increment operators. In both cases the variable is incremented, but if you were to take the value of both expressions in exactly the same cases, the result will differ.
If you have:
int i = 10;
int x = ++i;
then x will be 11.
But if you have:
int i = 10;
int x = i++;
then x will be 10.
Note as Eric points out, the increment occurs at the same time in both cases, but it's what value is given as the result that differs (thanks Eric!).
Generally, I like to use ++i unless there's a good reason not to. For example, when writing a loop, I like to use:
for (int i = 0; i < 10; ++i) {
}
Or, if I just need to increment a variable, I like to use:
++x;
Normally, one way or the other doesn't have much significance and comes down to coding style, but if you are using the operators inside other assignments (like in my original examples), it's important to be aware of potential side effects.
int i = 0;
Console.WriteLine(i++); // Prints 0. Then value of "i" becomes 1.
Console.WriteLine(--i); // Value of "i" becomes 0. Then prints 0.
Does this answer your question ?
The way the operator works is that it gets incremented at the same time, but if it is before a variable, the expression will evaluate with the incremented/decremented variable:
int x = 0; //x is 0
int y = ++x; //x is 1 and y is 1
If it is after the variable the current statement will get executed with the original variable, as if it had not yet been incremented/decremented:
int x = 0; //x is 0
int y = x++; //'y = x' is evaluated with x=0, but x is still incremented. So, x is 1, but y is 0
I agree with dcp in using pre-increment/decrement (++x) unless necessary. Really the only time I use the post-increment/decrement is in while loops or loops of that sort. These loops are the same:
while (x < 5) //evaluates conditional statement
{
//some code
++x; //increments x
}
or
while (x++ < 5) //evaluates conditional statement with x value before increment, and x is incremented
{
//some code
}
You can also do this while indexing arrays and such:
int i = 0;
int[] MyArray = new int[2];
MyArray[i++] = 1234; //sets array at index 0 to '1234' and i is incremented
MyArray[i] = 5678; //sets array at index 1 to '5678'
int temp = MyArray[--i]; //temp is 1234 (becasue of pre-decrement);
Etc, etc...
Just for the record, in C++, if you can use either (i.e.) you don't care about the ordering of operations (you just want to increment or decrement and use it later) the prefix operator is more efficient since it doesn't have to create a temporary copy of the object. Unfortunately, most people use posfix (var++) instead of prefix (++var), just because that is what we learned initially. (I was asked about this in an interview). Not sure if this is true in C#, but I assume it would be.
I think I'll try answering the question using code. Imagine the following methods for, say, int:
// The following are equivalent:
// ++i;
// PlusPlusInt(ref i);
//
// The argument "value" is passed as a reference,
// meaning we're not incrementing a copy.
static int PlusPlusInt(ref int value)
{
// Increment the value.
value = value + 1;
// Return the incremented value.
return value;
}
// The following are equivalent:
// i++;
// IntPlusPlus(ref i);
//
// The argument "value" is passed as a reference,
// meaning we're not incrementing a copy.
static int IntPlusPlus(ref int value)
{
// Keep the original value around before incrementing it.
int temp = value;
// Increment the value.
value = value + 1;
// Return what the value WAS.
return temp;
}
Assuming you know how ref works, this should clear this up really nicely. Explaining it in english is a lot more clunky in my opinion.

Why is StringBuilder faster than string operations, but a List<T> is faster than a LinkedList<T>?

So we are told that StringBuilder should be used when you are doing more than a few operations on a string (I've heard as low as three). Therefore we should replace this:
string s = "";
foreach (var item in items) // where items is IEnumerable<string>
s += item;
With this:
string s = new StringBuilder(items).ToString();
I assume that internally StringBuilder holds references to each Appended string, combining then on request. Lets compare this to the HybridDictionary, that uses a LinkedList for the first 10 elements, then swaps to a HashTable when the list grows more then 10. As we can see the same kind of pattern is here, small number of references = linkedList, else make ever increasing blocks of arrays.
Lets look at how a List works. Start off with a list size (internal default is 4). Add elements to the internal array, if the array is full, make a new array of double the size of the current array, copy the current array's elements across, then add the new element and make the new array the current array.
Can you see my confusion as to the performance benefits? For all elements besides strings, we make new arrays, copy old values and add the new value. But for strings that's bad? because we know that "a" + "b" makes a new string reference from the two old references, "a" and "b".
Hope my question isn't too confusing. Why does there seem to be a double standard between string concatenation and array concatenation (I know strings are arrays of chars)?
String: Making new references is bad!
T : where T != String: Making new references is good!
Edit: Maybe what I'm really asking here, is when does making new, bigger arrays and copying the old values across, start being faster than have references to randomly places objects all over the heap?
Double edit: By faster I mean reading, writing and finding variables, not inserting or removing (i.e. LinkedList would kickass at inserting for example, but I don't care about that).
Final edit: I don't care about StringBuilder, I'm interested in the trade off in time taken to copy data from one part of the heap to another for cache alignments, vs just taking the cache misses from teh cpu and have references all over the heap. When does one become faster then the other?*
Therefore we should replace this:
No you shouldn't. The first case you showed string concatenation that can take place at compile time and have replaced it with string concatenation that takes place a runtime. The former is much more desirable, and will execute faster than the latter.
It's important to use a string builder when the number of strings being concatted is not known at compile time. Often (but not always) this means concatting strings in a loop.
Earlier versions of String Builder (before 4.0, if memory serves), did internally look more or less like a List<char>, and it's correct that post 4.0 it looks more like a LinkedList<char[]>. However, the key difference here between using a StringBuilder and using regular string concatenation in a loop is not the difference between a linked list style in which objects contain references to the next object in the "chain" and an array-based style in which an internal buffer overallocates space and is reallocated occasionally as needed, but rather the difference between a mutable object and an immutable object. The problem with traditional string concatenation is that, since strings are immutable, each concatenation must copy all of the memory from both strings into a new string. When using a StringBuilder the new string only needs to be copied onto the end of some type of data structure, leaving all of the existing memory as it is. What type of data structure that is isn't terribly important here; we can rely on Microsoft to use a structure/algorithm that has been proven to have the best performance characteristics for the most common situations.
It seems to me that you are conflating the resizing of a list with the evaluation of a string expression, and assuming that the two should behave the same way.
Consider your example: string s = "a" + "b" + "c" + "d"
Assuming no optimisations of the constant expression (which the compiler would handle automatically), what this will do is evaluate each operation in turn:
string s = (("a" + "b") + "c") + "d"
This results in the strings "ab" and "abc" being created as part of that single expression. This has to happen, because strings in.NET are immutable, which means their values cannot be changed once created. This is because, if strings were mutable, you'd have code like this:
string a = "hello";
string b = a; // would assign b the same reference as a
string b += "world"; // would update the string it references
// now a == "helloworld"
If this were a List, the code would make more sense, and doesn't even need explanation:
var a = new List<int> { 1, 2, 3 };
var b = a;
b.Add(4);
// now a == { 1, 2, 3, 4 }
So the reason that non-string "list" types allocate extra memory early is for reasons of efficiency, and to reduce allocations when the list is extended. The reason that a string does not do that is because a string's value is never updated once created.
Your assumption about the operation of the StringBuilder is irrelevant, but the purpose of a StringBuilder is essentially to create a non-immutable object that reduces the overhead of multiple string operations.
The backing store of a StringBuilder is a char[] that gets resized as needed. Nothing is turned into a string until you invoke StringBuilder.ToString() on it.
The backing store of List<T> is a T[] that gets resized as needed.
The problem with something like
string s = a + b + c + d ;
is that the compiler parses it as
+
/ \
a +
/ \
b +
/ \
c d
and, unless it can see opportunities for optimization, do something like
string t1 = c + d ;
string t2 = b + t1 ;
string s = a + t2 ;
thus creating two temporaries and the final string. With a StringBuilder, though, it's going to build out the character array it needs and at the end create one string.
This is a win because strings, once created, are immutable (can't be changed) and are generally interned in the string pool (meaning that there is only ever one instance of the string...no matter how many time your create the string "abc", every instance will always be a reference to the same object in the string pool.
This adds cost to string creation as well: having determined the candidate string, the runtime has to check the string pool to see if it already exists. If it does, that reference is used; if it does not the candidate string is added to the string pool.
Your example, though:
string s = "a" + "b" + "c" + "d" ;
is a non-sequitur: the compile sees the constant expression and does an optimization called constant folding, so it becomes (even in debug mode):
string s = "abcd" ;
Similar optimizations happen with arithmetic expressions:
int x = 12 / 3 ;
is going to be optimized away to
int x = 4 ;

Datastructures, C#: ~O(1) lookup with range keys?

I have a dataset. This dataset will serve a lookup table. Given a number, I should be able to lookup a corresponding value for that number.
The dataset (let's say its CSV) has a few caveats though. Instead of:
1,ABC
2,XYZ
3,LMN
The numbers are ranges (- being "through", not minus):
1-3,ABC // 1, 2, and 3 = ABC
4-8,XYZ // 4, 5, 6, 7, 8 = XYZ
11-11,LMN // 11 = LMN
All the numbers are signed ints. No ranges overlap with another ranges. There are some gaps; there are ranges that aren't defined in the dataset (like 9 and 10 in the last snippet above).
`
How might I model this dataset in C# so that I have the most-performant lookup while keeping my in-memory footprint low?
The only option I've come up with suffers from overconsumption of memory. Let's say my dataset is:
1-2,ABC
4-6,XYZ
Then I create a Dictionary<int,string>() whose key/values are:
1/ABC
2/ABC
4/XYZ
5/XYZ
6/XYZ
Now I have hash performance-lookup, but tons of wasted space in the hash table.
Any ideas? Maybe just use PLINQ instead and hope for good performance? ;)
If your dictionary is going to truly store a wide range of key values, an approach that expands all possible ranges into explicit keys will rapidly consume more memory than you likely have available.
You're best option is to use a data structure that supports some variation of binary search (or other O(log N) lookup technique). Here's a link to a generic RangeDictionary for .NET that uses an OrderedList internally, and has O(log N) performance.
Achieving constant-time O(1) lookup requires that you expand all ranges into explicit keys. This requires both a lot of memory, and can actually degrade performance when you need to split or insert a new range. This probably isn't what you want.
You can create a doubly-indirected lookup:
Dictionary<int, int> keys;
Dictionary<int, string> values;
Then store the data like this:
keys.Add(1, 1);
keys.Add(2, 1);
keys.Add(3, 1);
//...
keys.Add(11, 3);
values.Add(1, "ABC");
//...
values.Add(3, "LMN");
And then look the data up:
return values[keys[3]]; //returns "ABC"
I'm not sure how much memory footprint this will save with trivial strings, but once you get beyond "ABC" it should help.
EDIT
After Dan Tao's comment below, I went back and checked on what he was asking about. The following code:
var abc = "ABC";
var def = "ABC";
Console.WriteLine(ReferenceEquals(abc, def));
will write "True" to the console. Which means that the either the compiler or the runtime (clarification?) is maintaining the reference to "ABC", and assigns it as the value of both variables.
After reading up some more on Interned strings, if you're using string literals to populate the dictionary, or Interning computed strings, it will in fact take more space to implement my suggestion than the original dictionary would have taken. If you're not using Interned strings, then my solution should take less space.
FINAL EDIT
If you're treating your strings correctly, there should be no excess memory usage from the original Dictionary<int, string> because you can assign them to a variable and then assign that reference as the value (or, if you need to, because you can Intern them)
Just make sure your assignment code includes an intermediate variable assignment:
while (thereAreStringsLeftToAssign)
{
var theString = theStringToAssign;
foreach (var i in range)
{
strings.Add(i, theString);
}
}
As arootbeer has mentioned in his answer, the following code does not create multiple instances of the string "ABC"; rather, it interns a single instance and assigns a reference to that instance to each KeyValuePair<int, string> in dictionary:
var dictionary = new Dictionary<int, string>();
dictionary[0] = "ABC";
dictionary[1] = "ABC";
dictionary[2] = "ABC";
// etc.
OK, so in the case of string literals, you're only using one string instance per range of keys. Is there a scenario where this wouldn't be the case--that is, where you would be using a separate string instance for each key within the range (this is what I assume you're concerned about when you speak of "overconsumption of memory")?
Honestly, I don't think so. There are scenarios where multiple equivalent string instances may be created without the benefit of interning, yes. But I can't imagine these scenarios would affect what you're trying to do here.
My reasoning is this: you want to assign certain values to different ranges of keys, right? So any time you are defining a key-range-value pairing of this sort, you have a single value and several keys. The single part is what leads me to doubt that you'll ever have multiple instances of the same string, unless it is defined as the value for more than one range.
To illustrate: yes, the following code will instantiate two identical strings:
string x = "ABC";
Console.Write("Type 'ABC' and press Enter: ");
string y = Console.ReadLine();
Console.WriteLine(Equals(x, y));
Console.WriteLine(ReferenceEquals(x, y));
The above program, assuming the user follows instructions and types "ABC," outputs True, then False. So you might think, "Ah, so when a string is only provided at run-time, it isn't interned! So this could be where my values could be duplicated!"
But... again: I don't think so. It all comes back to the fact that you are going to be assigning a single value to a range of keys. So let's say your values come from user input; then your code would look something like this:
var dictionary = new Dictionary<int, string>();
int start, count;
GetRange(out start, out count);
string value = GetValue();
foreach (int key in Enumerable.Range(start, count))
{
// Look, you're using the same string instance to assign
// to each key... how could it be otherwise?
dictionary[key] = value;
}
Now, if you were actually thinking more along the lines of what LBushkin mentions in his answer--that you may potentially have huge ranges, making it impractical to define a KeyValuePair<int, string> for each key within that range (e.g., if you have a range of 1-1000000)--then I would agree that you're best off with some sort of data structure that bases its lookup on a binary search. If that's more your scenario, say so and I will be happy to offer more ideas on that front. (Or you could just take a look at the link LBushkin already posted.)
Use a balanced ordered tree (or something similar) mapping start-of-range to end-of-range and data. This will be easy to implement for non-overlapping ranges.
arootbeer has a good solution, but one you may find confusing to work with.
Another choice is to use a reference type instead of a string, so that you point to the same reference
class StringContainer {
public string Value { get; set; }
}
Dictionary<int, StringContainer> values;
var value1 = new StringContainer { Value = "ABC" };
values.Add(1, value1);
values.Add(2, value1);
They will both point to the same instance of StringContainer
EDIT: Thanks for the comments everyone. This method handles value types other than string, so it might be useful for more than the given example. Also, it is my understanding that strings don't always behave in the manner you would expect from reference values, but I could be wrong.

What is the difference between i++ and ++i?

I've seen them both being used in numerous pieces of C# code, and I'd like to know when to use i++ and when to use ++i?
(i being a number variable like int, float, double, etc).
The typical answer to this question, unfortunately posted here already, is that one does the increment "before" remaining operations and the other does the increment "after" remaining operations. Though that intuitively gets the idea across, that statement is on the face of it completely wrong. The sequence of events in time is extremely well-defined in C#, and it is emphatically not the case that the prefix (++var) and postfix (var++) versions of ++ do things in a different order with respect to other operations.
It is unsurprising that you'll see a lot of wrong answers to this question. A great many "teach yourself C#" books also get it wrong. Also, the way C# does it is different than how C does it. Many people reason as though C# and C are the same language; they are not. The design of the increment and decrement operators in C# in my opinion avoids the design flaws of these operators in C.
There are two questions that must be answered to determine what exactly the operation of prefix and postfix ++ are in C#. The first question is what is the result? and the second question is when does the side effect of the increment take place?
It is not obvious what the answer to either question is, but it is actually quite simple once you see it. Let me spell out for you precisely what x++ and ++x do for a variable x.
For the prefix form (++x):
x is evaluated to produce the variable
The value of the variable is copied to a temporary location
The temporary value is incremented to produce a new value (not overwriting the temporary!)
The new value is stored in the variable
The result of the operation is the new value (i.e. the incremented value of the temporary)
For the postfix form (x++):
x is evaluated to produce the variable
The value of the variable is copied to a temporary location
The temporary value is incremented to produce a new value (not overwriting the temporary!)
The new value is stored in the variable
The result of the operation is the value of the temporary
Some things to notice:
First, the order of events in time is exactly the same in both cases. Again, it is absolutely not the case that the order of events in time changes between prefix and postfix. It is entirely false to say that the evaluation happens before other evaluations or after other evaluations. The evaluations happen in exactly the same order in both cases as you can see by steps 1 through 4 being identical. The only difference is the last step - whether the result is the value of the temporary, or the new, incremented value.
You can easily demonstrate this with a simple C# console app:
public class Application
{
public static int currentValue = 0;
public static void Main()
{
Console.WriteLine("Test 1: ++x");
(++currentValue).TestMethod();
Console.WriteLine("\nTest 2: x++");
(currentValue++).TestMethod();
Console.WriteLine("\nTest 3: ++x");
(++currentValue).TestMethod();
Console.ReadKey();
}
}
public static class ExtensionMethods
{
public static void TestMethod(this int passedInValue)
{
Console.WriteLine($"Current:{Application.currentValue} Passed-in:{passedInValue}");
}
}
Here are the results...
Test 1: ++x
Current:1 Passed-in:1
Test 2: x++
Current:2 Passed-in:1
Test 3: ++x
Current:3 Passed-in:3
In the first test, you can see that both currentValue and what was passed into the TestMethod() extension show the same value, as expected.
However, in the second case, people will try to tell you that the increment of currentValue happens after the call to TestMethod(), but as you can see from the results, it happens before the call as indicated by the 'Current:2' result.
In this case, first the value of currentValue is stored in a temporary. Next, an incremented version of that value is stored back in currentValue but without touching the temporary which still stores the original value. Finally that temporary is passed to TestMethod(). If the increment happened after the call to TestMethod() then it would write out the same, non-incremented value twice, but it does not.
It's important to note that the value returned from both the currentValue++ and ++currentValue operations are based on the temporary and not the actual value stored in the variable at the time either operation exits.
Recall in the order of operations above, the first two steps copy the then-current value of the variable into the temporary. That is what's used to calculate the return value; in the case of the prefix version, it's that temporary value incremented while in the case of the suffix version, it's that value directly/non-incremented. The variable itself is not read again after the initial storage into the temporary.
Put more simply, the postfix version returns the value that was read from the variable (i.e. the value of the temporary) while the prefix version returns the value that was written back to the variable (i.e. the incremented value of the temporary). Neither return the variable's value.
This is important to understand because the variable itself could be volatile and have changed on another thread which means the return value of those operations could differ from the current value stored in the variable.
It is surprisingly common for people to get very confused about precedence, associativity, and the order in which side effects are executed, I suspect mostly because it is so confusing in C. C# has been carefully designed to be less confusing in all these regards. For some additional analysis of these issues, including me further demonstrating the falsity of the idea that prefix and postfix operations "move stuff around in time" see:
https://ericlippert.com/2009/08/10/precedence-vs-order-redux/
which led to this SO question:
int[] arr={0}; int value = arr[arr[0]++]; Value = 1?
You might also be interested in my previous articles on the subject:
https://ericlippert.com/2008/05/23/precedence-vs-associativity-vs-order/
and
https://ericlippert.com/2007/08/14/c-and-the-pit-of-despair/
and an interesting case where C makes it hard to reason about correctness:
https://learn.microsoft.com/archive/blogs/ericlippert/bad-recursion-revisited
Also, we run into similar subtle issues when considering other operations that have side effects, such as chained simple assignments:
https://learn.microsoft.com/archive/blogs/ericlippert/chaining-simple-assignments-is-not-so-simple
And here's an interesting post on why the increment operators result in values in C# rather than in variables:
Why can't I do ++i++ in C-like languages?
Oddly it looks like the other two answers don't spell it out, and it's definitely worth saying:
i++ means 'tell me the value of i, then increment'
++i means 'increment i, then tell me the value'
They are Pre-increment, post-increment operators. In both cases the variable is incremented, but if you were to take the value of both expressions in exactly the same cases, the result will differ.
If you have:
int i = 10;
int x = ++i;
then x will be 11.
But if you have:
int i = 10;
int x = i++;
then x will be 10.
Note as Eric points out, the increment occurs at the same time in both cases, but it's what value is given as the result that differs (thanks Eric!).
Generally, I like to use ++i unless there's a good reason not to. For example, when writing a loop, I like to use:
for (int i = 0; i < 10; ++i) {
}
Or, if I just need to increment a variable, I like to use:
++x;
Normally, one way or the other doesn't have much significance and comes down to coding style, but if you are using the operators inside other assignments (like in my original examples), it's important to be aware of potential side effects.
int i = 0;
Console.WriteLine(i++); // Prints 0. Then value of "i" becomes 1.
Console.WriteLine(--i); // Value of "i" becomes 0. Then prints 0.
Does this answer your question ?
The way the operator works is that it gets incremented at the same time, but if it is before a variable, the expression will evaluate with the incremented/decremented variable:
int x = 0; //x is 0
int y = ++x; //x is 1 and y is 1
If it is after the variable the current statement will get executed with the original variable, as if it had not yet been incremented/decremented:
int x = 0; //x is 0
int y = x++; //'y = x' is evaluated with x=0, but x is still incremented. So, x is 1, but y is 0
I agree with dcp in using pre-increment/decrement (++x) unless necessary. Really the only time I use the post-increment/decrement is in while loops or loops of that sort. These loops are the same:
while (x < 5) //evaluates conditional statement
{
//some code
++x; //increments x
}
or
while (x++ < 5) //evaluates conditional statement with x value before increment, and x is incremented
{
//some code
}
You can also do this while indexing arrays and such:
int i = 0;
int[] MyArray = new int[2];
MyArray[i++] = 1234; //sets array at index 0 to '1234' and i is incremented
MyArray[i] = 5678; //sets array at index 1 to '5678'
int temp = MyArray[--i]; //temp is 1234 (becasue of pre-decrement);
Etc, etc...
Just for the record, in C++, if you can use either (i.e.) you don't care about the ordering of operations (you just want to increment or decrement and use it later) the prefix operator is more efficient since it doesn't have to create a temporary copy of the object. Unfortunately, most people use posfix (var++) instead of prefix (++var), just because that is what we learned initially. (I was asked about this in an interview). Not sure if this is true in C#, but I assume it would be.
I think I'll try answering the question using code. Imagine the following methods for, say, int:
// The following are equivalent:
// ++i;
// PlusPlusInt(ref i);
//
// The argument "value" is passed as a reference,
// meaning we're not incrementing a copy.
static int PlusPlusInt(ref int value)
{
// Increment the value.
value = value + 1;
// Return the incremented value.
return value;
}
// The following are equivalent:
// i++;
// IntPlusPlus(ref i);
//
// The argument "value" is passed as a reference,
// meaning we're not incrementing a copy.
static int IntPlusPlus(ref int value)
{
// Keep the original value around before incrementing it.
int temp = value;
// Increment the value.
value = value + 1;
// Return what the value WAS.
return temp;
}
Assuming you know how ref works, this should clear this up really nicely. Explaining it in english is a lot more clunky in my opinion.

Categories