int number = 1
The value of number is 1 because it is a value type
What is the actual value of the pointer that is assigned to reference type variables?
Is it an int or string? Or is it some bits? What would it look like if you write it out? Is it possible to assign a reference to a variable using that value?
Question harrysQuestion = new Question();
harrysQuestion is just a pointer or reference to the new Question. So what is the value of that pointer? The same value that is assigned to another Question variable if I do this:
Question harrysQuestionAgain = harrysQuestion;
Is it a number that points to some position in my computers memory? Is it an actual C# value variable behind the scenes?
Is it a number that points to some position in my computers memory?
Conceptually, references and pointers are separate but related. In reality they are virtually interchangeable, with the distinction that the GC knows how to walk and fixup references (garbage collection etc), but not pointers (and there are other things about how fixed works in terms of a hack in the value, allowing a reference value found on the stack to be interpreted as "pinned" cheaply). In reality, they are so close to each-other in all implementations (for performance reasons) that you can think of them as kinda the same.
It is very rare that you'd actually want to get the "value" of a reference (rather than dereferencing it), and unless you pin the object first you need to be very careful about doing so as the address can change (and the pointer version will not be corrected). The need for this use-case actually increases slightly with the upcoming "pipelines" work, so the corefxlab / myget version of the Unsafe utility type actually provides some methods to facilitate the exchange of references / pointers (including interior pointers/references into objects), but: unless you're doing something low level you'll probably never need that.
Per request (comments): I mentioned "pinning" and "fixed" - the problem here is that .NET has a "compacting" garbage collector, which is allowed to move objects around at runtime, as long as it promises to fix all the references and make sure that you never notice this from managed code. What it doesn't promise is to fix pointers. So: if you're going to be looking at any object as a pointer, you need to tell the runtime (and in particular: the garbage collector) to not move that object at all, or at least until you tell it that you're done. This is what "pinning" is. There are two ways to "pin":
for long-term pins (typically of things like byte[] buffers that you're going to store as a field in an object and pass to unmanaged code as a pointer), you can take a GCHandle against an object, which gets logged in a global structure that the GC knows to look at
for short-term pins of references that are on a stack, the fixed keyword does some voodoo that lets the GC (which always looks at every stack) know that a reference - and thus the object referred to (the object at that address) - should be considered pinned, without needing to constantly add/remove to a global structure
As a perhaps interesting side note: "interior references" and references to value types are a concept that only exists on the stack - not as fields on a type that could end up on the heap (which means any class or struct except for the new ref struct concept). They work the same as regular references, but the target of those references are the contents themselves, not the start of the object header. That means that
var fieldReference = ref this._someField;
or
SomeOtherMethod(ref this._someField);
or
SomeOtherMethod(ref someArray[index]);
work inside a method as long as that interior reference is only on the stack (i.e. no async / yield / captured-variables / etc); the GC is happy to do the overhead of resolving interior pointers to objects but only for the stacks - to reduce the overall scale of the work involved.
I am working on a project that has nested Lists of classes. hence my code look like this when I want to get a variable.
MainClass.subclass1[element1].subClass2[element2].subClass3[element3].value;
I was wondering how I could get an alias for subClass3 so I can get all the variables in it without having to look in all the subclasses, like this.
subClass3Alias.value
in c++ this would be easy simply have a pointer pointing to it, but C# does not really have pointers.
No need for pointers – types in C# are usually reference types anyway, meaning that you can just copy them and they will refer to the original object (like pointers in C++):
var subclassAlias = MainClass.subclass1[element1].subClass2[element2].subClass3[element3];
Now you can use subclassAlias.value.
A slightly different thing occurs if your type happens to be a value type: in that case, the above will still work – but subclassAlias will be a value copy of the original value, meaning that changes to subclassAlias will not be reflected in the original object.
That said, this looks like suspicious code anyway – normally such deep levels of nesting are a sign of bad design and violate the Law of Demeter.
(Incidentally, in C++ you wouldn’t use pointers either.)
I am trying to use request elements from an ASP application in a .NET class library that is used in the application. I came across a head scratcher that I can't wrap my head around:
//Context is an ASPTypeLibrary.ScriptingContext
dynamic req = System.EnterpriseServices.ContextUtil.GetNamedProperty("Request");
Context.Response.Write(req.Form("mykey")); //this writes the value I expected
Context.Response.Write(String.Format("{0}", req.Form("mykey"))); //this writes 'System.__ComObject'
Am I going about this all wrong? I was using info I gleaned from this question.
You should note that Request.Form("someKey") is not a string. The source of your confusion, however, originates not in Request.Form("someKey") but on the other side, in Response.Write(...).
There are some automatic conversion shenanigans going on.
Request.Write(...) doesn't take a string. It takes a Variant. The method will do its darnest to output whatever you pass to it.
If the Variant holds a BSTR (a COM string), it will output that unchanged. It will also try calling VarChangeTypeEx(...) (kind-of; see note below) to try to see if it can get COM to convert it to a BSTR (that's what happens when you pass it a number). If the Variant contains an object with a default method on it ([propvalue]), and it has no better way to output it, it will call the default method and start over with the result of that. I think it has a few other tricks up its sleeve, which are not entirely clearly documented.
At a high level, it should now be clear what's happening. On the first line, req.Form("myKey") returns a COM object, which then gets passed down to Response.Write(...), which then converts that object to a BSTR string and outputs it. On the other hand, when you try to pass req.Form("myKey") to a C# method, the conversion doesn't occur and you get a generic COM object instead, with predictable consequences.
So what is the return value of Request.Form("someKey") then? It's an IRequestDictionary object. And why a dictionary? Because you can submit an http request that has multiple form elements with the same name. This can be the case, for example, when the input elements are checkboxes intended to be overlapping options.
What happens when the form has multiple entries? The conversion process returns a joined string analog to String.Join(", ", someArray) in C#.
It's not clear to me whether Response.Write has intimate knowledge of IRequestDictionary (unlikely), or whether it knows about COM Enumerator pattern (more likely) and it enumerates them to compose the string.
More interesting to me is who is responsible for the conversion process, because VBScript's CStr() will do the same conversion. I had always assumed that CStr() was a thin wrapper around VarChangeTypeEx(...), but I'm pretty sure that VarChangeTypeEx(...) does not concatenate enumerators like that. Obviously CStr() is a lot fancier than I had assumed. I believe that Response.Write simply calls internally whatever API fully implements CStr() and relies on that for the conversion.
For further exploration of the Classic ASP objects and interface, try http://msdn.microsoft.com/en-us/library/ms524856(v=vs.90).aspx instead of the usual VBScript-based descriptions.
I'm just learning C# and working with some examples of strings and StringBuilder. From my reading, I understand that if I do this:
string greeting = "Hello";
greeting += " my good friends";
that I get a new string called greeting with the concatenated value. I understand that the run-time(or compiler, or whatever) is actually getting rid of the reference to the original string greeting and replacing it with a new concatenated one of the same name.
I was just wondering what practical application/ramification this has. Why does it matter to me how C# shuffles strings around in the background when the effect to me is simply that my initial variable changed value.
I was wondering if someone could give me a scenario where a programmer would need to know the difference. * a simple example would be nice, as I'm a relative beginner to this.
Thanks in advance..
Strings, again, are a good example. A very common error is:
string greeting = "Hello Foo!";
greeting.Replace("Foo", "World");
Instead of the proper:
string greeting = "Hello Foo!";
greeting = greeting.Replace("Foo", "World");
Unless you knew that string was an immutable class, you could suspect the first method would be appropriate.
Why does it matter to me how C# shuffles strings around in the background when the effect to me is simply that my initial variable changed value.
The other major place where this has huge advantages is when concurrency is introduced. Immutable types are much easier to deal with in a concurrent situation, as you don't have to worry about whether another thread is modifying the same value within the same reference. Using an immutable type often allows you to avoid the potentially significant cost of synchronization (ie: locking).
I understand that the run-time(or compiler, or whatever) is actually getting rid of the reference to the original string greeting and replacing it with a new concatenated one of the same name.
Pedantic intro: No. Objects do not have names -- variables do. It is storing a new object in the same variable. Thus, the name (variable) used to access the object is the same, even though it (the variable) now refers to another object. An object may also be stored in multiple variables and have multiple "names" at the same time or it might not be accessible directly by any variable.
The other parts of the question have already been succinctly answered for the case of strings -- however, the mutable/immutable ramifications are much larger. Here are some questions which may widen the scope of the issue in context.
What happens if you set a property of an object passed into a method? (There are these pesky "value-types" in C#, so it depends...)
What happens if a sequence of actions leaves an object in an inconsistent state? (E.g. property A was set and an error occurred before property B was set?)
What happens if multiple parts of code expect to be modifying the same object, but are not because the object was cloned/duplicated somewhere?
What happens if multiple parts of code do not expect the object to be modified elsewhere, but it is? (This applies in both threading and non-threading situations)
In general, the contract of an object (API and usage patterns/scope/limitations) must be known and correctly adhered to in order to ensure program validity. I generally find that immutable objects make life easier (as then only one of the above "issues" -- a meager 25% -- even applies).
Happy coding.
C# isn't doing any "shuffling", you are! Your statement assigns a new value to the variable, the referenced object itself did not change, you just dropped the reference.
The major reason immutability is useful is this:
String greeting = "Hello";
// who knows what foo does
foo(greeting);
// always prints "Hello" since String is immutable
System.Console.WriteLine(greeting);
You can share references to immutable objects without worrying about other code changing the object--it can't happen. Therefore immutable objects are easier to reason about.
Most of the time, very little effect. However, in the situation of concatenating many strings, the performance hit of garbage collecting all those strings becomes problematic. Do too many string manipulations with just a string, and the performance of your application can take a nosedive.
This is the reason why StringBuilder is more effective when you have a lot of string manipulation to do; leaving all those 'orphaned' strings out there makes a bigger problem for the Garbage Collector than simply modifying an in memory buffer.
I think the main benefit of immutable strings lies in make memory management easier.
C# allocates memory byte by byte for each object. If you create a string "Tom" it takes up three bytes. You may then allocate an integer and that would be four bytes. If you then tried to change the string "Tom" to "Tomas" it would require moving all the other memory to make room for the two new characters a and s.
To eliminate this pain, it's easier (and quicker) to just allocate five new bytes for the string "Tomas".
Does that help?
In performance terms, the advantage of immutuable is copying an object is cheap in terms of both CPU and memory since it only involves making a copy of a pointer. The downside is that writing to the object becomes more expensive since it must make a copy of the object in the process.
I have been developing a project that I absolutely must develop part-way in C++. I need develop a wrapper and expose some C++ functionality into my C# app. I have been a C# engineer since the near-beginning of .NET, and have had very little experience in C++. It still looks very foreign to me when attempting to understand the syntax.
Is there anything that is going to knock me off my feet that would prevent me from just picking up C++ and going for it?
C++ has so many gotchas that I can't enumerate them all. Do a search for "C# vs C++". A few basic things to know:
In C++:
struct and a class are basically the same thing (Default visibility for a struct is public, it's private for a class).
Both struct and class can be created either on the heap or the stack.
You have to manage the heap yourself. If you create something with "new", you have to delete it manually at some point.
If performance isn't an issue and you have very little data to move around, you can avoid the memory management issue by having everything on the stack and using references (& operator).
Learn to deal with .h and .cpp. Unresolved external can be you worse nightmare.
You shouldn't call a virtual method from a constructor. The compiler will never tell you so I do.
Switch case doesn't enforce "break" and go thru by default.
There is not such a thing as an interface. Instead, you have class with pure virtual methods.
C++ aficionados are dangerous people living in cave and surviving on the fresh blood of C#/java programmers. Talk with them about their favorite language carefully.
Garbage collection!
Remember that everytime you new an object, you must be responsible for calling delete.
There are a lot of differences, but the biggest one I can think of that programmers coming from Java/C# always get wrong, and which they never realize they've got wrong, is C++'s value semantics.
In C#, you're used to using new any time you wish to create an object. And whenever we talk about a class instance, we really mean "a reference to the class instance". Foo x = y doesn't copy the object y, it simply creates another reference to whatever object y references.
In C++, there's a clear distinction between local objects, allocated without new (Foo f or Foo f(x, y), and dynamically allocated ones (Foo* f = new Foo() or Foo* f = new Foo(x, y)). And in C# terms, everything is a value type. Foo x = y actually creates a copy of the Foo object itself.
If you want reference semantics, you can use pointers or references: Foo& x = y creates a reference to the object y. Foo* x = &y creates a pointer to the address at which y is located. And copying a pointer does just that: it creates another pointer, which points to whatever the original pointer pointed to. So this is similar to C#'s reference semantics.
Local objects have automatic storage duration -- that is, a local object is automatically destroyed when it goes out of scope. If it is a class member, then it is destroyed when the owning object is destroyed. If it is a local variable inside a function, it is destroyed when execution leaves the scope in which it was declared.
Dynamically allocated objects are not destroyed until you call delete.
So far, you're probably with me. Newcomers to C++ are taught this pretty soon.
The tricky part is in what this means, how it affects your programming style:
In C++, the default should be to create local objects. Don't allocate with new unless you absolutely have to.
If you do need dynamically allocated data, make it the responsibility of a class. A (very) simplified example:
class IntArrayWrapper {
explicit IntArrayWrapper(int size) : arr(new int[size]) {} // allocate memory in the constructor, and set arr to point to it
~IntArrayWrapper() {delete[] arr; } // deallocate memory in the destructor
int* arr; // hold the pointer to the dynamically allocated array
};
this class can now be created as a local variable, and it will internally do the necessary dynamic allocations. And when it goes out of scope, it'll automatically delete the allocated array again.
So say we needed an array of x integers, instead of doing this:
void foo(int x){
int* arr = new int[x];
... use the array ...
delete[] arr; // if the middle of the function throws an exception, delete will never be called, so technically, we should add a try/catch as well, and also call delete there. Messy and error-prone.
}
you can do this:
void foo(int x){
IntArrayWrapper arr(x);
... use the array ...
// no delete necessary
}
Of course, this use of local variables instead of pointers or references means that objects are copied around quite a bit:
Bar Foo(){
Bar bar;
... do something with bar ...
return bar;
}
in the above, what we return is a copy of the bar object. We could return a pointer or a reference, but as the instance created inside the function goes out of scope and is destroyed the moment the function returns, we couldn't point to that. We could use new to allocate an instance that outlives the function, and return a function to that -- and then we get all the memory management headaches of figuring out whose responsibility it is to delete the object, and when that should happen. That's not a good idea.
Instead, the Bar class should simply be designed so that copying it does what we need. Perhaps it should internally call new to allocate an object that can live as long as we need it to. We could then make copying or assignment "steal" that pointer. Or we could implement some kind of reference-counting scheme where copying the object simply increments a reference counter and copies the pointer -- which should then be deleted not when the individual object is destroyed, but when the last object is destroyed and the reference counter reaches 0.
But often, we can just perform a deep copy, and clone the object in its entirety. If the object includes dynamically allocated memory, we allocate more memory for the copy.
It may sound expensive, but the C++ compiler is good at eliminating unnecessary copies (and is in fact in most cases allowed to eliminate copy operations even if they have side effects).
If you want to avoid copying even more, and you're prepared to put up with a little more clunky usage, you can enable "move semantics" in your classes as well as (or instead of) "copy semantics". It's worth getting into this habit because (a) some objects can't easily be copied, but they can be moved (e.g. a Socket class), (b) it's a pattern established in the standard library and (c) it's getting language support in the next version.
With move semantics, you can use objects as a kind of "transferable" container. It's the contents that move. In the current approach, it's done by calling swap, which swaps the contents of two objects of the same type. When an object goes out of scope, it is destructed, but if you swap its contents into a reference parameter first, the contents escape being destroyed when the scope ends. Therefore, you don't necessarily need to go all the way and use reference counted smart pointers just to allow complex objects to be returned from functions. The clunkiness comes from the fact that you can't really return them - you have to swap them into a reference parameter (somewhat similar to a ref parameter in C#). But the language support in the next version of C++ will address that.
So the biggest C# to C++ gotcha I can think of: don't make pointers the default. Use value semantics, and instead tailor your classes to behave the way you want when they're copied, created and destroyed.
A few months ago, I attempted to write a series of blog posts for people in your situation:
Part 1
Part 2
Part 3
I'm not 100% happy with how they turned out, but you may still find them useful.
And when you feel that you're never going to get a grip on pointers, this post may help.
No run-time checks
One C++ pitfall is the behaviour when you try to do something that might be invalid, but which can only be checked at runtime - for example, dereferencing a pointer that could be null, or accessing an array with an index that might be out of range.
The C# philosophy emphasises correctness; all behaviour should be well-defined and, in cases like this, it performs a run-time check of the preconditions and throws well-defined exceptions if they fail.
The C++ philosophy emphasises efficiency, and the idea that you shouldn't pay for anything you might not need. In cases like this, nothing will be checked for you, so you must either check the preconditions yourself or design your logic so that they must be true. Otherwise, the code will have undefined behaviour, which means it might (more or less) do what you want, it might crash, or it might corrupt completely unrelated data and cause errors that are horrendously difficult to track down.
Just to throw in some others that haven't been mentioned yet by other answers:
const: C# has a limited idea of const. In C++ 'const-correctness' is important. Methods that don't modify their reference parameters should take const-references, eg.
void func(const MyClass& x)
{
// x cannot be modified, and you can't call non-const methods on x
}
Member functions that don't modify the object should be marked const, ie.
int MyClass::GetSomething() const // <-- here
{
// Doesn't modify the instance of the class
return some_member;
}
This might seem unnecessary, but is actually very useful (see the next point on temporaries), and sometimes required, since libraries like the STL are fully const-correct, and you can't cast const things to non-const things (don't use const_cast! Ever!). It's also useful for callers to know something won't be changed. It is best to think about it in this way: if you omit const, you are saying the object will be modified.
Temporary objects: As another answer mentioned, C++ is much more about value-semantics. Temporary objects can be created and destroyed in expressions, for example:
std::string str = std::string("hello") + " world" + "!";
Here, the first + creates a temporary string with "hello world". The second + combines the temporary with "!", giving a temporary containing "hello world!", which is then copied to str. After the statement is complete, the temporaries are immediately destroyed. To further complicate things, C++0x adds rvalue references to solve this, but that's way out of the scope of this answer!
You can also bind temporary objects to const references (another useful part of const). Consider the previous function again:
void func(const MyClass& x)
This can be called explicitly with a temporary MyClass:
func(MyClass()); // create temporary MyClass - NOT the same as 'new MyClass()'!
A MyClass instance is created, on the stack, func2 accesses it, and then the temporary MyClass is destroyed automatically after func returns. This is convenient and also usually very fast, since the heap is not involved. Note 'new' returns a pointer - not a reference - and requires a corresponding 'delete'. You can also directly assign temporaries to const references:
const int& blah = 5; // 5 is a temporary
const MyClass& myClass = MyClass(); // creating temporary MyClass instance
// The temporary MyClass is destroyed when the const reference goes out of scope
Const references and temporaries are frequent in good C++ style, and the way these work is very different to C#.
RAII, exception safety, and deterministic destructors. This is actually a useful feature of C++, possibly even an advantage over C#, and it's worth reading up on since it's also good C++ style. I won't cover it here.
Finally, I'll just throw in this is a pointer, not a reference :)
The traditional stumbling blocks for people coming to C++ from C# or Java are memory management and polymorphic behavior:
While objects always live on the heap and are garbage collected in C#/Java, you can have objects in static storage, stack or the heap ('free store' in standard speak) in C++. You have to cleanup the stuff you allocate from the heap (new/delete). An invaluable technique for dealing with that is RAII.
Inheritance/polymorphism work only through pointer or reference in C++.
There are many others, but these will probably get you first.
Virtual destructors.
Header files! You'll find yourself asking, "so why do I need to write method declarations twice every time?"
Pointers and Memory Allocation
...I'm a C# guy too and I'm still trying to wrap my head around proper memory practices in C/C++.
Here is a brief overview of Managed C++ here. An article about writing an Unmanaged wrapper using the Managed C++ here. There is another article here about mixing Unmanaged with Managed C++ code here.
Using Managed C++ would IMHO make it easier to use as a bridge to the C# world and vice versa.
Hope this helps,
Best regards,
Tom.
The biggest difference is C#'s reference semantics (for most types) vs. C++'s value semantics. This means that objects are copied far more often than they are in C#, so it's important to ensure that objects are copied correctly. This means implementing a copy constructor and operator= for any class that has a destructor.
Raw memory twiddling. Unions, memsets, and other direct memory writes. Anytime someone writes to memory as a sequence of bytes (as opposed to as objects), you lose much of the ability to reason about the code.
Linking
Linking with external libraries is not as forgiving as it is in .Net, $DEITY help you if you mix something compiled with different flavors of the same msvcrt (debug, multithread, unicode...)
Strings
And you'll have to deal with Unicode vs Ansi strings, these are not exactly the same.
Have fun :)
The following isn't meant to dissuade in any way :D
C++ is a minefield of Gotcha's, it's relatively tame if you don't use templates and the STL -- and just use object orientation, but even then is a monster. In that case object based programming (rather than object-oriented programming) makes it even tamer -- often this form of C++ is enforced in certain projects (i.e., don't use any features that have even a chance of being naively used).
However you should learn all those things, as its a very powerful language if you do manage to traverse the minefield.If you want to learn about gotcha's you better get the books from Herb Sutter, Scott Myers, and Bjarne Stroustrup. Also Systematically going over the C++ FAQ Lite will help you to realize that it indeed does require 10 or so books to turn into a good C++ programmer.