I have been reading a lot and now i really confused. Consider an ordinary instantiation:
Sampleclass instance1 = new Sampleclass();
After reading a lot I came to know that instance1 is a reference variable stored in a stack which contains the memory address of the object's data which is stored in heap.
If this is correct then where is object? instance1 is also a object. Sometimes I have seen only declaration like new Sampleclass(). Is that sufficient for object instantiation?
instance1 is a variable.
Because its type is a reference type, it is a reference to an object instance that lives on the heap.
new SampleClass() is a constructor call that creates a new object on the heap and returns a reference to it.
instance1 contains the copy of the reference which points to the memory where new objet Sampleclass() is created. What's confusing is instance1 is mere a copy of a reference, which is different from reference ref (C# Reference), which might confuse you as it confused me.
The expression new Sampleclass() creates an object. It also has a value which is a pointer to that object. You can do something with this pointer such as store it in a variable (e.g. Sampleclass instance1 = new Sampleclass(); ) or you can ignore it.
Why create something and ignore it? Because its constructor might have beneficial side effects for example.
1) Sampleclass (Type) > Container type
2) instance1 (Identifier) > user-friendly name of reference (01010101010) of data which is stored in memory (heap) and "instance1" itself stored in the stack with reference (01010101010)
3) = (operator) > to asign left side value to right side
4) new (keyword) > Purchase a new space to stored data
5) Sampleclass(); (Constructor ) > make a copy of Type "Sampleclass" and stored in newly purchased space (this is actually an object or instance) and this accessed by its name "instance1" because "instance1" know the actual location of stored data in heap memory.
Related
I found it difficult to come up with a descriptive enough title for this scenario so I'll let the code do most of the talking.
Consider covariance where you can substitute a derived type for a base class.
class Base
{
}
class Derived : Base
{
}
Passing in typeof(Base) to this method and setting that variable to the derived type is possible.
private void TryChangeType(Base instance)
{
var d = new Derived();
instance = d;
Console.WriteLine(instance.GetType().ToString());
}
However, when checking the type from the caller of the above function, the instance will still be of type Base
private void CallChangeType()
{
var b = new Base();
TryChangeType(b);
Console.WriteLine(b.GetType().ToString());
}
I would assume since objects are inherently reference by nature that the caller variable would now be of type Derived. The only way to get the caller to be type Derived is to pass a reference object by ref like so
private void CallChangeTypeByReference()
{
var b = new Base();
TryChangeTypeByReference(ref b);
Console.WriteLine(b.GetType().ToString());
}
private void TryChangeTypeByReference(ref Base instance)
{
var d = new Derived();
instance = d;
}
Further more, I feel like it's common knowledge that passing in an object to a method, editing props, and passing that object down the stack will keep the changes made down the stack. This makes sense as the object is a reference object.
What causes an object to permanently change type down the stack, only if it's passed in by reference?
You have a great many confused and false beliefs. Let's fix that.
Consider covariance where you can substitute a derived type for a base class.
That is not covariance. That is assignment compatibility. An Apple is assignment compatible with a variable of type Fruit because you can assign an Apple to such a variable. Again, that is not covariance. Covariance is the fact that a transformation on a type preserves the assignment compatibility relationship. A sequence of apples can be used somewhere that a sequence of fruit is needed because apples are a kind of fruit. That is covariance. The mapping "apple --> sequence of apples, fruit --> sequence of fruit" is a covariant mapping.
Moving on.
Passing in typeof(Base) to this method and setting that variable to the derived type is possible.
You are confusing types with instances. You do not pass typeof(Base) to this method; you pass a reference to Base to this instance. typeof(Base) is of type System.Type.
As you correctly note, formal parameters are variables. A formal parameter is a new variable, and it is initialized to the actual parameter aka argument.
However, when checking the type from the caller of the above function, the instance will still be of type Base
Correct. The argument is of type Base. You copy that to a variable, and then you reassign the variable. This is no different than saying:
Base x = new Base();
Base y = x;
y = new Derived();
And now x is still Base and y is Derived. You assigned the same variable twice; the second assignment wins. This is no different than if you said a = 1; b = a; b = 2; -- you would not expect a to be 2 afterwards just because you said b = a in the past.
I would assume since objects are inherently reference by nature that the caller variable would now be of type Derived.
That assumption is wrong. Again, you have made two assignments to the same variable, and you have two variables, one in the caller, and one in the callee. Variables contain values; references to objects are values.
The only way to get the caller to be type Derived is to pass a reference object by ref like so
Now we're getting to the crux of the problem.
The correct way to think about this is that ref makes an alias to a variable. A normal formal parameter is a new variable. A ref formal parameter makes the variable in the formal parameter an alias to the variable at the call site. So now you have one variable but it has two names, because the name of the formal parameter is an alias for the variable at the call. This is the same as:
Base x = new Base();
ref Base y = ref x; // x and y are now two names for the same variable
y = new Derived(); // this assigns both x and y because there is only one variable, with two names
Further more, I feel like it's common knowledge that passing in an object to a method, editing props, and passing that object down the stack will keep the changes made down the stack. This makes sense as the object is a reference object.
Correct.
The mistake you are making here is very common. It was a bad idea for the C# design team to name the variable aliasing feature "ref" because this causes confusion. A reference to a variable makes an alias; it gives another name to a variable. A reference to an object is a token that represents a specific object with a specific identity. When you mix the two it gets confusing.
The normal thing to do is to not pass variables by ref particularly if they contain references.
What causes an object to permanently change type down the stack, only if it's passed in by reference?
Now we have the most fundamental confusion. You have confused objects with variables. An object never changes its type, ever! An apple is an object, and an apple is now and forever an apple. An apple never becomes any other kind of fruit.
Stop thinking that variables are objects, right now. Your life will get so much better. Internalize these rules:
variables are storage locations that store values
references to objects are values
objects have a type that never changes
ref gives a new name to an existing variable
assigning to a variable changes its value
Now if we ask your question again using correct terminology, the confusion disappears immediately:
What causes the value of a variable to change its type down the stack, only if it's passed in by ref?
The answer is now very clear:
A variable passed by ref is an alias to another variable, so changing the value of the parameter is the same as changing the value of the variable at the call site
Assigning an object reference to a variable changes the value of that variable
An object has a particular type
If we don't pass by ref but instead pass normally:
A value passed normally is copied to a new variable, the formal parameter
We now have two variables with no connection; changing one of them does not change the other.
If that's still not clear, start drawing boxes, circles and arrows on a whiteboard, where objects are circles, variables are boxes, and object references are arrows from variables to objects. Making an alias via ref gives a new name to an existing circle; calling without ref makes a second circle and copies the arrow. It'll all make sense then.
This is not an issue with inheritance and polymorphism, what you're seeing is the difference between pass-by-value and pass-by-reference.
private void TryChangeType(Base instance)
The preceding method's instance parameter will be a copy of the caller's Base reference. You can change the object that is referenced and those changes will be visible to the caller because both the caller the callee both reference the same object. But, any changes to the reference itself (such as pointing it to a new object) will not affect the caller's reference. This is why it works as expected when you pass by reference.
When you call TryChangeType() you are passing a copy of the reference to "b" into "instance". Any changes to members of "instance" are made in the same memory space still referenced by "b" in your calling method. However, the command "instance = d" reassigns the value of the memory addressed by "instance". "b" and "instance no longer point to the same memory. When you return to CallChangeType, "b" still references the original space and hence Type.
TryChangeTypeByReference passes the a reference to where "b"'s pointer value is actually stored. Reassigning "instance" now changes the address that "b" is actually pointing to.
We know that class are reference types, so in general when we are passing a type, we are passing a reference but there's a difference between passing just b and ref b, which can be understood as:
In first case 1 it is passing reference by value, which means creating a separate pointer internally to the memory location, now when base class object is assigned to the derived class object, it starts pointing to another object in the memory and when that method returns, only the original pointer remains, which provides the same instance as Base class, when the new pointer created is off for garbage collection
However when object is passed as ref, this is passing reference to a reference in memory, which is like pointer to a pointer, like double pointer in C or C++, which when changes actually changes the original memory allocation and thus you see the difference
For first one to show the same result value has to be returned from the method and old object shall start pointing to the new derived object
Following is the modification to your program to get expected result in case 1:
private Base TryChangeType(Base instance)
{
var d = new Derived();
instance = d;
Console.WriteLine(instance.GetType().ToString());
return instance;
}
private void CallChangeType()
{
var b = new Base();
b = TryChangeType(b);
Console.WriteLine(b.GetType().ToString());
}
Following is the pictorial reference of both the cases:
When you do not pass by reference, a copy of the base class object is passed inside the function, and this copy is changed inside the TryChangeType function. When you print the type of the instance of the base class it is still the of the type "Base" because the copy of the instance was changed to "Derived" class.
When you pass by referece, the address of the instance i.e. the instace itself will be passed to the function. So any changes made to the instance inside the function is permanent.
Which of these two cases is the most correct? This case where the rxBytes variable is only declared?
private void ParseRxData(RxMessage rxMessage) {
List<byte> rxBytes;
rxBytes = rxMessage.LengthBytes;
rxBytes.AddRange(rxMessage.Payload);
/* Do stuff with 'rxBytes' variable */
}
Or this case where the rxBytes varible is instantiated?
private void ParseRxData(RxMessage rxMessage) {
var rxBytes = new List<byte>;
rxBytes = rxMessage.LengthBytes;
rxBytes.AddRange(rxMessage.Payload);
/* Do stuff with 'rxBytes' variable */
}
In short, is it necessary to instantiate a variable when it's assigned a value immediately after it's been declared?
I'm a fairly new C#/OOP programmer, so I apologize if I'm not using terminology correctly.
In short, is it necessary to instantiate a variable when it's assigned a value immediately after it's been declared?
No - in fact if you initialize rxBytes to a new List<byte>, then overwrite the variable with rxMessage.LengthBytes, then the list you created is never used and will be garbage collected. You seem to assign it a List<byte> just so you can use var which is unnecessary. Your first code block is the correct approach.
You can also just do
var rxBytes = rxMessage.LengthBytes;
But functionally there's no difference between declaring the variable on one line and assigning it a value on another.
Is there any way to set the equal sign "by value"?
Well for reference types the "value" is a reference, but if you just want the new variable to reference a copy of the list you can do
var rxBytes = new List<byte>(rxMessage.LengthBytes);
The following is entirely legal syntax:
List<byte> rxBytes = rxMessage.LengthBytes;
It's not the instantiation that is required, just an assignment.
When you create a new object using the new keyword, you are allocating memory for that object. In your case, however, you allocate memory for an object and hand a reference to the object to your variable, then immediately assign that variable to point at another object. That first object immediately becomes orphaned and, eventually, garbage collected.
So really, what you are asking if you can do is not only allowed, it is a better practice for responsible programming.
From my experience of C++, I know that in C, objects declared as ClassName ObjectName; are stored on the stack, and objects declared as ClassName ObjectName = new ClassName; are stored on the heap.
In C#, I seem to be being told from everywhere that the new keyword must be used, i.e. you cannot initialize an object like ClassName ObjectName; i.e.
Product P;
P.someMethod();
Why is this?
In C# class Objects and any values in the objects will always be stored on the heap. The new key word allocates memory on the heap for the object and any values it has, and returns the reference to its location. Until this is done you should not be able to work with the object functions.
so in example:
Product P = new Product();
p is actually a reference to the allocated object. An object can have multiple references to the same object.
Product C = P;
In the case C does not copy P, but it copies the reference to the object.
Structs work differently than objects since they are allocated on the stack. This means the same operation as above will actually copy the struct and allocate new memory for it on the stack.
I'll answer my own question for the sake of pulling the info together for clarity.
A combination of mohits00691 and Jon Skeet's answers clears this up. Even though P is declared as a type of Product, it has no default value and is not instantiated until it is set with "= new Product".
This differs from C++, where Product P would instantiate an object of class Product.
as far i know, code like :
Product p;
p.someFunction();
will throw an error while compiling only : "Unassigned Local variable". So you need to give value to every variable, be it reference type or value type, before using it in C#.
At work we were encountering a problem where the original object was changed after we send a copy through a method. We did find a workaround by using IClonable in the original class, but as we couldn't find out why it happened in the first place.
We wrote this example code to reproduce the problem (which resembles our original code), and hope someone is able to explain why it happens.
public partial class ClassRefTest : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
var myclass = new MyClass();
var copy = myclass;
myclass.Mystring = "jadajadajada";
Dal.DoSomeThing(copy);
lit.Text = myclass.Mystring; //Text is expected to be jadajadajada,
but ends up to be referenced
}
}
public class MyClass
{
public string Mystring { get; set; }
}
public static class Dal
{
public static int? DoSomeThing(MyClass daclass)
{
daclass.Mystring = "referenced";
return null;
}
}
As you can see, in the DoSomething() method we're not using any ref argument, but still the lit.Text ends up to be referenced.
Why does this happen?
It is always interesting to explain how this works. Of course my explanation could not be on par with the magnificiency of the Jon Skeet one or Joseph Albahari, but I would try nevertheless.
In the old days of C programming, grasping the concept of pointers was fundamental to work with that language. So many years are passed and now we call them references but they are still ... glorified pointers and, if you understand how they work, you are half the way to become a programmer (just kidding)
What is a reference? In a very short answer I would tell. It is a number stored in a variable and this number represent an address in memory where your data lies.
Why we need references? Because it is very simple to handle a single number with which we could read the memory area of our data instead of having a whole object with all its fields moved along with our code.
So, what happens when we write
var myclass = new MyClass();
We all know that this is a call to the constructor of the class MyClass, but for the Framework it is also a request to provide a memory area where the values of the instance (property, fields and other internal housekeeping infos) live and exist in a specific point in time. Suppose that MyClass needs 100 bytes to store everything it needs. The framework search the computer memory in some way and let's suppose that it finds a place in memory identified by the address 4200. This value (4200) is the value that it is assigned to the var myclass It is a pointer to the memory (oops it is a reference to the object instance)
Now what happens when you call?
var copy = myclass;
Nothing particular. The copy variable gets the same value of myclass (4200). But the two variables are referencing the same memory area so using one or the other doesn't make any difference. The memory area (the instance of MyClass) is still located at our fictional memory address 4200.
myclass.Mystring = "jadajadajada";
This uses the reference value as a base value to find the area of memory occupied by the property and sets its value to the intern area where the literal strings are kept. If I could make an analogy with pointers it is as you take the base memory (4200), add an offset to find the point where the reference representing the propery MyString is kept inside the boundaries of the 100 bytes occupied by our object instance. Let's say that the MyString reference is 42 bytes past the beginning of the memory area. Adding 42 to 4200 yelds 4242 and this is the point in which the reference to the literal "jadajadajada" will be stored.
Dal.DoSomeThing(copy);
Here the problem (well the point where you have the problem). When you pass the copy variable don't think that the framework repeat the search for a memory area and copy everything from the original area in a new area. No, it would be practically impossible (think about if MyClass contains a property that is an instance of another class and so on... it could never stop.) So the value passed to the DoSomeThing method is again the reference value 4200. This value is automatically assigned to the local variable daclass declared as the input parameter for DoSomething (It is like you have explicitly done before with var copy = myclass;.
At this point it is clear that any operation using daClass acts on the same memory area occupied by the original instance and you see the results when code returns back to your starting point.
I beg the pardon from the more technically expert users here. Particularly for my casual and imprecise use of the term 'memory address'.
that's normal since your MyClass is a reference type so you are passing a reference to original data not the data itself this why it's an expected behavior
here is an explanation of what a reference type is from Parameter passing in C#
A reference type is a type which has as its value a reference to the appropriate data rather than the data itself
I see two issues here...
Making a Copy of an object
var copy = myClass; does not make a copy - what it really does is create a second reference ("pointer") to myClass (naming the variable "copy" is misleading). So you have myClass and copy pointing to the same exact object.
To make a copy you have to do something like:
var copy = new MyClass(myClass);
Notice that I created a new object.
Passing By Reference
When passing value type variables without ref, the variable cannot be changed by the the receiving method.
Example: DoSomething(int foo) - DoSomething cannot affect the value of foo outside of itself.
When passing value type variables with ref, the variable can be changed
Example: DoSomething(ref int foo) - if DoSomething changes foo, it will remain changed.
When passing an object without ref, the object's data can be changed, but the reference to the object cannot be changed.
void DoSomething(MyClass myClass)
{
myClass.myString = "ABC" // the string is set to ABC
myClass = new MyClass(); // has no affect - or may not even be allowed
}
When passing an object with ref, the object's data can be changed, and the reference to the object can be changed.
void DoSomething(ref MyClass myClass)
{
myClass.myString = "ABC" // the string is set to ABC
myClass = new MyClass(); // the string will now be "" since myClass has been changed
}
The docs at MSDN say it pretty clearly. Value types are passed as a copy by default, objects are passed as a reference by default. Methods in C#
Given,
public class SomeClass {
public string SomeName{get;}
public List<string> RelatedNames{get;}
}
public class Program{
public void Main(){
var someClassInstance = new SomeClass(){ SomeName = "A", RelatedNames = new List<string>(1){ "a" }};
// So, now someClassInstance have been allocated some memory in heap = 1 string object and a list with 1 string object.
// Since SomeClass is mutable, it could be modified as below
someClassInstance.SomeName = "Now This is much more than a name";
someClassInstance.RelatedNames = someClassInstance.RelatedNames.AddRange(new List<string>(100} { "N","o","w".....});
//Now what happens inside heap?
//1.someClassInstance.SomeName will move it's pointer to another string inside heap
//2.someClassInstance.RealtedNames will move it's pointer to another List<>(101) inside heap.
//Is it correct? Then where is 'mutability' ?
}
}
As mentioned in the comments above, "AFAIK" on modifying a mutable object the internal pointers of that object will just point to another memory location inside heap. If that is correct, then does that mean that all objects inside heap (reference type) are immutable?
Thanks for your interest.
Where's mutability? Right there:
someClassInstance.SomeName = "Now This is much more than a name";
someClassInstance.RelatedNames = new List<string>(100} { "N","o","w".....};
You just mutated the object pointed to by someClassInstance.
Also, your example is a bit contrived. Strings are indeed immutable, but Lists are not, so you could have done this:
someClassInstance.RelatedNames.Add("HELLO!");
And then you just mutated the object pointed to by someClassInstance.RelatedNames.
EDIT: I see you changed your question. Well, then:
someClassInstance.SomeName will move it's pointer to another string inside heap
someClassInstance.RealtedNames will move it's pointer to another List<>(101) inside heap.
1 is true because String was designed to be immutable. That's why there's the StringBuilder class in case you need a mutable string.
2 is false, because that's not how List is implemented. Perhaps that's where your confusion comes from. Still, when you invoke AddRange, someClassInstance.RelatedNames will still point to the same instance, but that instance's internal state will have changed (most likely, its backing array will have been changed to point to a different array object, and its count would now be 101). In fact, a reference cannot magically change based on the operations that are invoked to the object it refers to.
And none of that changes the fact that someClassInstance's internal state was mutated anyway.
Object in the CLR are definitely not immutable by default. There is a little bit of confusion here because you've used string in your example which is a type that's implemented as an immutable type. This is certainly not the default in .Net though and mutability is far more common than immutability.
Take this line as an example
someClassInstance.SomeName = "Now This is much more than a name";
There are 3 objects of interest here in this statement.
The object referenced by someClassInstance.SomeName
The string which has the value "Now this is much more than a name"
The object referenced by 'someClassInstance`
All 3 of these values live in the heap. The execution of this statement will mutate the contents of the object referenced by someClassInstance. This is a prime example of mutability in action. If everything in this scenario were immutable then the settnig of SomeName would need to produce a copy of the object referenced by someClassInstance and give it the new value. This doesn't happen here and can be demonstrated by the following
var obj = someClassInstance; // Both reference the same object
someClassInstance.SomeName = "hello";
Console.WriteLine(someClassInstance.SomeName): // Prints "hello"
Yes because they are put on the heap using new or malloc and are pointers. As such you can only add or remove pointer references. So technically the objects themselves are not immutable, since they are not on the heap to begin with, but the pointer allocations on the heap are immutable.