Upcasting and its effect on the heap

Upcasting and its effect on the heap - c#

For the following classes:
public class Parent {
//Parent members
}
public class ChildA : Parent {
//ChildA members
}
public class ChildB : Parent {
//ChildB members
}
If I upcast ChildA or ChildB instance to a Parent instance, then I can't accesses their members, but their members are still there, because if I downcast and try to access their members again I will find that they still have their data.
I think this means that the Parent Instance keep allocating memory for the Child classes.
So does this mean when I instantiate a Parent Class that its allocating memory for the child classes members, or is that just happening when I cast?
And is it possible for a parent to allocate memory for more than one child if we go backward and forward with casting?

In the case you describe above, casting does not affect the memory that is allocated when casting from base to sub class and vice versa.
If you instantiate a Parent you will have a Parent object in memory. If you cast that to either of the child classes it will fail with an InvalidCastException.
If you instantiate either child you will have a child object in memory. You can cast this to the Parent and then back again. The memory allocation does not change in either case.
Additionally, if you instantiate a ChildA, cast to Parent and then attempt to cast to ChildB, you will get an InvalidCastException

"Normal" Upcasting and Downcasting of Reference Types
For reference types, casting variables doesn't change the type of the object already allocated on the heap, it just affects the type of the variable which references the object.
So no, there isn't any additional heap overhead with casting reference types (i.e. object instances from classes) provided that there are no custom conversion operators involved (See below, tolanj's comment).
Consider the following class hierarchy:
public class Fruit
{
public Color Colour {get; set;}
public bool Edible {get; set;}
}
public class Apple : Fruit
{
public Apple { Color = Green; Edible = true; KeepsDoctorAtBay = true;}
public bool KeepsDoctorAtBay{get; set;}
}
Which, when used with both upcasting and downcasting:
There is only ever one allocation on the heap, which is the initial var foo = new Apple().
After the various variable assignments, all three variables, foo, bar and baz point to the same object (an Apple instance on the heap).
Upcasting (Fruit bar = foo) will simply restrict the variable's available access to only Fruit methods and properties, and if the (Apple)bar downcast is successful all methods, properties and events of the downcast type will be available to the variable. If the downcast fails, an InvalidCastException will be thrown, as the type system will check the type of the heap object's compatability with the variable's type at run time.
Conversion Operators
As per tolanj's comment, all bets about the heap are off if an explicit conversion operator replaces the default casting of reference types.
For instance, if we add an unrelated class:
public class WaxApple // Not inherited from Fruit or Apple
{
public static explicit operator Apple(WaxApple wax)
{
return new Apple
{
Edible = false,
Colour = Color.Green,
KeepsDoctorAtBay = false
};
}
}
As you can imagine, WaxApple's explicit operator Apple can do whatever it likes, including allocate new objects on the heap.
var wax = new WaxApple();
var fakeApple = (Apple)wax;
// Explicit cast operator called, new heap allocation as per the conversion code.

A (down-)cast is nothing but a view onto an instance of a class by the "eyes of the parent class". Thus you´re neither losing nor adding any information nor memory by casting, you simply reference the same memory allready allocated for the original instance. This is the reason why you can still access (e.g. by reflection) the members of ChildA in the variable of type Parent. The information still exists, it is simply not visible.
So instead of having two memory-allocations you have two memory-references.
However be aware that this does not apply if you provide your own cast, e.g. from ChildA to ChildB. Doing so will typically look more or less similar to this:
public static explicit operator ChildA(ChildB b)
{
var a = new ChildA((Parent)b);
/* set further properties defined in ChildA but not in ChildB*/
}
Here you have two completely different instances, one of type ChildA and one of type ChildB which both consume their own memory.

I think this means that the Parent Instance keep allocating memory for the Child classes .
No, because Parent class does not know about it's children.
var a = new ClassA();
.NET allocates memory for all members of ClassA.
var b = (Parent)a;
.NET does not do anything with memory. a and b point to the same memory block (allocated for ClassA).

Related

How does Garbage Collection collects objects that have inheritance

consider the following code:
class Base
{
}
class Derived : Base
{
// some code
}
and from main if we do
Derived d = new Derived();
I have two questions:
Q1-We know when we do new Derived();CLR allocate a Derived object in the heap. But since Derived derives from Base, Derived implicit constructor also calls Base's implicit constructor, does it mean that there is also a Base object allocated in the heap?
Q2-(if the answer to Q1 is true) In GC's context, we refer to all reference type variables as roots. So for example, variable d is a root, and this root points to Derived object only. Here is a problem, there is no root variable to Base object, in theory Base object is always marked as unreachable by Garbage Collector and then get swept. which is obviously not correct, so does it mean that an implicit root variable will be assigned to Base object to keep it reachable?

You've misunderstood the nature of inheritance. There is just one object created here, an object of type Derived. The inheritance means that Derived gains some properties inherited from Base, but that does not mean another object of type Base is created. So to Q1, the answer is no. Therefore there is no need to answer Q2. The GC has one memory allocation to track, that for Derived.

When you create a new instance of a derived object (e.g. Derived d = new Derived();), a single object is allocated, not separate objects for the derived class and its base class.

Imagine they were both structs and you had to compose the derived object manually. You might end up with something like;
public struct Base {
...
}
public struct Derived {
public Base base;
...
}
The memory required for Base is contained within the memory footprint of Derived.

Memory allocation for object in inheritance in C#

I'm confused about how object allocation is done in the case of inheritance
consider the following code.
class Base
{
}
class Derived : Base
{
// some code
}
and from main if we do
Derived d = new Derived();
and
Base b = new Derived();
what is the memory allocation of both cases in the heap.
Is the derived object in inside base object or they both are beside each other

Memory allocation for both objects will look exactly the same. Both objects are of the same type Derived.
Of course, each object will be allocated in its own space on the heap.
What counts when creating objects is the class (type) used to construct the object, not the type of reference where object will be stored.
Each object exists as complete entity, but you can look at it as summary of all parts from all the classes it inherits from. In a way Derived object instance contains Base object instance inside. Not the other way around.

In both cases you instanciate objects of the concrete Derived class, so the memory footprint would be the same for both - you refer to them using references of the Base and the Derived class, but you instantiate the Derived class in both cases.
But as to providing a general answer to your question - yes, in memory instances of derived classes contain all the members of their base classes.

How can I add a property to a class through inheritance and then cast the base class to the new class?

I'm wondering if there is a way to do this inheritance situation in C#:
public class Item
{
public string Name { get; set; }
}
public class ItemExtended : Item
{
public int ExtendedProp { get; set; }
}
And let's say I have a method that returns objects of type Item:
public Item[] GetItems();
How can I make code like this run?
ItemExtended[] itemsExt = GetItems().Cast(i => (ExtendedItem)i).ToArray();
Where the cast wouldn't fail, the Name property value would be preserved and I would have an additional property ExtendedProp that I could access?
Edit (hopefully to clear some confusion)
In this situation the GetItems method would only ever return items of type Item. I was wondering if there was a casting method that could convert a base type to an inherited type such that all base member values are conserved (without the use of cloning).

If the runtime type of your object is Item, you can not cast it to an ItemExtended -- not unless there's a user-defined conversion that can create an ItemExtended from an Item. Note, however, that even then, you'll be creating a new instance of ItemExtended.
Inheritance in general doesn't work that way. In managed languages, downcasting only works if the runtime type of your object already is of the derived type. Instances of derived classes inherit all the data and behavior of their ancestor classes, but there's an ancestor doesn't have any knowledge of derived classes. Consider an example, where a derived class introduces a single new field. Firstly, the base class instance is smaller in size, so at the very least, a type cast would require allocating new memory. Second, you would have to decide between changing the runtime type of the original instance (which would be very weird indeed) or making a copy of the old data. The latter way would be very similar to the user-defined conversion scenario, except an user-defined conversion is explicitly invoked, and IMO better that way.
In unmanaged languages, you can of course make any arbitrary conversion you want -- but that just results in catastrophic failures if you do it wrong. In the example above, you would try to access the new field, but since it would not have been allocated for the instance, you would go beyond the boundaries of the object's memory space and access... whatever was in there, be it sensical or not.
If you want to introduce new behavior to existing classes, the C# way is via extension methods. Extension properties aren't there yet, and may never be, so you don't get the property syntax. You may or may not be able to live with that.
You may also find it interesting, that in XAML, the concept of attached properties sort of fits what you are trying to do: you can define arbitrary new properties for whatever -- but if you look at the implementation, what you are really doing is creating a dictionary that maps objects to their associated property values, and the XAML compiler sugarcoats this by making the markup look like you've added the properties to those objects.

You can use OfType instead of Cast:
ItemExtended[] itemsExt = GetItems().OfType<ItemExtended>().ToArray();

You're on the right track with a few adjustments,
use Select() instead of Cast() and
i as ItemExtended rather than (ItemExtended)i
This line should cast it correctly:
ItemExtended[] itemsExt = GetItems().Select(i => i as ItemExtended).ToArray();

Assigning derived class object to a parent class reference

I am always puzzled when I see:
Parent ref = new Child();
where Child class extends Parent.
How does the object ref look like in memory?
How is virtual method treated? non-virtual?
How is it different from:
Child ref = new Child();

How does the object look in memory?
Your question is unclear. There are two relevant memory locations. The variable is associated with a storage location. That storage location contains a reference to another storage location.
The variable's storage location is typically realized as a four or eight byte integer that contains a "managed pointer" -- a memory address known to the garbage collector.
The object's memory layout is also an implementation detail of the CLR. The memory buffer associated with the object will contain all the data for the object -- all the values of the fields and whatnot. It also contains a reference to yet another memory location, the virtual function table of the object.
The virtual function table (vtable) then contains even more references, this time references that refer to the methods associated with the most-derived type of the object.
How is virtual method treated? non-virtual?
Virtual methods are executed by looking up the object reference from the variable, then looking up the vtable, then looking up the method in the vtable, and then invoking that method.
Non-virtual methods are not invoked via the vtable because they are known at compile time.
How is it different from...
Non-virtual methods called on the object will call the version of the method based on the type of the variable. Virtual methods called on the object will call the version of the method based on the type of the object that the variable refers to.
If that is not all clear, you might want to read my article that explains how you might "emulate" virtual methods in a language that does not have them. If you can understand how to implement virtual methods yourself in a language that does not have them, that will help you understand how we actually do implement virtual methods.
http://blogs.msdn.com/b/ericlippert/archive/2011/03/17/implementing-the-virtual-method-pattern-in-c-part-one.aspx

ref is a Child object. virtual methods are called on Child class. However, methods defined only in Child class are not visible when assigned to Parent object.
If foo() was not virtual, then the compile will select a method based on the declared type of the variable ref. If you have Parent ref = new Child(); then Parent.foo() will be called. If you have Child ref = new Child(); then Child.foo() will be called. Of course, in this case the C# compiler will ask you to use new in the declaration of Child.foo() to indicate that you mean to hide the implementation in Parent.

I imagine ref just contains the address where the referred-to Child object can be found. If you call a virtual method, the actual method invoked depends on the dynamic type of the object (Child); if you call a non-virtual method, it depends on the static type (Parent). It's different from Child ref = ... because in that one, the static type is Child rather than Parent.
And I hope this isn't homework :)

Think of it this way (assuming Parent is not an abstract class)
Parent ref = new Child();
and
Parent ref = new Parent();
Are mostly the same, except virtual methods overridden in Child will be called in the former, but not the latter.
The type that you declare the object as will determine which methods are available on it. Declaring an object to be a less specific type than what you instantiate it as—the former case—can affect which methods get called at runtime, but only if those methods are declared as abstract or virtual.
In either case, imagine you called a method foo on ref. The runtime would fine method foo on class Parent. The runtime would then see if foo was virtual (or abstract). If foo was not virtual or abstract, the runtime would call the foo Parent defines right then and there, and be done with it. If however foo were virtual or abstract, the runtime would check to see if ref were really instantiated to a more specific type that overrode foo. If so, it would call that foo

C# get access to caller in parent constructor

I have a child and a parent class, as such:
class B : A{
public B : base(){
// stuff
}
}
class A{
public A(){
// how can I gain access here to the class that called me,
// ie the instance of class B that's being instantiated.
}
}
As above, my question is whether I can see who called the parent constructor within the constructor of the parent class.
One way to do this would be to have a separate function in A to which you pass this from within B. Is there anything simpler, ie can I do this during object initialization, or is that too early in the object construction process ? Does the whole object B need to be "ready" before I can access it from within A ?
Thanks!

Within A, it's easy - you just use this and cast it to B if you're confident that it really is a B rather than any other derived class. The object will already an instance of B.
However, it's generally a bad idea to call virtual methods from constructors, as the body of the B constructor hasn't been run yet, so it's only half-initialized. I've had a few situations where this is a pain, but if you tell us what you're trying to achieve we may be able to come up with something cleaner.

You can check what the type is which is being instantiated:
public A()
{
var theType = this.GetType(); // will be typeof(B) in your example
}
But acessing the instance (e.g. it's properties) is probably not wise, since the derived type is not yet initialized when the base type's constructor is executing.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.